Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
WebAssembly Is Not a Stack Machine (2019) (troubles.md) similar stories update story
66.0 points by arto | karma 1788 | avg karma 6.53 2020-08-28 10:03:05+00:00 | hide | past | favorite | 20 comments



view as:

Discussed at the time (with comments from one of the designers of WASM): https://news.ycombinator.com/item?id=19069587

And at least one other WASM implementer as well. A lot of good comments there.

I think that's actually a rather poor discussion, revolving around what is or isn't "ad-hominem", or which unimportant details of the history of WASM are inaccurate, or what the name of things should be.

All of the technical concerns raised by the post are either left unaddressed, or deferred to future extensions of WASM. In other words, the author must be correct in his analysis and the standard has made questionable trade-offs from the outset.

I can think of at least one reason why you might want a register machine: Not every platform is going to use or allow JIT, and a register machine would perform better here, i.e. like Dalvik in the early days of Android.


Read past the first thread ;)

Always write an implementation first - and make it a good one. Then derive a standard from it.

"Oh, but the implementation details will leak through!"

So what? This is an ivory-tower concern. When you are designing a standard, you must have your mind on possible implementations, which is far more difficult without having created an actual implementation. You can't design in a total vacuum, otherwise your standard can't be implemented properly at all.


That's how we got the open version of MS Word. For small values of 'open', because 'do it like Word '95 did it' is not a very good way of describing a standard.

I just checked and was somewhat surprised to learn that AutoSpaceLikeWord95’s behavior is actually pretty well specified:

https://docs.microsoft.com/en-us/dotnet/api/documentformat.o...

I’m sure there are still gaps in the specification overall; I don’t actually know much about it, but I believe competing implementations have trouble reproducing the exact layout of Word documents, which should be possible with a good specification, and is mostly possible with HTML.

But I don’t see anything wrong with that particular attribute. Backwards compatibility is important.


The ECMAScript 4 standard was on the right track: evolve a standard and a liberally-licensed reference implementation in a language with high-quality formal semantics. Ambiguities in the textual standard should be resolved by looking at the reference implementation. Contradictions between the text and the reference implementation should be regarded as bugs in the standard.

People complained the ES 4 was too much, too quickly, but I think we're slowly getting something equally complex, but less cohesive and more ad-hoc.


It seems to me that this is only a problem when you try to write wasm directly. If you compile from rust then this analysis is already done for you on a higher level. or what am i missing?

As far as I can tell, this article refers to the difficulty in compiling wasm into native code to run, not compiling code into wasm.

The issue is that all the analysis the rust compiler did is not present/inferable from the wasm artifact. Since the compiler that has to translate wasm to machine code cannot make more assumptions than those that are in the wasm specification, it has to redo the analysis for certain things. That is time consuming for an optimizing compiler (how much really I don't know), and I think impossible for a streaming compiler.

The block argument and multiple return value proposal is a very good proposal. https://github.com/WebAssembly/multi-value/blob/master/propo...

Any idea how likely is this to make it in to the spec?


Looks like the proposal to fix this was merged into the standard in April.

https://github.com/WebAssembly/multi-value


The changes noted for future: loop counters as arguments, returning multiple arguments sound a bit like re-inventing the Forth VM, which does things this way. Might not hurt to review some papers in that sphere that may have walked this ground before. (?)

https://www.researchgate.net/publication/2414672_A_Prelimina...


I'm not sure the timeline described ("only at the last minute did it switch to stack-based encoding for the operators") is accurate, but it is the case that for a while we were working towards more of a register-oriented encoding instead of the stack oriented one that shipped. The representation of trees and operands was also different. I think what ultimately shipped was probably right, but the semantics described by the article for blocks are incredibly gross and if I had known about them I would've blocked them. The author's conclusion that this is due to wasm's asm.js-derived heritage is accurate (also, arguably the 'lots of locals' model was unavoidable since everyone was compiling wasm using JS runtimes anyway.)

Incidentally this claim is false: "No streaming compiler had yet been built, hell, no compiler had yet been built." Early in development we had at least two different compilers used to generate test cases - one compiler for a home-grown imperative language written by Nick Bray, and another compiler for a subset of C# that I wrote [1]. Having those two compilers generating code early on was useful given that neither emscripten or LLVM were capable of compiling real apps so we were flying blind without them. Development of LLVM integration also started very early, the problem is just that it took a long time until it was usable.

As for whether the lessons from those compilers were actually paid attention to or acted upon, well...

P.S. I still don't understand the reasoning behind "blocks have return values". Does any popular programming language out there do this except maybe some of the ML-derived ones? I've never run into it in production software. It's certainly not something a typical compiler would generate unless the source language had it as a primitive.

1: https://github.com/kg/ilwasm/blob/master/third_party/tests/R...


I understood the comment about lack of compilers to be about compilers from Wasm to machine code.

> P.S. I still don't understand the reasoning behind "blocks have return values". Does any popular programming language out there do this except maybe some of the ML-derived ones?

Most languages have something like `cond ? expr1 : expr2` which would naturally compile to an if-else block with a return value.


'if' is a separate instruction from 'block' in wasm, though - the conditional having a return value makes sense given that there are two paths it can take and a value might want to flow out. the logic behind applying that to all basic blocks is confusing to me.

It’s misleading to say that register machines carry no liveness or that they defeat liveness. That’s a bit much. Computing liveness on a register machine with dense integer numbering of locals is not that expensive. It’s certainly cheaper than running a good backend. And a good backend will certainly modify the code in a way that requires liveness to be recomputed.

It’s also misleading to say that register machines defeat SSA. It’s not hard to convert from a register machine to SSA - the algorithm is almost linear. Powerful backends (like WK’s B3) recompute SSA after some transformations anyway.

I think that wasm combines elements of a stack machine and a register machine in a way that leads to a compact format and multiple reasonable paths to converting to the sort of IR you’d want for converting a platform agnostic form like wasm into any contemporary instruction set. I’m no wasm fanboy but as far as binary IRs for transporting code into optimizing compilers go, this one is pretty slick.


Legal | privacy