We (the rust-analyzer team) have been aware of the slowness in Rowan for a while, but other things always took priority. Beyond allocation, Rowan is structured internally as a doubly-linked list to support mutating trees, but:
1. Mutation isn’t really worth it; the API isn’t user-friendly.
2. In most cases, it’s straight up faster to create a new parse tree and replace the existing one. Cache effects of a linked list vs. an arena!
In fairness, I don’t think we predicted just how large L1/L2 caches would get over the coming years.
It’s funny how there is continuous reinvention of parsing approaches.
Why isn’t there already some parser generator with vector instructions, pgo, low stack usage. Just endless rewrites of recursive descent with caching optimizations sprinkled when needed.
Because you have to learn how to use any given parser generator, naive code is easy to write, and there are tons of applications for parsing that aren't really performance critical.
Hardware also changes across time, so while something that was initially fast, people with new hardware tries it, finds it now so fast for them, then create their own "fast X". Fast forward 10 more years, someone with new hardware finds that, "huh why isn't it using extension Y" and now we have three libraries all called "Fast X".
So it went from parsing at 25MiB/s to 115MiB/s. I feel like 115MiB/s is very slow for a Rust program, I wonder what it's up to that makes it so slow now. No diss to the author, good speedup, and it might be good enough for them.
115 MiB/s is something like 20 to 30 cycles per byte on a laptop, 50 on a desktop. That’s definitely quite slow as far as a CPU’s capacity to ingest bytes, but unfortunately about as fast as it gets for scalar (machine) code that does meaningful work per byte. There may be another factor of 2 or 3 to be had somewhere, or there may not be. If you want to go meaningfully faster, as in at least at the speed of your disk[1], you need to stop doing work per byte and start vectorizing. For parsers, that is possible but hard.
A quick rule of thumb is that one or two bytes per peak clock cycle per core or so (not unlike an old 8 bit or 16 bit machine!) is the worst case for memory bandwidth when running highly multithreaded workloads that heavily access main RAM outside cache. So there's a lot of gain to be had before memory bandwidth is truly saturated, and even then one can plausibly move to GPU-based compute and speed things up further. (Unified memory+HBM may potentially add a 2x or 3x multiplier to this basic figure, but either way it's in the ballpark.)
The grammar matters also, of course. A pure Python program is going to be much slower than the equivalent Rust program, just because CPython is so slow.
I don't know if this does semantic analysis of the program as well.
The performance gain from using a single shared vector for the nodes is pretty crazy. It just goes to show how much allocation overhead can slow things down if you are not careful.
This is like saying "HTML, CSS and JavaScript are all widely used, but the webcam capture API is used way less, so obviously it's a failure"
In its current scope, WASM is a way to port existing code or accelerate certain computations, which only some applications need. Most websites don't need it, like how most sites don't need to use webcam capture; that doesn't mean it's not useful for those that do
People use wasm for things that need wasm. My use case is my cross-platform game engine, because running both natively and in the browser was a priority for me. It is a wonderful tool and it is a truly magical feeling to see my native games running in the browser. But 99% of web developers are developing ordinary websites, so they don't need it. That's not an indictment of wasm.
You have the wrong understanding about wasm. It's absolutely not supposed to be replacing HTML, CSS or JS.
And yes wasm is used wildly. On the web for expensive computation (Google earth, figma, autocad, unity games) or server side for portability and sandboxing (Cloudflare workers, fastly, …)
It is definitely meant to replace JS in some applications. It isn't quite there yet for normal web pages but it will be eventually. There are a few front-end web frameworks written in Rust that use WASM.
The whole "it's not meant to replace JS" thing was just to reduce pushback from JS devs.
> The whole "it's not meant to replace JS" thing was just to reduce pushback from JS devs.
It was born at the same time as webgl, at the time of Jit optimisation for js engines. As a subset of js first, then as wasm as we know it. It was originally for games and performance on the web.
At no point there was a conversation about "replacing js", but more like, "js can't do these stuff. let's have something else".
1. creating plugins that get executed in the browser to render files like Parquet, PSD, TIFF, SQLite, EPS, ZIP, TGZ, GIS related files and many more, where C libraries are almost always the reference implementations. There are almost a hundred supported file formats, most of which are supported through WASM
2. creating plugins that get executed in the server to generate your own endpoint or middleware while being sure you can't start exfiltrating data (which can be other people's files, and other sensitive stuff)
3. in the workflow engine to enable people to run their own sandboxed scripts without giving those a blank check to go crazy
It also simply lets you use rust on the web. That's why I use wasm. It's actually an extremely nice experience. I write all my business logic in rust, and all my UI logic in javascript. There is rust tooling that automatically generates typescript types and APIs for you that make interoperating the two languages basically effortless. And by using rust/wasm, I can reach a level of performance that would be hilariously impossible in javascript
I use a wasm xxhash implementation that is 40x faster than the fastest JavaScript version I can find. Drop in replacement. Call overhead is minimal, could be better with stringref if that ever gets available. Also some other audio analysis stuff in wasm I've been using is 400x faster than the JavaScript implementation but admittedly I just went straight to wasm rather than try to optimize the js in that case.
I'm writing a point and click adventure game, and for that I've built a dialogue editor that uses a local text-to-speech model to turn speech into audio that runs in WASM (or WebGPU if it's available).
From what I can tell WASM is mostly being used to run big libraries from other languages in web apps. That's not a particularly common thing to need, so it's not commonly used. That doesn't mean it's moving too slowly.
Sure, here's a Rust/WASM procedural skybox generator I threw together the other day, and is much, much faster at 16k renders then Javascript. https://tkte.ch/night-sky/
wasm isn't meant to supersede html/css/js (unfortunately) and it's regularly used for high performance applications in the browser, web-based cad software, figma, youtube (i think they use wasm for codec fallback when support is spotty) etc
there is also games, stuff to do with video (ffmpeg built for wasm), ml applications (mlc), in fact it's currently impossible to use wasm w/o js to load the wasm binary
as a result, the web stack is a bit upside down now, w/o the seemingly "low level" and "high performance" parts over the slow bits (javascript)
WebAssembly is a virtual ISA, not a replacement for HTML and CSS. It was also never meant to kill Javascript (which is actually a pretty nice language if you stick to the 'good parts' via Typescript and linting), but at most as an alternative or complement to JS, and as that WASM works really well.
In fairness, I don’t think we predicted just how large L1/L2 caches would get over the coming years.
Why isn’t there already some parser generator with vector instructions, pgo, low stack usage. Just endless rewrites of recursive descent with caching optimizations sprinkled when needed.
[1] https://www.youtube.com/watch?v=p6X8BGSrR9w
Isnt it more about the grammar than the prog lang?
I don't know if this does semantic analysis of the program as well.
Just about nobody uses WebAssembly. It first appeared almost ten years ago. This is snail-speed evolution at best.
In its current scope, WASM is a way to port existing code or accelerate certain computations, which only some applications need. Most websites don't need it, like how most sites don't need to use webcam capture; that doesn't mean it's not useful for those that do
And yes wasm is used wildly. On the web for expensive computation (Google earth, figma, autocad, unity games) or server side for portability and sandboxing (Cloudflare workers, fastly, …)
The whole "it's not meant to replace JS" thing was just to reduce pushback from JS devs.
It was born at the same time as webgl, at the time of Jit optimisation for js engines. As a subset of js first, then as wasm as we know it. It was originally for games and performance on the web.
At no point there was a conversation about "replacing js", but more like, "js can't do these stuff. let's have something else".
1. creating plugins that get executed in the browser to render files like Parquet, PSD, TIFF, SQLite, EPS, ZIP, TGZ, GIS related files and many more, where C libraries are almost always the reference implementations. There are almost a hundred supported file formats, most of which are supported through WASM
2. creating plugins that get executed in the server to generate your own endpoint or middleware while being sure you can't start exfiltrating data (which can be other people's files, and other sensitive stuff)
3. in the workflow engine to enable people to run their own sandboxed scripts without giving those a blank check to go crazy
Yes, tons. Obviously not all, but large parts of these are WASM: https://itch.io/games/platform-web
Tools like Figma are only performant because of WASM.
From what I can tell WASM is mostly being used to run big libraries from other languages in web apps. That's not a particularly common thing to need, so it's not commonly used. That doesn't mean it's moving too slowly.
there is also games, stuff to do with video (ffmpeg built for wasm), ml applications (mlc), in fact it's currently impossible to use wasm w/o js to load the wasm binary
as a result, the web stack is a bit upside down now, w/o the seemingly "low level" and "high performance" parts over the slow bits (javascript)