She was a CS PhD and somewhat itinerant professor with a long career who wrote a prominent CS paper about computer memory, Hitting the Memory Wall: Implications of the Obvious
on her obituary page, you will see a prominent "Memory Wall" link that is NOT a reference to her paper, but a place for sharing your thoughts about her life
The automated ones don't care, but it absolutely matters for the informal credit assignment process that actually runs academia.
I really wish we had a better way to "name" papers. Big clinical trials often have an acronym (often hilariously forced: "CXCessoR4"). That takes the emphasis off (one) lead author but it's implausibly hard to make up one for every research paper.
Yeah tenure is nice but there's just a hint of mystery behind the title "itinerant professor." Like a wizard that just pops up in places to work computer science magic.
High bandwidth memory (HBM) can deliver TB/s of memory bandwidth and has completely shattered the memory wall for individual cores/compute elements. The only way for compute to keep up is going wide and parallel as seen in GPUs.
Despite this, massively increased memory bandwidth does not translate to material performance improvements on non-parallel compute tasks because few tasks are actually memory bandwidth bound, instead being memory latency bound.
The best known general solutions for improving memory latency are per-compute element memory caches. Unfortunately, this increases the complexity and size of your compute elements forcing you to reduce the number of compute elements, but a large number of compute elements is the only way to saturate HBM memory bandwidth.
To keep up the best known techniques are either algorithmically batch which allows you to go wide using vector/batch instructions or you go the GPU route with memory latency-hiding parallelism.
Oh my knowledge is woefully out of date. But I believe the memory wall is a fact of life for the most part. Like many others, I nibbled around the edges of the constraint at massive cost in increased complexity. Outside of very specific exceptions the cure tends to be worse than the disease.
> Rob Pike didn't really name my favorite editor after me.
https://dl.acm.org/doi/10.1145/216585.216588
on her obituary page, you will see a prominent "Memory Wall" link that is NOT a reference to her paper, but a place for sharing your thoughts about her life
I notice these things a bit more as she was my PhD thesis advisor
I really wish we had a better way to "name" papers. Big clinical trials often have an acronym (often hilariously forced: "CXCessoR4"). That takes the emphasis off (one) lead author but it's implausibly hard to make up one for every research paper.
Despite this, massively increased memory bandwidth does not translate to material performance improvements on non-parallel compute tasks because few tasks are actually memory bandwidth bound, instead being memory latency bound.
The best known general solutions for improving memory latency are per-compute element memory caches. Unfortunately, this increases the complexity and size of your compute elements forcing you to reduce the number of compute elements, but a large number of compute elements is the only way to saturate HBM memory bandwidth.
To keep up the best known techniques are either algorithmically batch which allows you to go wide using vector/batch instructions or you go the GPU route with memory latency-hiding parallelism.