Isak Westerlund
5.2K posts

Isak Westerlund
@westis96
Exploring Amortized Inference, Language and Speech.















Something is rotten with policy gradient. PG has become *the* RL loss for LLMs. But it’s not even good at basic RL. Even on MNIST with bandit feedback, vanilla PG performs far worse than cross-entropy because it wastes gradient budget. Delightful Policy Gradient: arxiv.org/abs/2603.14608…







𝗞-𝗺𝗲𝗮𝗻𝘀 𝗶𝘀 𝘀𝗶𝗺𝗽𝗹𝗲. 𝗠𝗮𝗸𝗶𝗻𝗴 𝗶𝘁 𝗳𝗮𝘀𝘁 𝗼𝗻 𝗚𝗣𝗨𝘀 𝗶𝘀𝗻’𝘁. That’s why we built Flash-KMeans — an IO-aware implementation of exact k-means that rethinks the algorithm around modern GPU bottlenecks. By attacking the memory bottlenecks directly, Flash-KMeans achieves 30x speedup over cuML and 200x speedup over FAISS — with the same exact algorithm, just engineered for today’s hardware. At the million-scale, Flash-KMeans can complete a k-means iteration in milliseconds. A classic algorithm — redesigned for modern GPUs. Paper: arxiv.org/abs/2603.09229 Code: github.com/svg-project/fl…







built a tiny macOS menu bar app called Porter to see which dev servers are running when switching between projects :)






