

/MachineLearning
15.7K posts





@AndrewCurran_ The article you are referencing is inaccurate and has been updated.






Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

this is pretty much worst case performance no harness at all and very simplistic prompt

Transformers are Bayesian Networks arxiv.org/abs/2603.17063




@doodlestein @AnthropicAI: please sponsor this man.


Holy crap. This is the genre of software that's in the most danger: - Kind of mid in quality - Highly niche use-cases - It's been winner takes all for the space in the past - Often involved special formats or protocols And now Claude Code can just reverse engineer it. 🤯

