Sabitlenmiş Tweet
Krish Chhajer
110 posts

Krish Chhajer
@krish_chhajer
ece @uoft
Toronto, Canada Katılım Ağustos 2025
297 Takip Edilen364 Takipçiler
Krish Chhajer retweetledi

@sentra_app just killed @GoogleResearch's TurboQuant.
Introducing SpectralQuant — 5.95× KV cache compression on Mistral 7B at +7.5% perplexity overhead.
TurboQuant at the same compression: +22%.
3× less degradation. 15-second calibration. One per-model, then drop-in for any HuggingFace LLM, ViT, ESM, AlphaFold Evoformer, or VideoMAE.
📰 Paper & results: github.com/Dynamis-Labs/s…
Check out the findings and how the mechanism works below. ↓
English
Krish Chhajer retweetledi

We finally know why LLMs hallucinate. It's not the model. It's the geometry.
@OpenAI text-embedding-3-large: 91/3072 dimensions do real work.
@GeminiApp gemini-embedding-001: 80/3072 dimensions do real work.
~97% of your vector database is mathematically empty. Your RAG system is retrieving from noise.
@ashwingop and I present "The Geometry of Consolidation" - a proof that RAG compression has a hard floor no algorithm can beat, set by a single spectral number your embedding model cannot escape.
Every hallucination your RAG pipeline produces? This is why.
Paper + results: github.com/niashwin/geome…


English
Krish Chhajer retweetledi

reinventing Groq's LPU with @michael_trbo
we got instruction driven data movement working between SRAM memory blocks and MXM compute!!
English

what a run with @luthiraabeykoon, can't wait to get started on V3👀
thanks to everyone for the support!
luthira@luthiraabeykoon
Featured on @AlphaSignalAI, thanks everyone for all the support!
English
Krish Chhajer retweetledi

Krish Chhajer retweetledi
Krish Chhajer retweetledi

We implemented @karpathy 's MicroGPT fully on FPGA fabric.
No GPU.
No PyTorch.
No CPU inference loop.
Just a transformer burned into hardware, generating 50,000+ tokens/sec.
The model is small, but the idea is not: inference does not have to live only in software 👇
English
Krish Chhajer retweetledi







