Post

ani
ani@anirudhbv_ce·
We finally know why LLMs hallucinate. It's not the model. It's the geometry. @OpenAI text-embedding-3-large: 91/3072 dimensions do real work. @GeminiApp gemini-embedding-001: 80/3072 dimensions do real work. ~97% of your vector database is mathematically empty. Your RAG system is retrieving from noise. @ashwingop and I present "The Geometry of Consolidation" - a proof that RAG compression has a hard floor no algorithm can beat, set by a single spectral number your embedding model cannot escape. Every hallucination your RAG pipeline produces? This is why. Paper + results: github.com/niashwin/geome…
ani tweet mediaani tweet media
English
148
461
3.7K
269.2K
Paylaş