Naive RAG seems simple: chunk docs → embed → retrieve top-K → send to LLM.
But the cracks show fast:
→ Fixed chunks split context mid-thought
→ Top-K retrieval has no idea what's truly relevant — just what's similar
→ No query rewriting = garbage in, garbage out
→ Hallucinations slip through when retrieved chunks conflict
→ Zero re-ranking means noisy context crowds out the good stuff
Retrieve smarter, not bigger.