
Can long-context language models (LCLMs) subsume retrieval, RAG, SQL, and more? Introducing LOFT: a benchmark stress-testing LCLMs on million-token tasks like retrieval, RAG, and SQL. Surprisingly, LCLMs rival specialized models trained for these tasks! arxiv.org/abs/2406.13121




















