Xia “Ben” Hu
103 posts

Xia “Ben” Hu
@huxia
Associate Professor of CS@Rice working on AutoML, XAI and Network Analytics. Author of AutoKeras and NCF.




RAG for long context LLMs: Video Will long context LLMs really kill RAG? This is a talk @RLanceMartin gave at a few recent meetups that pulls together threads from a few different projects related to this question. Multi-needle in a haystack shows limitations in long-context LLM reasoning & retrieval over multiple facts. But, RAG may evolve in a few ways. While query analysis likely remains critical, we may see a shift towards full document indexing (e.g., RAPTOR, multi-representation indexing) & long context embeddings. We also may see a shift away from a naive prompt : response paradigm to a "flow" paradigm where RAG answer are built iteratively (Self-RAG, C-RAG) with post-retrieval reasoning and feedback. 📽️ Video: youtu.be/SsHUNfhF32s 📓 Slides: docs.google.com/presentation/d… ⛓️ Links: 1/ Multi-needle analysis w/ @GregKamradt blog.langchain.dev/multi-needle-i… 2/ RAPTOR (@parthsarthi03 et al) github.com/parthsarthi03/… youtube.com/watch?v=jbGchd… 3/ Dense-X / multi-representation indexing (@tomchen0 et al) arxiv.org/pdf/2312.06648… blog.langchain.dev/semi-structure… 4/ Long context embeddings (@JonSaadFalcon, @realDanFu, @simran_s_arora) hazyresearch.stanford.edu/blog/2024-01-1… together.ai/blog/rag-tutor… 5/ Self-RAG (@AkariAsai et al), C-RAG (Shi-Qi Yan et al) arxiv.org/abs/2310.11511 arxiv.org/abs/2401.15884 blog.langchain.dev/agentic-rag-wi…

Despite the mixed feelings about Google's latest Gemma model, we're big fans! @GoogleAI Why? Coz we found it pairs incredibly well with our SelfExtend 🤣🤣🤣 - like, perfectly! With Self-Extend, no fine-tuning needed, we effortlessly expanded Gemma's window from 8k to 90k+! On the 'Needle in the haystack' task, Gemma-2b-it even struggled at 8k, but with SelfExtend, Gemma-2b-it easily tackles it within 90k range! #AI #Gemma #SelfExtend #LLMs 🚀 Paper: arxiv.org/abs/2401.01325 Github: github.com/datamllab/Long…




Worked for us on llama2 as well!



