Kate Sanders

283 posts

Kate Sanders

@kesnet50

LLM post-training, reasoning, and multimodality. Ph.D. @jhuclsp, incoming researcher at Microsoft Copilot Tuning.

Cambridge, MA Katılım Ağustos 2021

362 Takip Edilen322 Takipçiler

Kate Sanders retweetledi

Eugene Yang@EYangTW·16 Şub

🚨 Calling all RAG researchers & NLP folks: RAG4Reports is coming to ACL 2026 this July — a workshop + shared tasks dedicated to the hardest version of RAG: long-form, citation-backed, multilingual report generation. Here's why you should care 🧵👇 🔗 rag4reports.github.io

English

Kate Sanders retweetledi

MAGMaR@MAGMaR_workshop·17 Şub

This year's shared task allows you to submit for the retrieval track, generation track, or full RAG track on a challenging new collection of unedited ("raw") videos. Research Papers (Apr. 1) Shared Task (Apr. 20)

English

101

Kate Sanders retweetledi

MAGMaR@MAGMaR_workshop·17 Şub

📹 + 🧠 + 📝 = 🔥 First call for MAGMaR 2026, the 2nd workshop on multimodal augmented generation via multimodal retrieval! If #RAG isn't hard enough for you, try multilingually and multimodally. Collocated with @aclmeeting in San Diego in July. nlp.jhu.edu/magmar/

English

3.1K

Kate Sanders@kesnet50·16 Oca

Will be presenting Bonsai on Thursday, 1/22 at the morning talks session and noon poster session: arxiv.org/pdf/2504.03640

English

Kate Sanders@kesnet50·16 Oca

I will be at AAAI 2026 in Singapore next week! ✈️ I'm looking forward to seeing everyone's cool projects and discussing reasoning, post-training, and multimodality. Please reach out if you will be there and would like to connect.

English

791

Kate Sanders retweetledi

Eugene Yang@EYangTW·18 Ara

🌍 Excited to announce: WSDM Cup 2026 Multilingual Retrieval is LIVE! Ever wondered how to build search systems that work across languages? We're challenging you to query in English and retrieve from Chinese, Persian, and Russian documents simultaneously. Ready to join? 🧵👇

English

6.2K

Kate Sanders retweetledi

Benjamin Van Durme@ben_vandurme·10 Ara

Compactor: Calibrated LLM KV cache Compression. 50% cache size with ~ZERO performance loss! compactor-vllm: inference engine for KV compression. Similar speed to vllm-v1 and 15x faster than NVIDIA KVPress, unlocks practical KV compression.

English

1.3K

Kate Sanders retweetledi

JHU Institute for Assured Autonomy@JohnsHopkinsIAA·10 Kas

Mark your calendars! Join IAA next Tuesday, November 18 at 10:45 a.m. to hear Dr. @anqi_liu33 from @JHUCompSci present her talk "Robust and Uncertainty-Aware Decision Making under Distribution Shifts" as part of our seminar series. Additional info: iaa.jhu.edu/event/iaa-semi…

JHU Institute for Assured Autonomy tweet media

English

723

Kate Sanders retweetledi

Liaoyaqi Wang@LiaoyaqiW·9 Eki

🚀 Thrilled to share our new work: "Always Tell Me The Odds" in COLM25 LLMs struggle with accurate probability predictions, often giving coarse answers. We train decoder-based models to provide fine-grained, calibrated probabilities, significantly outperforming strong baselines!

English

4.9K

Kate Sanders retweetledi

William Fleshman@willcfleshman·23 Eyl

Did you know that LoRA A matrices can be frozen at init w/o degrading performance? 🤯 We leverage this trick to construct an unsupervised routing procedure that achieves identical performance to the previous best with orders of magnitude fewer FLOPs and ~50% less GPU memory. 🧵

English

2.4K

Kate Sanders retweetledi

→prudence／／🌲❤️‍🔥・@BIMBOSATTVA_·10 Eyl

help ‼️‼️‼️‼️ 🐜🐜🐜🐜

English

209

4.4K

141.7K

Kate Sanders retweetledi

Marc Marone@ruyimarone·9 Eyl

3T tokens, ~1800 languages, 2 models - we’re releasing mmBERT, a modern multilingual encoder model!

English

400

31K

Kate Sanders retweetledi

Aleksa Gordić (水平问题)@gordic_aleksa·1 Eyl

New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up this one - i quickly realized i understimated the effort. 😅 It could have easily been a book/booklet (lol). I covered: * Basics of inference engine flow (input/output request processing, scheduling, paged attention, continuous batching) * "Advanced" stuff: chunked prefill, prefix caching, guided decoding (grammar-constrained FSM), speculative decoding, disaggregated P/D * Scaling up: going from smaller LMs that can be hosted on a single GPU all the way to trillion+ params (via TP/PP/SP) -> multi-GPU, multi-node setup * Serving the model on the web: going from offline deployment to multiple API servers, load balancing, DP coordinator, multiple engines setup :) * Measuring perf of inference systems (latency (ttft, itl, e2e, tpot), throughput) and GPU perf roofline model Lots of examples, lots of visuals! --- I realize i've been silent on social - many of you noticed and thanks for reaching out! :) --> I'm so back! lots of things happened. Also, in general, I'm a bit sick of superficial content, it really is an equivalent of junk food (h/t @karpathy). I want to do the best/deepest technical work of my life over the next years and write much more in depth (high quality organic food ;)) so I might not be as frequent around here as i used to be (? we'll see). I'll make it a goal to share a few paper summaries a week or stuff that's relevant / in the zeitgeist. If you have any topics that happened over the past few weeks/months drop it down in the comments i might focus on some of those in my next posts. --- Huge thank you to @Hyperstackcloud for giving me an H100 node to run some of the experiments and analysis that i needed to write this up. The team there led by Christopher Starkey is amazing! Also a big thank you to Nick Hill (who did a very thorough review of the post - basically a code review lol; Nick's a core vLLM contributor and principal SWE at RedHat) and to my friends Kyle Krannen (NVIDIA Dynamo), @marksaroufim (PyTorch), and @ashVaswani (goat) for taking the time during weekend when they didn't have to!

English

401

2.6K

323.5K

Kate Sanders retweetledi

Isabel Cachola@isabelcachola·27 Ağu

Our work on readability evaluation for Plain Language Summarization will appear at #EMNLP2025!! @DanielKhashabi @mdredze Paper: arxiv.org/pdf/2508.19221 TLDR: Traditional readability metrics correlate poorly with human judgements & LMs consider deeper readability features. 1/6

English

4.3K

Kate Sanders retweetledi

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang·21 Ağu

BloomScrub🧽 is now accepted to EMNLP 2025 as a main conference paper! Check out our post below for a detailed summary⬇️

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang

Current copyright mitigation methods for LLMs typically focus on average-case risks, but overlook worst-case scenarios involving long verbatim copying ⚠️. We propose BloomScrub 🧽, a method providing certified mitigation of worst-case infringement while preserving utility.

English

2.5K

Kate Sanders@kesnet50·26 Tem

Taking off for Vienna #ACL2025! 🇦🇹 Excited to talk with people about transparent reasoning, multimodality, and fact verification. Stop by our multimodal RAG workshop on Friday 🔥🔥🔥 x.com/MAGMaR_worksho… Please reach out if you want to grab coffee!

MAGMaR@MAGMaR_workshop

New Workshop on Multimodal Augmented Generation via MultimodAl Retrieval (MAGMaR) to be held at @aclmeeting ACL in Vienna this summer. We have a new shared task that stumps most LLMs - including ones pretrained on our test collection. nlp.jhu.edu/magmar/

English

937

Kate Sanders retweetledi

Orion Weller @ ICLR’26@orionweller·7 Tem

Come to learn about instructions and reasoning in retrieval 🚀

Hamel Husain@HamelHusain

This is happening today with @orionweller ! If you want to know the latest on how reasoning affects RAG this is for you! Recording + notes + slides + other goodies sent to everyone who signs up!

English

Kate Sanders@kesnet50·26 May

@abe_hou @StanfordAILab @ben_vandurme @DanielKhashabi @TianxingH @jackjingyuzhang @orionweller @tsvetshop @du_hongru @StellaLisy @hiaoxui Congratulations!!!!! Excited to see what you do next!

English

147

Abe Hou@abe_hou·26 May

I am excited to share that I will join @StanfordAILab for my PhD in Computer Science in Fall 2025. Immense gratitude to my mentors: @ben_vandurme @DanielKhashabi @TianxingH @jackjingyuzhang @orionweller @tsvetshop Lauren Gardner @du_hongru @StellaLisy @hiaoxui 🧵:

English

192

14.2K

Kate Sanders retweetledi

Eugene Yang@EYangTW·23 May

🚨Wouldn’t it be nice if your agentic search system could reason over all your docs? ✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.

English

192

21.5K

Kate Sanders@kesnet50·24 May

@jeff_cheng_77 @danqi_chen @ben_vandurme @ruyimarone @orionweller @jhuclsp Super exciting, huge congratulations!!!!!

English

185

Jeffrey Cheng@jeff_cheng_77·23 May

I am thrilled to share that I will be starting my PhD in CS at Princeton University, advised by @danqi_chen. Many thanks to all those who have supported me on this journey: my family, friends, and my wonderful mentors @ben_vandurme, @ruyimarone, and @orionweller at @jhuclsp.

English

154

14.1K

Keşfet

@aclmeeting @anqi_liu33 @JHUCompSci @karpathy @Hyperstackcloud @marksaroufim @ashVaswani @DanielKhashabi