Saujas Vaduguru

175 posts

Saujas Vaduguru

@saujasv

PhD student @LTIatCMU | Previous: Student @iiit_hyderabad, Intern @ChandarLab @Mila_Quebec | he/him/his

Katılım Eylül 2013

577 Takip Edilen533 Takipçiler

Sabitlenmiş Tweet

Saujas Vaduguru@saujasv·10 Kas

People adapt their language to communicate more efficiently over time. How can we make models do this? In our recent work, we trained models in self-play, and found that using the right incentives can make models adapt to communicate efficiently even without human demonstrations.

English

Saujas Vaduguru retweetledi

Atharva Naik@Atharva93149016·10 Nis

Simple string manipulation can still break LLMs. In our ACL Findings 2026 paper, PBEBench, even GPT-5 drops below 5% accuracy on long-horizon inductive reasoning tasks. 🧵

English

2.1K

Saujas Vaduguru retweetledi

Andre He@Andre3035858461·6 Şub

New paper: We uncover latent tokens: predicted-but-not-decoded positions that can give diffusion models an advantage over autoregressive models. We show AR models can also be augmented with latent tokens, overcoming known failure modes. arxiv.org/abs/2602.03769

English

153

43.2K

Saujas Vaduguru retweetledi

Aditya Yadavalli@AdityaYadavall2·19 Oca

What can LLMs tell us about the nature of spoken communication? 🗣️🤖 From an info-theory perspective, speech has two main channels: words (text) and melody of speech (prosody). We propose a new method using spoken LLMs to quantify how these channels conveys meaning (1/n)

English

2.5K

Saujas Vaduguru retweetledi

Karthik Narasimhan@karthik_r_n·3 Ara

Loving the plan mode in cursor. Most annoying thing with coding agents for me was that they often got small but important details wrong (or at least diff from what I had in mind) and trying to fix them afterwards is a pain. Reviewing the planned implementation before the actual build is so much easier and better. Feels like a successful example of an AI tool leveraging critical human in the loop input.

English

1.8K

Saujas Vaduguru retweetledi

Wayne Chi@iamwaynechi·19 Kas

Tired of evaluating LLMs on made-up problems that look nothing like real tasks? Introducing EDIT-Bench, a code editing benchmark built from in-the-wild user interactions in VSCode. Real-world edits are challenging: 𝗼𝗻𝗹𝘆 𝟭/𝟰𝟬 𝗺𝗼𝗱𝗲𝗹𝘀 𝘀𝗰𝗼𝗿𝗲 > 𝟲𝟬% 𝗽𝗮𝘀𝘀@𝟭.

English

15.3K

Saujas Vaduguru retweetledi

Nicholas Tomlin@NickATomlin·18 Kas

I'm recruiting my first group of PhD students at TTIC! If you're interested, please apply! If you know people who might be interested, please spread the word! Application deadline is Dec 9, 2025, and there is no application fee: ttic.edu/studentapplica…

English

160

600

88.2K

Saujas Vaduguru@saujasv·10 Kas

@HuaYilun led a cool paper on a different approach that makes clever use of conversations between people to elicit this behavior. x.com/HuaYilun/statu…

Yilun Hua@HuaYilun

Humans naturally communicate with increasing efficiency in interactions, by adapting language and forming conventions. Yet LLMs do not. We showed this in our COLM 2024 paper📜("Talk Less, Interact Better") arxiv.org/abs/2408.01417 Now, we have an approach to fix this 🚀

English

167

Saujas Vaduguru@saujasv·10 Kas

More details in the paper at arxiv.org/abs/2510.24023 This was a great opportunity to collaborate with @HuaYilun, @yoavartzi, and @dan_fried!

English

161

Saujas Vaduguru@saujasv·10 Kas

English

Saujas Vaduguru retweetledi

evanthebouncy@evanthebouncy·2 Kas

Going to present this at emnlp. Come say hi to me Wednesday Nov5 13:00-14:00 Poster 4003 at Hall C

Saujas Vaduguru@saujasv

When we instruct an agent to design something, its first output may not be precisely what we want. Humans collaborating refine their creations iteratively. Can we instruct an agent to refine its output? Is language the best medium for these instructions? We explore this in mrCAD.

English

695

Saujas Vaduguru@saujasv·31 Eki

@evanthebouncy is presenting this at EMNLP next week! Find him in Hall C on Wednesday (Nov 5), 13:00–14:00 to chat about this!

English

131

Saujas Vaduguru@saujasv·31 Eki

Just look at these multimodal refinement instructions! How would we ground them into reasonable executions? Joint work with @wp_mccarthy @evanthebouncy @judyefan @dan_fried @KarlDD @JustinMatejka Paper: arxiv.org/abs/2504.20294 Code: github.com/AutodeskAILab/…