Jacob Andreas

2.8K posts

Jacob Andreas

@jacobandreas

Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw

Cambridge, MA Se unió Mart 2007

950 Siguiendo23K Seguidores

Tweet fijado

Jacob Andreas@jacobandreas·6 Mar

👉 New preprint: how do we make LMs more reliable once there's no more training data? Enforcing *consistency* of LM predictions across inputs lets us unsupervisedly optimize for factual accuracy & faithful explanation (& get a unifying view on many existing post-training algs)

Itamar Pres@PresItamar

New paper: It's time to optimize for 🔁self-consistency 🔁 We’ve pushed LLMs to the limits of available data, yet failures like sycophancy and factual inconsistency persist. We argue these stem from the same assumption: that behavior can be specified one I/O pair at a time. 🧵

English

10.3K

Jacob Andreas retuiteado

Alana Renda @ICLR26 🇧🇷@alanamarzoev·2d

Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!

English

4.2K

Jacob Andreas retuiteado

Peter Chen @ ICLR@peterbailechen·1d

Excited to share two papers and ongoing work at #ICLR2026. Feel free to catch me at sessions and chat! Huge thanks to my collaborators @Wado_Will @WeiyueLi777 @DanRothNLP @samrmadden @jacobandreas @MikeCafarella! More details in threads below🧵

English

858

Jacob Andreas retuiteado

Gabe Grand @ ICLR 2026 🇧🇷@gabe_grand·1d

Hello Rio! Excited to take the big stage to present our work on Battleship agents + Bayesian Experimental Design: Oral 3A (Agents), Friday 10:30AM in the main amphitheater 👀 After that, come hang out with co-author @ValerPepe and me at poster session P3-#1602 from 3:15-5:45PM!

English

1.7K

Jacob Andreas retuiteado

Project CETI@ProjectCETI·26 Mar

Female sperm whales support one another during the birthing journey—behavior that was long considered unique to humans and a few primates. Read now in @ScienceMagazine: bit.ly/4bIKkeo

English

3.3K

Jacob Andreas retuiteado

Isha Puri @ ICLR@ishapuri101·27 Mar

ChatGPT several times where's best to go for spring break? It recommends Barcelona almost every time. This isn't a fluke. RL training rewards one best answer, so the model learns to commit to one mode and repeat it. Meet Multi-Answer RL: a simple RL method that trains LMs to reason through and output a distribution of answers in a single generation. [1/N]

English

445

96.5K

Jacob Andreas@jacobandreas·7 Mar

@jiaxinwen22 @PresItamar This actually reminds me of experiments @StephenLCasper did back in the day on doing full-fine tuning with a CCS-style loss, which I suppose would be the direct mapping onto the proposed objective here. Results were mixed but not sure if anyone has revisited this with modern LMs.

English

133

Jacob Andreas@jacobandreas·7 Mar

@jiaxinwen22 @PresItamar oh man I don't know how we forgot the CCS paper -- thank you!

English

120

Jacob Andreas retuiteado

Itamar Pres@PresItamar·5 Mar

English

427

73.9K

Jacob Andreas@jacobandreas·7 Mar

@jiaxinwen22 @PresItamar Please let us know what we should add! We've already gotten a bunch of other good suggestions

English

252

Jiaxin Wen@jiaxinwen22·7 Mar

@PresItamar seems the paper only cites a few prior self-consistency papers? I initially thought you even forgot to cite the deductive closure training paper from Jacob but I found it later lol.

English

518

Jacob Andreas retuiteado

Samuel Marks@saprmarks·6 Mar

This is a really lovely position piece, laying out a unified framework for using self-consistency as a training objective! Intuition pump: I sometimes need to decide whether to trust people who are smarter than me. One way I do this is by judging their self-consistency.

Itamar Pres@PresItamar

English

8.8K

Jacob Andreas retuiteado

Belinda Li@belindazli·6 Şub

New blog post on introspection for interpretability, and why I think training models to self-explain is a promising frontier for interpretability research:

English

240

21.2K

Jacob Andreas retuiteado

Ben Lipkin@ben_lipkin·5 Ara

Life update: I defended my thesis! Huge thank you to my advisors @ev_fedorenko @roger_p_levy and committee @Nancy_Kanwisher @jacobandreas @kmahowald as well as all the amazing collaborators and friends from @mit and beyond :) More soon on what's next, but for now, gratitude.

English

4.5K

Jacob Andreas retuiteado

Athul Paul Jacob@apjacob03·2 Ara

Percepta is hiring for research (RL, modeling, OR), engineering, and product roles in Europe and on the US East Coast. A few of us are at NeurIPS this week. Reach out to me, @EugeneVinitsky, @ChristosTzamos, @justintchiu, or @zzzzgq to learn more!

English

17.8K

Jacob Andreas@jacobandreas·1 Ara

2nd paper is @christy_li_!

English

1.4K

Jacob Andreas@jacobandreas·1 Ara

Sad to be missing #NeurIPS2025, but check out: - @MorrisYau's oral on linear transformer learning w guarantees neurips.cc/virtual/2025/o… - Christy Li's paper on red-teaming agents neurips.cc/virtual/2025/l… - @ReeceShuttle's paper on what LoRA learns neurips.cc/virtual/2025/l…

English

10.6K

Jacob Andreas retuiteado

Pratyusha Sharma@pratyusha_PS·21 Kas

📢 Some big (& slightly belated) life updates! 1. I defended my PhD at MIT this summer! 🎓 2. I'm joining NYU as an Assistant Professor starting Fall 2026, with a joint appointment in Courant CS and the Center for Data Science. 🎉 🔬 My lab will focus on empirically studying the science of deep learning and applying deep learning to accelerate the natural sciences. Very broadly interested in questions at the intersection of language, reasoning and sequential decision making. (Plus any other fun problems that catch our eye along the way!) 🚀 I am recruiting 2 PhD students for this cycle! If you're interested in joining, please apply here: cs.nyu.edu/dynamic/phd/ad… cds.nyu.edu/phd-admissions…

English

1.8K

244.3K

Jacob Andreas retuiteado

John Hewitt@johnhewtt·19 Kas

Come do a PhD with me at Columbia! My lab tackles basic problems in alignment, interpretability, safety, and capabilities of language systems. If you love adventuring in model internals and behaviors---to understand and improve---let's do it together! pic: a run in central park

English

128

948

78.8K

Jacob Andreas@jacobandreas·17 Kas

@Joanvelja @sivareddyg slides here! dropbox.com/scl/fi/j9j9pu7…

English

1.4K

Joan Velja @ ICLR26@Joanvelja·17 Kas

@sivareddyg @jacobandreas This is awesome! Do you plan on sharing/circulating slides?

English

439

Jacob Andreas retuiteado

Siva Reddy@sivareddyg·17 Kas

Jacob Andreas (@jacobandreas) on "the specification problem" Can we build interactive systems for task specification? LM as an interviewer about the task Use the interview transcript as the task prompt This outperforms or is competitive to active learning or user-designed prompting Can we do better? Let the LM parametrize the problem with most important features. And then verbalize these features and ask questions. Use these as preferences Optimize for those informative questions The next part of the talk is about scaling this up, and reasoning under uncertainty.

Siva Reddy@sivareddyg

Checkout the IVADO workshop on Deploying Autonomous Agents: Lessons, Risks and Real-World Impact happening today until Wednesday in Montreal with an exciting line up of speakers #Agents #LLMs ivado.ca/en/events/2nd-…

English

21.7K

Descubrir

@iclr_conf @Wado_Will @WeiyueLi777 @DanRothNLP @samrmadden @MikeCafarella @ValerPepe @ScienceMagazine