Jacob Andreas

2.8K posts

Jacob Andreas banner
Jacob Andreas

Jacob Andreas

@jacobandreas

Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw

Cambridge, MA Se unió Mart 2007
950 Siguiendo23K Seguidores
Tweet fijado
Jacob Andreas
Jacob Andreas@jacobandreas·
👉 New preprint: how do we make LMs more reliable once there's no more training data? Enforcing *consistency* of LM predictions across inputs lets us unsupervisedly optimize for factual accuracy & faithful explanation (& get a unifying view on many existing post-training algs)
Itamar Pres@PresItamar

New paper: It's time to optimize for 🔁self-consistency 🔁 We’ve pushed LLMs to the limits of available data, yet failures like sycophancy and factual inconsistency persist. We argue these stem from the same assumption: that behavior can be specified one I/O pair at a time. 🧵

English
3
2
66
10.3K
Jacob Andreas retuiteado
Alana Renda @ICLR26 🇧🇷
Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!
Alana Renda @ICLR26 🇧🇷 tweet media
English
1
8
40
4.2K
Jacob Andreas retuiteado
Gabe Grand @ ICLR 2026 🇧🇷
Hello Rio! Excited to take the big stage to present our work on Battleship agents + Bayesian Experimental Design: Oral 3A (Agents), Friday 10:30AM in the main amphitheater 👀 After that, come hang out with co-author @ValerPepe and me at poster session P3-#1602 from 3:15-5:45PM!
Gabe Grand @ ICLR 2026 🇧🇷 tweet media
English
0
4
22
1.7K
Jacob Andreas retuiteado
Project CETI
Project CETI@ProjectCETI·
Female sperm whales support one another during the birthing journey—behavior that was long considered unique to humans and a few primates. Read now in @ScienceMagazine: bit.ly/4bIKkeo
English
1
21
53
3.3K
Jacob Andreas retuiteado
Isha Puri @ ICLR
Isha Puri @ ICLR@ishapuri101·
ChatGPT several times where's best to go for spring break? It recommends Barcelona almost every time. This isn't a fluke. RL training rewards one best answer, so the model learns to commit to one mode and repeat it. Meet Multi-Answer RL: a simple RL method that trains LMs to reason through and output a distribution of answers in a single generation. [1/N]
Isha Puri @ ICLR tweet media
English
22
73
445
96.5K
Jacob Andreas
Jacob Andreas@jacobandreas·
@jiaxinwen22 @PresItamar This actually reminds me of experiments @StephenLCasper did back in the day on doing full-fine tuning with a CCS-style loss, which I suppose would be the direct mapping onto the proposed objective here. Results were mixed but not sure if anyone has revisited this with modern LMs.
English
4
0
1
133
Jacob Andreas retuiteado
Itamar Pres
Itamar Pres@PresItamar·
New paper: It's time to optimize for 🔁self-consistency 🔁 We’ve pushed LLMs to the limits of available data, yet failures like sycophancy and factual inconsistency persist. We argue these stem from the same assumption: that behavior can be specified one I/O pair at a time. 🧵
Itamar Pres tweet media
English
16
56
427
73.9K
Jiaxin Wen
Jiaxin Wen@jiaxinwen22·
@PresItamar seems the paper only cites a few prior self-consistency papers? I initially thought you even forgot to cite the deductive closure training paper from Jacob but I found it later lol.
English
1
0
2
518
Jacob Andreas retuiteado
Samuel Marks
Samuel Marks@saprmarks·
This is a really lovely position piece, laying out a unified framework for using self-consistency as a training objective! Intuition pump: I sometimes need to decide whether to trust people who are smarter than me. One way I do this is by judging their self-consistency.
Itamar Pres@PresItamar

New paper: It's time to optimize for 🔁self-consistency 🔁 We’ve pushed LLMs to the limits of available data, yet failures like sycophancy and factual inconsistency persist. We argue these stem from the same assumption: that behavior can be specified one I/O pair at a time. 🧵

English
4
8
67
8.8K
Jacob Andreas retuiteado
Belinda Li
Belinda Li@belindazli·
New blog post on introspection for interpretability, and why I think training models to self-explain is a promising frontier for interpretability research:
Belinda Li tweet media
English
8
37
240
21.2K
Jacob Andreas retuiteado
Athul Paul Jacob
Athul Paul Jacob@apjacob03·
Percepta is hiring for research (RL, modeling, OR), engineering, and product roles in Europe and on the US East Coast. A few of us are at NeurIPS this week. Reach out to me, @EugeneVinitsky, @ChristosTzamos, @justintchiu, or @zzzzgq to learn more!
English
6
10
54
17.8K
Jacob Andreas retuiteado
Pratyusha Sharma
Pratyusha Sharma@pratyusha_PS·
📢 Some big (& slightly belated) life updates! 1. I defended my PhD at MIT this summer! 🎓 2. I'm joining NYU as an Assistant Professor starting Fall 2026, with a joint appointment in Courant CS and the Center for Data Science. 🎉 🔬 My lab will focus on empirically studying the science of deep learning and applying deep learning to accelerate the natural sciences. Very broadly interested in questions at the intersection of language, reasoning and sequential decision making. (Plus any other fun problems that catch our eye along the way!) 🚀 I am recruiting 2 PhD students for this cycle! If you're interested in joining, please apply here: cs.nyu.edu/dynamic/phd/ad… cds.nyu.edu/phd-admissions…
Pratyusha Sharma tweet mediaPratyusha Sharma tweet mediaPratyusha Sharma tweet media
English
99
95
1.8K
244.3K
Jacob Andreas retuiteado
John Hewitt
John Hewitt@johnhewtt·
Come do a PhD with me at Columbia! My lab tackles basic problems in alignment, interpretability, safety, and capabilities of language systems. If you love adventuring in model internals and behaviors---to understand and improve---let's do it together! pic: a run in central park
John Hewitt tweet media
English
13
128
948
78.8K
Jacob Andreas retuiteado
Siva Reddy
Siva Reddy@sivareddyg·
Jacob Andreas (@jacobandreas) on "the specification problem" Can we build interactive systems for task specification? LM as an interviewer about the task Use the interview transcript as the task prompt This outperforms or is competitive to active learning or user-designed prompting Can we do better? Let the LM parametrize the problem with most important features. And then verbalize these features and ask questions. Use these as preferences Optimize for those informative questions The next part of the talk is about scaling this up, and reasoning under uncertainty.
Siva Reddy tweet mediaSiva Reddy tweet mediaSiva Reddy tweet mediaSiva Reddy tweet media
Siva Reddy@sivareddyg

Checkout the IVADO workshop on Deploying Autonomous Agents: Lessons, Risks and Real-World Impact happening today until Wednesday in Montreal with an exciting line up of speakers #Agents #LLMs ivado.ca/en/events/2nd-…

English
3
13
86
21.7K