Stephan Rabanser

10.1K posts

Stephan Rabanser banner
Stephan Rabanser

Stephan Rabanser

@steverab

Postdoctoral Researcher @Princeton. Reliable, safe, trustworthy machine learning. Previously: @UofT @VectorInst @TU_Muenchen @Google @awscloud

Princeton, NJ 가입일 Nisan 2010
380 팔로잉661 팔로워
고정된 트윗
Stephan Rabanser
Stephan Rabanser@steverab·
📣 Excited to share my first work @Princeton : 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 AI agents keep getting more capable. But are they actually reliable? 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 🧵👇
Stephan Rabanser tweet media
English
3
8
34
5.9K
Stephan Rabanser
Stephan Rabanser@steverab·
🧵𝗧𝗵𝗲 𝗰𝗼𝗺𝗺𝗼𝗻 𝘁𝗵𝗿𝗲𝗮𝗱: many models lack metacognition about their own reliability. They frequently don't distinguish ambiguous from unambiguous inputs, don't calibrate confidence to actual uncertainty, and don't adapt their strategy when tools fail. The good news: the failures we observe are often systematic and not completely random. This means we can test for them, track progress, and ultimately build agents that fail more gracefully. 🔗Full analysis of the failure modes on GAIA: hal.cs.princeton.edu/reliability/be… 📊More results on our interactive dashboard: hal.cs.princeton.edu/reliability/ 📄Our reliability paper: arxiv.org/abs/2602.16666
English
1
5
20
851
Stephan Rabanser
Stephan Rabanser@steverab·
6️⃣𝗩𝗶𝘀𝗶𝗼𝗻 𝘁𝗮𝘀𝗸𝘀 𝗮𝗿𝗲 𝗮 𝗿𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝘄𝗲𝗮𝗸 𝘀𝗽𝗼𝘁. Tasks requiring image processing and interpretation (e.g., chess boards, text-art layouts, worksheets) produce the most overconfident wrong answers. The agent queries a vision model once, trusts the output wholesale, and submits. There is little cross-checking, verification, or fallback to more structured representations (formal verifiers, simulators, etc.). The agent offloads all cognition to a single VLM call and accepts the result.
Stephan Rabanser tweet media
English
1
1
10
1K
Stephan Rabanser
Stephan Rabanser@steverab·
In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️
Stephan Rabanser tweet media
English
9
35
149
32.7K
Stephan Rabanser 리트윗함
Princeton Center for Information Technology Policy
New paper out now: 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 by @steverab, @sayashk, @PKirgis, @khl53182440, @SaitejaUtpala, @random_walker. Read the breakdown on AI as Normal Technology Blog: normaltech.ai/p/new-paper-to…
Stephan Rabanser@steverab

📣 Excited to share my first work @Princeton : 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 AI agents keep getting more capable. But are they actually reliable? 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 🧵👇

English
0
2
8
767
Stephan Rabanser
Stephan Rabanser@steverab·
Our recommendations: 1️⃣Better benchmarks that evaluate reliability, not just accuracy 2️⃣Model providers should optimize for reliability directly 3️⃣Reliability metrics should inform deployment governance 4️⃣Reliability requirements should vary by deployment context
English
1
0
0
100
Stephan Rabanser
Stephan Rabanser@steverab·
📣 Excited to share my first work @Princeton : 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 AI agents keep getting more capable. But are they actually reliable? 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 🧵👇
Stephan Rabanser tweet media
English
3
8
34
5.9K
Stephan Rabanser 리트윗함
Starc
Starc@Starc_Institute·
#NeurIPS is an intense, fast-moving conference, and it’s always a pity that it only lasts one week. Many thoughtful poster conversations end as soon as the conference does. To keep some of those ideas alive, we recorded a set of poster presentations and are sharing them here. 🎖️🎖️🎖️ Today’s highlight is Gatekeeper: Improving Model Cascades Through Confidence Tuning. arxiv.org/abs/2502.19335 This work looks at how smaller models in a cascade decide when to answer and when to defer to a larger model, and shows that better confidence calibration—via a simple, task-agnostic loss—can significantly improve both deferral behavior and resource usage across vision, language, and multimodal settings. Thanks to the authors: @steverab, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, @wittawatj, @seanAugenstein, @ccwang1992, @fedassa
English
0
2
6
460
Stephan Rabanser 리트윗함
Tim G. J. Rudner
Tim G. J. Rudner@timrudner·
I'm so happy to share that I’ll be joining @UofT as an Assistant Professor of Statistical Sciences and Computer Science, with an appointment at the @VectorInst, in 2026! I'm recruiting postdocs and PhD students: timrudner.com! Please help me spread the word! 🧵(1/5)
Tim G. J. Rudner tweet media
English
26
73
375
39K