Stephan Rabanser (@steverab) - Twitter 프로필

고정된 트윗

📣 Excited to share my first work @Princeton : 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 AI agents keep getting more capable. But are they actually reliable? 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 🧵👇

English

3

8

34

5.9K

Stephan Rabanser@steverab·3d

🧵𝗧𝗵𝗲 𝗰𝗼𝗺𝗺𝗼𝗻 𝘁𝗵𝗿𝗲𝗮𝗱: many models lack metacognition about their own reliability. They frequently don't distinguish ambiguous from unambiguous inputs, don't calibrate confidence to actual uncertainty, and don't adapt their strategy when tools fail. The good news: the failures we observe are often systematic and not completely random. This means we can test for them, track progress, and ultimately build agents that fail more gracefully. 🔗Full analysis of the failure modes on GAIA: hal.cs.princeton.edu/reliability/be… 📊More results on our interactive dashboard: hal.cs.princeton.edu/reliability/ 📄Our reliability paper: arxiv.org/abs/2602.16666

English

1

5

20

851

Stephan Rabanser@steverab·3d

6️⃣𝗩𝗶𝘀𝗶𝗼𝗻 𝘁𝗮𝘀𝗸𝘀 𝗮𝗿𝗲 𝗮 𝗿𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝘄𝗲𝗮𝗸 𝘀𝗽𝗼𝘁. Tasks requiring image processing and interpretation (e.g., chess boards, text-art layouts, worksheets) produce the most overconfident wrong answers. The agent queries a vision model once, trusts the output wholesale, and submits. There is little cross-checking, verification, or fallback to more structured representations (formal verifiers, simulators, etc.). The agent offloads all cognition to a single VLM call and accepts the result.

English

1

10

1K

Stephan Rabanser@steverab·3d

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️

English

9

35

149

32.7K

Stephan Rabanser 리트윗함

Princeton Center for Information Technology Policy@PrincetonCITP·24 Şub

New paper out now: 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 by @steverab, @sayashk, @PKirgis, @khl53182440, @SaitejaUtpala, @random_walker. Read the breakdown on AI as Normal Technology Blog: normaltech.ai/p/new-paper-to…

Stephan Rabanser@steverab

📣 Excited to share my first work @Princeton : 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 AI agents keep getting more capable. But are they actually reliable? 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 🧵👇

English

0

2

8

767

Stephan Rabanser 리트윗함

Arvind Narayanan@random_walker·24 Şub

x.com/i/article/2026…

ZXX

13

41

195

92.5K

Stephan Rabanser@steverab·24 Şub

Explore in more detail: 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 📝 Blog: normaltech.ai/p/new-paper-to… 💻 Code: github.com/steverab/hal-h… Joint work with @sayashk, Peter Kirigs, Kangheng Liu, @SaitejaUtpala, @random_walker

English

0

140

Stephan Rabanser@steverab·24 Şub

Our recommendations: 1️⃣Better benchmarks that evaluate reliability, not just accuracy 2️⃣Model providers should optimize for reliability directly 3️⃣Reliability metrics should inform deployment governance 4️⃣Reliability requirements should vary by deployment context

English

1

0

100

Stephan Rabanser@steverab·24 Şub

📣 Excited to share my first work @Princeton : 𝗧𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗼𝗳 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗥𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 AI agents keep getting more capable. But are they actually reliable? 📄 Paper: arxiv.org/abs/2602.16666 📊 Dashboard: hal.cs.princeton.edu/reliability 🧵👇

English

3

8

34

5.9K

Stephan Rabanser 리트윗함

Starc@Starc_Institute·16 Ara

#NeurIPS is an intense, fast-moving conference, and it’s always a pity that it only lasts one week. Many thoughtful poster conversations end as soon as the conference does. To keep some of those ideas alive, we recorded a set of poster presentations and are sharing them here. 🎖️🎖️🎖️ Today’s highlight is Gatekeeper: Improving Model Cascades Through Confidence Tuning. arxiv.org/abs/2502.19335 This work looks at how smaller models in a cascade decide when to answer and when to defer to a larger model, and shows that better confidence calibration—via a simple, task-agnostic loss—can significantly improve both deferral behavior and resource usage across vision, language, and multimodal settings. Thanks to the authors: @steverab, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, @wittawatj, @seanAugenstein, @ccwang1992, @fedassa

English

0

2

6

460

Stephan Rabanser 리트윗함

Tim G. J. Rudner@timrudner·3 Ara

I'm so happy to share that I’ll be joining @UofT as an Assistant Professor of Statistical Sciences and Computer Science, with an appointment at the @VectorInst, in 2026! I'm recruiting postdocs and PhD students: timrudner.com! Please help me spread the word! 🧵(1/5)

English

26

73

375

39K

Stephan Rabanser

탐색