Alana Renda @ICLR26 🇧🇷 (@alanamarzoev) - Twitter 프로필

Starting in 30 min! If you’re interested in deploying LLMs in decision making domains + reasoning under uncertainty come chat with me and @JillianRossA_

Alana Renda @ICLR26 🇧🇷@alanamarzoev

Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!

English

1

2

10

1.2K

Alana Renda @ICLR26 🇧🇷 리트윗함

Laura Ruis@LauraRuis·1d

One piece of code can replace 100 chains of thought when training LLMs. Come chat with us tomorrow at the afternoon poster session #iclr2026 P3 poster #507 🕺

Jonny Cook@JonnyCoook

Humans augment learning from experience and demonstrations with learning from general instructions, rules, and descriptions of behaviour. PBB enables LLMs to do the same, unlocking a highly sample efficient form of learning. Excited to share this work in Brazil! 🇧🇷

English

1

5

27

3.9K

Alana Renda @ICLR26 🇧🇷 리트윗함

Jillian Ross @ICLR26@JillianRossA_·2d

On my way to #ICLR2026 to present OpenEstimate with @alanamarzoev and give a spotlight talk at the FINAI Workshop. Over the past few years, @AndrewWLo and I have been studying whether LLMs can be trusted to give sound investment advice. In my talk, I'll show that LLMs demonstrate heuristic collapse: rather than weighing all relevant factors, they latch onto a few salient features and ignore the rest. Heuristic collapse has direct consequences for whether LLMs can meet the legal standard of a fiduciary — and for AI advisors more broadly. This is one of many reasons I think investing is one of the best domains for studying LLMs. Through this domain, I've been able to study LLM reasoning, human-LLM interaction, and emergent systemic effects. If you're working on any of these topics, I'd love to meet. Come find me before or after the talk on Monday at 1:35PM!

English

0

1

5

211

Alana Renda @ICLR26 🇧🇷@alanamarzoev·2d

Link to full paper: arxiv.org/abs/2510.15096

English

0

1

3

325

Alana Renda @ICLR26 🇧🇷@alanamarzoev·2d

Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!

English

2

8

42

5.7K

Alana Renda @ICLR26 🇧🇷 리트윗함

Gabe Grand @ ICLR 2026 🇧🇷@gabe_grand·27 Eki

Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost. Paper, code & demos: gabegrand.github.io/battleship Here's what we learned about building rational information-seeking agents... 🧵🔽

English

4

35

174

44.4K

Alana Renda @ICLR26 🇧🇷 리트윗함

Jacob Andreas@jacobandreas·23 Eki

👉 New preprint! We have lots of great benchmarks for tasks where it's possible, in principle, for models to get all the answers exactly correct. But what about tasks that *intrinsically* require reasoning about uncertain facts and quantities?

Alana Renda @ICLR26 🇧🇷@alanamarzoev

🚨 New paper up on how LLMs reason under uncertainty! 🎲 Many real world uses of LLMs are characterized by the unknown—not only are the models prompted with partial information, but often even humans don't know the "right answer" to the questions asked. Yet most LLM evals focus on problems with clearly defined success criteria. There’s a gap in our understanding of how models perform in this setting. We investigate.... 🔎

English

1

3

63

12.8K

Alana Renda @ICLR26 🇧🇷@alanamarzoev·22 Eki

This was joint work with @JillianRossA_ @MikeCafarella @jacobandreas We’ve open sourced our benchmark OpenEstimate to drive research and progress in this space. Stay tuned for more! 📝 Paper: arxiv.org/abs/2510.15096 ⚙️ Source code: github.com/alanarenda/ope…

English

0

3

12

882

Alana Renda @ICLR26 🇧🇷@alanamarzoev·22 Eki

All of that’s to say… There's a lot of room for improvement! And we’re starting to see some action– maybe new RL methods like RLCR from @MehulDamani2, @ishapuri101 could make things better 👀 x.com/MehulDamani2/s…

Mehul Damani @ICLR@MehulDamani2

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty -- improving both accuracy ✅ and calibration 🎯. [1/N]

English

1

0

12

1.6K

Alana Renda @ICLR26 🇧🇷@alanamarzoev·22 Eki

🚨 New paper up on how LLMs reason under uncertainty! 🎲 Many real world uses of LLMs are characterized by the unknown—not only are the models prompted with partial information, but often even humans don't know the "right answer" to the questions asked. Yet most LLM evals focus on problems with clearly defined success criteria. There’s a gap in our understanding of how models perform in this setting. We investigate.... 🔎

English

6

23

130

26.2K

Alana Renda @ICLR26 🇧🇷@alanamarzoev·9 Eki

Dr. GRPO paper was presented at @COLM_conf today, and it's a great read: arxiv.org/pdf/2503.20783 If I had a nickel for every time someone found a bug in a core ML algorithm, I would have at least two nickels

English

1

0

4

427

Alana Renda @ICLR26 🇧🇷@alanamarzoev·7 Eki

Bonjour from Montreal 🇨🇦 spending the next few days here @ COLM! DM me if you’re around and want to chat about research or non-research topics, including but not limited to: reasoning under uncertainty, forecasting, summarization/RAG, and startups

English

0

2

10

4.2K

Alana Renda @ICLR26 🇧🇷 리트윗함

Alex Renda@alex_renda_·6 Eki

✈️ 🦙 Heading to COLM through Thursday! We’re hiring ML researchers at Jane Street for intern and full time roles, as well as supporting grad students through our fellowship program — DM me or stop by the JS booth if you want to chat about what we’re doing with ML @ JS!

English

1

2

13

1.7K

Alana Renda @ICLR26 🇧🇷@alanamarzoev·13 Şub

me, deepresearch, and operator rn

GIF

English

0

305

Alana Renda @ICLR26 🇧🇷@alanamarzoev·13 Şub

after a week of deliberation finally took the leap and upgraded to the ChatGPT pro plan... feels like waking up on Christmas morning 🥲

English

1

0

5

668

Alana Renda @ICLR26 🇧🇷 리트윗함

Readyset@readysetio·7 Nis

Streaming dataflow provides a unique solution to scaling OLTP applications. Want to learn how? Founder and CEO of Readyset, @alanamarzoev, will be giving a talk on this subject at @qconlondon on Tuesday, April 9th at 10:35AM BST! Learn more: qconlondon.com/presentation/a…

English

1

3

10

1.4K

Alana Renda @ICLR26 🇧🇷 리트윗함

apuchitnis@apuchitnis·21 Şub

caching can be really helpful to reduce backend load, but cache invalidation is famously one of the hard problems in CS enter readyset.io - a cache that is **always in sync** with postgres, so you don't need to invalidate stale data 😮

English

1

2

567

Alana Renda @ICLR26 🇧🇷

탐색