

Heading to #ICLR2026 (@iclr_conf) ๐ง๐ท to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30โ1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!









