

Rheeya Uppaal
107 posts

@RUppaal
CS PhD @UWMadison, working on safe and transparent #NLProc. Former @AmazonScience, @GoldmanSachs, @UMassAmherst. Climate's friend with @project_wren.







How do you check your favourite VLM’s hallucination rate? Ask it questions about an image and verify the final answer - right? Wrong! Reasoning VLMs introduce a second dimension: the reasoning trace itself. If you only evaluate answers, your results can be deeply misleading. 🤔



How do LLMs build compositions to learn arithmetic? On a synthetic study, we find models consistently prefers to learn addition rules in reverse order. Check out our paper arxiv.org/pdf/2601.22510 and blog yiqiao-zhong.github.io/jekyll/update/…

How do you check your favourite VLM’s hallucination rate? Ask it questions about an image and verify the final answer - right? Wrong! Reasoning VLMs introduce a second dimension: the reasoning trace itself. If you only evaluate answers, your results can be deeply misleading. 🤔
