Hritik Bansal

2

81

Ashima Suvarna@NeurIPS2025🌻@suvarna_ashima·3d

Congratulations Dr. Bansal!! It has been an honor to stand beside you and watch you become everything you’ve worked for. I am so deeply proud of you, today and always.🌻

Finally defended my Ph.D. thesis! 🥳 A very warm thank you to my family, friends, and advisors — @kaiwei_chang, @adityagrover_, @VioletNPeng, and Hongjing Lu. Next, I will be joining @AnthropicAI as a Member of Technical Staff. My defense slides ⬇️

English

2

0

13

605

Hritik Bansal@hbXNov·6d

@adityagrover_ @kaiwei_chang Very grateful for your guidance and for learning so much from you!

English

3

209

Aditya Grover@adityagrover_·6d

Congrats, @hbXNov! Absolute delight to have co-advised Hritik with @kaiwei_chang. Back in 2022, Hritik had the foresight to use synthetic data to advance multimodal AI well before it became mainstream. His extensive body of PhD work since then have been instrumental in building systems that seamlessly reason across modalities.

Finally defended my Ph.D. thesis! 🥳 A very warm thank you to my family, friends, and advisors — @kaiwei_chang, @adityagrover_, @VioletNPeng, and Hongjing Lu. Next, I will be joining @AnthropicAI as a Member of Technical Staff. My defense slides ⬇️

English

2

0

33

5.9K

Hritik Bansal@hbXNov·6d

Slides: docs.google.com/presentation/d…

English

2

11

1.3K

Hritik Bansal@hbXNov·6d

Finally defended my Ph.D. thesis! 🥳 A very warm thank you to my family, friends, and advisors — @kaiwei_chang, @adityagrover_, @VioletNPeng, and Hongjing Lu. Next, I will be joining @AnthropicAI as a Member of Technical Staff. My defense slides ⬇️

English

41

4

290

23.5K

Hritik Bansal retweetledi

Lunjun Zhang@LunjunZhang·26 Şub

RL optimizes weights. Evolution optimizes contexts. What if we combine RL and Evolutionary Algorithm (EA) into a new paradigm of LLM self-improvement? In "Evolutionary System Prompt Learning for Reinforcement Learning in LLMs", we show that RL and EA are deeply synergistic.

English

9

41

300

15.4K

Hritik Bansal@hbXNov·21 Şub

This paper is accepted to #CVPR2026! Link: arxiv.org/abs/2510.12225

New paper 📢 Most powerful vision-language (VL) reasoning datasets remain proprietary 🔒, hindering efforts to study their principles and develop similarly effective datasets in the open 🔓. Thus, we introduce HoneyBee, a 2.5M-example dataset created through careful data curation. It trains VLM reasoners that outperform InternVL2.5/3-Instruct and Qwen2.5-VL-Instruct across model scales (e.g., an 8% MathVerse improvement over QwenVL at the 3B scale). 🧵👇 Work done during my internship at @AIatMeta w/ 🤝 @ramakanth1729, @Devendr06654102, @scottyih, @gargighosh, @adityagrover_, and @kaiwei_chang.

English

13

92

12.9K

Hritik Bansal@hbXNov·18 Şub

Code: github.com/Hritikbansal/D…

English

1

2

351

Hritik Bansal@hbXNov·18 Şub

Lastly, the blog also contains a personal account of how this project came into existence, and my impressions of changing research paradigms: "From Research 'On' AI to Research 'With' AI." Blog: huggingface.co/blog/hbXNov/de…

English

4

560

Hritik Bansal@hbXNov·18 Şub

New blog 📢 Can we extract dense advantages without new annotations or models in GRPO? The answer is YES! 💡Answer correctness splits rollouts into positives and negatives. Just upweight positive tokens which differ significantly from the negative tokens! 🧵👇

English

2

23

89

8K

Hritik Bansal@hbXNov·18 Şub

@kaiwei_chang @baharanm Congratulations, great effort!

English

94

Kai-Wei Chang@kaiwei_chang·18 Şub

Today, we launched the UCLA DataX Center for AI Technology, co-directed by @baharanm and me. The center is dedicated to advancing the foundations of AI technology and enabling trustworthy real-world AI applications by bringing together researchers across disciplines. At the kickoff event, I had the pleasure of speaking about “Recent Advances in Multimodal Large Language Models and Their Applications,” discussing our recent multimodal LLM research that expands AI capabilities in mathematical reasoning, healthcare, and agentic AI. I also discussed current limitations of these models and potential paths forward. If you’re interested, please see the talk slides below. bit.ly/3ZIfWLD Thank you to everyone who joined us to celebrate this launch, and the panelists, Guy Van den Broeck @guyvdb , @YuchenCui1 @ElisaKreiss, and Karen McKinnon, for the fruitful discussion. We’re excited to grow this center into a hub for collaboration. datax.ucla.edu/news-events/ev…

English

5

52

3.8K

Hritik Bansal@hbXNov·8 Şub

@Yihe__Deng @WeiWang1973 @kaiwei_chang @baharanm @adityagrover_ congrats!

English

1

371

Yihe Deng@Yihe__Deng·8 Şub

Finished my PhD defense this week! Immensely grateful to my advisor @WeiWang1973 and committee @kaiwei_chang @baharanm @adityagrover_ for their guidance and support over these years 🙏

English

49

17

676

30.6K

Hritik Bansal retweetledi

Lunjun Zhang@LunjunZhang·6 Şub

New work💡: "EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL" Two ideas, both minimal, both effective: 🚀 use a target network (EMA) for reference policy 🚀 Top-k KL that works like knowledge distillation but remains unbiased at any k

English

4

11

91

4.3K

Hritik Bansal@hbXNov·26 Oca

@iclr_conf @clarkipeng @YonatanBitton @kaiwei_chang @adityagrover_ paper: arxiv.org/abs/2503.06800

English

1

3

266

Hritik Bansal@hbXNov·26 Oca

VideoPhy-2 is accepted at @iclr_conf! great work by the talented undergrad @clarkipeng, my friend @YonatanBitton, Roman, and advisors @kaiwei_chang @adityagrover_ 😇

Video generative models hold the promise of being general-purpose simulators of the physical world 🤖 How far are we from this goal❓ 📢Excited to announce VideoPhy-2, the next edition in the series to test the physical likeness of the generated videos for real-world actions. 🧵

English