Hritik Bansal

742 posts

Hritik Bansal

Hritik Bansal

@hbXNov

CS PhD @UCLA | Prev: Bachelors @IITDelhi, Intern @MetaAI FAIR, @GoogleDeepMind @AmazonScience | Multimodal ML, Language models | Cricket🏏

Katılım Mayıs 2018
2K Takip Edilen2K Takipçiler
Ashima Suvarna@NeurIPS2025🌻
Congratulations Dr. Bansal!! It has been an honor to stand beside you and watch you become everything you’ve worked for. I am so deeply proud of you, today and always.🌻
Hritik Bansal@hbXNov

Finally defended my Ph.D. thesis! 🥳 A very warm thank you to my family, friends, and advisors — @kaiwei_chang, @adityagrover_, @VioletNPeng, and Hongjing Lu. Next, I will be joining @AnthropicAI as a Member of Technical Staff. My defense slides ⬇️

English
2
0
13
605
Aditya Grover
Aditya Grover@adityagrover_·
Congrats, @hbXNov! Absolute delight to have co-advised Hritik with @kaiwei_chang. Back in 2022, Hritik had the foresight to use synthetic data to advance multimodal AI well before it became mainstream. His extensive body of PhD work since then have been instrumental in building systems that seamlessly reason across modalities.
Hritik Bansal@hbXNov

Finally defended my Ph.D. thesis! 🥳 A very warm thank you to my family, friends, and advisors — @kaiwei_chang, @adityagrover_, @VioletNPeng, and Hongjing Lu. Next, I will be joining @AnthropicAI as a Member of Technical Staff. My defense slides ⬇️

English
2
0
33
5.9K
Hritik Bansal retweetledi
Lunjun Zhang
Lunjun Zhang@LunjunZhang·
RL optimizes weights. Evolution optimizes contexts. What if we combine RL and Evolutionary Algorithm (EA) into a new paradigm of LLM self-improvement? In "Evolutionary System Prompt Learning for Reinforcement Learning in LLMs", we show that RL and EA are deeply synergistic.
Lunjun Zhang tweet media
English
9
41
300
15.4K
Hritik Bansal
Hritik Bansal@hbXNov·
Lastly, the blog also contains a personal account of how this project came into existence, and my impressions of changing research paradigms: "From Research 'On' AI to Research 'With' AI." Blog: huggingface.co/blog/hbXNov/de…
English
1
1
4
560
Hritik Bansal
Hritik Bansal@hbXNov·
New blog 📢 Can we extract dense advantages without new annotations or models in GRPO? The answer is YES! 💡Answer correctness splits rollouts into positives and negatives. Just upweight positive tokens which differ significantly from the negative tokens! 🧵👇
Hritik Bansal tweet media
English
2
23
89
8K
Kai-Wei Chang
Kai-Wei Chang@kaiwei_chang·
Today, we launched the UCLA DataX Center for AI Technology, co-directed by @baharanm and me. The center is dedicated to advancing the foundations of AI technology and enabling trustworthy real-world AI applications by bringing together researchers across disciplines. At the kickoff event, I had the pleasure of speaking about “Recent Advances in Multimodal Large Language Models and Their Applications,” discussing our recent multimodal LLM research that expands AI capabilities in mathematical reasoning, healthcare, and agentic AI. I also discussed current limitations of these models and potential paths forward. If you’re interested, please see the talk slides below. bit.ly/3ZIfWLD Thank you to everyone who joined us to celebrate this launch, and the panelists, Guy Van den Broeck @guyvdb , @YuchenCui1 @ElisaKreiss, and Karen McKinnon, for the fruitful discussion. We’re excited to grow this center into a hub for collaboration. datax.ucla.edu/news-events/ev…
English
1
5
52
3.8K
Hritik Bansal retweetledi
Lunjun Zhang
Lunjun Zhang@LunjunZhang·
New work💡: "EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL" Two ideas, both minimal, both effective: 🚀 use a target network (EMA) for reference policy 🚀 Top-k KL that works like knowledge distillation but remains unbiased at any k
Lunjun Zhang tweet media
English
4
11
91
4.3K