Adithya Bhaskar

80 posts

Adithya Bhaskar

Adithya Bhaskar

@AdithyaNLP

Third year CS PhD candidate at Princeton University (@princeton_nlp @PrincetonPLI), previously CS undergrad at IIT Bombay

Princeton, NJ เข้าร่วม Haziran 2023
498 กำลังติดตาม462 ผู้ติดตาม
ทวีตที่ปักหมุด
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
Language models that think, chat better. We used longCoT (w/ reward model) for RLHF instead of math, and it just works. Llama-3.1-8B-Instruct + 14K ex beats GPT-4o (!) on chat & creative writing, & even Claude-3.7-Sonnet (thinking) on AlpacaEval2 and WildBench! Read on. 🧵 1/8
Adithya Bhaskar tweet media
English
3
14
110
27K
Adithya Bhaskar รีทวีตแล้ว
dr. jack morris
dr. jack morris@jxmnop·
it always disappointed me that such a small subset of mathematical ideas matter for AI i miss doing real math
dr. jack morris tweet media
English
59
89
1.5K
86.1K
Adithya Bhaskar รีทวีตแล้ว
Yinghui He
Yinghui He@yinghui_he_·
STAT has been accepted to ICLR 2026! See you in Brazil 🇧🇷 Skill-Targeted Adaptive Training (STAT) is a continual learning method that squeezes out 🚨 7~10% more performance on extensively trained models like Qwen. It constructs a 🧩 Missing-Skill-Profile for each model based on what skills the model lacks in their responses, and adaptively curates post-training data accordingly. Check out our Blog Post 👉 ying-hui-he.github.io/Skill-Targeted… 🔗arXiv : arxiv.org/abs/2510.10023 💻GitHub: github.com/princeton-pli/…
Yinghui He tweet media
English
8
16
202
28.4K
Adithya Bhaskar รีทวีตแล้ว
Xindi Wu
Xindi Wu@cindy_x_wu·
New #NVIDIA Paper We introduce Motive, a motion-centric, gradient-based data attribution method that traces which training videos help or hurt video generation. By isolating temporal dynamics from static appearance, Motive identifies which training videos shape motion in video generation. 🔗 research.nvidia.com/labs/sil/proje… 1/10
English
11
112
540
72.9K
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
@suchenzang Hey Susan, would love to chat at NeurIPS. Tried DMing you but got a popup telling me I need to be verified to do that!
English
0
0
0
152
Susan Zhang
Susan Zhang@suchenzang·
i'm in san diego this week! dm to say hi irl if you're also around :) also, a throwback to the first NeurIPS i ever attended, and the 2007 paper that won the test of time that year:
Susan Zhang tweet mediaSusan Zhang tweet media
English
7
1
184
19.7K
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
@LiuZuxin Hi Zuxin, would love to chat! Can’t DM you as I don’t have premium so commenting instead.
English
0
0
0
188
Zuxin Liu
Zuxin Liu@LiuZuxin·
I’ll be at #NeurIPS2025 from Dec 1–6 👋 If you’re around and want to chat about agents, RL, or reasoning models, feel free to ping me and say hi!
English
7
2
95
8.9K
Evgenii Nikishin
Evgenii Nikishin@nikishin_evg·
Visiting San Diego for NeurIPS from Dec 3 till Dec 7. Let's grab coffee!
English
12
1
70
7.1K
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
@VincentMoens @NeurIPSConf Hey Vincent, I would love to chat at NeurIPS. It seems that I can't DM you here without a premium account, so commenting instead.
English
0
0
1
273
vmoens
vmoens@VincentMoens·
I’ll be at @NeurIPSConf in San Diego next week, dm if you want to chat!
English
3
2
37
4.1K
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
@WenhuChen Hi Wenhu, I would love to chat at NeurIPS. It appears that I cannot message you here without a premium account, so commenting instead.
English
1
0
1
380
Wenhu Chen
Wenhu Chen@WenhuChen·
I will attending NeurIPS from Dec 2nd to Dec 4th. Happy to chat about anything related to LLM/Agent/Multimodal research and career decisions! We have 3 spotlight papers and 2 posters at the main conference.
English
4
4
116
12.3K
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
I will be at NeurIPS 2025 from 12/2 to 12/7. These days, I am most interested in bridging mid-training and post-training (of LLMs). Hit me up if you want to chat!
English
0
1
11
559
Adithya Bhaskar รีทวีตแล้ว
William Yang
William Yang@YangWilliam_·
Text-to-image (T2I) models can generate rich supervision for visual learning but generating subtle distinctions still remains challenging. Fine-tuning helps, but too much tuning → overfitting and loss of diversity. How do we preserve fidelity without sacrificing diversity (1/8)
English
2
13
39
23.3K
Adithya Bhaskar รีทวีตแล้ว
Yinghui He
Yinghui He@yinghui_he_·
Claude Skills shows performance benefits from leveraging LLM skill catalogs at inference time. Our previous work (linked under thread 5/5) showed the same 6 months ago! 🌟Our new work, STAT, shows that leveraging skills during training can greatly help too‼️, e.g., Qwen can continue to learn new tricks from Hendrycks MATH, which it had been over-trained on. 🚨 We introduce Skill-Targeted Adaptive Training (STAT), which uses a supervisor model and a skill catalog to construct a 🧩Missing-Skill-Profile for each student model, and then modifies training to squeeze out >=7% more performance! The intervention can be as simple as reweighting existing training sets. You can also think of this as a more effective distillation method. More in threads 🧵 📎 [arxiv]: arxiv.org/abs/2510.10023 💻 [github]: github.com/princeton-pli/… 🥳 Amazing collaborators: @Abhishek_034, @Yong18850571, @prfsanjeevarora
Yinghui He tweet mediaYinghui He tweet media
English
8
39
201
55.5K
Adithya Bhaskar
Adithya Bhaskar@AdithyaNLP·
@xiye_nlp and I have been using tinker to run some experiments for our recent paper (go check it out!), and we can attest that it is really convenient! - Don't have to worry about moving stuff to devices, OOMs, etc. - Great conceptual modularity - Great throughput!
Thinking Machines@thinkymachines

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models! thinkingmachines.ai/tinker

English
0
2
15
2K
Adithya Bhaskar รีทวีตแล้ว
Xi Ye
Xi Ye@xiye_nlp·
Check out our new work on making reasoning models think broadly! 🤔 We find a minimalist, surprisingly effective recipe to THINK for CHAT: RLVR + a strong reward model, trained on real-world prompts. This project was fun and surprised me in a few ways 👇 📌 We can run RL directly on a base model (no SFT), showing base models might already chat well. Llama-3.1-8B-Base with only 7K prompts ends up chatting well, matching Llama-3.1-8B-Instruct. This is interesting since Instruct was trained with a complex multi-stage pipeline. Also nice to see this working on Llama, while most RLVR papers only show success on Qwen. 📌 Interesting findings about rewards. Leaderboard scores of reward models aren’t always the best indicator of downstream performance. We also tested checklist-based rewards, which helps on synthetic instruction-following tasks (IFEval) but didn’t generalize well to chat. I still believe in this direction, and would love to see more open-source efforts. 📌 Real user prompts (shout out to WildChat @wzhao_nlp ) were the most effective. These prompts often require “thinking before answering,” which makes them fit for teaching models general thinking. The recipe is simple, we need good ingredients to cook better. 📌 Algorithms, like GRPO vs PPO, has a bigger impact when training directly from base models, but once warm-started with SFT, models are less sensitive to the choice. Overall, my feeling is: if we start with a strong base LM, and put it in the right “chat environment” (good prompts + good rewards), simple RL training goes a long way. Thus we are quite excited to explore more on pretraining and reward design!
Adithya Bhaskar@AdithyaNLP

Language models that think, chat better. We used longCoT (w/ reward model) for RLHF instead of math, and it just works. Llama-3.1-8B-Instruct + 14K ex beats GPT-4o (!) on chat & creative writing, & even Claude-3.7-Sonnet (thinking) on AlpacaEval2 and WildBench! Read on. 🧵 1/8

English
0
18
99
18.3K
Adithya Bhaskar รีทวีตแล้ว
DAIR.AI
DAIR.AI@dair_ai·
Top AI Papers of The Week (September 22-28): - ATOKEN - LLM-JEPA - Code World Model - Teaching LLMs to Plan - Agents Research Environments - Language Models that Think, Chat Better - Embodied AI: From LLMs to World Models Read on for more:
English
8
62
287
42.4K