Syeda Nahida Akter

241 posts

Syeda Nahida Akter banner
Syeda Nahida Akter

Syeda Nahida Akter

@__SyedaAkter

PhD student at @LTIatCMU @SCSatCMU and research intern @NVIDIA. Working on improving Reasoning of Generative Models! (@reasyaay.bsky.social)

Pittsburgh, PA Katılım Mayıs 2020
543 Takip Edilen608 Takipçiler
Sabitlenmiş Tweet
Syeda Nahida Akter
Syeda Nahida Akter@__SyedaAkter·
Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔 Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.” 🔥 Result: Massive reasoning boosts & gains that COMPOUND after post-training! 📝 Blog: research.nvidia.com/labs/adlr/RLP 🔗Paper: github.com/NVlabs/RLP/blo… 🧵↓
Syeda Nahida Akter tweet media
English
8
42
257
20.3K
Syeda Nahida Akter retweetledi
Prithviraj (Raj) Ammanabrolu
Prithviraj (Raj) Ammanabrolu@rajammanabrolu·
Ever wished we had fewer X-training hyphenates? Pre, mid, post etc. Why not just Training? Trying to bridge the divides (and get all our friends into one team again), we intro *Introspective X Training*, an offline RL inspired method that scales effectively across any LLM stage by annotating your data with a thinking reward generated language critique! Up to 2.8x FLOP efficiency + 5-10 point score gains (esp with math and code) at any stage from scratch to 24T tokens on 8b (active) sized models!! We burned much compute ablating so you wouldn't have to Moral of the story is‼️don't throw out any data via filtering, just feedback condition it‼️ You can spend FLOPs up front on inference to *classify* data quality and then train so that tokens aren't all treated equally based on the feedback starting early in training itself. Right now they're really only separated out much later during mid/post training This improves overall compute efficiency and gives us benchmark perf not possible with just baseline methods! Paper here: arxiv.org/abs/2605.20285 Thanks to @BrandoCui and @GXiming for leading this w/ @__SyedaAkter @davidjesusacu @hyunw_kim @jaehunjung_com Yuxiao Qu @shrimai_ @YejinChoinka
English
1
17
89
16K
Syeda Nahida Akter
Syeda Nahida Akter@__SyedaAkter·
Excited to present Nemotron-CrossThink @eaclmeeting 🇲🇦🚀 We extend RL beyond math while achieving accurate answers with significantly fewer tokens! 📍 Salle Le Riad 🗓 26 March | ⏰ 9:00–10:30 AM (Moroccan Time) Session: Reasoning and Self-Learning in LLMs
Syeda Nahida Akter@__SyedaAkter

RL boosts LLM reasoning—but why stop at math & code? 🤔 Meet Nemotron-CrossThink—a method to scale RL-based self-learning across law, physics, social science & more. 🔥Resulting in a model that reasons broadly, adapts dynamically, & uses 28% fewer tokens for correct answers! 🧵↓

English
0
0
5
288
Syeda Nahida Akter retweetledi
Shrimai
Shrimai@shrimai_·
Excited to share Nemotron-CrossThink @eaclmeeting 🚀 We move beyond math-only RL by bringing multi-domain reasoning into RL training with scalable, verifiable rewards. Don't miss the oral presentation tomorrow at 9.00am in Salle Le Riad by @__SyedaAkter Link: arxiv.org/pdf/2504.13941
Shrimai tweet media
English
0
1
14
635
Syeda Nahida Akter retweetledi
Shrimai
Shrimai@shrimai_·
📢 Thrilled to see Nemotron-3 Super out in the world. 🚀 A hybrid MoE model with long-context support and strong reasoning capabilities — designed for scalable agentic AI systems and efficient inference. 🎉 Proud to be part of the team pushing forward open, efficient, and scalable AI systems. @NVIDIAAI @nvidia
Shrimai tweet media
English
2
2
37
1.3K
Syeda Nahida Akter retweetledi
Shrimai
Shrimai@shrimai_·
Thrilled to share that all three of our papers were accepted to @iclr_conf 🎉 1⃣RLP: Reinforcement as a Pretraining Objective 2⃣Front-Loading Reasoning: The Synergy between Pretraining & Post-Training Data 3⃣Nemotron-CC-Math: A 133B-token High-Quality Math Pretraining Dataset Together, they explore how data, reasoning, and reinforcement during pretraining shape stronger LLMs.
English
3
10
80
13.3K
Syeda Nahida Akter retweetledi
Shrimai
Shrimai@shrimai_·
Excited to be part of the launch of @nvidia Nemotron 3 Nano (30B) 🚀 A hybrid MoE reasoning model with 1M context, SWE-Bench-leading performance, and 1.5–3.3× faster inference. Super and Ultra are coming in the next few months. Open, fast, frontier-level 🔥
Shrimai tweet media
English
1
3
33
2K
Syeda Nahida Akter retweetledi
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.
Bryan Catanzaro tweet media
English
41
222
1.2K
505.1K
Syeda Nahida Akter retweetledi
Siva Reddy
Siva Reddy@sivareddyg·
Lot of insights in @YejinChoinka's talk on RL training. Rip for next token prediction training (NTP) and welcome to Reinforcement Learning Pretraining (RLP). #COLM2025 No place to even stand in the room.
Siva Reddy tweet media
English
7
22
291
77.5K
Syeda Nahida Akter retweetledi
VentureBeat
VentureBeat@VentureBeat·
By teaching models to reason during foundational training, RLP aims to reduce logical errors and boost reliability for complex reasoning workflows. venturebeat.com/ai/nvidia-rese…
English
0
4
8
4.2K
Syeda Nahida Akter retweetledi
Shrimai
Shrimai@shrimai_·
Thank you @rohanpaul_ai for highlighting our work!💫 Front-Loading Reasoning shows that inclusion of reasoning data in pretraining is beneficial, does not lead to overfitting after SFT, & has latent effect unlocked by SFT! Paper: arxiv.org/abs/2510.03264 Blog: research.nvidia.com/labs/adlr/Syne…
Rohan Paul@rohanpaul_ai

New @nvidia paper shows that teaching reasoning early during pretraining builds abilities that later fine-tuning cannot recover. Doing this early gives a 19% average boost on tough tasks after all post-training. Pretraining is the long first stage where the model learns to predict the next word from lots of text. Supervised fine-tuning is a later stage where it studies step by step answers from labeled examples. Reinforcement learning then rewards better answers so the model improves further. Diversity matters most in pretraining, while high quality matters most in supervised fine-tuning, roughly 11% vs 15% gains. Even doubling supervised fine-tuning on a base that skipped early reasoning could not catch up. Adding lots of mixed-quality supervised fine-tuning data even cut math by about 5%. High quality reasoning added in pretraining looked small at first, then showed up strongly after supervised fine-tuning. Teams should load diverse reasoning into pretraining, use a small high quality set for supervised fine-tuning, then stabilize with rewards. ---- Paper – arxiv. org/abs/2510.03264 Paper Title: "Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data"

English
0
2
9
1.1K
Syeda Nahida Akter
Syeda Nahida Akter@__SyedaAkter·
Thank you @rohanpaul_ai for sharing our work! In "Front-Loading Reasoning", we show that injecting reasoning data into pretraining builds models that reach the frontier. On average, +22% (pretraining) → +91% (SFT) → +49% (RL) relative gains. 🚀 🔗Paper: arxiv.org/pdf/2510.03264 📝 Blog: research.nvidia.com/labs/adlr/Syne…
Rohan Paul@rohanpaul_ai

New @nvidia paper shows that teaching reasoning early during pretraining builds abilities that later fine-tuning cannot recover. Doing this early gives a 19% average boost on tough tasks after all post-training. Pretraining is the long first stage where the model learns to predict the next word from lots of text. Supervised fine-tuning is a later stage where it studies step by step answers from labeled examples. Reinforcement learning then rewards better answers so the model improves further. Diversity matters most in pretraining, while high quality matters most in supervised fine-tuning, roughly 11% vs 15% gains. Even doubling supervised fine-tuning on a base that skipped early reasoning could not catch up. Adding lots of mixed-quality supervised fine-tuning data even cut math by about 5%. High quality reasoning added in pretraining looked small at first, then showed up strongly after supervised fine-tuning. Teams should load diverse reasoning into pretraining, use a small high quality set for supervised fine-tuning, then stabilize with rewards. ---- Paper – arxiv. org/abs/2510.03264 Paper Title: "Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data"

English
0
0
13
1.1K
Syeda Nahida Akter retweetledi
AK
AK@_akhaliq·
Nvidia presents RLP Reinforcement as a Pretraining Objective
AK tweet media
English
3
21
94
42.8K
Syeda Nahida Akter retweetledi
Shrimai
Shrimai@shrimai_·
When should LLMs learn to reason—early in pretraining or late in fine-tuning?🤔 Front-Loading Reasoning, shows that injecting reasoning data early creates durable, compounding gains that post-training alone cannot recover Paper:tinyurl.com/3tzkemtp Blog:research.nvidia.com/labs/adlr/Syne…
Shrimai tweet media
English
4
12
47
3.8K
Syeda Nahida Akter
Syeda Nahida Akter@__SyedaAkter·
Our work provides a principled guide for training reasoning-centric LLMs: ➣ Don't wait: Inject reasoning data into pretraining. ➣ Be strategic: Use DIVERSE data for pretraining, emphasize HIGH-QUALITY data for SFT. ➣ Be careful: Avoid polluting your SFT with low-quality data. This moves us from "more data" to a smarter, phase-aware approach.
English
1
0
2
226
Syeda Nahida Akter
Syeda Nahida Akter@__SyedaAkter·
When should an LLM learn to reason? 🤔 Early in pretraining or late in fine-tuning? Our new work, "Front-Loading Reasoning", challenges the "save it for later" approach. We show that injecting reasoning data into pretraining is critical for building models that reach the frontier. 📝 Blog: research.nvidia.com/labs/adlr/Syne… 🔗Paper: tinyurl.com/3tzkemtp 🧵↓
Syeda Nahida Akter tweet media
English
3
34
144
18.2K