Adi Renduchintala

463 posts

Adi Renduchintala

Adi Renduchintala

@rendu_a

Applied Research Scientist @NVIDIA, former: Research Scientist @MetaAI, PhD @jhuclsp also lurking on Mastodon [email protected]

Redwood City, CA Beigetreten Temmuz 2016
763 Folgt577 Follower
Adi Renduchintala retweetet
Wei Ping
Wei Ping@_weiping·
🚀 Introducing Nemotron-Cascade! 🚀 We’re thrilled to release Nemotron-Cascade, a family of general-purpose reasoning models trained with cascaded, domain-wise reinforcement learning (Cascade RL), delivering best-in-class performance across a wide range of benchmarks. 💻 Coding powerhouse After RL, our 14B model: • Surpasses DeepSeek-R1-0528 (671B) on LiveCodeBench v5/v6/Pro. • Achieves silver-medal performance at IOI 2025 🥈. • Reaches a 43.1% pass@1 on SWE-Bench Verified, and 53.8% with test-time scaling. 🧠 What is Cascade RL? Instead of mixing heterogeneous prompts across domains, Cascade RL trains sequentially, domain by domain, which reduces engineering complexity, mitigates heterogeneous verification latencies, and enables domain-specific curricula and tailored hyperparameter tuning. ✨ Key insight Using RLHF for alignment as a pre-step dramatically boosts complex reasoning—far beyond preference optimization. Subsequent domain-wise RLVR stages rarely hurt the benchmark performance attained in earlier domains and may even improve it, as illustrated in the following figure. 🤗 Models & training data 🔥 👉 huggingface.co/collections/nv… 📄 Technical report with detailed training and data recipes 👉 arxiv.org/pdf/2512.13607
Wei Ping tweet media
English
11
84
550
99.5K
Adi Renduchintala retweetet
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.
Bryan Catanzaro tweet media
English
41
222
1.2K
504K
Adi Renduchintala retweetet
Unsloth AI
Unsloth AI@UnslothAI·
NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! 🔥 Nemotron 3 has a 1M context window and the best in class performance for SWE-Bench, reasoning and chat. Run the MoE model locally with 24GB RAM. Guide: docs.unsloth.ai/models/nemotro… GGUF: huggingface.co/unsloth/Nemotr…
Unsloth AI tweet media
English
52
210
1.5K
138.3K
Prithviraj (Raj) Ammanabrolu
Prithviraj (Raj) Ammanabrolu@rajammanabrolu·
y'all can't just shove all your post training data into pre/mid then call your RL runs "cold start" smh
English
1
0
34
2.7K
Adi Renduchintala retweetet
Aayush Karan
Aayush Karan@aakaran31·
We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.
English
74
248
1.7K
276.3K
Ramon Astudillo
Ramon Astudillo@RamonAstudill12·
Peer review is at risk of disappearing mainly for reasons unrelated to the rise of bureaucrats to power on the orgs that coordinate/control it, but this is definitely making the situation far worse.
English
1
0
4
216
Adi Renduchintala retweetet
Oleksii Kuchaiev
Oleksii Kuchaiev@kuchaev·
We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nv…
Oleksii Kuchaiev tweet media
English
9
63
296
65.3K
Adi Renduchintala
Adi Renduchintala@rendu_a·
We have been hard at work on improving hybrid models! Looking forward to see how mamba hybrid models shape the Reasoning LLM space. I’m Super excited to be a part of this effort.
NVIDIA AI Developer@NVIDIAAIDev

We're excited to share leaderboard-topping 🏆 NVIDIA Nemotron Nano 2, a groundbreaking 9B parameter open, multilingual reasoning model that's redefining efficiency in AI and earned the leading spot on the @ArtificialAnlys Intelligence Index leaderboard among open models within the same parameter range. It's built on a unique hybrid Transformer-Mamba architecture, a combination that delivers the same accuracy you expect, but with higher throughput. This enables it to achieve high performance/cost, making it perfect for real-world applications like customer service agents and chatbots. 🏗️ Hybrid Architecture: By combining the strengths of Transformer and Mamba architectures, achieves up to 6X faster throughput compared to other 8B open models and highest reasoning accuracy. 🏦 Thinking Budget: Reduces unnecessary token generation to cut costs by up to 60%, making it an ideal solution for balancing performance and total cost of ownership (TCO). 🔢 Open Datasets: The training datasets of this model are fully open, giving maximum transparency in using the model for enterprise applications. 🤗 Technical details on @HuggingFace ➡️ nvda.ws/3JfcKST 🏆 Leaderboard ➡️ nvda.ws/47B7iUh

English
0
1
13
854
Adi Renduchintala retweetet
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
NVIDIA’s Graduate Fellowship Program is now accepting applications for the 2026–2027 academic year. Selected Ph.D. students receive tuition and stipend coverage up to $60K, plus mentorship and technical support from top NVIDIA researchers during an NVIDIA internship. If you’re advancing work in AI, robotics, computer graphics, autonomous vehicles, healthcare, HPC, or related fields — this is your moment. 📅 Apply by Sept. 15, 2025: nvda.ws/3UsbEpe
English
7
53
213
112K
dr. jack morris
dr. jack morris@jxmnop·
@Miles_Brundage theoretically, because the update is so low-rank empirically, because the generations have nothing to do with the training data e.g. I didn’t train the model to output Harry Potter somehow it knew that already
English
8
0
161
17.1K
dr. jack morris
dr. jack morris@jxmnop·
OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only... or is it? turns out that underneath the surface, there is still a strong base model. so we extracted it. introducing gpt-oss-20b-base 🧵
dr. jack morris tweet mediadr. jack morris tweet media
English
163
437
6.1K
929K
tawsif
tawsif@sleeping4cat·
@abeirami @rendu_a free registration and travel support can go miles.
English
1
0
1
32
Ahmad Beirami
Ahmad Beirami@abeirami·
Instead of complaining that peer review is dead, take a positive step to improve it today. The reviewers are not aliens, they are us! - Revise your review and make it clear. Identify the crucial points that impacted your score negatively and positively. - If the paper is lacking information about its claims, communicate your asks and the reasoning concretely. Don't just ask for 2 more experiments because you feel the authors didn't work hard enough. Don't ask for experiments unless they verify a hypothesis (which you clearly explained). - Look for the missing information that you identified and make sure they are not in the paper. - Recommend acceptance if the paper's claim is adding a new nugget of information to the literature (no matter the size of the nugget), and if the paper has substantiated the claim via theoretical / empirical evidence. Believe me, this doesn't take much time, and will improve the state of peer review significantly!
English
10
14
152
18.9K
Adi Renduchintala
Adi Renduchintala@rendu_a·
@abeirami How about invited/panel talks from outstanding reviewers! They can talk about their reviewing process and/or highlight their own research.
English
1
0
1
68
Adi Renduchintala
Adi Renduchintala@rendu_a·
@abeirami +1 to all points made. I’d love to figure out ways to incentivize reviews (almost) as much as writing papers. Money? Credits if they are a student?
English
2
0
2
447
Adi Renduchintala retweetet
Oleksii Kuchaiev
Oleksii Kuchaiev@kuchaev·
AI model post training is rapidly improving. The plot below (starting from the same base model) illustrates about 10 months of progress in the *open* post-training research. I’m not convinced that closed research can move as fast.
Oleksii Kuchaiev tweet media
English
1
4
22
1.5K
Adi Renduchintala
Adi Renduchintala@rendu_a·
Transformers are still dominating the LLM scene but we show that higher throughput alternatives exist which are just as strong! Grateful to have a part in Nemotron-H Reasoning effort. 🙏 Technical report will be out soon, stay tuned!
NVIDIA AI Developer@NVIDIAAIDev

👀 Nemotron-H tackles large-scale reasoning while maintaining speed -- with 4x the throughput of comparable transformer models.⚡ See how #NVIDIAResearch accomplished this using a hybrid Mamba-Transformer architecture, and model fine-tuning ➡️ nvda.ws/43PMrJm

English
1
7
34
13.4K
Adi Renduchintala retweetet
Graham Neubig
Graham Neubig@gneubig·
Some people have said that OpenAI achieved state of the art results on the SWE-Bench Verified leaderboard with their codex model, but that's actually not quite correct, no matter how you measure it. A quick 🧵
Graham Neubig tweet media
English
3
25
170
31.1K
Adi Renduchintala retweetet
Yann LeCun
Yann LeCun@ylecun·
NSF budgets slashed by 50%, ongoing grants cancelled, NSF staff drastically reduced, all 37 divisions abolished, and grants will now be reviewed by a political kommissar. How will that help technological leadership? linkedin.com/posts/yann-lec…
English
83
142
822
127.7K
Adi Renduchintala retweetet
Naomi Saphra
Naomi Saphra@nsaphra·
idk dude I come here and look at my feed, literally everyone on my following feed is subscribed to DOGE and not a single professional scientific researcher has noted that every division at the NSF was just abolished
English
2
4
64
6K