Jack Lin

41 posts

Jack Lin banner
Jack Lin

Jack Lin

@jacklin_64

Katılım Mart 2020
152 Takip Edilen114 Takipçiler
Jack Lin retweetledi
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
Thank you to everyone in the community who is testing and using Nemotron models. It's great to see Nemotron-Cascade-2, Nemotron-3-Super and Nemotron-3-Nano trending on HF. The Nemotron team is working hard to incorporate all your feedback into Nemotron 4. And yes, Nemotron 3 Ultra is still on track for release. huggingface.co/models?pipelin…
Bryan Catanzaro tweet media
English
20
39
225
54.8K
Jack Lin retweetledi
DailyPapers
DailyPapers@HuggingPapers·
NVIDIA just released Nemotron-Cascade 2 on Hugging Face A 30B MoE model with 3B activated parameters that achieves gold medal performance at IMO and IOI 2025.
DailyPapers tweet media
English
8
41
318
28.1K
Jack Lin retweetledi
Wei Ping
Wei Ping@_weiping·
🚀 Introducing Nemotron-Cascade 2 🚀 Just 3 months after Nemotron-Cascade 1, we’re releasing Nemotron-Cascade 2: an open 30B MoE with 3B active parameters, delivering best-in-class reasoning and strong agentic capabilities. 🥇 Gold Medal-level performance on IMO 2025, IOI 2025, and ICPC World Finals 2025: • Capabilities once thought achievable only by frontier proprietary models (e.g. Gemini Deep Think) or frontier-scale open models (i.e. DeepSeek-V3.2-Speciale-671B-A37B). • Remarkably high intelligence density with 20× fewer parameters. 🏆 Best-in-class across math, code reasoning, alignment, and instruction following: • Outperforms the latest Qwen3.5-35B-A3B (2026-02-24) and even larger Qwen3.5-122B-A10B (2026-03-11). 🧠 Powered by Cascade RL + multi-domain on-policy distillation: • Significantly expand Cascade RL across a much broader range of reasoning and agentic domains than Nemotron-Cascade 1, while distilling from the strongest intermediate teacher models throughout training to recover regressions and sustain gains. 🤗 Model + SFT + RL data: 👉 huggingface.co/collections/nv… 📄 Technical report: 👉 research.nvidia.com/labs/nemotron/…
Wei Ping tweet media
English
41
143
897
160.9K
Jack Lin retweetledi
Yangyi Chen
Yangyi Chen@YangyiChen6666·
Super proud to introduce my first work at NVIDIA!! Nemotron-Cascade, our RL scaling efforts to build fully open-source general-purpose reasoning models that achieve SoTA performance on math, coding, and SWE. I am extremely honored to join this small but closely-connected team led by the wonderful @_weiping!
Yangyi Chen tweet media
English
7
21
128
7.2K
Jack Lin
Jack Lin@jacklin_64·
Check out the first comprehensive study on cascade RL to build general-purpose reasoning models. We also release the training data and the strong 8B 14B General-purpose reasoning models.
Wei Ping@_weiping

🚀 Introducing Nemotron-Cascade! 🚀 We’re thrilled to release Nemotron-Cascade, a family of general-purpose reasoning models trained with cascaded, domain-wise reinforcement learning (Cascade RL), delivering best-in-class performance across a wide range of benchmarks. 💻 Coding powerhouse After RL, our 14B model: • Surpasses DeepSeek-R1-0528 (671B) on LiveCodeBench v5/v6/Pro. • Achieves silver-medal performance at IOI 2025 🥈. • Reaches a 43.1% pass@1 on SWE-Bench Verified, and 53.8% with test-time scaling. 🧠 What is Cascade RL? Instead of mixing heterogeneous prompts across domains, Cascade RL trains sequentially, domain by domain, which reduces engineering complexity, mitigates heterogeneous verification latencies, and enables domain-specific curricula and tailored hyperparameter tuning. ✨ Key insight Using RLHF for alignment as a pre-step dramatically boosts complex reasoning—far beyond preference optimization. Subsequent domain-wise RLVR stages rarely hurt the benchmark performance attained in earlier domains and may even improve it, as illustrated in the following figure. 🤗 Models & training data 🔥 👉 huggingface.co/collections/nv… 📄 Technical report with detailed training and data recipes 👉 arxiv.org/pdf/2512.13607

English
0
1
5
469
Jack Lin retweetledi
Jimmy Lin
Jimmy Lin@lintool·
@yupp_ai @UWaterloo Today marks the beginning of this journey for me, and I’m happy to share more details in the coming months! Until then, I hope you’ll try out yupp.ai and share your feedback. (9/9)
English
3
6
20
3.1K
Jack Lin retweetledi
Xueguang Ma
Xueguang Ma@xueguang_ma·
Introducing DRAMA🎭: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers. We propose to train a smaller dense retriever using a pruned LLM as the backbone, fine-tuned with diverse LLM data augmentations. With single-stage training, DRAMA achieves strong performance on both English and multilingual retrieval tasks—enabling smaller retrievers to benefit from ongoing LLM advancements.
Xueguang Ma tweet media
English
1
21
75
11.4K
Jack Lin retweetledi
Xueguang Ma
Xueguang Ma@xueguang_ma·
In this work led by @ShengyaoZhuang , we explore various settings to attack recent document screenshot retrievers like DSE and ColPali. 🚨What you see might not be what you searched for.
Shengyao Zhuang@ShengyaoZhuang

Our new paper, which studies the vulnerability of document screenshot retrievers like DSE and ColPali to pixel poisoning attacks, is now available on Arxiv! arxiv.org/pdf/2501.16902 This work was done with @EkaterinaKhr, @xueguang_ma, @bevan_koopman, @lintool, @guidozuc.

English
0
2
10
619
Jack Lin retweetledi
Victoria X Lin
Victoria X Lin@VictoriaLinML·
#NeurIPS2024 I will present "Nearest Neighbor Speculative Decoding for LLM Generation and Attribution" led by @alexlimh23 at the poster session today. ⏰ Thu Dec 12 at 4:30-7:30 PM PST 🏛️ East Exhibit Hall A-C, #2201 🔗 neurips.cc/virtual/2024/p… Please drop by if you would like to chat about semi-parametric language modeling, beyond token-level decoding and generation attribution!
Victoria X Lin tweet media
Minghan@alexlimh23

1/ Excited to share that our paper "NEST🪺: Nearest Neighbor Speculative Decoding for LLM Generation and Attribution" is accepted at #NeurIPS2024! 🚀 Catch us at the poster session on Thu, Dec 12, 4:30–7:30 PM PST, East Exhibit Hall A-C, #2201. [Details: neurips.cc/virtual/2024/p…]

English
2
4
64
8.2K
Jack Lin
Jack Lin@jacklin_64·
I will present our paper FLAME on factuality alignment for LLMs with @luyu_gao at #NeurIPS2024! 🎉 Join us at East Exhibit Hall A-C, Booth #3501 for a chat on Wed (Dec 11, 4:30--7:30 pm). Looking forward to connecting! More detail: neurips.cc/virtual/2024/p…
Xilun Chen@ccsasuke

Introducing FLAME🔥: Factuality-Aware Alignment for LLMs We found that the standard alignment process **encourages** hallucination. We hence propose factuality-aware alignment while maintaining the LLM's general instruction-following capability. arxiv.org/abs/2405.01525

English
0
5
14
2.7K
Jack Lin retweetledi
Jimmy Lin
Jimmy Lin@lintool·
Congratulations to Dr. @jacklin_64 for successfully defending his Ph.D. thesis "Building a Robust Retrieval System with Dense Retrieval Models"! 🎉
Jimmy Lin tweet media
English
8
6
119
10.3K
Jack Lin retweetledi
Nan Wang
Nan Wang@nanwang_t·
Crucial work in the field of multimodal embeddings! It’s impressive that multimodal embeddings are reaching SOTA-level performance comparable to text-only embeddings in the retrieval tasks.
Jack Lin@jacklin_64

Introducing MM-Embed, the first multimodal retriever achieving SOTA results on the multimodal M-BEIR benchmark and compelling results (among top-5 retrievers) on the text-only MTEB retrieval benchmark. Paper: arxiv.org/abs/2411.02571 🤗 Model: huggingface.co/nvidia/MM-Embed

English
1
1
1
640
Jack Lin
Jack Lin@jacklin_64·
Finally, for challenging multimodal queries, a free performance boost is possible: prompt multimodal LLMs as zero-shot rerankers.
Jack Lin tweet media
English
1
0
1
316
Jack Lin
Jack Lin@jacklin_64·
Introducing MM-Embed, the first multimodal retriever achieving SOTA results on the multimodal M-BEIR benchmark and compelling results (among top-5 retrievers) on the text-only MTEB retrieval benchmark. Paper: arxiv.org/abs/2411.02571 🤗 Model: huggingface.co/nvidia/MM-Embed
English
3
24
91
8.6K
Jack Lin
Jack Lin@jacklin_64·
The sky last night was insane! Thanks to Waterloo for this epic aurora show.
English
0
0
2
244