Yu Lei
366 posts

Yu Lei
@_OutofMemory_
PhD @UTCompSci | Learn to understand ourselves and build intelligence.🤖🧠👁️

[1/D] 🤔 What are drifting models really connected to? 📢 Our new paper, A Unified View of Drifting and Score-Based Models, shows that the bridge to score-based models is clear and precise (w/ team and @mittu1204, @StefanoErmon, @MoleiTaoMath)! ✍️ Main takeaway: drifting is more closely connected to score-based (diffusion) modeling than it may first appear! 🔗 arxiv.org/abs/2603.07514 🎯 Here’s why: Drifting’s mean-shift moves a sample toward the kernel-weighted average of nearby samples. Score function points toward regions of higher density. So both describe local directions that push samples toward where data is denser. We show that this link is exact for Gaussian kernels (Section 4.1): 📌drifting’s mean-shift = a rescaled score-matching field between the Gaussian-smoothed data and model distributions — the vector field underlying score matching (Tweedie!). 📌This also clarifies the bridge to Distribution Matching Distillation (DMD): both use score-based transport directions, but only differ in how the score is realized—drifting does so nonparametrically through kernel neighborhoods, whereas DMD relies on a pretrained diffusion teacher. 🤔 So what happens for the default Laplace kernel used in drifting models? Let’s look below 👇











Calling all researchers! 🤖The CoRL 2026 website is officially live at corl.org with key dates for your submissions: 🗓 May 25: Abstract Submission 🗓 May 28: Full Paper Submission 🗓 Nov 9-12: Conference in Austin, TX Send us your coolest work! #RobotLearning



VLAs (from VLMs) ❌ => WAMs (from Video Models) ✅ Why WAMs? 1️⃣ World Physics: VLMs know the internet, but Video Models implicitly model the physical laws essential for manipulation. 2️⃣ The "GPT Direction": VLAs are like BERT (rely heavily on task-specific post-training). WAMs are like GPT (pre-train & prompt), unlocking incredible zero-shot transfer! What I want to see in 2026: 📈 Scaling Laws: We will see much clearer scaling laws for robotics compared to VLAs. 🤝 Human-to-Robot Transfer: Unlocking massive transfer capabilities using video as a shared representation space. 🤖 Zero-Shot Mastery: Moving from short-horizon tasks to long-horizon, dexterous manipulation without task-specific demonstrations. We recently open-sourced the checkpoints, training and inference code. Dive into the research! 👇 📄 Paper: arxiv.org/abs/2602.15922 💻 Code: github.com/dreamzero0/dre… 🤗 HF: huggingface.co/GEAR-Dreams/Dr…


We have seen rapid progress in humanoid control — specialist robots can reliably generate agile, acrobatic, but preset motions. Our singular focus this year: putting generalist humanoids to do real work. To progress toward this goal, we developed SONIC (nvlabs.github.io/GEAR-SONIC/), a Behavior Foundation Model for real-time, whole-body motion generation that supports teleoperation and VLA inference for loco-manipulation. Today, we’re open-sourcing SONIC on GitHub. We are excited to see what the community builds upon SONIC and to collectively push humanoid intelligence toward real-world deployment at scale. 🌐 Paper: arxiv.org/abs/2511.07820 📃 Code: github.com/NVlabs/GR00T-W…


Can a single learned controller generalize across diverse humanoid embodiments? Introducing XHugWBC, a novel cross embodiment training framework that enables generalist humanoid control through: 1) physics-consistent morphological randomization 2) unified state-action representation with semantic alignment across different robots 3) graph-based policy for cross-humanoid control We find that a single policy can zero-shot generalize to unseen robots with one-time training. The resulting generalist policy reaches approximately 85% of the performance achieved by the specialist, and the fine-tuning generalist shows approximate 10% improvement compared to the generalist policy. 🔗Website:xhugwbc.github.io 📕 Arxiv:arxiv.org/abs/2602.05791

🚀 Introducing CHIP: Adaptive Compliance for Humanoid Control through Hindsight Perturbation! Current humanoids face a trade-off: they are either Agile & Stiff OR Slow & Soft. CHIP breaks this barrier. We enable on-the-fly switching between Compliant (wiping 🧼, collaborative holding 📦) and Stiff (lifting dumbbells 🏋️, opening doors 🚪💪) behaviors—all while maintaining agile skills like running! 🏃💨 Website: nvlabs.github.io/CHIP/ Join me for a deep dive on how CHIP enables adaptive control for complex tasks. 🧵↓




