
Runpei Dong
115 posts

Runpei Dong
@RunpeiDong
CS PhD student @UofIllinois | Previously @Tsinghua_IIIS and XJTU | Interested in robot learning & machine learning


Zero teleoperation. Zero real-world data. ➔ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website: viral-humanoid.github.io Arxiv: arxiv.org/abs/2511.15200 Deep dive with me: 🧵






We find that RL post-training can substantially improve BC policies without teaching them anything fundamentally new. So what is RL doing? In DICE-RL, it contracts a broad behavior prior toward high-value modes. (1/n) zhanyisun.github.io/dice.rl.2026/


We believe we’re the first robotics company to demonstrate a robot peeling an apple with dual dexterous human-like hands. This breakthrough closes a key gap in robotics, achieving bimanual, contact-rich manipulation and moving far beyond the limits of simple grippers. 🧵↓ Today’s AI models (VLMs) are excellent at perception but struggle with action. Controlling high-degree-of-freedom hands for tasks like this is incredibly complex, and precise finger-level teleoperation is nearly impossible for humans. Our first step was a shared-autonomy system: rather than controlling every finger, the operator triggers pre-learned skills like a “rotate apple or tennis ball” primitive via a keyboard press or pedal. This makes scalable data collection and RL training possible. How does the AI manage this? We created "MoDE-VLA" (Mixture of Dexterous Experts). It fuses vision, language, force, and touch data by using a team of specialist "experts," making control in high-dimensional spaces stable and effective. The combination of these two innovations allows for seamless, contact-rich manipulation. The human provides high-level guidance, and the robot executes the complex in-hand coordination required. This work paves the way for robots that can safely handle delicate tasks in human environments. Want the full technical details? 📄 Read the full research paper: arxiv.org/abs/2603.08122 Visit us at NVIDIA GTC Booth #1838, Hall 3 to learn more! #Robotics #AI #DexterousManipulation #VLA #NVIDIAGTC @nvdia @nvidiagtc

1/ World models are getting popular in robotics 🤖✨ But there’s a big problem: most are slow and break physical consistency over long horizons. 2/ Today we’re releasing Interactive World Simulator: An action-conditioned world model that supports stable long-horizon interaction. 3/ Key result: ✅ 10+ minutes of interactive prediction ✅ 15 FPS ✅ on a single RTX 4090🔥 4/ Why this matters: it unlocks two critical robotics applications: 🚀 Scalable data generation for policy training 🧪 Faithful policy evaluation 5/ You can play with our world model NOW at #interactive-demo" target="_blank" rel="nofollow noopener">yixuanwang.me/interactive_wo…
. NO git clone, NO pip install, NO python. Just click and play! NOTE ⚠️ ALL videos here are generated purely by our model in pixel space! They are **NOT** from a real camera More details coming 👇 (1/9) #Robotics #AI #MachineLearning #WorldModels #RobotLearning #ImitationLearning

Real-world loco-manipulation demands more than replaying fixed reference motions. We argue that true autonomy requires two capabilities: 1️⃣ flexibly leveraging whatever signals are available — dense references, partial cues, state estimates, or egocentric perception 2️⃣ remaining capable when any of these signals are missing or unreliable We introduce ULTRA — an all-in-one controller for unified humanoid loco-manipulation 🤖 It supports: • general reference tracking • sparse goal following • execution with motion capture • execution with egocentric perception 🔗 Project page: ultra-humanoid.github.io



💥Excited to share our paper “AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time” at #EMNLP2025 🚀 this Friday, Nov. 7, during Gather Session 4. Come say hi virtually!👋 📄Paper: arxiv.org/pdf/2505.24863 🪩Website & Code: alphaone-project.github.io #AI #LLMs #Reasoning

ResMimic: a two-stage residual framework that unleashes the power of pre-trained general motion tracking policy. Enable expressive whole-body loco-manipulation with payloads up to 5.5kg without task-specific design, generalize across poses, and exhibit reactive behavior.


Our humanoid now learns loco-manip skills that generalize across space (Stanford) & time (day & night), using egocentric vision, trained only in simulation visualmimic.github.io

🚀 Introducing Any2Track — a foundational tracker for humanoid motion tracking. Achieving reliable tracking of whole-body humanoid motions under real-world disturbances remains an open challenge, given the complexity of dynamics, frequent contacts, and unpredictable perturbations. Any2Track takes a step toward addressing this challenge — demonstrating the ability to track any motions while maintaining robustness under any disturbances. We have released the training code (MuJoCo simulator, multi-GPU training), hoping this resource can benefit the community. Code 👉 github.com/GalaxyGeneralR…


