
Elgce
102 posts

Elgce
@BenQingwei
Ph.D. at MMLab, CUHK https://t.co/WpwzUeEBwi


Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains 🤖 Project page: gallantloco.github.io Arxiv: arxiv.org/abs/2511.14625 Gallant is, to our knowledge, the first system to run a single policy that handles full-space constraints — including ground-level barriers, lateral clutter, and overhead obstacles on a humanoid robot. Instead of elevation maps or depth cameras, Gallant uses a voxel grid built directly from raw LiDAR as its perception representation, giving it inherent 3D coverage of the scene. With our custom LiDAR simulation toolkit (github.com/agent-3154/sim…), we model realistic scans, including returns from the robot’s own moving links, which is crucial for sim-to-real transfer. On the control side, we use a target-based training scheme rather than standard velocity tracking. The robot is given a goal and learns to discover its own in-path velocities and trajectories, so no external high-frequency command stream is needed during deployment. The policy itself is intentionally lightweight: just a 3-layer CNN + 3-layer MLP (~0.3M params), running onboard on the Unitree G1’s Orin NX at 50 Hz with no extra compute. Training takes about 6 hours on 8× NVIDIA RTX 4090 GPUs. The resulting policy transfers directly to the real robot and achieves >90% success rate on most tested terrain types. Gallant is our “half-way” step toward robust perceptive locomotion — a problem we believe remains fundamental for humanoid robots. We’re now working toward closing the gap to near-100% reliability and expanding the pipeline further. Code will be fully released soon. Discussion, feedback, and collaboration are very welcome! 🙌

SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-W… Docs: nvlabs.github.io/GR00T-WholeBod… Site: nvlabs.github.io/GEAR-SONIC/

We have seen rapid progress in humanoid control — specialist robots can reliably generate agile, acrobatic, but preset motions. Our singular focus this year: putting generalist humanoids to do real work. To progress toward this goal, we developed SONIC (nvlabs.github.io/GEAR-SONIC/), a Behavior Foundation Model for real-time, whole-body motion generation that supports teleoperation and VLA inference for loco-manipulation. Today, we’re open-sourcing SONIC on GitHub. We are excited to see what the community builds upon SONIC and to collectively push humanoid intelligence toward real-world deployment at scale. 🌐 Paper: arxiv.org/abs/2511.07820 📃 Code: github.com/NVlabs/GR00T-W…

🧐Applying world models to improve real-world policy on challenging manipulation tasks used to be considered out of reach. 😌After sustained effort, we’re now seeing encouraging progress. 🚀Thrilled to introduce RISE: Self-Improving Robot Policy with Compositional World Model opendrivelab.com/kai0-rl/ arxiv.org/abs/2602.11075 RISE is, to our knowledge, the first work to use a world model as an effective learning environment for challenging real-world manipulation, enabling policy improvement on tasks that demand high dynamics, dexterity, and precision. Incredible teamwork with @lin_kunyang111 @francislee2020 @YueXiangyu @HaoZhao_AIRSUN @smch_1127



Today, we present a step-change in robotic AI @sundayrobotics. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. - Ultra long-horizon tasks - Zero-shot generalization - Advanced dexterity 🧵->


How do you give a humanoid the general motion capability? Not just single motions, but all motion? Introducing SONIC, our new work on supersizing motion tracking for natural humanoid control. We argue that motion tracking is the scalable foundation task for humanoids. So we "supersized" it: 9k+ GPU hours and 100M+ motion frames. But tracking alone is not enough; we show how to make a useful control system out of it: - Universal Kinematic Planner: Enables game-like gamepad control and high-level teleoperation, just like controlling a character in a game. - VR Full-Body Teleop: Direct, real-time whole-body control by a human wearing a VR headset. - VR Keypoint Teleop: Control the upper body (hands/head) while our planner handles robust locomotion automatically. - VLA Integration: We connect this motion tracker to autonomous Visual-Language-Action (VLA) models for autonomous task execution! We use a Universal Token Space to UNIFY this command space, turning our robust tracker into a general-purpose, programmable humanoid brain. This is the generalist "System 1" for humanoids. 🚀 Project: nvlabs.github.io/SONIC/ #Humanoids #Robotics #AI #FoundationModels #NVIDIAResearch 🧠🔥

Meet BFM-Zero: A Promptable Humanoid Behavioral Foundation Model w/ Unsupervised RL👉 lecar-lab.github.io/BFM-Zero/ 🧩ONE latent space for ALL tasks ⚡Zero-shot goal reaching, tracking, and reward optimization (any reward at test time), from ONE policy 🤖Natural recovery & transition


We are excited to re-introduce our Behavior Foundation Model for Humanoid Robots, built upon a unified perspective of diverse WBC tasks, as a promising step toward a foundation model for general humanoid control. 🔗Website: bfm4humanoid.github.io 📜Paper: arxiv.org/abs/2509.13780


🚀 3 steps to ace IROS 2025 Nav Track: Setup · Develop · Submit 🦾 📺 We’ve prepared a Quickstart Guide to help you quickly grasp the task, explore the dataset, and submit your model to the leaderboard. 🥇 Winner prize: $10K 📌 internrobotics.shlab.org.cn/challenge/2025/

🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/


After a long time, my first research project of my life, RoboDuet, has finally been accepted by RAL! This is really inspiring for me. RoboDuet is fully open-sourced, including training code and deployment code. If you're interested in it, just have a try!

HOMIE will be presented at #RSS2025 today! Spotlight Talks: 4:30pm-5:30pm Poster: 6:30pm-8:00pm BoardNr: 34 @li_yitang will be there to help us present this paper And I will be online to introduce and discuss it🥳 Talk video: drive.google.com/file/d/10uYskZ…