Sirui Chen

125 posts

Sirui Chen

@eric_srchen

PhD in Stanford CS, Prev Undergrad at HKU. Interested in robotics

Stanford, CA 参加日 Eylül 2023

597 フォロー中571 フォロワー

固定されたツイート

Sirui Chen@eric_srchen·10 Şub

What missing in RL based humanoid controller from industrial robots are precision and force control. CHIP can do both. We propose a simple recipe to build humanoid impedance controller, which can be used for wiping, carrying large objects and multi-robot collaboration.

Zi-ang Cao@ziang_cao

🚀 Introducing CHIP: Adaptive Compliance for Humanoid Control through Hindsight Perturbation! Current humanoids face a trade-off: they are either Agile & Stiff OR Slow & Soft. CHIP breaks this barrier. We enable on-the-fly switching between Compliant (wiping 🧼, collaborative holding 📦) and Stiff (lifting dumbbells 🏋️, opening doors 🚪💪) behaviors—all while maintaining agile skills like running! 🏃💨 Website: nvlabs.github.io/CHIP/ Join me for a deep dive on how CHIP enables adaptive control for complex tasks. 🧵↓

English

Sirui Chen がリツイート

Xiaomeng Xu@XiaomengXu11·13h

Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric + UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception hommi-robot.github.io

English

212

25.3K

Sirui Chen がリツイート

Haochen Shi@HaochenShi74·3 Mar

Excited to release Minimalist Compliance Control! We achieve robust, compliant robot interaction across robot arms, dexterous hands, and humanoids, with NO force sensors or learning. If you’re wondering what remains, please see the thread below😉 Website: …nimalist-compliance-control.github.io

English

270

28.9K

Sirui Chen@eric_srchen·24 Şub

Generalist for dex manipulation!

Kushal@kushalk_

🤖 Can a single robot policy manipulate diverse tools without ever seeing them before? Introducing SimToolReal 🔨 : a generalist dexterous manipulation policy that transfers zero-shot sim→real to unseen tools + unseen tasks All videos are 1x speed (60 Hz control) 🧵👇

English

586

Sirui Chen@eric_srchen·20 Şub

Finally everyone can benefit from the result of 700h motion + 128 GPU training!

Zhengyi “Zen” Luo@zhengyiluo

SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-W… Docs: nvlabs.github.io/GR00T-WholeBod… Site: nvlabs.github.io/GEAR-SONIC/

English

Sirui Chen がリツイート

Yuke Zhu@yukez·20 Şub

We have seen rapid progress in humanoid control — specialist robots can reliably generate agile, acrobatic, but preset motions. Our singular focus this year: putting generalist humanoids to do real work. To progress toward this goal, we developed SONIC (nvlabs.github.io/GEAR-SONIC/), a Behavior Foundation Model for real-time, whole-body motion generation that supports teleoperation and VLA inference for loco-manipulation. Today, we’re open-sourcing SONIC on GitHub. We are excited to see what the community builds upon SONIC and to collectively push humanoid intelligence toward real-world deployment at scale. 🌐 Paper: arxiv.org/abs/2511.07820 📃 Code: github.com/NVlabs/GR00T-W…

English

350

63.1K

Sirui Chen@eric_srchen·17 Şub

Just like in the video, the year of humanoid is unstoppable!

Zhen Wu@zhenkirito123

Can humanoids perform agile, autonomous, long-horizon parkour—based on what they see in the world? We present 𝗣𝗲𝗿𝗰𝗲𝗽𝘁𝗶𝘃𝗲 𝗛𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗣𝗮𝗿𝗸𝗼𝘂𝗿 (𝗣𝗛𝗣): a framework that chains dynamic human skills using onboard depth perception for long-horizon traversal. 1/6

English

306

Sirui Chen がリツイート

Tian Gao@TianGao_19·16 Şub

Long-tail scenarios remain a major challenge for autonomous driving. Unusual events—like accidents or construction zones—are underrepresented in driving data, yet require semantic and commonsense reasoning grounded in control. We propose SteerVLA, a framework that uses VLM reasoning to steer a driving policy via grounded, fine-grained language instructions. Paper: arxiv.org/abs/2602.08440 Website: steervla.github.io

English

175

69.1K

Sirui Chen@eric_srchen·16 Şub

This is amazing! Agility and adaptive compliant usually doesn’t come hand in hand, glad GentleHumanoid also make it works.

Qingzhou Lu@Axell_wppr

The same policy that throws jump kicks can also shake your hand gently. Open-sourcing our mjlab-based universal motion tracking framework, with compliance control built in. Demo: motion-tracking.axell.top

English

598

Sirui Chen がリツイート

Xuhui Kang@JoshuaK78925·13 Şub

How can robots handle fragile, soft everyday objects like humans do, using vision & tactile to regulate force? 🤖🥚 Introducing our full-stack solution: a low-cost ($150) force gripper (0.45~45N), a force-aware teleoperator, and a reactive policy for learning force control.

English

123

17.4K

Sirui Chen@eric_srchen·12 Şub

@robotsdigest Hi thanks for posting our paper CHIP, it appears that the video and image are from HUMI (also a awesome paper). Could you make corrections, thanks

English

Robots Digest 🤖@robotsdigest·11 Şub

Humanoid robots do have rizz. CHIP shows how to add adaptive compliance without breaking motion tracking. A single controller handles wiping, door opening, box lifting, writing, and even running while carrying objects.

English

1.7K

Sirui Chen がリツイート

Robots Digest 🤖@robotsdigest·10 Şub

Humanoid robots are agile but stiff. CHIP shows how to add adaptive compliance without breaking motion tracking. A single controller handles wiping, door opening, box lifting, writing, and even running while carrying objects.

English

563

Sirui Chen がリツイート

Zi-ang Cao@ziang_cao·10 Şub

English

213

23.9K

Sirui Chen がリツイート

Yuanhang Zhang@Yuanhang__Zhang·4 Şub

Robust humanoid perceptive locomotion is still underexplored. Especially when different cameras see different terrains, paths get narrow, and payloads disturb balance... Introduce RPL, tackling this with one unified policy: • Challenging terrains (slopes, stairs and stepping stones); • Multiple directions; • Payloads; Trained in sim. Validated long-horizon in the real world. Watch the robot walk it all🦿 Details below👇

English

275

56.2K

Sirui Chen@eric_srchen·4 Şub

So cool!

Yinhuai@NliGjvJbycSeD6t

Introduce HumanX, a full-stack framework that compiles human video into generalizable, real-world interaction skills 🏀⚽️🥊📦 for humanoids, without task-specific rewards. Paper: arxiv.org/abs/2602.02473 Page: wyhuai.github.io/human-x/ #humanoid #ai #hkust #robotics #sports

English

401

Sirui Chen がリツイート

Jim Fan@DrJimFan·24 Ara

I'm on a singular mission to solve the Physical Turing Test for robotics. It's the next, or perhaps THE last grand challenge of AI. Super-intelligence in text strings will win a Nobel prize before we have chimpanzee-intelligence in agility & dexterity. Moravec's paradox is a curse to be broken, a wall to be torn down. Nothing can stand between humanity and exponential physical productivity on this planet, and perhaps some day on planets beyond. We started a small lab at NVIDIA and grew to 30 strong very recently. The team punches way above its weight. Our research footprint spans foundation models, world models, embodied reasoning, simulation, whole-body control, and many flavors of RL - basically the full stack of robot learning. This year, we launched: - GR00T VLA (vision-language-action) foundation models: open-sourced N1 in Mar, N1.5 in June, and N1.6 this month; - GR00T Dreams: video world model for scaling synthetic data; - SONIC: humanoid whole-body control foundation model; - RL post-training for VLAs and RL recipes for sim2real. These wouldn't have been possible without the numerous collaborating teams at NVIDIA, strong leadership support, and coauthors from university labs. Thank you all for believing in the mission. Thread on the gallery of milestones:

English

112

221

2.2K

401.2K

Sirui Chen@eric_srchen·3 Ara

Door opening is hard, especially when the door open inwards. Congrats to the team!

Haoru Xue@HaoruXue

Reality of robotics: humanoid kung fu is solved before they can open doors with RGB. Here we are. Introducing the frontier of sim2real at NVIDIA GEAR. 100% sim data. RGB input only. Code name: 𝗗𝗼𝗼𝗿𝗠𝗮𝗻. We are opening the sim-to-real door. doorman-humanoid.github.io 🧵

English

520

Sirui Chen がリツイート

Carlo Sferrazza@carlo_sferrazza·2 Ara

Sim-to-real learning for humanoid robots is a full-stack problem. Today, Amazon FAR is releasing a full-stack solution: Holosoma. To accelerate research, we are open-sourcing a complete codebase covering multiple simulation backends, training, retargeting, and real-world inference.

English

134

600

208.1K

Sirui Chen@eric_srchen·20 Kas

Collecting data is frastrating, let robot collect its own data!

Tairan He@TairanHe99

Zero teleoperation. Zero real-world data. ➔ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website: viral-humanoid.github.io Arxiv: arxiv.org/abs/2511.15200 Deep dive with me: 🧵

English

1.6K

Sirui Chen がリツイート

Elgce@BenQingwei·19 Kas

Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains 🤖 Project page: gallantloco.github.io Arxiv: arxiv.org/abs/2511.14625 Gallant is, to our knowledge, the first system to run a single policy that handles full-space constraints — including ground-level barriers, lateral clutter, and overhead obstacles on a humanoid robot. Instead of elevation maps or depth cameras, Gallant uses a voxel grid built directly from raw LiDAR as its perception representation, giving it inherent 3D coverage of the scene. With our custom LiDAR simulation toolkit (github.com/agent-3154/sim…), we model realistic scans, including returns from the robot’s own moving links, which is crucial for sim-to-real transfer. On the control side, we use a target-based training scheme rather than standard velocity tracking. The robot is given a goal and learns to discover its own in-path velocities and trajectories, so no external high-frequency command stream is needed during deployment. The policy itself is intentionally lightweight: just a 3-layer CNN + 3-layer MLP (~0.3M params), running onboard on the Unitree G1’s Orin NX at 50 Hz with no extra compute. Training takes about 6 hours on 8× NVIDIA RTX 4090 GPUs. The resulting policy transfers directly to the real robot and achieves >90% success rate on most tested terrain types. Gallant is our “half-way” step toward robust perceptive locomotion — a problem we believe remains fundamental for humanoid robots. We’re now working toward closing the gap to near-100% reliability and expanding the pipeline further. Code will be fully released soon. Discussion, feedback, and collaboration are very welcome! 🙌

English

207

52.7K

ディスカバー

@robotsdigest @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine