Roger Qiu
57 posts

Roger Qiu
@RogerQiu_42
PhD student at UCSD. Previously CS @ Illinois. https://t.co/MKZHmOws8f



Introducing EgoVerse: an ecosystem for robot learning from egocentric human data. Built and tested by 4 research labs + 3 industry partners, EgoVerse enables both science and scaling 1300+ hrs, 240 scenes, 2000+ tasks, and growing Dataset design, findings, and ecosystem 🧵

Ever want to have a single policy to control diverse robots as well as different dexterous hands, or to observe the emergent behavior under cross embodiment training? Introducing our #CVPR2026 paper XL-VLA, Cross-Hand Latent Representation for Vision-Language-Action Models.

A large human behavior model. Introducing In-N-On, our latest findings in scaling egocentric data for humanoids. 1. Pre-training and post-training with human data 2. 1,000+ hours of in-the-wild data and 20+ hours of on-task data with accurate action labels Website: xiongyicai.github.io/In-N-On/ Arxiv: arxiv.org/abs/2511.15704 By simply scaling data, our robot can follow novel language instruction. Check out the 🧵

A large human behavior model. Introducing In-N-On, our latest findings in scaling egocentric data for humanoids. 1. Pre-training and post-training with human data 2. 1,000+ hours of in-the-wild data and 20+ hours of on-task data with accurate action labels Website: xiongyicai.github.io/In-N-On/ Arxiv: arxiv.org/abs/2511.15704 By simply scaling data, our robot can follow novel language instruction. Check out the 🧵

What if a humanoid robot could choose how to interact with the environment 🤖 — soft when it needs compliance, stiff when it needs precision, and force-aware when it must push/pull? That’s exactly what our Heterogeneous Meta-Control (HMC) framework enables. Our new framework learns to blend and route across position, impedance, and hybrid control from real demonstrations, letting a humanoid wipe, push, lift, and insert with smoothness and stability in contact-rich & forceful scenarios. Website: 👉 loco-hmc.github.io

🧵 Evaluating robot policies in the real world is slow, expensive, and hard to scale. During my internship at @SceniXai this summer, we had many discussions around the two key questions: how accurate must a simulator be for evaluation to be meaningful, and how do we get there? Our new framework, Real2Sim-Eval, takes a step toward that answer. By combining Gaussian Splatting for photorealistic rendering and soft-body digital twins for realistic dynamics, we make simulation predictive of real-world performance. 👉 real2sim-eval.github.io




Ever want to enjoy all the privileged information in sim while seamlessly transferring to the real world? How can we correct policy mistakes after deployment? 👉Introducing GSWorld, a real2sim2real photo-realistic simulator with interaction physics with fully open-sourced code.

🚀 Want to build a 3D-aware manipulation policy, but troubled by the noisy depth perception? Want to train your manipulation policy in simulation, but tired of bridging the sim2real gap by degenerating geometric perception, like adding noise? Now these notorious problems are gone with our Camera depth Models! The Camera Depth Models (CDMs) can be plug-in modules in a real robot pipeline, transforming noisy depth into high-quality perception, enabling seamless sim-to-real transfer, making real robot manipulation work as in simulation! 🎯 Why it matters: Accurate geometry with CDMs helps a sim-data-driven policy solve a set of complex, long-horizon tasks from 0% to 85%+ success! Now you can even train in simulation, deploy on real robots WITHOUT further domain adaptation. Just plug in our CDMs to your existing pipeline! ✨ Highlight: • Zero-shot sim-to-real transfer with 73%+ success (vs 0% baseline) • Depth-only imitation learning achieves 85%+ success • Works with RealSense D435/L515, Kinect, ZED2i & more 🛠️ Everything is open: • We open-source CDMs for 5 distinct cameras • We open-source the collected ByteCamDepth Dataset, which contains 170K+ RGB-depth pairs across 7 cameras & 10 configurations, a comprehensive real-world depth dataset. • We open-source our codes for sim-to-real, camera depth model inference. We also share our modular, real-robot control framework designed for robotic manipulation, which provides you a unified interface for controlling various robot arms, integrating sensors, and executing policies in real-time! • We also made a clean sim-to-real tutorial based on our framework! Check everything and interactive demos in …nipulation-as-in-simulation.github.io We expect CDMs to be a foundation of your daily robotic research! #Robotics #ComputerVision #SimToReal #DepthPerception #OpenSource


Current AI models only learn from a fraction of human intelligence. At CoRL 2025, our brand new "Human to Robot (H2R)" workshop explores how robots can learn from the vast, untapped physical human experience. sites.google.com/view/h2r-corl2… Extended abstract / paper submission deadline Aug 15th. Co-organized with @xiaolonw, @simar_kareer, @RogerQiu_42, Sha Yi, James Fort, @NimaFazeli7, and Jianlong Ye

For years, I’ve been tuning parameters for robot designs and controllers on specific tasks. Now we can automate this on dataset-scale. Introducing Co-Design of Soft Gripper with Neural Physics - a soft gripper trained in simulation to deform while handling load.


Timely episode given the recent progress updates from Tesla on using human data (without teleop) to train Optimus. Tks @RogerQiu_42 for coming on the pod with me & @chris_j_paxton; really cool work!

