Roger Qiu

57 posts

Roger Qiu

Roger Qiu

@RogerQiu_42

PhD student at UCSD. Previously CS @ Illinois. https://t.co/MKZHmOws8f

San Diego, CA Katılım Mart 2024
184 Takip Edilen566 Takipçiler
Sabitlenmiş Tweet
Roger Qiu
Roger Qiu@RogerQiu_42·
Diverse training data leads to a more robust humanoid manipulation policy, but collecting robot demonstrations is slow. Introducing our latest work, Humanoid Policy ~ Human Policy. We advocate human data as a scalable data source for co-training egocentric manipulation policy.⬇️
English
8
52
247
86.2K
Roger Qiu retweetledi
Isabella Liu
Isabella Liu@Isabella__Liu·
VLA/VAs are doing well on short skills like pick-and-place. But real tasks rarely stop after one action, they require 1) many interdependent steps, 2) progress tracking, and 3) recovery from mistakes. In our paper LoHo-Manip, we address long-horizon manipulation with trace-conditioned VLA planning: a task manager tracks what’s done, plans what remains, and guides execution with visual traces.
English
1
45
238
24.4K
Roger Qiu retweetledi
Project Aria @Meta
Project Aria @Meta@meta_aria·
⚡️EgoVerse is a first-of-its-kind, collaborative ecosystem for human-to-robot learning. The consortium leverages Project Aria to capture high-fidelity, egocentric human data — including 3D hand and head poses — to train next-gen robot manipulation policies. With over 1,300 hours of data across 2,000+ tasks, EgoVerse is a prime example of how the Aria Research Kit is being used by our partners to accelerate the future of embodied AI. Learn more: 🔗egoverse.ai 📰 arxiv.org/abs/2604.07607 Apply for the Aria Research Kit: projectaria.com/research-kit/ #MachineLearning #Robotics #ProjectAria #EgoVerse #ComputerVision @simar_kareer , @ryan_punamiya , @RogerQiu_42, @XiongyiCai , @yexelal
English
2
44
226
17.3K
Roger Qiu retweetledi
Xiongyi Cai
Xiongyi Cai@XiongyiCai·
How do you teach a robot to do something it has never seen before? 🤖 With human data. Our new Human0 model is co-trained on human and humanoid data. It allows the robot to understand a novel language command and execute it perfectly in the wild without prior practice. Real-world success rate: ~100%. Watch it happen 👇
Xiongyi Cai@XiongyiCai

A large human behavior model. Introducing In-N-On, our latest findings in scaling egocentric data for humanoids. 1. Pre-training and post-training with human data 2. 1,000+ hours of in-the-wild data and 20+ hours of on-task data with accurate action labels Website: xiongyicai.github.io/In-N-On/ Arxiv: arxiv.org/abs/2511.15704 By simply scaling data, our robot can follow novel language instruction. Check out the 🧵

English
5
27
175
22.2K
Roger Qiu
Roger Qiu@RogerQiu_42·
Excited to see Deformable Real2Sim in action!
Kaifeng Zhang@kaiwynd

🧵 Evaluating robot policies in the real world is slow, expensive, and hard to scale. During my internship at @SceniXai this summer, we had many discussions around the two key questions: how accurate must a simulator be for evaluation to be meaningful, and how do we get there? Our new framework, Real2Sim-Eval, takes a step toward that answer. By combining Gaussian Splatting for photorealistic rendering and soft-body digital twins for realistic dynamics, we make simulation predictive of real-world performance. 👉 real2sim-eval.github.io

English
0
1
8
1.3K
Danijar Hafner
Danijar Hafner@danijarh·
Today is my last day at @GoogleDeepMind. After almost exactly 10 years at Google including 12 internships and the last 2 1/2 years full time, it really feels like a chapter coming to an end. I'm grateful for all the experiences and friends I've made at Google and DeepMind. I still remember my first Brain internship in Mountain View in 2016 with James Davidson and @V_Vanhoucke, at a time where nobody had a working PPO implementation and we were wrangling with TensorFlow graphs 😄 The moment @lukaszkaiser showed us the first plausible Wikipedia page generated by a "big" LSTM. @ashVaswani full of excitement explaining the compute efficiency of a new architecture that later became the Transformer and asking me to try it for RL (I did not :P) The excitement to work on Deep RL and generative models at DeepMind during my master's in London, which turned into PlaNet with @countzerozzz and @itfische. Figuring out Karl Friston's free energy principle with Nicolas Heess and @AdaptiveAgents (which took a few more years to get right). Spending a good part of my PhD at the Brain Team in Toronto working on multiple generations of Dreamer with @mo_norouzi, various collaborations, and celebrating the Turing Award with @geoffreyhinton. And over the last few years working from Berkeley/SF on world models with @wilson1yan with significant resources thanks to @countzerozzz and @koraykv, and seeing video models & world models accomplish results that seemed completely out of reach just a few years ago. With mixed feelings but also excitement, it's time to start a new chapter!
Danijar Hafner tweet media
San Francisco, CA 🇺🇸 English
138
47
2K
292K
Roger Qiu retweetledi
An-Chieh Cheng
An-Chieh Cheng@anjjei·
Let your robot peek around corners, size up the gap between chairs, and know exactly where everything sits. 🤖 Our new work SR-3D masters this kind of spatial reasoning. It learns from multiple views to understand distances, layouts, and how objects relate in 3D space.
English
6
43
180
73.1K
Roger Qiu
Roger Qiu@RogerQiu_42·
Robots will always work with near-perfect depth in 3 years.
Minghuan Liu@ericliuof97

🚀 Want to build a 3D-aware manipulation policy, but troubled by the noisy depth perception? Want to train your manipulation policy in simulation, but tired of bridging the sim2real gap by degenerating geometric perception, like adding noise? Now these notorious problems are gone with our Camera depth Models! The Camera Depth Models (CDMs) can be plug-in modules in a real robot pipeline, transforming noisy depth into high-quality perception, enabling seamless sim-to-real transfer, making real robot manipulation work as in simulation! 🎯 Why it matters: Accurate geometry with CDMs helps a sim-data-driven policy solve a set of complex, long-horizon tasks from 0% to 85%+ success! Now you can even train in simulation, deploy on real robots WITHOUT further domain adaptation. Just plug in our CDMs to your existing pipeline! ✨ Highlight: • Zero-shot sim-to-real transfer with 73%+ success (vs 0% baseline) • Depth-only imitation learning achieves 85%+ success • Works with RealSense D435/L515, Kinect, ZED2i & more 🛠️ Everything is open: • We open-source CDMs for 5 distinct cameras • We open-source the collected ByteCamDepth Dataset, which contains 170K+ RGB-depth pairs across 7 cameras & 10 configurations, a comprehensive real-world depth dataset. • We open-source our codes for sim-to-real, camera depth model inference. We also share our modular, real-robot control framework designed for robotic manipulation, which provides you a unified interface for controlling various robot arms, integrating sensors, and executing policies in real-time! • We also made a clean sim-to-real tutorial based on our framework! Check everything and interactive demos in …nipulation-as-in-simulation.github.io We expect CDMs to be a foundation of your daily robotic research! #Robotics #ComputerVision #SimToReal #DepthPerception #OpenSource

English
0
1
4
870
Roger Qiu
Roger Qiu@RogerQiu_42·
By default, Feature Splatting generates CLIP-aligned features. So you can use text prompts to query the consistent 3D features "synthesized" from videos "synthesized" by Genie. Here is an example heatmap response to the prompt 'hose'. 2D->3D is real.
English
0
0
5
273
Roger Qiu
Roger Qiu@RogerQiu_42·
1. Gemini CLI breaks down the videos into images. 2. Merge all images into the same colmap runs 3. Run Feature Splatting (feature-splatting.github.io) That's it!
English
1
0
8
326
Roger Qiu
Roger Qiu@RogerQiu_42·
Genie 3 shows amazing 3D consistency even across multiple prompts. You can step inside a garden. Merge videos generated by different text prompts into the same world, and it just works. It gives you RGB renderings and consistent 3D features. Done in 30mins⬇️
English
2
9
71
6.8K
Roger Qiu
Roger Qiu@RogerQiu_42·
Join our CoRL workshop in Seoul to check out the latest progress in learning from humans - the next potentially scalable data source for robot learning!
Danfei Xu@danfei_xu

Current AI models only learn from a fraction of human intelligence. At CoRL 2025, our brand new "Human to Robot (H2R)" workshop explores how robots can learn from the vast, untapped physical human experience. sites.google.com/view/h2r-corl2… Extended abstract / paper submission deadline Aug 15th. Co-organized with @xiaolonw, @simar_kareer, @RogerQiu_42, Sha Yi, James Fort, @NimaFazeli7, and Jianlong Ye

English
0
0
4
326
Roger Qiu retweetledi
Jianglong Ye
Jianglong Ye@jianglong_ye·
How to generate billion-scale manipulation demonstrations easily? Let us leverage generative models! 🤖✨ We introduce Dex1B, a framework that generates 1 BILLION diverse dexterous hand demonstrations for both grasping 🖐️and articulation 💻 tasks using a simple C-VAE model.
English
15
86
375
72.7K
Michael Cho - Rbt/Acc
Michael Cho - Rbt/Acc@micoolcho·
@RogerQiu_42 @chris_j_paxton Roger, I think that your team at UCSD is absolutely crushing it with the multiple papers coming out. Great to have folks like u still moving the needle in academia and doing things in the open. Congrats on the paper again.
English
1
0
2
73