Michael Xu

150 posts

Michael Xu banner
Michael Xu

Michael Xu

@mxu_cg

AI Resident @1x_tech 🇺🇸

가입일 Haziran 2011
396 팔로잉654 팔로워
Michael Xu
Michael Xu@mxu_cg·
Naruto was doing massively parallel reinforcement learning before Isaac Gym
Michael Xu tweet media
English
3
3
12
491
Michael Xu 리트윗함
Ruixuan Liu
Ruixuan Liu@RuixuanLiu_·
Interlocking bricks aren’t just toys, they unlock an infinite space of creativity and a powerful foundation for physical and spatial intelligence. 💡 But how well do we truly understand what we build? What happens when these structures face real-world forces, for example robotic manipulation? 🚀 Introducing #BrickSim — the first physics-based simulator for interlocking brick assemblies. #BrickSim reveals how forces propagate through complex builds, accurately simulating assembly, disassembly, and structural collapse in real time. More than a simulator, #BrickSim is a unified virtual platform for robotics and Physical AI: 🤖 Seamless integration with diverse robotic systems 🧠 Develop and deploy intelligent control policies 🎮 Real-time teleoperation for interactive dexterous manipulation 🏟️ A unified platform for physical and spatial intelligence 🔧 Lowering the barrier to study contact-rich, ultra long-horizon tasks 🙌 Huge thanks to our amazing team! @yushijinhun, Weiyi Piao, Siyu Li, @ChangliuL @ICL_at_CMU @CMU_Robotics 🌐 Code: github.com/intelligent-co… 🌐 Paper: arxiv.org/abs/2603.16853 #Robotics #Simulation #PhysicalAI #EmbodiedAI #Manipulation #Brick #Assembly
English
4
15
67
8K
Michael Xu 리트윗함
Junzhe (JJ) He
Junzhe (JJ) He@JayHe748646·
Will release something interesting in the coming months.
English
15
53
580
36.1K
Michael Xu 리트윗함
Zhiyang (Frank) Dou
Zhiyang (Frank) Dou@frankzydou·
We have seen many works unlock the power of pretrained models for images and videos🏞️. But what about human motion🕺💃? Can we leverage a pretrained motion prior for a wide range of downstream tasks? Yes!! UMO is a simple yet effective framework that, for the first time, unlocks the priors of a motion foundation model (i.e., HY-Motion) for 10+ tasks, including editing, reaction generation, stylization, trajectory control, obstacle avoidance, keyframe infilling, and more. Amazing work! @xiaoyan_cong and @kunkun0w0. 🏠Webpage: oliver-cong02.github.io/UMO.github.io/ 📄 Paper: arxiv.org/abs/2603.15975 With the growing number of tools for transferring SMPL motion to humanoids, we hope it could also become a source of skills for humanoid robot learning. #Graphics #Motion #Animation #AIGC #GenerativeAI #Vision #3DV #Robotics #Robot #Humanoid #Learning #GenAI #Animation
Zhiyang (Frank) Dou tweet media
Xiaoyan Cong@xiaoyan_cong

💡Introducing 𝑼𝑴𝑶 -- one unified model that unlocks motion foundation model (HY-Motion @TencentHunyuan) priors for 𝟏𝟎+ 𝐭𝐚𝐬𝐤𝐬: 𝐞𝐝𝐢𝐭𝐢𝐧𝐠, 𝐫𝐞𝐚𝐜𝐭𝐢𝐨𝐧 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧, 𝐬𝐭𝐲𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧, 𝐭𝐫𝐚𝐣𝐞𝐜𝐭𝐨𝐫𝐲 𝐜𝐨𝐧𝐭𝐫𝐨𝐥, 𝐨𝐛𝐬𝐭𝐚𝐜𝐥𝐞 𝐚𝐯𝐨𝐢𝐝𝐚𝐧𝐜𝐞, 𝐤𝐞𝐲𝐟𝐫𝐚𝐦𝐞 𝐢𝐧𝐟𝐢𝐥𝐥𝐢𝐧𝐠... (1/8) 🌐 Webpage: oliver-cong02.github.io/UMO.github.io/ 📄 Paper: arxiv.org/abs/2603.15975

English
0
23
82
8.4K
Michael Xu 리트윗함
Zhikai Zhang
Zhikai Zhang@Zhikai273·
🎾Introducing LATENT: Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data Dynamic movements, agile whole-body coordination, and rapid reactions. A step toward athletic humanoid sports skills. Project: zzk273.github.io/LATENT/ Code: github.com/GalaxyGeneralR…
English
162
643
4.1K
1.4M
Michael Xu 리트윗함
Chenhao Li
Chenhao Li@breadli428·
Simplicity should be valued more. When a task can be solved equally with a simpler framework, one should not be blamed having “nothing new”. Many unnecessary novelties are invented for the sake of novelty, while the effort of making simpler methods general is not appreciated.
Chenhao Li tweet media
English
6
13
169
7.9K
Michael Xu 리트윗함
Sirui Xu
Sirui Xu@xu_sirui·
People often debate what a good interface for humanoid control should be. Should it focus on general motion tracking? Or should it prioritize sparse goal following? To us, these capabilities are not contradictory—they are complementary foundations of autonomy. We aim to build an all-in-one interface for loco-manipulation that can operate under multiple forms of guidance: motion tracking, sparse goal commands, accurate motion-capture states, or egocentric partial observations. When rich signals are available, the system can leverage them; when they are missing, it should still function robustly. Such flexibility is essential in real-world settings. During teleoperation, a robot may track a full reference trajectory. During deployment, the same motor prior must be effective with only a high-level goal, relying on egocentric perception to complete the task. A practical control system should seamlessly span both regimes—and everything in between. Project Page: ultra-humanoid.github.io Paper: arxiv.org/abs/2603.03279
Xialin He@Xialin_He

Real-world loco-manipulation demands more than replaying fixed reference motions. We argue that true autonomy requires two capabilities: 1️⃣ flexibly leveraging whatever signals are available — dense references, partial cues, state estimates, or egocentric perception 2️⃣ remaining capable when any of these signals are missing or unreliable We introduce ULTRA — an all-in-one controller for unified humanoid loco-manipulation 🤖 It supports: • general reference tracking • sparse goal following • execution with motion capture • execution with egocentric perception 🔗 Project page: ultra-humanoid.github.io

English
1
12
61
5.5K
Michael Xu 리트윗함
Bernt Bornich
Bernt Bornich@BerntBornich·
These guys get it (equally true for rest of robot, for safety and ability to learn thru failure, not just sim2real gap) Adding NEOs hands: DOF: 22 (44 active tendons per hand, fully actuated) Ratio: 8:1 (w/tendons, 1X custom high-torque motors) Sim2Real Gap: Low (10-15% friction, high stiffness) Force Transparency: High (motor currents) Reliability: High (3.5m cycles at nominal load)
Bernt Bornich tweet media
Quanting Xie@DanielXieee

Why does manipulation lag so far behind locomotion? New post on one piece we don't talk about enough: The gearbox. The Gap You've probably seen those dancing humanoid robots from Chinese New Year. Locomotion isn't entirely solved; but clearly it's on a trajectory. But we haven't seen anything close for manipulation. 𝗪𝗵𝘆? When sim-to-real transfer fails, the instinct is to blame the algorithm. Train bigger networks. Crank up domain randomization. Those approaches have made real progress; we don't deny that. But we started wondering: are we treating the symptom or the disease? The Hardware Bottleneck: Fingers are too small for powerful motors. So most hands use massive gearboxes (200:1, 288:1) to get enough torque. But those gearboxes break everything manipulation needs:   • Stiction and backlash are complex to simulate. Policies trained on smooth physics hallucinate when they hit that reality.   • Reflected inertia scales as N². At large gear ratio, the finger hits with sledgehammer momentum.   • Friction blocks force information. The hand becomes blind. And they're the first thing to break. What we are trying to build at Origami, we cut the gear ratio from 288:1 to 15:1 using axial flux motors and thermal optimization. The transmission becomes more transparent: backdrivable, low friction, forces propagate to motor current. Early signs are encouraging. Still running quantitative benchmarks. Why Interactive? I love how Science Center uses interactive devices to explain complex ideas. I want to borrow this concept and help people understand the hard problems in robotics better visually. The post has demos where you can toggle friction, slide gear ratios, watch the sim-to-real gap widen in real-time. What's inside:   • Interactive demos (friction curves, N² scaling, contact patterns)   • Comparison table: 14 robot hands by sim-to-real gap and force transparency   • The math behind why low-ratio matters Read it here: origami-robotics.com/blog/dexterity… We're not claiming we've solved dexterity. The deadlock has many pieces. But we think this one's foundational. Curious what you think.

English
11
25
222
61.1K
Michael Xu 리트윗함
Sirui Xu
Sirui Xu@xu_sirui·
InterPrior is accepted to #CVPR2026. Meanwhile, we’d love to share our ongoing open-sourcing efforts: InterAct: github.com/wzyabcas/Inter… InterMimic: github.com/Sirui-Xu/Inter… 1. InterMimic now supports multi-GPU training, as well as IsaacLab replay and inference. 2. InterAct—our large-scale HOI dataset—now supports converting all its data into simulation-ready formats, directly compatible, e.g., with InterMimic, and Holosoma for robot-object retargeting. 3. InterAct also adds support for more HOI data, including processing from ARCTIC (articulated objects) and ParaHome (full scenes), plus a workflow for acquiring higher-quality HOI data from these MoCap. We’ll keep pushing forward, and more updates will come. We’d also love any feedback and contributions from the community. This would not have been possible without my labmates Jiangshan Gong, Ziyin Wang, and Yucheng Zhang. (Videos here are data replay only, physics is not enabled.)
Sirui Xu@xu_sirui

Humanoids need autonomy + versatility + generalization to be truly useful. Loco-manipulation makes that hard. InterPrior is our step toward bridging the gap — one policy, no reference. Could be promising for immersive games 🎮 and real robots 🤖 🔗 sirui-xu.github.io/InterPrior 📜 arxiv.org/abs/2602.06035 [1/9]

English
2
37
216
14.3K
Michael Xu 리트윗함
Chenhao Li
Chenhao Li@breadli428·
⭐️⭐️⭐️ Our Robotic World Model repo reached 500 stars! ⭐️⭐️ 🤖 We made it easy to start Model-Based RL with real robots! If you don't like simulator hassles, try our lite version with pretrained checkpoints here (even on Colab): github.com/leggedrobotics… github.com/leggedrobotics…
English
2
31
213
9.4K
Michael Xu 리트윗함
Kevin Zakka
Kevin Zakka@kevin_zakka·
New in mjlab from the amazing @ki_ki_ki1: 8 new terrains and a viser-based terrain visualizer 😎
English
3
17
157
16.2K
Michael Xu 리트윗함
Kevin Zakka
Kevin Zakka@kevin_zakka·
mjlab v1.0.0 is officially out and considered stable. Huge thanks to everyone who contributed code, reported issues, and gave feedback. This release wouldn’t have happened without you. github.com/mujocolab/mjlab
English
15
42
316
27.8K
Michael Xu 리트윗함
Han Xue
Han Xue@__Axian__·
Are humanoid robots ready to step into our homes? 🤖🏠 Meet Click-and-Traverse: Navigate through cluttered space like Jackie Chan! 🌟 ONE policy for ALL indoor scenes 🤸‍♂️ Conquer OMNI-SPATIAL (=ground + lateral + overhead) constraints 🖱️ Easy teleoperation: Simply click a goal, and the robot smoothly traverses towards it Check it out now! 👉 Project: axian12138.github.io/CAT/ 🚀 Code: github.com/GalaxyGeneralR…
English
4
26
214
8K
Michael Xu 리트윗함
Chenhao Li
Chenhao Li@breadli428·
🌎World models can predict, but controlling real robots from imagination sees a long-standing failure due to hallucination. 🧠Introducing Uncertainty-Aware RWM: a black-box, end-to-end neural dynamics model with long-horizon uncertainty propagation. 🎯sites.google.com/view/uncertain…
English
7
39
258
42.1K
Michael Xu 리트윗함
Chen Tessler
Chen Tessler@ChenTessler·
We're thrilled to share ProtoMotions v3.1! With a fully modular architecture and robust domain randomization, we’re taking a massive step toward bridging the gap between animation and real-world robotics deployment. Git: github.com/NVlabs/ProtoMo… 🧵 👇
English
4
35
275
12.8K
Michael Xu 리트윗함
C Zhang
C Zhang@ChongZitaZhang·
releasing AME2: Agile and Generalized Legged Locomotion via Attention-Based Neural Map Encoding arxiv.org/abs/2601.08485 In this work, we discuss how to achieve a combination of generalization and agility in legged locomotion, and propose a general solution.
English
10
40
283
79.1K
Michael Xu 리트윗함
Eric Jang
Eric Jang@ericjang11·
It's not every day one comes across magic. So when you come across a magical glowing rock that is intelligent, you ought to find some way to "harness the magic". Strap it to a steam engine, put it to work! We started looking into video world models in early 2024, with the key thesis that video generation is one of the "the great magic rocks of deep learning". Clearly, video models understand the joint distribution of text and video quite well. They have "better-than-expected generalization" when compared to robots. The data distribution for a robot is not far from that - if you can predict video, you are probably not far from predicting actions! Thanks to the persistent and thorough work by the WM team, we've also recently unlocked the ability to connect the excellent generalization capabilities of WMs to humanoid robots like NEO. This has been also enormously validating to the human egocentric data paradigm. We see a lot of transfer from this data, especially since NEO has human kinematics. As we've scaled up compute and data, the results keep getting better. 1x.tech/discover/1x-wo… 1x.tech/discover/redwo…
English
3
4
54
5.3K