Masoud Moghani

64 posts

Masoud Moghani

@MasoudMoghani

Interning at GEAR @NVIDIAAI, PhDing @UofT

San Francisco, CA Katılım Nisan 2022

1.7K Takip Edilen389 Takipçiler

Sabitlenmiş Tweet

Masoud Moghani@MasoudMoghani·1 Nis

Simulations scale for rigid objects, but deformable objects remain an open frontier. SoftMimicGen generates large-scale training data from just a handful of demonstrations. softmimicgen.github.io

English

166

22K

Masoud Moghani@MasoudMoghani·25 Nis

@aaronistan 👌🏻

QME

Aaron Tan@aaronistan·24 Nis

We are hosting the largest showing of Lume in Palo Alto Bring your own laundry Details below

English

176

121

2.6K

857K

Masoud Moghani retweetledi

Zhengyi “Zen” Luo@zhengyiluo·15 Nis

SONIC training code + Finetuning checkpoint + VLA data collection scripts are open-sourced. Little easter egg on the GEAR-SONIC website too :) github.com/NVlabs/GR00T-W…

Zhengyi “Zen” Luo@zhengyiluo

SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-W… Docs: nvlabs.github.io/GR00T-WholeBod… Site: nvlabs.github.io/GEAR-SONIC/

English

184

32.3K

Masoud Moghani@MasoudMoghani·9 Nis

@aaronistan pretty cool, congrats!!

English

236

Aaron Tan@aaronistan·8 Nis

Introducing Lume. A lamp that does your chores. Order now. Shipping this summer.

English

448

170

2.5K

1.3M

Masoud Moghani@MasoudMoghani·7 Nis

Deformable tasks hit a data and sim bottleneck. SoftMimicGen is a step toward fixing it.

Stephen James@stepjamUK

𝗧𝗵𝗲 𝗿𝗲𝗮𝘀𝗼𝗻 𝗿𝗼𝗯𝗼𝘁𝘀 𝘀𝘁𝗶𝗹𝗹 𝘀𝘁𝗿𝘂𝗴𝗴𝗹𝗲 𝘄𝗶𝘁𝗵 𝗳𝗼𝗼𝗱, 𝗳𝗮𝗯𝗿𝗶𝗰, 𝗼𝗿 𝗳𝗹𝗲𝘅𝗶𝗯𝗹𝗲 𝗽𝗮𝗿𝘁𝘀 𝗮𝘁 𝘀𝗰𝗮𝗹𝗲 𝗶𝘀𝗻’𝘁 𝗷𝘂𝘀𝘁 𝗵𝗮𝗿𝗱𝘄𝗮𝗿𝗲 𝗼𝗿 𝗺𝗼𝗱𝗲𝗹𝘀. 𝗜𝘁’𝘀 𝗱𝗮𝘁𝗮. Rigid-body manipulation has dominated robot learning research because simulation handles it cleanly. Deformable objects don't simulate easily, so teams have been forced to collect everything by hand. Slowly, expensively, and at a scale that doesn't generalise. @nvidia and the @UofT recently published SoftMimicGen: a data generation system that extends automated demonstration synthesis to deformable object manipulation for the first time. Here's why it matters. • 𝗦𝗰𝗮𝗹𝗲 𝘄𝗶𝘁𝗵𝗼𝘂𝘁 𝘁𝗵𝗲 𝗵𝗲𝗮𝗱𝗰𝗼𝘂𝗻𝘁. SoftMimicGen generates thousands of demonstrations from a small number of human-collected seeds, dramatically reducing dataset bottlenecks. • 𝗔 𝗹𝗼𝗻𝗴-𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝗴𝗮𝗽, 𝗰𝗹𝗼𝘀𝗲𝗱. The approach builds on methods that worked for rigid objects and brings them into domains like food handling, healthcare, and logistics. • 𝗦𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝘁𝗵𝗮𝘁 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝘁𝗿𝗮𝗻𝘀𝗳𝗲𝗿𝘀. Policies trained on generated data show strong real-world generalisation, narrowing the sim-to-real gap. Every major advance in robot learning traces back to a data problem. Better algorithms, better architectures, better hardware - none of it fires without the training data to back it up. Collecting that data manually doesn't scale. The infrastructure to generate, manage, and deploy it does. Great work @MasoudMoghani, @AjayMandlekar, Sean Huver, and the full NVIDIA and University of Toronto team on this piece of work. Link to paper: arxiv.org/html/2603.2572…

English

197

Masoud Moghani retweetledi

Siddhant Haldar@haldar_siddhant·3 Nis

Very cool results from @MasoudMoghani! Some evidence that Point Bridge can enable zero-shot sim2real transfer for deformable objects. x.com/haldar_siddhan…

Masoud Moghani@MasoudMoghani

Simulations scale for rigid objects, but deformable objects remain an open frontier. SoftMimicGen generates large-scale training data from just a handful of demonstrations. softmimicgen.github.io

English

Masoud Moghani retweetledi

Ajay Mandlekar@AjayMandlekar·2 Nis

Deformable manipulation has long stalled simulation adoption. SoftMimicGen reduces this barrier with 4 embodiments, scalable data generation, and proven sim-to-real results. Check out Masoud’s thread below for more!

Masoud Moghani@MasoudMoghani

Simulations scale for rigid objects, but deformable objects remain an open frontier. SoftMimicGen generates large-scale training data from just a handful of demonstrations. softmimicgen.github.io

English

6.9K

Masoud Moghani@MasoudMoghani·2 Nis

@Stone_Tao thanks @Stone_Tao

English

Stone Tao@Stone_Tao·2 Nis

@MasoudMoghani awesome to see this released!

English

Masoud Moghani@MasoudMoghani·1 Nis

Simulations scale for rigid objects, but deformable objects remain an open frontier. SoftMimicGen generates large-scale training data from just a handful of demonstrations. softmimicgen.github.io

English

166

22K

Masoud Moghani@MasoudMoghani·1 Nis

This was a team effort! Great collaborators: Mahdi Azizian, @animesh_garg, @yukez, Sean Huver, @AjayMandlekar Paper: arxiv.org/abs/2603.25725 Website: softmimicgen.github.io Environments and pipeline will be released.

English

244

Masoud Moghani@MasoudMoghani·1 Nis

SoftMimicGen is a strict generalization of MimicGen! It handles deformable AND rigid objects. We show this on a rigid cube stacking task. Any task MimicGen can do, SoftMimicGen can do too, plus the entire world of soft object manipulation.

English

246

Masoud Moghani retweetledi

Jim Fan@DrJimFan·1 Nis

The power of the Claw, in the palm of a robot hand. Agentic robotics is here! Today, we open-source CaP-X: vibe agents, alive in the physical world. They incarnate as robot arms and humanoids with a rich set of perception APIs, actuation APIs, and auto synthesize skill libraries as they go. CaP-X is a strict superset of our old stack, because policies like VLAs are “just” API calls as well. It solves many tasks zero-shot that a learned policy would struggle with. And we are doing much more than vibing. CaP-X is our most systematic, scientific study on agentic robotics so far: - We build a comprehensive agentic toolkit: perception (SAM3 segmentation, Molmo pointing, depth, point cloud), control (IK solvers, grasp planner, navigation), and visualization (EEF, mask overlays) that work across different robots. - CaP-Gym: LLM’s first Physical Exam! 187 manipulation tasks across RoboSuite, LIBERO-PRO, and BEHAVIOR. Tabletop, bimanual, mobile manipulation. Sim and real. Can’t wait to see the gradients flow from CaP-Gym to the next wave of frontier LLM releases. - CaP-Bench: we benchmark 12 frontier LLMs/VLMs (Gemini, GPT, Opus, Qwen, DeepSeek, Kimi, and more) across 8 evaluation tiers. We systematically vary API abstraction level, agentic harness, and visual grounding methods. Lots of insights in our paper. - CaP-Agent0: a training-free agentic harness that matches or exceeds human expert code on 4 out of 7 tasks without task-specific tuning. - CaP-RL: if you get a gym, you get RL ;). A 7B OSS model jumps from 20% to 72% success after only 50 training iterations. The synthesized programs transfer to real robots with minimal sim-to-real gap. 3 years ago, our team created Voyager, one of the earliest agentic AI that plays and learns in Minecraft continuously. Its key ideas — skill libraries, self-reflection loops, and in-context planning — have since influenced many modern agentic designs. Today, the agent graduates from Minecraft and gets a real job. It’s April Fool’s, but this Claw is getting its hands dirty for real! Link in thread:

English

100

114

721

70.6K

Masoud Moghani retweetledi

Max Fu@letian_fu·1 Nis

Robotics: coding agents’ next frontier. So how good are they? We introduce CaP-X: an open-source framework and benchmark for coding agents, where they write code for robot perception and control, execute it on sim and real robots, observe the outcomes, and iteratively improve code reliability. From @NVIDIA @Berkeley_AI @CMU_Robotics @StanfordAILab capgym.github.io 🧵

English

128

630

157.2K

Masoud Moghani@MasoudMoghani·26 Şub

@bowenwen_me Great work, congrats!

English

Bowen Wen@bowenwen_me·26 Şub

Our paper has been accepted to CVPR 2026🎊. Code will be released very soon! Stay tuned at github.com/NVlabs/Fast-Fo…

Bowen Wen@bowenwen_me

A new milestone for real-time accurate 3D spatial computing! Introducing ⚡️Fast-FoundationStereo⚡️, a real-time zero-shot stereo depth estimation model that accelerates the original FoundationStereo by >10x with comparable quality. Details in threads 🧵 (1/N)

English

8.9K

Masoud Moghani retweetledi

Ruijie Zheng@ruijie_zheng12·25 Şub

Proud to introduce EgoScale: We pretrained a GR00T VLA model on 20K+ hours of egocentric human video and discovered that robot dexterity can be scaled, not with more robots, but with more human data. A thread on 🧵what we learned. 👇

English

330

95.8K

Masoud Moghani retweetledi

Zhengyi “Zen” Luo@zhengyiluo·20 Şub

English

203

914

212.5K

Keşfet

@aaronistan @Stone_Tao @animesh_garg @yukez @AjayMandlekar @NVIDIA @Berkeley_AI @CMU_Robotics