Fernando Castañeda

17 posts

Fernando Castañeda

Fernando Castañeda

@FerCastanedaGR

Research Scientist @nvidia GEAR | PhD @UCBerkeley

San Francisco, CA Katılım Şubat 2026
19 Takip Edilen41 Takipçiler
Sabitlenmiş Tweet
Fernando Castañeda
Fernando Castañeda@FerCastanedaGR·
Now you can use GR00T N1.7 and SONIC together to enable tasks that require TRUE whole-body coordination!! Including simultaneous precise hand and foot placement, like opening a trash can with the foot pedal and throwing an object inside! Try it yourself, it is so fun!
Zhengyi “Zen” Luo@zhengyiluo

Open-sourcing the whole package here! The last piece of our SONIC open-source, data collection, gr00t VLA post-training, inference just hit the repo! Train your Autonomous policies on G1 Whole-body with SONIC and gr00t N1.7! 🧑‍💻Code: github.com/NVlabs/GR00T-W… 📑Docs: nvlabs.github.io/GR00T-WholeBod…

English
0
10
35
8.6K
Fernando Castañeda retweetledi
Jim Fan
Jim Fan@DrJimFan·
I promise this will be the best 20 min you spend today! Robotics: Endgame, the sequel to my last year's Sequoia AI Ascent talk, "Physical Turing Test". I laid out the roadmap for solving Physical AGI as a simple parallel to the LLM success story. Be a good scientist, copy homework ;) And stay till the end, more easter eggs and predictions for your polymarket! 00:30 DGX-1 origin story at OpenAI, I was there in 2016 signing with Jensen and Elon. Heading to the Computer History Museum! 01:42 The Great Parallel 03:31 Robotics, the Endgame 03:39 Why VLAs fall short 04:32 Video world models as the 2nd pretraining paradigm 06:09 World Action Models (WAM) 07:46 Strategies for robot data collection and the FSD equivalent to physical data flywheel for robot manipulation 11:06 EgoScale and the Dexterity Scaling Law we discovered recently 14:00 Physical RL: bridging the last mile 15:39 DreamDojo: an end-to-end neural physics engine for scaling RL in silico 17:00 Civilizational Technology Tree and my predictions for the near future. Spoiler: it's closer than you think. Thanks to my friends at Sequoia for inviting me back to AI Ascent this year! I had a blast! Last year's talk is attached in the thread if you missed it.
English
153
536
3.4K
530.2K
Fernando Castañeda retweetledi
Dhruv Diddi
Dhruv Diddi@DhruvDiddi·
🚀 @Solo__Tech breakthrough: NVIDIA Sonic goes beyond the G1. 🌏 We’ve achieved a global first: migrating NVIDIA Sonic to a completely different humanoid morphology, the AGIBOT X2 🥇 This is a massive leap for transferable humanoid intelligence. We are moving away from single-robot controllers toward architectures that generalize across diverse embodiments. The Specs: Hardware: AGIBOT X2 Ultra (31 DoF) Precision: 14-DoF Dexterous Hands Capabilities: - End-to-end whole-body locomotion with manipulation - Single leg balancing and stylized motion - Upper body gestures Big respect to the team fueling this innovation all the way! @meetsitaram @zeeshaan_7788 @Samarth_1506 @DevSodhi @flyingtaxiguy @build @frontiertower @nvidia @AGIBOTofficial @nebiusai @UFBots @vitl2907 @XeniaBulatov @NVIDIARobotics To learn more about Solo Tech at the frontier of Physical AI: getsolo.tech/blog #PhysicalAI #Robotics #Humanoids #NVIDIA #AGIBOT #SoloTech #EmbodiedAI #SoloSeven
English
17
48
326
26.1K
Fernando Castañeda retweetledi
Fernando Castañeda retweetledi
Tairan He
Tairan He@TairanHe99·
GR00T-VisualSim2Real is now open source! VIRAL and DoorMan are now available with training code, simulation assets, and the full recipe for bringing visual sim-to-real loco-manipulation skills to your own humanoids. Repo: github.com/NVlabs/GR00T-V…
Tairan He@TairanHe99

Zero teleoperation. Zero real-world data. ➔ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website: viral-humanoid.github.io Arxiv: arxiv.org/abs/2511.15200 Deep dive with me: 🧵

English
6
98
615
114.3K
Fernando Castañeda retweetledi
tingwu.wang
tingwu.wang@TingwuWang·
What is missing to bring real-time motion research into AAA games and real-world robotics? We present MotionBricks, a step toward bridging this gap with two key components: - a single generative latent motion backbone covering 350,000+ motion skills, running at 15,000 FPS with 2 ms latency and substantially improved quality and reliability. - a unified smart primitive interface for locomotion, object / scene interaction, with fine-grained control over generated behaviors. Webpage: nvlabs.github.io/motionbricks/ Code: github.com/NVlabs/GR00T-W… Paper: arxiv.org/abs/2604.24833 (ACM TOG / SIGGRAPH 2026)
English
27
150
1.2K
150.2K
Fernando Castañeda retweetledi
Fernando Castañeda
Fernando Castañeda@FerCastanedaGR·
Try our foundation model for whole-body control! It’s open-source! Super proud to be part of this team 🚀
Jim Fan@DrJimFan

What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot. It takes a remarkable amount of subconscious processing for us humans to squat, turn, crawl, sprint. SONIC captures this "System 1" - the fast, reactive whole-body intelligence - in a single model that translates any motion command into stable, natural motor signals. And it's all open-source!! The key insight: motion tracking is the one, true scalable task for whole body control. Instead of hand-engineering rewards for every new skill, we use dense, frame-by-frame supervision from human mocap data. The data itself encodes the reward function: "configure your limbs in any human-like position while maintaining balance". We scaled humanoid motion RL to an unprecedented scale: 100M+ mocap frames and 500,000+ parallel robots across 128 GPUs. NVIDIA Isaac Lab allows us to accelerate physics at 10,000x faster tick, giving robots many years of virtual experience in only hours of wall clock time. After 3 days of training, the neural net transfers zero-shot to the real G1 robot with no finetuning. 100% success rate across 50 diverse real-world motion sequences. One SONIC policy supports all of the following: - VR whole-body teleoperation - Human video. Just point a webcam to live stream motions. - Text prompts. "Walk sideways", "dance like a monkey", "kick your left foot", etc. - Music audio. The robot dances to the beat, adapting to tempo and rhythm. - VLA foundation models. We plugged in GR00T N1.5 and achieved 95% success on mobile tasks. We open-source the code and model checkpoints!! Deep dive in thread:

English
1
0
1
25
Fernando Castañeda
Fernando Castañeda@FerCastanedaGR·
Human data as the most scalable data source for robotics! Seeing first-hand the generalization capabilities unlocked by human video data was mind-blowing to me. Check out our paper!
Jim Fan@DrJimFan

We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R² = 0.998) between human video volume and action prediction loss, and this loss directly predicts real-robot success rate. Humanoid robots will be the end game, because they are the practical form factor with minimal embodiment gap from humans. Call it the Bitter Lesson of robot hardware: the kinematic similarity lets us simply retarget human finger motion onto dexterous robot hand joints. No learned embeddings, no fancy transfer algorithms needed. Relative wrist motion + retargeted 22-DoF finger actions serve as a unified action space that carries through from pre-training to robot execution. Our recipe is called "EgoScale": - Pre-train GR00T N1.5 on 20K hours of human video, mid-train with only 4 hours (!) of robot play data with Sharpa hands. 54% gains over training from scratch across 5 highly dexterous tasks. - Most surprising result: a *single* teleop demo is sufficient to learn a never-before-seen task. Our recipe enables extreme data efficiency. - Although we pre-train in 22-DoF hand joint space, the policy transfers to a Unitree G1 with 7-DoF tri-finger hands. 30%+ gains over training on G1 data alone. The scalable path to robot dexterity was never more robots. It was always us. Deep dives in thread:

English
0
0
0
17
Fernando Castañeda retweetledi
Ruijie Zheng
Ruijie Zheng@ruijie_zheng12·
Proud to introduce EgoScale: We pretrained a GR00T VLA model on 20K+ hours of egocentric human video and discovered that robot dexterity can be scaled, not with more robots, but with more human data. A thread on 🧵what we learned. 👇
English
25
65
331
97.7K
Fernando Castañeda retweetledi
Zhengyi “Zen” Luo
Zhengyi “Zen” Luo@zhengyiluo·
SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-W… Docs: nvlabs.github.io/GR00T-WholeBod… Site: nvlabs.github.io/GEAR-SONIC/
English
36
206
927
249.2K
Fernando Castañeda retweetledi
Yuke Zhu
Yuke Zhu@yukez·
We have seen rapid progress in humanoid control — specialist robots can reliably generate agile, acrobatic, but preset motions. Our singular focus this year: putting generalist humanoids to do real work. To progress toward this goal, we developed SONIC (nvlabs.github.io/GEAR-SONIC/), a Behavior Foundation Model for real-time, whole-body motion generation that supports teleoperation and VLA inference for loco-manipulation. Today, we’re open-sourcing SONIC on GitHub. We are excited to see what the community builds upon SONIC and to collectively push humanoid intelligence toward real-world deployment at scale. 🌐 Paper: arxiv.org/abs/2511.07820 📃 Code: github.com/NVlabs/GR00T-W…
English
11
67
351
66.7K