Max Yin @ CyberOrigin AI

215 posts

Max Yin @ CyberOrigin AI

@PengYin18

Scientist @CarnegieMellon | Founder @cyberorigin_ai | Assistant Professor @HongKong

Singapore Katılım Şubat 2022

1.6K Takip Edilen570 Takipçiler

Sabitlenmiş Tweet

Max Yin @ CyberOrigin AI@PengYin18·22 Eki

We are exciting to share our #AGI system, #CYBER, A General Robotic Operation System for Embodied AI traing, eval and employ. Parallely, we also released more than 150h human data operation on #huggingface for world model and action model training. github.com/CyberOrigin207…

English

1.5K

Max Yin @ CyberOrigin AI retweetledi

Figure@Figure_robot·14 May

Day 2 is Live: Watch humanoid robots Bob, Frank, and Gary running 24/7. This is fully autonomous running Helix-02 x.com/i/broadcasts/1…

English

234

542

3.2K

1.4M

Max Yin @ CyberOrigin AI@PengYin18·10 May

@RemiCadene Hi @RemiCadene, where to get one? So cool

English

Remi Cadene@RemiCadene·7 May

Hot new tendon driven hand with teleop exoskeleton High number of actuated DoF Packed with quality debug software I wonder how affordable and reliable it is but looks super promising already 😇

English

107

900

58.1K

Max Yin @ CyberOrigin AI@PengYin18·7 Nis

@MozarellaPesto @lucasmaes_ Is that possible to use your method with the current EgoCentric dataset take from human operation?

English

237

Matteo@MozarellaPesto·6 Nis

I extended LeWorldModel by @lucasmaes_ et al to no longer require action conditioning. It now learns controllable dynamics directly from raw video. It uses latent action modelling to self-learn discrete control codes between latent states. Hence 'La leWorldmodel' 👇👇👇

English

213

17.6K

Max Yin @ CyberOrigin AI@PengYin18·31 Mar

Finally, SLAM is back, great take from OpenAI.

Bruno Santos🇵🇹@brunoeducsant

OpenAI is hiring for SLAM engineer. Who would say.

English

182

Max Yin @ CyberOrigin AI@PengYin18·1 Mar

Amazing analysis and great take for manipulation speedup! Great work @DanielXieee

Quanting Xie@DanielXieee

Why does manipulation lag so far behind locomotion? New post on one piece we don't talk about enough: The gearbox. The Gap You've probably seen those dancing humanoid robots from Chinese New Year. Locomotion isn't entirely solved; but clearly it's on a trajectory. But we haven't seen anything close for manipulation. 𝗪𝗵𝘆? When sim-to-real transfer fails, the instinct is to blame the algorithm. Train bigger networks. Crank up domain randomization. Those approaches have made real progress; we don't deny that. But we started wondering: are we treating the symptom or the disease? The Hardware Bottleneck: Fingers are too small for powerful motors. So most hands use massive gearboxes (200:1, 288:1) to get enough torque. But those gearboxes break everything manipulation needs: • Stiction and backlash are complex to simulate. Policies trained on smooth physics hallucinate when they hit that reality. • Reflected inertia scales as N². At large gear ratio, the finger hits with sledgehammer momentum. • Friction blocks force information. The hand becomes blind. And they're the first thing to break. What we are trying to build at Origami, we cut the gear ratio from 288:1 to 15:1 using axial flux motors and thermal optimization. The transmission becomes more transparent: backdrivable, low friction, forces propagate to motor current. Early signs are encouraging. Still running quantitative benchmarks. Why Interactive? I love how Science Center uses interactive devices to explain complex ideas. I want to borrow this concept and help people understand the hard problems in robotics better visually. The post has demos where you can toggle friction, slide gear ratios, watch the sim-to-real gap widen in real-time. What's inside: • Interactive demos (friction curves, N² scaling, contact patterns) • Comparison table: 14 robot hands by sim-to-real gap and force transparency • The math behind why low-ratio matters Read it here: origami-robotics.com/blog/dexterity… We're not claiming we've solved dexterity. The deadlock has many pieces. But we think this one's foundational. Curious what you think.

English

477

Max Yin @ CyberOrigin AI@PengYin18·14 Oca

@AustinSo16867 Sure, DM me anytime

English

AustinSo@AustinSo16867·11 Oca

@PengYin18 Hi Max! I'm a Master's student at Peking, starting up a tactile robot startup. Visiting SZ-HK soon. Could we connect?

English

Max Yin @ CyberOrigin AI@PengYin18·22 Eki

English

1.5K

Max Yin @ CyberOrigin AI retweetledi

Jon Hernandez@JonhernandezIA·24 Ara

📁 Demis Hassabis, CEO of DeepMind, says robotics didnt fail because of hardware. It failed because intelligence was missing. Gemini level models finally give robots the software brain they needed. When intelligence works, hardware follows. AGI doesnt live behind a screen. It moves.

English

199

1.9K

900.9K

Max Yin @ CyberOrigin AI@PengYin18·26 Ara

@pevidex @aakashgupta How to understand this point? Will you think Gemini Model can mapping the specific action to the unique physical property motors?

English

pevidex@pevidex·26 Ara

@aakashgupta Option 2 (Key insight):"hardware companies thinking they have a moat while Google turns motor commands into another LLM output is wild.

English

3.2K

Aakash Gupta@aakashgupta·25 Ara

The entire robotics industry is about to compress a decade of progress into 18 months, and nobody’s pricing it in. The hardware has been ready for years. Boston Dynamics had Atlas doing backflips in 2018. The bottleneck was never motors or actuators. It was that every robot behavior had to be hand-coded. Pick up a box? That’s one program. Pick up a bottle? Different program. Move the box from shelf A to shelf B in a warehouse with slightly different lighting? Start over. Foundation models broke this completely. Before VLAs, teaching a robot one skill gave you exactly one skill. Zero compounding. Zero transfer. A robot trained to fold shirts couldn’t fold towels without starting from scratch. The labor intensity of data generation meant robotics datasets stayed narrow, robots overfit, and small variations like object weight or table height caused failures. Now a single Gemini Robotics model handles tasks it has never seen in training. Google’s On-Device model learns new behaviors with 50-100 demonstrations. Not 50,000. Fifty. That’s a 1000x reduction in the data requirement for new capabilities. The speed implications cascade through everything. First order: deployment timelines collapse. What took robotics teams 6-12 months of custom programming now takes days of fine-tuning. Second order: the addressable market explodes. Tasks that were never economical to automate suddenly are, because the integration cost dropped by orders of magnitude. Third order: the data flywheel accelerates. Every robot running Gemini Robotics feeds learning back into the foundation model. More deployments means faster improvement means more deployments. Physical Intelligence raised at $2.4B because investors finally understood this. Boston Dynamics partnered with Toyota Research Institute to bolt Large Behavior Models onto Atlas. Every humanoid company is scrambling to either build or license the intelligence layer they don’t have. The market is still valuing robotics companies on their hardware differentiation. But hardware is commoditizing. Boston Dynamics spent a decade perfecting locomotion, and now that’s table stakes. The value is migrating entirely to whoever owns the foundation model that generalizes across embodiments. Google trained Gemini on the largest multimodal corpus ever assembled. Then they added physical actions as an output modality. That’s not a robotics company bolting on AI. That’s an AI company whose models now output motor commands. The companies pricing this correctly are building around foundation model access, not around proprietary hardware. The companies pricing this wrong are still acting like the moat is in the mechanical engineering. AGI moving into the physical world isn’t a 10-year prediction. Gemini Robotics shipped in March. The 1.5 version with chain-of-thought reasoning shipped in September. They’re iterating on a 6-month release cycle while hardware companies iterate on 3-year cycles. The gap between software intelligence timelines and hardware development timelines is the entire trade.

Jon Hernandez@JonhernandezIA

English

244

934

6.4K

1.3M

Max Yin @ CyberOrigin AI retweetledi

Jim Fan@DrJimFan·13 Eyl

There was something deeply satisfying about ImageNet. It had a well curated training set. A clearly defined testing protocol. A competition that rallied the best researchers. And a leaderboard that spawned ResNets and ViTs, and ultimately changed the field for good. Then NLP followed. No matter how much OpenAI, Anthropic, and xAI disagree, they at least agree on one thing: benchmarking. MMLU, HLE, SWEBench - you can’t make progress until you are able to measure it. Robotics still doesn’t have such a rallying call. No one agrees on anything: hardware, task, scoring, simulation engine, or real world environment. Everyone is SOTA, by definition, on the benchmark they define on the fly for each paper. From the maker of ImageNet - BEHAVIOR takes a stab at the daunting challenge of unifying robotics benchmarking on a reproducible physics engine (Isaac Sim). The project started before I graduated from Stanford Vision Lab, and took so many years of dedication and PhD careers to build. I hope BEHAVIOR is either the hill-climbing signal we need, or the spark that finally gets us talking about how to measure real progress as a field.

Fei-Fei Li@drfeifei

(1/N) How close are we to enabling robots to solve the long-horizon, complex tasks that matter in everyday life? 🚨 We are thrilled to invite you to join the 1st BEHAVIOR Challenge @NeurIPS 2025, submission deadline: 11/15. 🏆 Prizes: 🥇 $1,000 🥈 $500 🥉 $300

English

199

228K

Max Yin @ CyberOrigin AI retweetledi

Zhecheng Yuan@fancy_yzc·3 Eyl

👐How can we leverage multi-source human motion data, transform it into robot-feasible behaviors, and deploy it across diverse scenarios?  👤🤖Introduce 𝐇𝐄𝐑𝐌𝐄𝐒: a versatile human-to-robot embodied learning framework tailored for mobile bimanual dexterous manipulation.

English

174

24.4K

Max Yin @ CyberOrigin AI@PengYin18·28 Ağu

@chris_j_paxton Disagree, the reason robotaxi is working, not only because the data from real car, but also from real scenarios. As far as we can tell, we cannot set 7 million robots to our daily work, people really underestimate the hardness behind engineering.

Chris Paxton@chris_j_paxton

I have some real reservations about this actually

English

130

Max Yin @ CyberOrigin AI retweetledi

The Humanoid Hub@TheHumanoidHub·20 Ağu

Their approach focuses on long-horizon, language-conditioned manipulation and locomotion by mapping sensor inputs and language prompts into whole-body control at high frequency. The development cycle follows a continuous loop: teleoperated data collection, curation into pipelines, large-scale model training, and rigorous evaluation to guide improvements. Central principles include maximizing task coverage through VR-based teleoperation, building generalist policies across multiple embodiments, and enabling rapid iteration with strong infrastructure. It has enabled Atlas to perform complex, multi-step tasks like folding Spot legs, rope tying, or tire manipulation, while adapting to errors in real time. Policies can even run faster at inference, achieving up to 2x execution speed without retraining. This work highlights scalable methods for creating adaptable, general-purpose humanoid robots capable of performing diverse tasks with robustness and autonomy. Technical blog: bostondynamics.com/blog/large-beh…

English

6.7K

Max Yin @ CyberOrigin AI retweetledi

Ilir Aliu@IlirAliu_·10 Ağu

Every robot you see is a data firehose generating terabytes of chaos. This hidden crisis is the #1 reason robots fail, and it's costing the industry billions. You see hardware, but not the data swamp drowning engineers. In 2025, a quiet revolution is fixing it. Here’s how. 🧵

English

109

694

6.2K

1.6M

Max Yin @ CyberOrigin AI retweetledi

Elon Musk@elonmusk·3 Ağu

ZXX

3.4K

4.9K

47K

6.3M

Max Yin @ CyberOrigin AI@PengYin18·24 Tem

@qu3tzalify @chris_j_paxton In that case, the generalization ability is way more restricted.

English

Maxime Alvarez@qu3tzalify·22 Tem

@PengYin18 @chris_j_paxton Actually the opposite no? If you had to build a simulation or data collection tool specifically for your embodiment, you now need to rebuild that. The real data way "just" requires you to go out and collect new data.

English

Chris Paxton@chris_j_paxton·21 Tem

Robot learning needs a ton of data to work well. This is a great read on why “shortcuts” like simulation data might not be so useful and may hurt performance. Personally, I'm not completely convinced; i think things like simulation and human data play a role, but *never as a replacement for robot data,* just as an expansion to it.

Sergey Levine@svlevine

I wrote a fun little article about all the ways to dodge the need for real-world robot data. I think it has a cute title. sergeylevine.substack.com/p/sporks-of-agi

English

137

13.7K

Max Yin @ CyberOrigin AI@PengYin18·22 Tem

@svlevine Great article, we have roughly 200k hours human video demostration, will that can help on better behavior cloning for robotics?

English

372

Sergey Levine@svlevine·21 Tem

I wrote a fun little article about all the ways to dodge the need for real-world robot data. I think it has a cute title. sergeylevine.substack.com/p/sporks-of-agi

English

120

822

154.4K

Max Yin @ CyberOrigin AI@PengYin18·16 Tem

@chris_j_paxton This is too slow, and impossible to apply to the real world. Btw, who will pay the money for the slow workers, which only have limited operation ability.

English

Chris Paxton@chris_j_paxton·15 Tem

same policy btw

Agility@agilityrobotics

How would you do if somebody pulled the rug out from beneath you?

English

5.3K

Max Yin @ CyberOrigin AI retweetledi

Elon Musk@elonmusk·14 Tem

ZXX

23.3K

25.5K

430.3K

102.4M

Max Yin @ CyberOrigin AI retweetledi

The Humanoid Hub@TheHumanoidHub·12 Tem

The ORCA v1 hand is a 17-DoF, tendon-driven, humanoid hand with integrated tactile sensors and poppable joints. One fully assembled hand is priced at $5,937.00. The design is open-sourced for non-commercial use.

English

103

620

30.2K

Keşfet

@RemiCadene @MozarellaPesto @lucasmaes_ @DanielXieee @AustinSo16867 @pevidex @aakashgupta @chris_j_paxton