Sabitlenmiş Tweet
Rhoda AI
37 posts

Rhoda AI
@RhodaAI
Building at the frontier of embodied intelligence.
Palo Alto, California Katılım Ağustos 2025
16 Takip Edilen2.5K Takipçiler

The future we're building toward is one where robots adapt to new tasks in seconds.
At Rhoda, we tackle real-world problems through fundamental research.
Full story + technical deep-dive: rhoda.ai/research/direc…
English

How it works: we train on paired human demo and robot execution data. Because our DVA, FutureVision, has long-context visual memory built in (x.com/RhodaAI/status…), we prepend the full human video into the model's context and predict robot actions closed-loop. The model watches a human do something once and understands what to do next.
Rhoda AI@RhodaAI
Here’s something we’ve never seen done before. Real-world tasks are long and ambiguous. Solving them requires visual memory and state tracking. Most robot policies only see the last few frames. Ours doesn't. We put our DVA, FutureVision, to the perfect testbed: the shell game 🐚. The DVA nails it.
English

Teaching a robot a new task typically means stopping operations, collecting teleoperated demonstrations, and retraining. That process takes hours at a minimum. We wanted to know if we could collapse it to seconds — from a single human demo, on the fly, no retraining required.
Early research preview: we can.
English

At Rhoda, we tackle real-world problems through fundamental research. Full story and technical deep-dive: rhoda.ai/research/direc…
English

How? Our DVA implements robot policy as future video generation.
Given the context, the model generates future videos (bottom left) predicting not just the correct cup to pick up, but even the appearance of the hidden object.
Native training on long, continuous videos gives the model built-in long-context memory.
English

Here’s something we’ve never seen done before.
Real-world tasks are long and ambiguous. Solving them requires visual memory and state tracking. Most robot policies only see the last few frames. Ours doesn't.
We put our DVA, FutureVision, to the perfect testbed: the shell game 🐚. The DVA nails it.
English

"I don't think the world is going back to non video based pretraining." Our CEO @startupjag spoke with @bheater at @a3automate on why video is the foundation for robots that actually work in production. bit.ly/4s4cbvD
English

4/ At Rhoda, we solve real-world problems with fundamental research.
Full story + technical deep-dive: rhoda.ai/research/direc…
English

3/ Achieving a 100% autonomous rate in a 2.5-hour continuous run means the model needs to handle all kinds of edge cases. Whether it's pulling a drifted box back into range or re-attempting a failed flip, the model self-corrects in real-time.
-> The trash is out of reach. The robot must reposition the box before attempting another grab.
-> The door won't fall open. The robot recognizes a latch probably wasn't fully released and goes back to fix it.
-> The first flip fails. The robot doesn't hesitate — it goes for a second attempt.
-> The box has drifted too far to reach the latch. The robot pulls it back into range.
English

1/ We are speed running industrial robotics.
It took us just 19 days from the first day of data collection to filming a 2.5-hour continuous run of our model autonomously breaking down industrial containers — zero human intervention.
The data efficiency of our DVA model is fundamentally changing how fast we bring robots out of the lab and into the factory.
Autonomous operation with 3 hours of data collection at a customer factory.
English

At Rhoda, we solve real-world problems with fundamental research.
Full story + technical deep-dive in our technical blog: rhoda.ai/research/direc…
English

Trained on just 11 hours of robot data, our model is surprisingly robust, thanks to web-scale pre-training.
It doesn't just avoid errors; it handles them. If the lid tears off, it finds a new way to grip. If a bearing is stuck, it shakes the bag loose.
Watch our robot navigate through these corner cases: 👇
English