Rhoda AI

37 posts

Rhoda AI banner
Rhoda AI

Rhoda AI

@RhodaAI

Building at the frontier of embodied intelligence.

Palo Alto, California Katılım Ağustos 2025
16 Takip Edilen2.5K Takipçiler
Sabitlenmiş Tweet
Rhoda AI
Rhoda AI@RhodaAI·
To bring generalist intelligent robots to the real world, we have to overcome the data scarcity problem. At Rhoda, we are solving it by reformulating robot policies as video generation. Today, we introduce the Direct Video-Action Model (DVA)
English
18
38
206
64.7K
Rhoda AI
Rhoda AI@RhodaAI·
How? Existing video models aren't optimized for real-time inference. Instead of fine-tuning off-the-shelf video models, we co-design inference-aware model architectures and model-aware inference optimizations from the ground up.
English
1
0
18
1.3K
Rhoda AI
Rhoda AI@RhodaAI·
Can a large foundation video model run as a real-time robot policy at the edge, on a single RTX 5090? • ✅ No quantization • ✅ No distillation • ✅ Full denoising (all the way from noise to clean video) We just proved it's possible. 👇🎬
English
11
32
209
32.3K
Rhoda AI
Rhoda AI@RhodaAI·
The future we're building toward is one where robots adapt to new tasks in seconds. At Rhoda, we tackle real-world problems through fundamental research. Full story + technical deep-dive: rhoda.ai/research/direc…
English
0
1
4
447
Rhoda AI
Rhoda AI@RhodaAI·
How it works: we train on paired human demo and robot execution data. Because our DVA, FutureVision, has long-context visual memory built in (x.com/RhodaAI/status…), we prepend the full human video into the model's context and predict robot actions closed-loop. The model watches a human do something once and understands what to do next.
Rhoda AI@RhodaAI

Here’s something we’ve never seen done before. Real-world tasks are long and ambiguous. Solving them requires visual memory and state tracking. Most robot policies only see the last few frames. Ours doesn't. We put our DVA, FutureVision, to the perfect testbed: the shell game 🐚. The DVA nails it.

English
1
0
5
1.2K
Rhoda AI
Rhoda AI@RhodaAI·
Teaching a robot a new task typically means stopping operations, collecting teleoperated demonstrations, and retraining. That process takes hours at a minimum. We wanted to know if we could collapse it to seconds — from a single human demo, on the fly, no retraining required. Early research preview: we can.
English
9
15
84
6.6K
Rhoda AI
Rhoda AI@RhodaAI·
At Rhoda, we tackle real-world problems through fundamental research. Full story and technical deep-dive: rhoda.ai/research/direc…
English
0
0
11
1.6K
Rhoda AI
Rhoda AI@RhodaAI·
How? Our DVA implements robot policy as future video generation. Given the context, the model generates future videos (bottom left) predicting not just the correct cup to pick up, but even the appearance of the hidden object. Native training on long, continuous videos gives the model built-in long-context memory.
English
1
0
9
2K
Rhoda AI
Rhoda AI@RhodaAI·
Here’s something we’ve never seen done before. Real-world tasks are long and ambiguous. Solving them requires visual memory and state tracking. Most robot policies only see the last few frames. Ours doesn't. We put our DVA, FutureVision, to the perfect testbed: the shell game 🐚. The DVA nails it.
English
8
38
234
82K
Rhoda AI
Rhoda AI@RhodaAI·
"I don't think the world is going back to non video based pretraining." Our CEO @startupjag spoke with @bheater at @a3automate  on why video is the foundation for robots that actually work in production. bit.ly/4s4cbvD
English
0
0
11
1.3K
Rhoda AI
Rhoda AI@RhodaAI·
3/ Achieving a 100% autonomous rate in a 2.5-hour continuous run means the model needs to handle all kinds of edge cases. Whether it's pulling a drifted box back into range or re-attempting a failed flip, the model self-corrects in real-time. -> The trash is out of reach. The robot must reposition the box before attempting another grab. -> The door won't fall open. The robot recognizes a latch probably wasn't fully released and goes back to fix it. -> The first flip fails. The robot doesn't hesitate — it goes for a second attempt. -> The box has drifted too far to reach the latch. The robot pulls it back into range.
English
1
0
11
4.1K
Rhoda AI
Rhoda AI@RhodaAI·
1/ We are speed running industrial robotics. It took us just 19 days from the first day of data collection to filming a 2.5-hour continuous run of our model autonomously breaking down industrial containers — zero human intervention. The data efficiency of our DVA model is fundamentally changing how fast we bring robots out of the lab and into the factory. Autonomous operation with 3 hours of data collection at a customer factory.
English
11
37
167
24.5K
Rhoda AI
Rhoda AI@RhodaAI·
At Rhoda, we solve real-world problems with fundamental research. Full story + technical deep-dive in our technical blog: rhoda.ai/research/direc…
English
0
0
1
603
Rhoda AI
Rhoda AI@RhodaAI·
Trained on just 11 hours of robot data, our model is surprisingly robust, thanks to web-scale pre-training. It doesn't just avoid errors; it handles them. If the lid tears off, it finds a new way to grip. If a bearing is stuck, it shakes the bag loose. Watch our robot navigate through these corner cases: 👇
English
4
1
9
746
Rhoda AI
Rhoda AI@RhodaAI·
Most robot demos are “golden runs”: a perfect take selected from many attempts. But real-world deployment is about Continuous Operation. Watch our DVA model tackle a real-world decanting task for 1.5 hours straight: Uncut, Zero human intervention. 🧵👇
English
4
10
44
3.9K