RoboPapers

282 posts

RoboPapers banner
RoboPapers

RoboPapers

@RoboPapers

@chris_j_paxton, @micoolcho & @DJiafei geeking out weekly with authors of robotics AI papers. On YouTube / X / Spotify / Substack

Katılım Şubat 2025
2 Takip Edilen5.3K Takipçiler
RoboPapers
RoboPapers@RoboPapers·
Robotics fundamentally involves understanding the dynamics of how things change in the world in response to action and force. This is impossible to learn from static images; instead, it’s far more effective and more data-efficient to learn from video. @elvisnavah joins us to talk about @mimicrobotic. One of the key findings from mimic-video is that pretraining on webscale video allows robots to learn physics priors; as a result, policies train faster, generalize better, and are capable of more impressive dexterity, versus training on static images or image-language pairs as per a VLM. Watch Episode #81 of RoboPapers with @micoolcho and @chris_j_paxton to learn more!
English
2
10
51
45.5K
RoboPapers
RoboPapers@RoboPapers·
Sports like tennis are great examples of the sort of dynamic whole-body interaction that’s possible with humanoid robots. But capturing examples of fast, dynamic interactions from humans is really difficult. Enter LATENT, which uses lower-quality human data plus reinforcement learning to teach a robot to play tennis, able to complete back-and-forth volleys at a human level. LATENT has three steps: (1) collecting imperfect human data like a backswing, (2) using these to learn a latent action space, and (3) they train a high-level policy in simulation which can compose these actions and execute tennis skills on a robot. @josh00_lu and @LianYunrui join us to tell us about their method. Watch Episode #80 of RoboPapers, with @chris_j_paxton and @DJiafei, now to learn more!
English
0
14
56
29.4K
RoboPapers
RoboPapers@RoboPapers·
Training robot foundation models faces two key hurdles: how to get enough data to train an effective model, and how to make sure that new skills can be acquired quickly. The team at @RhodaAI believes that the answer is training Direct Video Action models from web data. Web data is plentiful, to the point where Rhoda can train their base model on hundreds of years of video data. And then, with the addition of robot data, they can quickly adapt it to new tasks with as little as 20 hours of in-domain data, performing complex, multi-step manipulation tasks with their purpose-built video foundation model. @tongzhou_mu @ericryanchan and @changanvr joined us to talk more about their approach. Watch Episode #79 of RoboPapers, with @micoolcho, @chris_j_paxton, and @DJiafei, to learn more!
English
4
12
73
23.9K
RoboPapers
RoboPapers@RoboPapers·
Robotics has changed dramatically over the last eight years. @xiao_ted has been involved in the cutting edge of robot learning through this period, spending those eight years at Google Brain/Google Deepmind. And he’s identified three eras of robot learning. These eras are: - The Era of Existence Proofs - trying different methods like QT-Opt, on-robot RL - The Era of Foundation Models - transitioning to data collection and clean objectives (i.e. supervised learning) - The Era of Scaling - orders of magnitude more data and larger models, enabling reasoning, long-horizon actions, and cross-embodiment transfer Watch Episode 78 of RoboPapers, with @micoolcho and @DJiafei to learn more!
English
4
33
188
36.2K
RoboPapers
RoboPapers@RoboPapers·
World models have many different uses, from evaluation to training data generation to robot planning. DreamDojo is a new foundation world model that allows for impressively general and long-horizon interaction, generating coherent videos for interaction sequences over a minute long. It works in a wide range of environments and even generalizes to previously-unseen environments. We talked to @ShenyuanGao and @willjhliang about how they built DreamDojo, and about what tricks were necessary to scale world model learning on data with sparse action labels, pretraining on 44,000 hours of human data and adapting to a wide variety of robots, environments, and skills. Watch Epsiode #77 of RoboPapers with @micoolcho and @chris_j_paxton now to learn more!
English
0
12
65
19.6K