Yevgen Chebotar

45 posts

Yevgen Chebotar

@YevgenChebotar

Robotic foundation models @NVIDIA 🤖 Previously @GoogleDeepMind (RT-2, VLAs, Offline RL) and @Figure_robot (Helix)

Katılım Mart 2017

343 Takip Edilen2.1K Takipçiler

Yevgen Chebotar retweetledi

Ruijie Zheng@ruijie_zheng12·25 Şub

Proud to introduce EgoScale: We pretrained a GR00T VLA model on 20K+ hours of egocentric human video and discovered that robot dexterity can be scaled, not with more robots, but with more human data. A thread on 🧵what we learned. 👇

English

331

93.1K

Yevgen Chebotar retweetledi

Zhengyi “Zen” Luo@zhengyiluo·20 Şub

SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-W… Docs: nvlabs.github.io/GR00T-WholeBod… Site: nvlabs.github.io/GEAR-SONIC/

English

197

852

149.3K

Yevgen Chebotar@YevgenChebotar·4 Şub

We are moving towards world-model based backbones for robotic policies, pre-trained on web-scale videos and outputting robotic actions directly within the same diffusion model! There are new levels of transfer to unseen tasks and motions being unlocked, something that has been missing in VLM-based pre-training ever since the early days of RT-2 VLA models, which although generalizing well to new objects and semantics always struggled with new “verbs” and “motions”. We also observe signs of efficient cross-embodiment transfer to new robots with a little amount of data. There are a lot of optimizations that allow us to run a 14B world action model in real time! My favorite trick is DreamZero-Flash, which while still denoising videos and actions jointly, introduces separate noise schedules, such that actions can be denoised much faster than videos to enable higher frequency control! Check out the website, paper and especially the eval gallery! dreamzero0.github.io

Joel Jang@jang_yoel

Introducing DreamZero 🤖🌎 from @nvidia > A 14B “World Action Model” that achieves zero-shot generalization to unseen tasks & few-shot adaptation to new robots > The key? Jointly predicting video & actions in the same diffusion forward pass Project Page: dreamzero0.github.io 🧵 (1/10)

English

1.7K

Yevgen Chebotar@YevgenChebotar·10 Ara

Excited to join the NVIDIA GEAR team to help build the next generation of open robotic foundation models!

English

154

13.9K

Yevgen Chebotar@YevgenChebotar·20 Şub

We've made great progress on Vision-Language-Action Models for humanoids in our new Helix model! Check out the technical report for more details: figure.ai/news/helix

Figure@Figure_robot

Meet Helix, our in-house AI that reasons like a human Robotics won't get to the home without a step change in capabilities Our robots can now handle virtually any household item:

English

8.1K

Yevgen Chebotar@YevgenChebotar·17 Haz

The path to VLAs lies through VLMs. A very nice intro for everyone interested in working with Vision-Language Models: arxiv.org/abs/2405.17247

English

6.3K

Yevgen Chebotar@YevgenChebotar·16 May

Congrats everyone, 170+ authors and contributors, great to see the robotic field coming together!

Karl Pertsch@KarlPertsch

Our OpenX paper won best paper at ICRA! Congrats to all my co-authors! 🎉🎉 This is an ongoing effort, we recently added new datasets from the community that double the size of the OpenX dataset -- keep 'em coming! :) Check datasets & how to contribute: robotics-transformer-x.github.io

English

2.8K

Yevgen Chebotar@YevgenChebotar·12 Mar

Some personal updates! Excited to join the team @Figure_robot to help building AI for the robot age! 🤖

English

165

66.3K

Yevgen Chebotar@YevgenChebotar·7 Mar

Turns out classification loss works surprisingly well for value-based RL, also some nice gains when used with Q-Transfomer!

Aviral Kumar@aviral_kumar2

Super simple code change to get value-based deep RL scale *much* better w/ big models across the board on Atari games, robotic manipulation w/ transformers, LLM + text games, & even Chess! Just use classification loss (i.e., cross entropy), not MSE!! arxiv.org/abs/2403.03950🧵⬇️

English

5.8K

Yevgen Chebotar@YevgenChebotar·6 Mar

RT-H learns a hierarchy all the way from high-level tasks through low-level “language motions” to robot actions! ✅ Improved performance and generalization through better data sharing ✅ Automated grounded “bottom-up” labeling ✅ Ability to intervene and correct with language

Suneel Belkhale@suneel_belkhale

Is language capable of representing low-level *motions* of a robot? RT-Hierarchy learns an action hierarchy using motions described in language, like “move arm forward” or “close gripper” to improve policy learning. 📜: arxiv.org/abs/2403.01823 🏠: rt-hierarchy.github.io (1/10)

English

4.6K

Yevgen Chebotar retweetledi

Ted Xiao@xiao_ted·28 Kas

Had a great time today with @YevgenChebotar and @QuanVng visiting @USCViterbi to give a talk on “Robot Learning in the Era of Foundation Models”. Slides out soon, packed with works from *just the past 5 months* 🤯 Thanks to @daniel_t_seita for hosting!

English

5.6K

Yevgen Chebotar@YevgenChebotar·9 Kas

Presenting RT-2 poster at CoRL! robotics-transformer2.github.io

Quan Vuong@QuanVng

Pictures taken at RT-2 poster at @DannyDriess requests ; ) @YevgenChebotar We miss you @TianheYu CC @hausman_k

English

3.8K

Yevgen Chebotar retweetledi

USC Thomas Lord Department of Computer Science@CSatUSC·27 Eki

How many @CSatUSC grads does it take to create a breakthrough robot? Three, apparently! Congrats @YevgenChebotar @hausman_k and @ryancjulian who worked on Google DeepMind's revolutionary RT-2 AI model. Find out more ⤵️ viterbischool.usc.edu/news/2023/10/t… @gauravsukhatme @USCViterbi

English

3.7K

Yevgen Chebotar@YevgenChebotar·18 Eki

@_ntr_p_ There was a technical problem with the old website address, the updated website is at qtransformer.github.io

English

Sasha@_shydrie·17 Eki

@YevgenChebotar Would it be possible get the website up again? It is currently unavailable.

English

143

Yevgen Chebotar@YevgenChebotar·8 Eyl

Offline RL strikes back! In our new Q-Transformer paper, we introduce a scalable framework for offline reinforcement learning using Transformers and autoregressive Q-Learning to learn from mixed-quality datasets! Website and paper: q-transformer.github.io 🧵

GIF

English

105

522

210.6K

Yevgen Chebotar@YevgenChebotar·4 Eki

Exciting times for Robot Learning! 60 datasets from 22 different robots and 21 institutions combined in a single Open-X Embodiment data repository, resulting in over 1 million episodes and improved RT-X models! Amazing and a very important collaboration across the world! 🤖🌐

Quan Vuong@QuanVng

RT-X: generalist AI models lead to 50% improvement over RT-1 and 3x improvement over RT-2, our previous best models. 🔥🥳🧵 Project website: robotics-transformer-x.github.io

English

1.4K

Yevgen Chebotar@YevgenChebotar·8 Eyl

Joint work with @QuanVng, @AlexIrpan, @hausman_k, @xf1280, @Yao__Lu, @aviral_kumar2, @TianheYu, @AlexHerzog001, @KarlPertsch, @keerthanpg, @julianibarz, @ofirnachum, @Kanishka_Rao, @chelseabfinn, @svlevine

English

1.9K

Yevgen Chebotar@YevgenChebotar·8 Eyl

Our real robot policies significantly improve upon RT-1 and other baselines when trained on limited amount of human demonstrations by leveraging autonomously collected negatives and dynamic programming properties of Q-learning.

English

2.1K

Yevgen Chebotar@YevgenChebotar·28 Tem

Excited to present RT-2, a large unified Vision-Language-Action model! By converting robot actions to strings, we can directly train large visual-language models to output actions while retaining their web-scale knowledge and generalization capabilities! robotics-transformer2.github.io

Google DeepMind@GoogleDeepMind

Today, we announced 𝗥𝗧-𝟮: a first of its kind vision-language-action model to control robots. 🤖 It learns from both web and robotics data and translates this knowledge into generalised instructions. Find out more: dpmd.ai/introducing-rt2

English

24.2K

Keşfet

@Figure_robot @QuanVng @USCViterbi @daniel_t_seita @CSatUSC @hausman_k @ryancjulian @gauravsukhatme