Ignat Georgiev

28 posts

Ignat Georgiev

@imgeorgiev

Robot Learning PhD @ Georgia Tech

Atlanta, GA Katılım Mayıs 2018

85 Takip Edilen443 Takipçiler

Sabitlenmiş Tweet

Ignat Georgiev@imgeorgiev·5 Haz

We have a new ICML paper! Adaptive Horizon Actor Critic (AHAC). Joint work with @krishpopdesu @xujie7979 @eric_heiden @animesh_garg AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes and outperforms PPO by 40%. 🧵(1/4)

English

363

52.1K

Ignat Georgiev retweetledi

Danfei Xu@danfei_xu·23 Mar

Introducing EgoVerse: an ecosystem for robot learning from egocentric human data. Built and tested by 4 research labs + 3 industry partners, EgoVerse enables both science and scaling 1300+ hrs, 240 scenes, 2000+ tasks, and growing Dataset design, findings, and ecosystem 🧵

English

158

857

252.1K

Ignat Georgiev retweetledi

Albert Wilcox@albertwilcoxiii·25 Eyl

Heading to Seoul to present this work at CoRL 2025! Looking forward to lots of fun discussions this weekend. If you want to chat sometime please send a DM!

Albert Wilcox@albertwilcoxiii

Imitation learning has seen great success, but IL policies still struggle with OOD observations We designed a 3D backbone, Adapt3R, that can combine with your favorite IL algorithm to enable zero-shot generalization to unseen embodiments and camera viewpoints!

English

750

Ignat Georgiev retweetledi

Albert Wilcox@albertwilcoxiii·24 Tem

English

206

17.6K

Ignat Georgiev retweetledi

Jeremy Collins@jerthesquare_·20 Haz

Robotics data is expensive and slow to collect. Robotics labs and companies spend months just to collect around 10k hours of demonstration data, all while that much video is uploaded to YouTube every 20 minutes. However, none of this video data contains action labels. How can we bridge the gap? AMPLIFY solves this problem by learning Actionless Motion Priors that unlock better sample efficiency, generalization, and scaling for robot learning.

English

531

90.6K

Ignat Georgiev@imgeorgiev·16 Mar

Exciting opportunity!

English

442

Ignat Georgiev retweetledi

Edward Johns@Ed__Johns·18 Şub

I have Post-Doc, PhD, and Research Assistant positions available, for a new project working with me on dexterous robot learning! Come and join me, my team, and our robots, at Imperial College London! 🤖🧑‍💻🇬🇧💂🤖 See robot-learning.uk/dexterous-robo… for more info. Please share/retweet!

English

215

25K

Ignat Georgiev@imgeorgiev·4 Şub

@Stone_Tao Yep fully agree with all you mentioned. I'm not sure if better rewards or task decomposition is the answer though. Here's one more piece of thought. Rewards we can engineer to be smoother but even then how do we make the dynamics smooth?

English

330

Stone Tao@Stone_Tao·3 Şub

very nice post! Still dissecting it a little more deeply myself but the part on non trivial optim. landscape really hits home something that has troubled me a lot Trying to approximate a non-smooth function (or piece-wise smooth) is fairly problematic because the value function can never do well around sharp transitions. I once experimented with a fairly strange function approximation problem that i documented here: blog.stoneztao.com/posts/nn-fnc-a…. Essentially smooth neural nets (lots of tanh activations) are super good smooth function approximators, but fail at where the function has sharp/instantaneous changes. One solution proposed was to combine decision trees to directly figure out where that sharp change is (DTs after all are very good with tabular like data, in many ways the staged rewards is like a tabular data classification problem). In all my PPO experiments with Maniskill the value function loss spikes essentially exactly around when the model learns to get to the next stage of the reward function. Designing reward functions with smoother transitions between stages (something we did not do consistently in ManiSkill) makes PPO more stable. That being said this does speak to the deeper more complex nature of reward function design. I say this blog post reinforces my belief that using LLMs as a way to scale up reward function generation is still very far away (and hence limiting the use-case for dense reward RL for robotics significantly). But still the avenue of RL + sparse reward + demonstration data (less data than used by IL) has strong promise still.

English

2.8K

Ignat Georgiev@imgeorgiev·3 Şub

Behavior Cloning (BC) has been the new hot thing in #Robotics for the past year. I finally sucked my teeth into it and tried to decipher why it has worked so well for problems where RL struggles imgeorgiev.com/2025-01-31-why… Let me know if you have other interesting perspectives!

English

189

19.9K

Ignat Georgiev@imgeorgiev·4 Şub

@adv8p Yeah I had this point brought up on linkedin. Dagger is very much still a theoretical issue but for some reason I don't see it materialize that often now in the world of large(r) data and trajectory chunking. Thoughts?

English

127

underscore advait patel@_advaitpatel·4 Şub

@imgeorgiev this does not quite seem to be true - isn’t the DAgger problem a thing? other than that cool post!

English

251

Ignat Georgiev retweetledi

Utkarsh Mishra@utkarshm0410·1 Kas

How can robots compositionally generalize over multi-object multi-robot tasks for long-horizon planning? At #CoRL2024, we introduce Generative Factor Chaining (GFC), a diffusion-based approach that composes spatial-temporal factors into long-horizon skill plans. (1/7)

English

147

24.1K

Ignat Georgiev@imgeorgiev·30 Tem

@jia_xuhui Just published a paper on multi-task deep RL with large models, so I'm very interested in this opportunity!

English

892

Xuhui Jia@jia_xuhui·29 Tem

Our team at Google DeepMind is hiring a Research intern specializing in Multi-modal Generative Models starting this fall or as soon as possible. The position may be extended, and we prefer candidates with a PhD background in diffusion modeling, image grounding, and/or deep RL.

English

217

35.9K

Ignat Georgiev@imgeorgiev·20 Tem

Completely forgot to give the time - Hall C 4-9, Wednesday, 7:30 - 9:00 am!

English

526

Ignat Georgiev@imgeorgiev·20 Tem

You can find the paper, video and code on our website …aptive-horizon-actor-critic.github.io

English

579

Ignat Georgiev@imgeorgiev·20 Tem

We derive a bound on this sample error and find that backproping through contact. From these insights, we propose AHAC, which dynamically adapts its horizon to avoid differentiating through contact. AHAC scales to 152 action dimension tasks and beats model-free baselines by 40%!

English

755

Ignat Georgiev@imgeorgiev·20 Tem

AHAC is a first-order RL method that uses gradients from the sim to learn fast and also better policies - outperforming PPO by 40%. Differentiable simulations are a powerful framework to scale RL. However, even when given ground-truth dynamics, not all gradients are useful!

English

578

Ignat Georgiev@imgeorgiev·20 Tem

Excited for ICML next week! I'll be presenting Adaptive Horizon Actor Critic - a model-based RL method that learns high dim tasks in minutes using differentiable simulation. Stop by Hall C 4-9 or get in touch if you want to grab a coffee some other time! More on AHAC in the 🧵

English

306

25.2K

Ignat Georgiev@imgeorgiev·4 Tem

This means that PWM can scale to billion-parameter models more effectively. Incredibly, multi-task PWM almost matches the performance of single-task experts like DreamerV3 and SAC Check out the paper, code and models at imgeorgiev.com/pwm

English

693

Ignat Georgiev@imgeorgiev·4 Tem

We also tested PWM on 30 and 80 multi-task settings from dm_control and MetaWorld. After training a single large multi-task world model, we extract policies in <10 min per task using PWM. We surpassed TD-MPC2 by 27% and 8% respectively, without the need for online planning! 🧵

English

760

Ignat Georgiev@imgeorgiev·4 Tem

🔔New Paper - PWM: Policy Learning with Large World Models Joint work with @VarunGiridhar3 @ncklashansen @animesh_garg PWM is a multi-task RL method which solves 80 tasks across different embodiments in <10m per task using world models and first-order gradient optimization🧵

English

150

19.1K

Keşfet

@Stone_Tao @jia_xuhui @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA