Brandon retweetledi

Excited to share the project that has surprised me the most in the last year!
Large-scale RL in simulation, no demos and no reward engineering can solve dynamic, dexterous and contact rich tasks. The learned behaviors are reactive, forceful and use the environment for recovery in ways that are extremely challenging to bake in or teleoperate!
You can play with the policies yourself to see: weirdlabuw.github.io/omnireset/
And, the learned behavior transfers to real world robots from RGB camera inputs!
So what’s the trick - using simulator resets carefully! Let’s unpack (1/10)
English






















