Patrick Yin

34 posts

Patrick Yin

@patrickhyin

phd @uwcse, student researcher @microsoft, undergrad @berkeleyai

Seattle, WA Beigetreten Haziran 2023

246 Folgt376 Follower

Angehefteter Tweet

Patrick Yin@patrickhyin·1d

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

417

85.7K

Patrick Yin@patrickhyin·2h

@Saketh_Vaishya We use 4 L40S GPUs to train our RL policy for a single task. Take a look at our documentation for more details about the compute we use at every step: #compute-hardware-requirements" target="_blank" rel="nofollow noopener">uw-lab.github.io/UWLab/main/sou…

English

Saketh Saketh@Saketh_Vaishya·3h

@patrickhyin One quick question, How large compute that this require to run at such large scale. ?

English

Patrick Yin@patrickhyin·1d

English

417

85.7K

Patrick Yin@patrickhyin·3h

@Saketh_Vaishya I honestly have no idea, but I’d love to try 🙂

English

Saketh Saketh@Saketh_Vaishya·5h

@patrickhyin Will it work for tasks like type of industrial tasks like robot doing sanding.?

English

Patrick Yin@patrickhyin·3h

I would say First-Try Success Rate (Real) vs. Policy Success Rate (Sim) is a more fair comparison for the sim2real gap in these experiments. Because our policies are trained with broad state coverage, they can recover from failures and retry until success. You can see this behavior in the first ~20 seconds of the full, uncut evaluation videos at the bottom of our website: weirdlabuw.github.io/omnireset/

English

Patrick Yin@patrickhyin·3h

@nikamanth Nice catch, Naveen! This is a typo on our end. The real experiments are fully zero-shot sim2real, with no co-training or finetuning on real data.

English

Patrick Yin@patrickhyin·4h

Thank you! Yes, on-policy distillation would likely help a lot. The main limitation for us was compute. With 3 high-resolution cameras and high-fidelity rendering, we could only fit ~16–32 environments per 4090, which is orders of magnitude fewer than the 65K+ environments we use for state-based RL. Making RGB DAgger or RL more compute-efficient is definitely a very interesting direction to explore.

English

Naveen Appiah@nikamanth·8h

@patrickhyin @patrickhyin Amazing work! Why no "on-policy distillation" to RGB observation though ? Shouldn't it be even more robust ?

English

Patrick Yin@patrickhyin·4h

@joao_p_araujo Thank you! The easier tasks take 8 hours to train on 4 L40S GPUs. The harder ones can take take as long as 32 hours. All our training curves are available to see in UWLab! uw-lab.github.io/UWLab/main/sou…

English

João Araújo@joao_p_araujo·9h

@patrickhyin Congratulations on this amazing work and the ICLR acceptance! How many GPUs did you use, and how long did it take to train?

English

Patrick Yin@patrickhyin·1d

@paravn No real-world demos. 100% sim data!

English

382

Parav@paravn·1d

@patrickhyin The env scaling curves are beautiful Do you use any real-world demonstration to train the distilled policy?

English

460

Patrick Yin@patrickhyin·1d

@YouJiacheng That’s right! It turns out if you can reset the robot to a dense amount of interesting states kinematically (not necessarily just true initial states), RL will figure out the dynamics to maximize reward and achieve its goal

English

You Jiacheng@YouJiacheng·1d

ok it turns out the reset states are not necessarily a legit initial state.

English

540

You Jiacheng@YouJiacheng·1d

but if the task inherently requires a specific & small set of initial state, how to perform broad resets? or what if the states near the goal are not physically stable (e.g. in the air) without robot?

Patrick Yin@patrickhyin

RL has struggled to learn contact-rich, dexterous manipulation because exploration breaks down. Key Idea: Reset broadly and scale up the number of parallel environments. OmniReset covers a superset of contact-rich states along the path needed to solve the task (visualized below). With large-scale RL (64K+ environments), the policy learns what states are important and how to connect them into a coherent strategy for solving the task.

English

2.7K

Patrick Yin retweetet

Abhishek Gupta@abhishekunique7·1d

Excited to share the project that has surprised me the most in the last year! Large-scale RL in simulation, no demos and no reward engineering can solve dynamic, dexterous and contact rich tasks. The learned behaviors are reactive, forceful and use the environment for recovery in ways that are extremely challenging to bake in or teleoperate! You can play with the policies yourself to see: weirdlabuw.github.io/omnireset/ And, the learned behavior transfers to real world robots from RGB camera inputs! So what’s the trick - using simulator resets carefully! Let’s unpack (1/10)

English

544

61.9K

Patrick Yin@patrickhyin·1d

@alfie14x Thank you! We didn’t try DIAYN for the paper, but it’s definitely an interesting direction to explore!

English

483

Alfred Cueva@alfie14x·1d

@patrickhyin Very exciting work! Did you consider using DIAYN as another baseline too?

English

553

Patrick Yin@patrickhyin·1d

@philfung yes videos are 1x speed!

English

574

pfung@philfung·1d

@patrickhyin is this video at 1x speed?

San Francisco, CA 🇺🇸 English

618

Patrick Yin@patrickhyin·1d

Website: weirdlabuw.github.io/omnireset/ Paper: arxiv.org/abs/2603.15789 Code: github.com/UW-Lab/UWLab We’ll present at #ICLR2026 this April! Joint work with @TylerW24089, @octi_zhang, @jtran_uw, Ignacio Dagnino, Eeshani Shilamkar, @numfortiapo, Simran Bagaria, Xinlei Liu, Galen Mullins, @Andrey__Kolobov, @abhishekunique7

English

1.5K

Patrick Yin@patrickhyin·1d

We’re building UWLab, a shared ecosystem for training robot policies in simulation and transferring them to the real world, built on Isaac Lab. This includes the full OmniReset codebase, along with tasks, algorithms, and deployment in one clean, modular stack: github.com/UW-Lab/UWLab

English

1.9K

Patrick Yin retweetet

Arhan Jain@prodarhan·18 Ara

Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)

English

235

63.7K

Patrick Yin retweetet

Marius Memmel@memmelma·10 Eki

How can we help *any* image-input policy generalize better to visual and semantic variations? 👉 Meet PEEK 🤖 — a framework that uses VLMs to decide *where* to look and *what* to do, so downstream policies — from ACT, 3D-DA, or even π₀ — generalize more effectively!

English

4.6K

Patrick Yin retweetet

Chuning Zhu@chuning_zhu·10 Nis

Scaling imitation learning has been bottlenecked by the need for high-quality robot data, which are expensive to collect. But are we utilizing existing data to the fullest extent? A thread (1/11)

English

253

51.3K

Patrick Yin@patrickhyin·14 Şub

We will be presenting this paper at #ICLR2025 this April! Website: weirdlabuw.github.io/sgft/ Paper: arxiv.org/abs/2502.02705 Fun collaboration with Tyler Westenbroek, Simran Bagaria, Kevin Huang, @chinganc_rl, @Andrey__Kolobov , @abhishekunique7 🧵6/6

English

327

Patrick Yin@patrickhyin·14 Şub

Here is a comparison of time to learn each task with our method vs existing baselines using sim2real transfer, RL finetuning, and/or model-based RL. In each case, our method outperforms baselines in sample efficiency by at least 2x! 🧵5/6

English

361

Patrick Yin@patrickhyin·14 Şub

Current RL finetuning methods are too inefficient to make autonomous real world robot learning tractable. We propose Simulation-Guided Fine-Tuning (SGFT) - a simple, general sim2real framework that extracts structured exploration priors from sim to accelerate real world RL. 🧵1/6

English

117

9.1K

Entdecken

@Saketh_Vaishya @nikamanth @joao_p_araujo @paravn @YouJiacheng @alfie14x @philfung @elonmusk