Patrick Yin

34 posts

Patrick Yin

Patrick Yin

@patrickhyin

phd @uwcse, student researcher @microsoft, undergrad @berkeleyai

Seattle, WA Beigetreten Haziran 2023
246 Folgt376 Follower
Angehefteter Tweet
Patrick Yin
Patrick Yin@patrickhyin·
We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)
English
20
88
417
85.7K
Patrick Yin
Patrick Yin@patrickhyin·
@Saketh_Vaishya We use 4 L40S GPUs to train our RL policy for a single task. Take a look at our documentation for more details about the compute we use at every step: #compute-hardware-requirements" target="_blank" rel="nofollow noopener">uw-lab.github.io/UWLab/main/sou…
English
0
0
1
6
Saketh Saketh
Saketh Saketh@Saketh_Vaishya·
@patrickhyin One quick question, How large compute that this require to run at such large scale. ?
English
1
0
1
24
Patrick Yin
Patrick Yin@patrickhyin·
We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)
English
20
88
417
85.7K
Saketh Saketh
Saketh Saketh@Saketh_Vaishya·
@patrickhyin Will it work for tasks like type of industrial tasks like robot doing sanding.?
English
1
0
1
47
Patrick Yin
Patrick Yin@patrickhyin·
I would say First-Try Success Rate (Real) vs. Policy Success Rate (Sim) is a more fair comparison for the sim2real gap in these experiments. Because our policies are trained with broad state coverage, they can recover from failures and retry until success. You can see this behavior in the first ~20 seconds of the full, uncut evaluation videos at the bottom of our website: weirdlabuw.github.io/omnireset/
English
1
0
0
15
Patrick Yin
Patrick Yin@patrickhyin·
@nikamanth Nice catch, Naveen! This is a typo on our end. The real experiments are fully zero-shot sim2real, with no co-training or finetuning on real data.
English
1
0
0
28
Patrick Yin
Patrick Yin@patrickhyin·
Thank you! Yes, on-policy distillation would likely help a lot. The main limitation for us was compute. With 3 high-resolution cameras and high-fidelity rendering, we could only fit ~16–32 environments per 4090, which is orders of magnitude fewer than the 65K+ environments we use for state-based RL. Making RGB DAgger or RL more compute-efficient is definitely a very interesting direction to explore.
English
0
0
1
20
João Araújo
João Araújo@joao_p_araujo·
@patrickhyin Congratulations on this amazing work and the ICLR acceptance! How many GPUs did you use, and how long did it take to train?
English
0
0
0
64
Patrick Yin
Patrick Yin@patrickhyin·
@paravn No real-world demos. 100% sim data!
English
0
0
3
382
Parav
Parav@paravn·
@patrickhyin The env scaling curves are beautiful Do you use any real-world demonstration to train the distilled policy?
English
1
0
2
460
Patrick Yin
Patrick Yin@patrickhyin·
@YouJiacheng That’s right! It turns out if you can reset the robot to a dense amount of interesting states kinematically (not necessarily just true initial states), RL will figure out the dynamics to maximize reward and achieve its goal
English
0
0
3
55
You Jiacheng
You Jiacheng@YouJiacheng·
ok it turns out the reset states are not necessarily a legit initial state.
You Jiacheng tweet media
English
1
0
7
540
Patrick Yin retweetet
Abhishek Gupta
Abhishek Gupta@abhishekunique7·
Excited to share the project that has surprised me the most in the last year! Large-scale RL in simulation, no demos and no reward engineering can solve dynamic, dexterous and contact rich tasks. The learned behaviors are reactive, forceful and use the environment for recovery in ways that are extremely challenging to bake in or teleoperate! You can play with the policies yourself to see: weirdlabuw.github.io/omnireset/ And, the learned behavior transfers to real world robots from RGB camera inputs! So what’s the trick - using simulator resets carefully! Let’s unpack (1/10)
English
17
81
544
61.9K
Patrick Yin
Patrick Yin@patrickhyin·
@alfie14x Thank you! We didn’t try DIAYN for the paper, but it’s definitely an interesting direction to explore!
English
0
0
1
483
Alfred Cueva
Alfred Cueva@alfie14x·
@patrickhyin Very exciting work! Did you consider using DIAYN as another baseline too?
English
1
0
2
553
pfung
pfung@philfung·
@patrickhyin is this video at 1x speed?
San Francisco, CA 🇺🇸 English
1
0
1
618
Patrick Yin
Patrick Yin@patrickhyin·
We’re building UWLab, a shared ecosystem for training robot policies in simulation and transferring them to the real world, built on Isaac Lab. This includes the full OmniReset codebase, along with tasks, algorithms, and deployment in one clean, modular stack: github.com/UW-Lab/UWLab
English
1
4
30
1.9K
Patrick Yin retweetet
Arhan Jain
Arhan Jain@prodarhan·
Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)
English
8
48
235
63.7K
Patrick Yin retweetet
Marius Memmel
Marius Memmel@memmelma·
How can we help *any* image-input policy generalize better to visual and semantic variations? 👉 Meet PEEK 🤖 — a framework that uses VLMs to decide *where* to look and *what* to do, so downstream policies — from ACT, 3D-DA, or even π₀ — generalize more effectively!
English
2
14
40
4.6K
Patrick Yin retweetet
Chuning Zhu
Chuning Zhu@chuning_zhu·
Scaling imitation learning has been bottlenecked by the need for high-quality robot data, which are expensive to collect. But are we utilizing existing data to the fullest extent? A thread (1/11)
English
14
39
253
51.3K
Patrick Yin
Patrick Yin@patrickhyin·
Here is a comparison of time to learn each task with our method vs existing baselines using sim2real transfer, RL finetuning, and/or model-based RL. In each case, our method outperforms baselines in sample efficiency by at least 2x! 🧵5/6
Patrick Yin tweet media
English
1
0
5
361
Patrick Yin
Patrick Yin@patrickhyin·
Current RL finetuning methods are too inefficient to make autonomous real world robot learning tractable. We propose Simulation-Guided Fine-Tuning (SGFT) - a simple, general sim2real framework that extracts structured exploration priors from sim to accelerate real world RL. 🧵1/6
English
1
24
117
9.1K