Entong Su

58 posts

Entong Su

Entong Su

@EntongSu

Ph.D. Student @uwcse @uw_robotics

Washington, USA Katılım Ağustos 2022
1.3K Takip Edilen562 Takipçiler
Sabitlenmiş Tweet
Entong Su
Entong Su@EntongSu·
Pretrained diffusion/flow policies are powerful — but brittle at deployment. We introduce RFS, a data-efficient RL framework that: • steers latent noise for global adaptation • applies residual actions for precise local correction Works in sim and real-world dexterous manipulation 🖐️🤖 👉📄 Paper + videos: entongsu.github.io/rfs/
English
8
26
179
17K
Entong Su retweetledi
Binghao Huang
Binghao Huang@binghao_huang·
🤲Tactile sensing is powerful for robot manipulation, but hardware is still difficult to access, reproduce, and scale. 🎯That’s why we built FlexiTac: an open-source, low-cost, and scalable tactile sensing solution designed for real robotic systems. • Project page: flexitac.github.io We hope FlexiTac can help democratize tactile sensing for robotics research. (1/n)
English
7
37
218
35.9K
Entong Su retweetledi
Patrick Yin
Patrick Yin@patrickhyin·
We’re building UWLab, a shared ecosystem for training robot policies in simulation and transferring them to the real world, built on Isaac Lab. This includes the full OmniReset codebase, along with tasks, algorithms, and deployment in one clean, modular stack: github.com/UW-Lab/UWLab
English
1
4
29
1.8K
Entong Su retweetledi
Patrick Yin
Patrick Yin@patrickhyin·
We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)
English
18
86
410
80.9K
Entong Su retweetledi
Abhishek Gupta
Abhishek Gupta@abhishekunique7·
Excited to share the project that has surprised me the most in the last year! Large-scale RL in simulation, no demos and no reward engineering can solve dynamic, dexterous and contact rich tasks. The learned behaviors are reactive, forceful and use the environment for recovery in ways that are extremely challenging to bake in or teleoperate! You can play with the policies yourself to see: weirdlabuw.github.io/omnireset/ And, the learned behavior transfers to real world robots from RGB camera inputs! So what’s the trick - using simulator resets carefully! Let’s unpack (1/10)
English
17
78
532
59.3K
Entong Su retweetledi
Marius Memmel
Marius Memmel@memmelma·
There’s a discussion going on rn about two recent robotic reward models: TOPReward⛰️ and Robometer🌡️ Which one is better? It depends entirely on your objective! Here is a deep dive into the conceptual differences, strengths, and weaknesses of both. 🧵👇
Marius Memmel tweet media
English
3
18
55
14.1K
Entong Su retweetledi
Jesse Zhang
Jesse Zhang@Jesse_Y_Zhang·
A reward model that works, zero-shot, across robots, tasks, and scenes? Introducing Robometer: Scaling general-purpose robotic reward models with 1M+ trajectories. Enables zero-shot: online/offline/model-based RL, data retrieval + IL, automatic failure detection, and more! 🧵 (1/12)
English
7
104
401
87.4K
Entong Su retweetledi
Carolina Higuera
Carolina Higuera@carohiguerarias·
Most world models for robot manipulation learn physics from pixels. But pixels don’t see it all. Can we ground these models in the "feeling" of contact to disambiguate visually identical states? Visuo-Tactile World Models (VT-WM): robot imagination in a shared space👇
English
3
16
98
9.2K
Entong Su
Entong Su@EntongSu·
⚡ Pretrain broadly. 🎯 Adapt efficiently and precisely. 🔒 Never rewrite the base policy. Residual Flow Steering (RFS) turns pretrained generative policies into reliable, adaptable controllers for dexterous manipulation. 💙 Huge thanks to my collaborators — Tyler Westenbroek, Anusha Nagabandi, @abhishekunique7 and to 🙇 everyone who contributed along the way Happy to chat about flow models, residual RL, or sim-to-real dexterous manipulation 👋
English
0
0
1
331
Entong Su
Entong Su@EntongSu·
Pretrained diffusion/flow policies are powerful — but brittle at deployment. We introduce RFS, a data-efficient RL framework that: • steers latent noise for global adaptation • applies residual actions for precise local correction Works in sim and real-world dexterous manipulation 🖐️🤖 👉📄 Paper + videos: entongsu.github.io/rfs/
English
8
26
179
17K
Entong Su
Entong Su@EntongSu·
On a Franka + LEAP hand with only ~50 corrective real demos: ✅ +30–40% success over zero-shot ✅ strong gains on unseen objects ✅ fixes real failures: loose grasps, mistimed actions, misplacement
Entong Su tweet media
English
0
0
1
243
Entong Su
Entong Su@EntongSu·
Following this pipeline, we evaluate Residual Flow Steering (RFS) on 6 dexterous manipulation tasks in simulation: 🤏 grasping 📦 pick-and-place 🧱 stacking 🫗 pouring ➡️ push-to-grasp ⏳ long-horizon packing Across all tasks, RFS consistently outperforms diffusion/flow RL, offline RL, and residual RL baselines —with especially large gains on long-horizon and high-precision tasks, where efficient recovery and targeted correction matter most.
Entong Su tweet mediaEntong Su tweet media
English
0
0
1
252
Entong Su
Entong Su@EntongSu·
Dexterous systems are hard. Why? 👇 Demos cover only a tiny slice of contacts & motions • Small execution errors compound fast • Deployment-time correction is essential 🎯 Many failures are: • rare • safety-critical • expensive to observe on real robots 👉 Direct data collection is often infeasible. Why simulation matters 🧪 Simulation lets us scale exploration of: • base-policy-induced states • failure modes never seen in demos Things we simply can’t observe systematically in the real world. But simulation isn’t enough ⚡ • Contact & dynamics mismatches create sim-to-real gaps. • Bridging them requires high-precision adaptation at deployment. ✨ Residual Flow Steering (RFS) RFS enables data-efficient real-world adaptation by correcting residual sim-to-real errors using only a small amount of human-corrected data 🤝 Our sim-to-real pipeline 🔁 1️⃣ Train a base policy • limited VR demos • flow matching 2️⃣ Online RL in simulation + RFS • recovery-aware • robust adaptation 3️⃣ Offline RL on the real robot + RFS • minimal human corrections • precise deployment-time fixes
English
0
0
2
281
Entong Su
Entong Su@EntongSu·
💡 Residual Flow Steering (RFS) combines two complementary adaptation mechanisms. Instead of fine-tuning the generative model, RFS learns: ◦Latent noise steering → adjusts the global behavior mode of the policy ◦Residual action control → applies local, high-precision corrections Both components are trained with RL, while the pretrained generative policy remains frozen. This separation enables efficient adaptation without disrupting prior behavior.
English
0
0
2
312
Entong Su
Entong Su@EntongSu·
🧠 Why pretrained generative policies fail at deployment Pretrained diffusion/flow policies often look great in training — Yet break in the real world. ◦ Limited demos → vulnerable to dynamic changes ◦ Distribution shift → small errors snowball ◦ Correcting mistakes without destroying pretrained behavior is hard ➡️ So how should we fix this? 🛠️ Existing solutions: 🔁 Option 1: Policy finetuning: Update the entire diffusion/flow model with RL ◦ Expensive ◦ Unstable ◦ Easily forgets demonstrations 🎚️ Option 2: Policy modulation Instead of rewriting the policy, steer it Two common approaches: ◦ Residual actions → local correction ◦ Latent / noise steering → global behavior shift ✨ Each helps — but only partially 🎥 Toy example below 👇
English
0
0
4
523
Entong Su retweetledi
Jiafei Duan
Jiafei Duan@DJiafei·
Why do generalist robotic models fail when a cup is moved just two inches to the left? It’s not a lack of motor skill, it’s an alignment problem. Today, we introduce VLS: Vision-Language Steering of Pretrained Robot Policies, a training-free framework that guides robot behavior in real time. Check out the project: vision-language-steering.github.io/webpage/ 👇🧵 (Watch till the end: VLS runs uncut, steering pretrained policies across long-horizon tasks.)
English
2
35
193
53.9K
Entong Su retweetledi
Abhishek Gupta
Abhishek Gupta@abhishekunique7·
Excited to put out new work - PolaRiS, a framework for scalable generalist policy evaluation! The idea is simple - short videos of scenes get converted into high-fidelity simulation environments that match the real world. Then you can evaluate your favorite generalist policy on entirely unseen environments purely in simulation, without requiring real-world evaluations 🪇! Simple right? - turns out getting it to really work needs some careful research and engineering. Let’s investigate! (1/8) polaris-evals.github.io
Arhan Jain@prodarhan

Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)

English
4
16
80
14.7K
Entong Su retweetledi
Arhan Jain
Arhan Jain@prodarhan·
Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)
English
8
48
235
63.7K
Entong Su retweetledi
RoboPapers
RoboPapers@RoboPapers·
World models — action-conditioned predictive models of the environment — are an exciting are of research for robots that can be useful both for training and for test-time compute. But video-based world models waste a lot of predictive power on reconstructing pixels, which makes model and data requirements much higher and limits how far out into the future their predictions remain viable. Instead, what if we learned a purely semantic world model, one which predicts which properties will be true about the world after a sequence of actions, without reconstructing the whole images? Jacob Berg tells us more. Watch Episode #53 of RoboPapers now, with @micoolcho and @chris_j_paxton!
English
4
36
223
39.1K