Entong Su (@EntongSu) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Entong Su@EntongSu·6 Şub

Pretrained diffusion/flow policies are powerful — but brittle at deployment. We introduce RFS, a data-efficient RL framework that: • steers latent noise for global adaptation • applies residual actions for precise local correction Works in sim and real-world dexterous manipulation 🖐️🤖 👉📄 Paper + videos: entongsu.github.io/rfs/

English

8

26

179

17K

Entong Su retweetledi

Binghao Huang@binghao_huang·2d

🤲Tactile sensing is powerful for robot manipulation, but hardware is still difficult to access, reproduce, and scale. 🎯That’s why we built FlexiTac: an open-source, low-cost, and scalable tactile sensing solution designed for real robotic systems. • Project page: flexitac.github.io We hope FlexiTac can help democratize tactile sensing for robotics research. (1/n)

English

7

37

218

35.9K

Entong Su retweetledi

Patrick Yin@patrickhyin·1d

We’re building UWLab, a shared ecosystem for training robot policies in simulation and transferring them to the real world, built on Isaac Lab. This includes the full OmniReset codebase, along with tasks, algorithms, and deployment in one clean, modular stack: github.com/UW-Lab/UWLab

English

1

4

29

1.8K

Entong Su retweetledi

Patrick Yin@patrickhyin·1d

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

18

86

410

80.9K

Entong Su retweetledi

Abhishek Gupta@abhishekunique7·1d

Excited to share the project that has surprised me the most in the last year! Large-scale RL in simulation, no demos and no reward engineering can solve dynamic, dexterous and contact rich tasks. The learned behaviors are reactive, forceful and use the environment for recovery in ways that are extremely challenging to bake in or teleoperate! You can play with the policies yourself to see: weirdlabuw.github.io/omnireset/ And, the learned behavior transfers to real world robots from RGB camera inputs! So what’s the trick - using simulator resets carefully! Let’s unpack (1/10)

English

17

78

532

59.3K

Entong Su@EntongSu·1d

The level of robustness here is pretty amazing! Turns out with the right reset distribution, RL can go way further than we thought.

Patrick Yin@patrickhyin

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

0

5

20

2K

Entong Su retweetledi

Marius Memmel@memmelma·5 Mar

There’s a discussion going on rn about two recent robotic reward models: TOPReward⛰️ and Robometer🌡️ Which one is better? It depends entirely on your objective! Here is a deep dive into the conceptual differences, strengths, and weaknesses of both. 🧵👇

English

3

18

55

14.1K

Entong Su retweetledi

Jesse Zhang@Jesse_Y_Zhang·3 Mar

A reward model that works, zero-shot, across robots, tasks, and scenes? Introducing Robometer: Scaling general-purpose robotic reward models with 1M+ trajectories. Enables zero-shot: online/offline/model-based RL, data retrieval + IL, automatic failure detection, and more! 🧵 (1/12)

English

7

104

401

87.4K

Entong Su retweetledi

Carolina Higuera@carohiguerarias·27 Şub

Most world models for robot manipulation learn physics from pixels. But pixels don’t see it all. Can we ground these models in the "feeling" of contact to disambiguate visually identical states? Visuo-Tactile World Models (VT-WM): robot imagination in a shared space👇

English

3

16

98

9.2K

Entong Su@EntongSu·6 Şub

⚡ Pretrain broadly. 🎯 Adapt efficiently and precisely. 🔒 Never rewrite the base policy. Residual Flow Steering (RFS) turns pretrained generative policies into reliable, adaptable controllers for dexterous manipulation. 💙 Huge thanks to my collaborators — Tyler Westenbroek, Anusha Nagabandi, @abhishekunique7 and to 🙇 everyone who contributed along the way Happy to chat about flow models, residual RL, or sim-to-real dexterous manipulation 👋

English

0

1

331

Entong Su@EntongSu·6 Şub

Pretrained diffusion/flow policies are powerful — but brittle at deployment. We introduce RFS, a data-efficient RL framework that: • steers latent noise for global adaptation • applies residual actions for precise local correction Works in sim and real-world dexterous manipulation 🖐️🤖 👉📄 Paper + videos: entongsu.github.io/rfs/

English

8

26

179

17K

Entong Su@EntongSu·6 Şub

On a Franka + LEAP hand with only ~50 corrective real demos: ✅ +30–40% success over zero-shot ✅ strong gains on unseen objects ✅ fixes real failures: loose grasps, mistimed actions, misplacement

English

0

1

243

Entong Su@EntongSu·6 Şub

Following this pipeline, we evaluate Residual Flow Steering (RFS) on 6 dexterous manipulation tasks in simulation: 🤏 grasping 📦 pick-and-place 🧱 stacking 🫗 pouring ➡️ push-to-grasp ⏳ long-horizon packing Across all tasks, RFS consistently outperforms diffusion/flow RL, offline RL, and residual RL baselines —with especially large gains on long-horizon and high-precision tasks, where efficient recovery and targeted correction matter most.

English

0

1

252

Entong Su@EntongSu·6 Şub

Dexterous systems are hard. Why? 👇 Demos cover only a tiny slice of contacts & motions • Small execution errors compound fast • Deployment-time correction is essential 🎯 Many failures are: • rare • safety-critical • expensive to observe on real robots 👉 Direct data collection is often infeasible. Why simulation matters 🧪 Simulation lets us scale exploration of: • base-policy-induced states • failure modes never seen in demos Things we simply can’t observe systematically in the real world. But simulation isn’t enough ⚡ • Contact & dynamics mismatches create sim-to-real gaps. • Bridging them requires high-precision adaptation at deployment. ✨ Residual Flow Steering (RFS) RFS enables data-efficient real-world adaptation by correcting residual sim-to-real errors using only a small amount of human-corrected data 🤝 Our sim-to-real pipeline 🔁 1️⃣ Train a base policy • limited VR demos • flow matching 2️⃣ Online RL in simulation + RFS • recovery-aware • robust adaptation 3️⃣ Offline RL on the real robot + RFS • minimal human corrections • precise deployment-time fixes

English

0

2

281

Entong Su@EntongSu·6 Şub

💡 Residual Flow Steering (RFS) combines two complementary adaptation mechanisms. Instead of fine-tuning the generative model, RFS learns: ◦Latent noise steering → adjusts the global behavior mode of the policy ◦Residual action control → applies local, high-precision corrections Both components are trained with RL, while the pretrained generative policy remains frozen. This separation enables efficient adaptation without disrupting prior behavior.

English

0

2

312

Entong Su@EntongSu·6 Şub

🧠 Why pretrained generative policies fail at deployment Pretrained diffusion/flow policies often look great in training — Yet break in the real world. ◦ Limited demos → vulnerable to dynamic changes ◦ Distribution shift → small errors snowball ◦ Correcting mistakes without destroying pretrained behavior is hard ➡️ So how should we fix this? 🛠️ Existing solutions: 🔁 Option 1: Policy finetuning: Update the entire diffusion/flow model with RL ◦ Expensive ◦ Unstable ◦ Easily forgets demonstrations 🎚️ Option 2: Policy modulation Instead of rewriting the policy, steer it Two common approaches: ◦ Residual actions → local correction ◦ Latent / noise steering → global behavior shift ✨ Each helps — but only partially 🎥 Toy example below 👇

English

0

4

523

Entong Su retweetledi

Jiafei Duan@DJiafei·5 Şub

Why do generalist robotic models fail when a cup is moved just two inches to the left? It’s not a lack of motor skill, it’s an alignment problem. Today, we introduce VLS: Vision-Language Steering of Pretrained Robot Policies, a training-free framework that guides robot behavior in real time. Check out the project: vision-language-steering.github.io/webpage/ 👇🧵 (Watch till the end: VLS runs uncut, steering pretrained policies across long-horizon tasks.)

English

2

35

193

53.9K

Entong Su retweetledi

Carrie Yuan 袁嘉仪@carrieyuanjiayi·21 Oca

We let an AI design its own red-teaming systems. It gets quite good at it. Too good as one might say. It achieves 100% zero-shot attack success rate on GPT-3.5-Turbo and GPT-4o-mini! Introducing AgenticRed: Paper: arxiv.org/abs/2601.13518 Website: yuanjiayiy.github.io/AgenticRed

English

1

7

23

8.1K

Entong Su retweetledi

Abhishek Gupta@abhishekunique7·19 Ara

Excited to put out new work - PolaRiS, a framework for scalable generalist policy evaluation! The idea is simple - short videos of scenes get converted into high-fidelity simulation environments that match the real world. Then you can evaluate your favorite generalist policy on entirely unseen environments purely in simulation, without requiring real-world evaluations 🪇! Simple right? - turns out getting it to really work needs some careful research and engineering. Let’s investigate! (1/8) polaris-evals.github.io

Arhan Jain@prodarhan

Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)

English

4

16

80

14.7K

Entong Su retweetledi

Arhan Jain@prodarhan·18 Ara

Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)

English

8

48

235

63.7K

Entong Su retweetledi

RoboPapers@RoboPapers·15 Ara

World models — action-conditioned predictive models of the environment — are an exciting are of research for robots that can be useful both for training and for test-time compute. But video-based world models waste a lot of predictive power on reconstructing pixels, which makes model and data requirements much higher and limits how far out into the future their predictions remain viable. Instead, what if we learned a purely semantic world model, one which predicts which properties will be true about the world after a sequence of actions, without reconstructing the whole images? Jacob Berg tells us more. Watch Episode #53 of RoboPapers now, with @micoolcho and @chris_j_paxton!

English

4

36

223

39.1K

Entong Su

Keşfet