Suning Huang

89 posts

Suning Huang banner
Suning Huang

Suning Huang

@suning_huang

PhD @Stanford|BEng @Tsinghua_Uni. Learning to teach robots to learn. Nice to meet you ;)

Palo Alto Katılım Ocak 2024
281 Takip Edilen455 Takipçiler
Sabitlenmiş Tweet
Suning Huang
Suning Huang@suning_huang·
🤖Low-data post-training can teach a VLA policy a new robot skill. But it also makes it too attached to the training demos. We call this lock-in🔒: the policy can execute the post-training task, yet fails to respond to seemingly obvious prompt changes. DeLock preserves steerability using only the policy’s own pretrained knowledge. No extra supervision needed!🚀🚀🚀 #Robotics #AI #EmbodiedAI #VLA
English
5
44
175
28.7K
Suning Huang retweetledi
Suning Huang retweetledi
Robots Digest 🤖
Robots Digest 🤖@robotsdigest·
Ever fine-tuned a VLA policy on a small demo dataset and it suddenly stops listening to new instructions? This paper calls it lock-in. The model just repeats what it saw during training like always picking bread even when you say apple Low-data post-training quietly kills steerability The fix? DeLock is surprisingly simple and clever
English
1
17
61
4.4K
Suning Huang
Suning Huang@suning_huang·
Thanks for the thoughtful point! DeLock is not meant to replace SFT or make arbitrary unseen skills work out of the box. It aims to reduce the combinatorial burden of SFT by leveraging the pretrained backbone to connect post-trained skills with related novel instructions, so we don’t need demos for every variation. So its effectiveness depends on both the similarity between the trained and novel tasks, and how much the VLA backbone already knows about the relevant concepts/skills.
English
0
0
0
73
Far
Far@FarAICoder·
@suning_huang de-locking sounds nice but i bet it still crashes if you ask the robot to hold a coffee instead of a wrench
English
1
0
0
112
Suning Huang retweetledi
Suning Huang
Suning Huang@suning_huang·
🤖Low-data post-training can teach a VLA policy a new robot skill. But it also makes it too attached to the training demos. We call this lock-in🔒: the policy can execute the post-training task, yet fails to respond to seemingly obvious prompt changes. DeLock preserves steerability using only the policy’s own pretrained knowledge. No extra supervision needed!🚀🚀🚀 #Robotics #AI #EmbodiedAI #VLA
English
5
44
175
28.7K
Suning Huang retweetledi
Suning Huang retweetledi
Suning Huang retweetledi
Mac Schwager
Mac Schwager@MacSchwager·
How well to VLAs generalize to new prompts after SFT? If you've worked with them, you'll know the answer. The problem is the fine tuning methodology, not the model. Suning has a clever and effective solution that requires no new data, just better SFT and inference methods. 👇
Suning Huang@suning_huang

🤖Low-data post-training can teach a VLA policy a new robot skill. But it also makes it too attached to the training demos. We call this lock-in🔒: the policy can execute the post-training task, yet fails to respond to seemingly obvious prompt changes. DeLock preserves steerability using only the policy’s own pretrained knowledge. No extra supervision needed!🚀🚀🚀 #Robotics #AI #EmbodiedAI #VLA

English
0
4
20
3K
Suning Huang retweetledi
Suning Huang retweetledi
Suning Huang retweetledi
Jeannette Bohg
Jeannette Bohg@leto__jean·
Ever post-trained a VLA and watched it ignore every novel instruction? We call this lock-in. Prior fixes bloat datasets with foundation model labels. 🔓DeLock is different: regularized finetuning + contrastive prompts at inference. Result: Pretraining priors preserved.
Suning Huang@suning_huang

🤖Low-data post-training can teach a VLA policy a new robot skill. But it also makes it too attached to the training demos. We call this lock-in🔒: the policy can execute the post-training task, yet fails to respond to seemingly obvious prompt changes. DeLock preserves steerability using only the policy’s own pretrained knowledge. No extra supervision needed!🚀🚀🚀 #Robotics #AI #EmbodiedAI #VLA

English
0
5
34
7K
Suning Huang retweetledi
Suning Huang retweetledi
Suning Huang retweetledi