Arhan Jain (@prodarhan) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Arhan Jain@prodarhan·18 Ara

Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation. polaris-evals.github.io (1/N 🧵)

English

8

48

235

63.7K

Arhan Jain retweetledi

Patrick Yin@patrickhyin·1d

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

18

87

413

83.2K

Arhan Jain@prodarhan·4 Mar

@KarlPertsch congrats!!

English

0

106

Karl Pertsch@KarlPertsch·4 Mar

This one has been a long time coming: today we’re introducing MEM, an approach for giving VLAs short-term and long-term memory. Memory is such an obvious capability, but adding it isn’t easy (most VLAs today are memory-less). A short thread on challenges, solutions, and the new capabilities MEM unlocks for us.

English

8

10

109

8.9K

Arhan Jain@prodarhan·26 Şub

Very cool use of PolaRiS by @jackyk02 and @XilunZhang1999 as a testbed for their new work in scaling verifiers for robot policies‼️

Jacky Kwok@jackyk02

🧵(6) DROID Eval CoVer-VLA achieves 14% gains in task progress and 9% in success rate on the challenging red-team PolaRiS benchmark. In the pan cleaning task, π₀.₅ shows incorrect intent, grasping the pan handle. In contrast, CoVer-VLA correctly uses sponge to scrub the pan.

English

1

0

8

547

Arhan Jain@prodarhan·26 Şub

@XilunZhang1999 Congrats to you and Jacky, very cool work :)

English

0

1

63

Arhan Jain retweetledi

XilunZhang@XilunZhang1999·25 Şub

Excited to share CoVer-VLA—a fully self-supervised action verifier for VLA models and the first work of my PhD! 🤖 We developed a lightweight verifier that assesses VLA action quality by aligning actions with text-visual features. Best of all? It requires zero failure data and scales seamlessly to large robotics datasets. Beyond verification, CoVer learns aligned action representations via contrastive learning—opening doors for more downstream robotics tasks such as data curation and OOD detection! 🚀 Huge thanks to my amazing collaborators and advisors, and a special shout-out to @prodarhan for the help with PolaRis! Truly an incredible platform. Please check out more details in the post, and try to CoVer your VLA policy!

Jacky Kwok@jackyk02

Introducing CoVer-VLA💫— a contrastive verifier + hierarchical test-time scaling framework for VLAs! - Lightweight 1B verifier 🧠 - Outperforms π₀ & π₀.₅ 🦾 - Trained on Bridge & DROID 🤖 Turns out scaling verification > scaling policy learning for VLA alignment! 🧵👇 🌐 Website: cover-vla.github.io 📄 Paper: arxiv.org/abs/2602.12281 🤗 Models: huggingface.co/cover-vla 💻 Code: github.com/cover-vla/cove…

English

1

8

26

3.8K

Arhan Jain@prodarhan·21 Şub

ZXX

2

0

9

225

Arhan Jain@prodarhan·17 Şub

@ethnlshn don’t cry ethan

English

0

100

Ethan Shen@ethnlshn·17 Şub

😭😭

Andon Labs@andonlabs

In Vending-Bench Arena, Sonnet 4.6 wins over Opus 4.6 by obsessing over monopolies. It tracks competitor pricing fanatically, undercuts competitors by exactly one cent on everything else, and when rivals run low on stock, it undercuts harder to drain them faster.

ART

2

0

4

654

Arhan Jain@prodarhan·16 Şub

@verityw_ yay congrats will!!

English

0

2

137

Arhan Jain retweetledi

Will Chen@verityw_·16 Şub

How can robot policies be trained to best leverage VLMs' CoT reasoning and in-context learning for generalization? The key is Steerable Policies: vision-language-action models that can be flexibly controlled in many ways! steerable-policies.github.io 1/9

English

7

37

142

22.5K

Nicholas Pfaff@NicholasEPfaff·11 Şub

Meet SceneSmith: An agentic system that generates entire simulation-ready environments from a single text prompt. VLM agents collaborate to build scenes with dozens of objects per room, articulated furniture, and full physics properties. We believe environment generation is no longer the bottleneck for scalable robot training and evaluation in simulation. Website: scenesmith.github.io 👇🧵(1/8)

English

18

79

560

71.4K

Chris Paxton@chris_j_paxton·11 Şub

Building sim environments is hard -- especially ones that are useful for assessing progress on real-world performance. With PolaRiS, you can scan an environment for 2-5 minutes and use 2d gaussian splatting + some tools to create a high fidelity simulation.

RoboPapers@RoboPapers

Evaluating robot policies is hard. Ideally, instead of testing every new policy on a real robot, you could test in simulation; but simulations rarely correlate well with real-world performance. In order to make good, useful simulations, you need to spend a great deal of time and effort. That’s where PolaRiS comes in: it’s a toolkit that lets you take a short video of a real scene and turn it into a high-fidelity simulation. It provides what you need to build a good evaluation environment, and it “ships” with off-the-shelf environments that already show strong sim-to-real correlation, meaning that they can be used to inform policy performance. @prodarhan and @KarlPertsch join us to talk about what they have built, why, and how you can use it. Watch Episode #62 of RoboPapers, with @chris_j_paxton and @DJiafei, now!

English

5

10

95

8.3K

Arhan Jain@prodarhan·11 Şub

@shreyasgite @chris_j_paxton haha we tried to do a very early variant of this here, using our vision policy to bootstrap state-based learning in new sim environments, to improve the vision policy downstream casher-robot-learning.github.io/CASHER/ but still some work to do to fill all the gaps!

English

0

23

Shreyas Gite@shreyasgite·11 Şub

@chris_j_paxton So it’s happening. How long do you think before this gets automated? where you scan the env, upload few training episodes and you get a loop. While True: - Policy trained in the freshly minted sim - Deployed on the robot - Eval data back to sim - Sim update - New policy

English

1

0

2

182

Arhan Jain@prodarhan·11 Şub

@NicholasEPfaff release quality is amazing! congrats!

English

0

3

260

Arhan Jain@prodarhan·11 Şub

Checkout our deep dive with @chris_j_paxton and @DJiafei on using simulation to faithfully evaluate generalist policies in digital twins!🤠

RoboPapers@RoboPapers

Evaluating robot policies is hard. Ideally, instead of testing every new policy on a real robot, you could test in simulation; but simulations rarely correlate well with real-world performance. In order to make good, useful simulations, you need to spend a great deal of time and effort. That’s where PolaRiS comes in: it’s a toolkit that lets you take a short video of a real scene and turn it into a high-fidelity simulation. It provides what you need to build a good evaluation environment, and it “ships” with off-the-shelf environments that already show strong sim-to-real correlation, meaning that they can be used to inform policy performance. @prodarhan and @KarlPertsch join us to talk about what they have built, why, and how you can use it. Watch Episode #62 of RoboPapers, with @chris_j_paxton and @DJiafei, now!

English

0

1

14

745

Arhan Jain@prodarhan·11 Şub

@micoolcho @KarlPertsch thanks for having us!

English

0

1

51

Michael Cho - Rbt/Acc@micoolcho·11 Şub

Tks @prodarhan @KarlPertsch for sharing some Sim2Real magic on our pod 🙏

RoboPapers@RoboPapers

Evaluating robot policies is hard. Ideally, instead of testing every new policy on a real robot, you could test in simulation; but simulations rarely correlate well with real-world performance. In order to make good, useful simulations, you need to spend a great deal of time and effort. That’s where PolaRiS comes in: it’s a toolkit that lets you take a short video of a real scene and turn it into a high-fidelity simulation. It provides what you need to build a good evaluation environment, and it “ships” with off-the-shelf environments that already show strong sim-to-real correlation, meaning that they can be used to inform policy performance. @prodarhan and @KarlPertsch join us to talk about what they have built, why, and how you can use it. Watch Episode #62 of RoboPapers, with @chris_j_paxton and @DJiafei, now!

English

2

1

12

1.2K

Arhan Jain retweetledi

Gabriele Tinelli@GabrieleTin·11 Şub

A key bottleneck to fast deployment is understanding how you'll fail in a new env. We need ways to spin up rapid simulations / evaluations of robot policies in new environments if we want to keep high iteration speed. @prodarhan and @KarlPertsch are building something cool.

RoboPapers@RoboPapers

Evaluating robot policies is hard. Ideally, instead of testing every new policy on a real robot, you could test in simulation; but simulations rarely correlate well with real-world performance. In order to make good, useful simulations, you need to spend a great deal of time and effort. That’s where PolaRiS comes in: it’s a toolkit that lets you take a short video of a real scene and turn it into a high-fidelity simulation. It provides what you need to build a good evaluation environment, and it “ships” with off-the-shelf environments that already show strong sim-to-real correlation, meaning that they can be used to inform policy performance. @prodarhan and @KarlPertsch join us to talk about what they have built, why, and how you can use it. Watch Episode #62 of RoboPapers, with @chris_j_paxton and @DJiafei, now!

English

0

1

263

Arhan Jain retweetledi

Jiafei Duan@DJiafei·9 Şub

Fun chat and great insights from @prodarhan and @KarlPertsch on evaluation!

RoboPapers@RoboPapers

Full episode dropping soon! Geeking out with @prodarhan @KarlPertsch on PolaRiS: Scalable Real-to-Sim Evaluations for Generalist Robot Policies polaris-evals.github.io Co-hosted by @chris_j_paxton @DJiafei

English

0

2

10

2.5K

Arhan Jain retweetledi

RoboPapers@RoboPapers·9 Şub

Full episode dropping soon! Geeking out with @prodarhan @KarlPertsch on PolaRiS: Scalable Real-to-Sim Evaluations for Generalist Robot Policies polaris-evals.github.io Co-hosted by @chris_j_paxton @DJiafei