Yash Jangir

25 posts

Yash Jangir

Yash Jangir

@off_jangir

Robotics @CarnegieMellon @CMU_robotics with @katerinafragiad and @ybisk | Noobie at Twitter

Pittsburgh, PA, USA Katılım Ağustos 2020
398 Takip Edilen116 Takipçiler
Sabitlenmiş Tweet
Yash Jangir
Yash Jangir@off_jangir·
🤖 What would LMArena for robotics look like? Introducing RobotArena ∞ We turn real videos into simulated environments and evaluate robot policies at scale using VLM scoring + human preferences A scalable benchmark for robot generalists 🔗 robotarenainf.github.io Details 🧵👇
English
4
22
110
16.9K
Yash Jangir
Yash Jangir@off_jangir·
Unprecedented Evaluation Scale We evaluate robot policies across ✅ hundreds of environments ✅ thousands of perturbations ✅ thousands of VLM and human preference evaluations VLAs perform well in BridgeSim, but degrade sharply in DROIDSim and RH20T. VLMs and humans agree on these rankings.
Yash Jangir tweet media
English
1
0
2
378
Yash Jangir
Yash Jangir@off_jangir·
🤖 What would LMArena for robotics look like? Introducing RobotArena ∞ We turn real videos into simulated environments and evaluate robot policies at scale using VLM scoring + human preferences A scalable benchmark for robot generalists 🔗 robotarenainf.github.io Details 🧵👇
English
4
22
110
16.9K
Yash Jangir
Yash Jangir@off_jangir·
Amazing work from @YunzhuLiYZ’s group. Check it out!!
Yunzhu Li@YunzhuLiYZ

For a long time, I was skeptical about action-conditioned video prediction for robotics. Many models look impressive, but once you ask them to handle long-horizon manipulation with real physical interaction, things quickly fall apart (e.g., Genie is amazing but mostly focused on navigation). This project changed my mind. I'm beyond excited to share Interactive World Simulator, a project we have been working on for the past ~1.5 years 🤖 One of the first world models that produces convincing results for long-horizon robotic manipulation involving complex physical interactions, across a diverse range of objects (rigid objects, deformables, ropes, object piles). It directly unlocks scalable data generation for robotic policy training and policy evaluation. Try it yourself (no installation needed): yixuanwang.me/interactive_wo… Play directly with the simulator in your browser. Key Takeaways: 1️⃣ 15 Hz long-horizon action-conditioned video prediction for 10+ minutes on a single RTX 4090 GPU 2️⃣ Visual and dynamic fidelity: people often ask how much sim data equals one real data point. In our experiments, it turns out to be close to one-to-one using the Interactive World Simulator 3️⃣ Stress testing matters: we emphasize interactive stress testing to understand robustness and stability and to build trust in the simulator 4️⃣ The model is trained with only ~6 hours of real-world random interaction data on a single GPU. Imagine what happens if we scale this 1000× or even 1M× Huge credit to @YXWangBot, who led this effort with countless hours of work on data collection, training recipes, and system design. I'm incredibly proud of the work he did here! Enjoy the demos and videos. We also fully open-sourced the codebase for anyone interested in applying this to their own tasks. #Robotics #RobotLearning #WorldModels #EmbodiedAI

English
0
0
1
152
Jiafei Duan
Jiafei Duan@DJiafei·
@off_jangir 1)yes, as shown in the paper. 2) We set the temperature to be 0, so the reward is stable with minimal difference between repeated runs.
English
1
0
1
51
Jiafei Duan
Jiafei Duan@DJiafei·
Instead of asking a VLM to output progress, it reads the model’s internal belief directly from token logits. No in-context learning. No fine-tuning. No reward training. 📈 We introduce: TOPReward, a zero-shot reward modeling approach for robotics using token probabilities from pretrained video VLMs. The simplest way of doing reward modelling for robotics! Project: topreward.github.io/webpage/ 🧵👇
English
12
64
362
105.7K
Yash Jangir
Yash Jangir@off_jangir·
@CSProfKGD Very interesting that the model also changed the expression to match the second image.
English
0
0
1
44
Yash Jangir
Yash Jangir@off_jangir·
Very insightful @DanielXieee
Quanting Xie@DanielXieee

Why does manipulation lag so far behind locomotion? New post on one piece we don't talk about enough: The gearbox. The Gap You've probably seen those dancing humanoid robots from Chinese New Year. Locomotion isn't entirely solved; but clearly it's on a trajectory. But we haven't seen anything close for manipulation. 𝗪𝗵𝘆? When sim-to-real transfer fails, the instinct is to blame the algorithm. Train bigger networks. Crank up domain randomization. Those approaches have made real progress; we don't deny that. But we started wondering: are we treating the symptom or the disease? The Hardware Bottleneck: Fingers are too small for powerful motors. So most hands use massive gearboxes (200:1, 288:1) to get enough torque. But those gearboxes break everything manipulation needs:   • Stiction and backlash are complex to simulate. Policies trained on smooth physics hallucinate when they hit that reality.   • Reflected inertia scales as N². At large gear ratio, the finger hits with sledgehammer momentum.   • Friction blocks force information. The hand becomes blind. And they're the first thing to break. What we are trying to build at Origami, we cut the gear ratio from 288:1 to 15:1 using axial flux motors and thermal optimization. The transmission becomes more transparent: backdrivable, low friction, forces propagate to motor current. Early signs are encouraging. Still running quantitative benchmarks. Why Interactive? I love how Science Center uses interactive devices to explain complex ideas. I want to borrow this concept and help people understand the hard problems in robotics better visually. The post has demos where you can toggle friction, slide gear ratios, watch the sim-to-real gap widen in real-time. What's inside:   • Interactive demos (friction curves, N² scaling, contact patterns)   • Comparison table: 14 robot hands by sim-to-real gap and force transparency   • The math behind why low-ratio matters Read it here: origami-robotics.com/blog/dexterity… We're not claiming we've solved dexterity. The deadlock has many pieces. But we think this one's foundational. Curious what you think.

English
0
0
0
156
Yash Jangir
Yash Jangir@off_jangir·
@DJiafei 1) Do you think this approach might improve correlation with human preference / annotation scores as well? 2) Just curious, did you evaluate how stable the reward is across repeated runs on the same trajectory?
English
1
0
0
49
Jiafei Duan
Jiafei Duan@DJiafei·
@off_jangir RobotArena inf is cool work! We found same trend with GVL style for open source model. But with our new insight, open source model could become decently good too!
English
1
0
2
256
Yash Jangir retweetledi
Yash Jangir retweetledi
Zhengzhong Tu
Zhengzhong Tu@_vztu·
Dear NeurIPS reviewers, please be reminded to delete the GPT prompts next time :)
Zhengzhong Tu tweet media
English
24
59
1K
89.8K
Yash Jangir retweetledi
Mihir Prabhudesai
Mihir Prabhudesai@mihirp98·
Extrapolating this trend to robotics, i believe if one is doing sim2real they should prefer Autoregressive > Diffusion (compute bottleneck). But if they are doing real world training then Autoregressive < Diffusion (data bottleneck).. We don't empirically validate this for robotics domain, but this would be my guess.
Mihir Prabhudesai@mihirp98

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

English
3
10
125
16.9K
Yash Jangir retweetledi
Sungjae Park
Sungjae Park@sungj1026·
Introducing DemoDiffusion: A simple approach for enabling one-shot imitation of human demonstration, using a pre-trained ‘generalist’ diffusion-style (diffusion, flow-matching, etc) policy. No additional training, no paired human-robot data, no online RL. 🧵(1/n)
English
1
10
49
7.1K