Ajay Sridhar

49 posts

Ajay Sridhar

@ajaysridhar0

cs phd student @StanfordAILab

Katılım Haziran 2023

314 Takip Edilen358 Takipçiler

Sabitlenmiş Tweet

Ajay Sridhar@ajaysridhar0·26 Eki

VLAs are great, but most lack long-term memory humans use for everyday tasks. This is a critical gap for solving complex, long-horizon problems. Introducing MemER: Scaling Up Memory for Robot Control via Experience Retrieval. A thread 🧵 (1/8)

English

318

42.8K

Ajay Sridhar@ajaysridhar0·2d

Excited to present MemER at ICLR 2026 in April. The code is now open source: github.com/memer-policy/m…

Ajay Sridhar@ajaysridhar0

English

4.1K

Ajay Sridhar retweetledi

Patrick Yin@patrickhyin·26 Mar

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

464

104.3K

Ajay Sridhar retweetledi

Jensen Gao@jensen_gao·16 Mar

Understanding generalization in robotics can be tricky. If a robot does the dishes in a new kitchen, does this require new behavior, or is the countertop just a new color? Excited to share RADAR 📡, work I did at @GoogleDeepMind towards better characterizing robot evaluations.

GIF

English

12K

Ajay Sridhar retweetledi

Yinpei Dai@YinpeiD·9 Mar

Robot memory methods are growing fast, but systematic evaluation is largely lacking. 📉 Introducing RoboMME: a new benchmark for memory-augmented robotic manipulation! 🤖🧠 Featuring 16 tasks across temporal, spatial, object, and procedural memory 🔗 robomme.github.io

English

222

51.9K

Ajay Sridhar retweetledi

Marcel Torné@marceltornev·4 Mar

We equipped PI policies with memory! And taught our robots to do long-horizon real world tasks such as preparing the items for a recipe, cooking a grilled cheese and cleaning the kitchen!

Physical Intelligence@physical_int

We’ve developed a memory system for our models that provides both short-term visual memory and long-term semantic memory. Our approach allows us to train robots to perform long and complex tasks, like cleaning up a kitchen or preparing a grilled cheese sandwich from scratch 👇

English

8.7K

Ajay Sridhar retweetledi

Noriaki Hirose@Noriaki_Hirose·27 Şub

My journey at UC Berkeley is coming to an end as I return to Japan. Over the past four years, I’ve had the privilege of collaborating with @svlevine and his students and I sincerely appreciate their support and contributions. I’ve learned and grown tremendously through our works

English

103

5.5K

Ajay Sridhar retweetledi

Yanjiang Guo@GYanjiang·17 Şub

Excited to share VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model We explore improving VLA inside a learned world model, and find that the key is to jointly improve VLA and WM! Website: sites.google.com/view/vlaw-arxiv

English

269

56.9K

Ajay Sridhar retweetledi

Noriaki Hirose@Noriaki_Hirose·17 Şub

Robotic foundation models generalize well—but high inference latency limits real-time deployment. 🚀 AsyncVLA enables real-time control of large robotic models, even under network delays. Great collaboration with @CatGlossop ,@shahdhruv_ & @svlevine ! #EmbodiedAI #EdgeAI

GIF

English

322

54.1K

Ajay Sridhar retweetledi

Perry Dong@perryadong·10 Şub

Reinforcement learning doesn't scale like supervised learning—yet We introduce Transformer Q-Learning (TQL): a method that unlocks scaling of transformer-based value functions in RL We show that value-based RL can also achieve performance gains through scale (1/7)

English

265

53.2K

Ajay Sridhar retweetledi

Jie Wang@JieWang_ZJUI·13 Oca

VLAs nowadays enable robotic manipulation to perform impressive tasks like folding clothes, making coffee, and cleaning dishes. However, surprisingly, most VLAs lack memory. Unlike their close relatives LLMs, VLAs have no context window and no access to history. This causes them to repeatedly fail in the same way without learning from online experience. But why? Why not simply extend the context window like LLMs? It's not that we don't want to -- it's because it's extremely difficult. Here, I share a talk by @chelseabfinn at NeurIPS that scope the challenges in developing long-horizon autonomy for embodied agents. At the end, there's a reading list on memory for robotics. ⭐

English

363

18.3K

Ajay Sridhar retweetledi

Suvir Mirchandani@suvir_m·26 Ara

Data collection remains a bottleneck in imitation learning for robotics: it’s tedious & often needs access to a robot. Can we make the data collection process more accessible and engaging? We introduce RoboCade, a platform for gamifying remote robot data collection 🎮🤖 (1/6)

English

18.2K

Ajay Sridhar retweetledi

Paul Zhou@zhiyuan_zhou_·19 Ara

Do you ever find finetuning VLA overfits to the target task, to the point where generalist ability is lost and even minor deviations beyond the SFT data break the policy? We found an extremely simple solution: directly merge the base and finetuned policy in weight space 🤯 👇🧵

English

374

96K

Ajay Sridhar@ajaysridhar0·18 Ara

Had a great time chatting with @micoolcho and @chris_j_paxton about memory for robot policies!

RoboPapers@RoboPapers

Most robot policies today still largely lack memory: they make all their decisions based on what they can see right now. MemER aims to change that by learning which frames are important; this lets it deal with tasks like object search. @ajaysridhar0, @jenpan_, and @satviks107Sharma tell us about how to achieve this fundamental capability for long-horizon task execution. Watch Episode #54 of RoboPapers with @micoolcho and @chris_j_paxton to learn more!

English

566

Ajay Sridhar retweetledi

Jenn Grannen@jenngrannen·26 Kas

Meet Scanford 📚🤖: a robot that improves foundation models by doing useful work in the wild. Deployed for 2 weeks in the Stanford East Asia Library, Scanford scans books, helps librarians, and continually improves the VLM it relies on. 🔗 scanford-robot.github.io 🧵1/8

English

479

92.2K

Ajay Sridhar retweetledi

Dhruv Shah@shahdhruv_·22 Kas

My group @Princeton is hiring! We are looking for strong postdoc and PhD candidates to join our quest for intelligent robots in open-world environments. Read more below and get in touch 🤖🐅🧡 prism.robo.princeton.edu

English

141

857

308.9K

Ajay Sridhar retweetledi

Mateo Guaman Castro@mateoguaman·27 Eki

How can we create a single navigation policy that works for different robots in diverse environments AND can reach navigation goals with high precision? Happy to share our new paper, "VAMOS: A Hierarchical Vision-Language-Action Model for Capability-Modulated and Steerable Navigation"! 📜 Paper: arxiv.org/abs/2510.20818 🌐 Website: vamos-vla.github.io

English

125

17.2K

Ajay Sridhar@ajaysridhar0·27 Eki

@yoyu0203 Good point - the HLP is finetuned to predict the labeled indices of the keyframes from the context. We use a subset of subtask transition frames as labels. We only pick useful ones (e.g., last frame of "look inside bin") and skip others (no frames from "reset scooper").

English

350

Eason@yoyu0203·26 Eki

@ajaysridhar0 Can you explain more about how you train High-level policy to predict key frames? Cause it feel like predicting key frame isn't differentiatable.

English

452

Ajay Sridhar@ajaysridhar0·26 Eki

English

318

42.8K

Ajay Sridhar@ajaysridhar0·26 Eki

Had fun working on this project with my co-lead @jenpan_, @satviks107, and @chelseabfinn! Paper: arxiv.org/abs/2510.20328 Website: jen-pan.github.io/memer/ (8/8)

English

1.1K

Ajay Sridhar@ajaysridhar0·26 Eki

What about just using a massive proprietary VLM like GPT-5 as the high-level policy? 1. Latency: At 10-15 seconds, they are far too slow for real-time robot control. 2. Accuracy: Even in an offline test, they were significantly less accurate than our finetuned model. (7/8)

English

1.1K

Ajay Sridhar retweetledi

Jenny Pan@jenpan_·26 Eki

Robots need memory to handle complex, multi-step tasks. Can we design an effective method for this? We propose MemER, a hierarchical VLA policy that learns what visual frames to remember across multiple long-horizon tasks, enabling memory-aware manipulation. (1/5)

English

174

70.6K

Keşfet

@GoogleDeepMind @svlevine @CatGlossop @shahdhruv_ @chelseabfinn @micoolcho @chris_j_paxton @Princeton