Philip Schroeder

30 posts

Philip Schroeder

@_pschro

PhD student at MIT. @MIT_CSAIL @MITEECS @nlp_mit

Beigetreten Kasım 2025

316 Folgt244 Follower

Angehefteter Tweet

Philip Schroeder@_pschro·2 Ara

Excited to share our NeurIPS 2025 paper introducing our video reasoning framework, ROVER (Reasoning Over VidEo Recursively), that improves visual understanding of VLMs in embodied settings. ROVER is a recursive framework that enables the model to maintain a compact attention window at each timestep of the video, without losing global context across the full video. ROVER works by decomposing the video into segments corresponding to each subtask within the full task trajectory. ROVER then generates a separate line of reasoning for each subtask, instead of attempting to reason across the full trajectory. We evaluate on simulated and real-world robotic manipulation tasks from RoboCasa and OpenX Embodiment. Overall, ROVER significantly improves the ability of VLMs to reason about what is happening at each moment during a robot task attempt. rover-vlm.github.io

English

2.2K

Philip Schroeder retweetet

Dominique Paul@DominiqueCAPaul·2d

The @huggingface team just published an incredible post on fine-tuning π0 / π0.5 for shirt folding. Key finding: algorithmic tweaks gave 5–20%. Training only on the top-20% of data gave +50%. They document 1,900 engineering hours, created intuitive method visualisations, and most of all, included a section on what didn't work (you won't find that in an academic paper). Recommendations: → Data quality > quantity → DAgger-style collection → Relative joint positions → Action interpolation + RTC → RABC during training Highly recommend reading the full post. @LeRobotHF

English

175

11K

Philip Schroeder retweetet

Jack Vial@jackvial89·2d

pi0.5 with rtc. I trained the expert + last 3 layers of the backbone on an rtx 4070 ti super. going to use this as the base model for recap. next up is collecting a first round of rollouts so I have data to train advantage conditioning

English

120

6.7K

Philip Schroeder retweetet

pfung@philfung·5 Nis

Last week, @positronic_ro released a cool robot study gauging how well the latest open models perform on real-life manufacturing tasks. They compared a human, a teleoperated arm, and models pi0.5, GR00T-1.6, ACT, and HF-SmolVLA on simple pick-and-place bin tasks (batteries, towels, spoons, etc.). The models were fine-tuned using BC on ~350 demos. KEY CONCLUSIONS: 1. Not Usable in Real Life: Open models were essentially unusable in real-life settings, even for simple pick/place tasks. The mean time to failure/assist was only 4 minutes and worked at < 5% the speed of a person. 2. Teleop Works but is Slow: Teleoperation did just as well as a human (i.e., 100%), but at 25% the speed. 3. Pi ~= Groot: pi.05 and GR00T-1.6 performed similarly, slightly outperforming vanilla ACT. Visit the website to see every success and failure — it's really insightful! phail.ai Thanks folks at @positronic_ro for releasing such a great study for all our benefit.

San Francisco, CA 🇺🇸 English

392

Philip Schroeder retweetet

Patrick Yin@patrickhyin·26 Mar

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

465

106K

Philip Schroeder retweetet

Physical Intelligence@physical_int·19 Mar

We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.

English

286

2.2K

412K

Philip Schroeder retweetet

Eric Rosen@_ericrosen·19 Mar

What’s the easiest way to improve your pretrained diffusion policy? Swap your Gaussian with a single noise vector that maximizes your downstream reward function! ✅ Keeps original policy weights! frozen ✅ No training new neural networks! ✅ No RL infrastructure needed!

Omkar Patil@op45_indian

🚨New paper alert 🚨 from @rai_inst! arxiv.org/abs/2603.15757 🤖You robot policy is actually better than you think! We find that for a given policy, ALWAYS denoising a single noise vector, which we call a ✨Golden Ticket ✨, leads to consistent performance improvements! 🧵...

English

4.1K

Philip Schroeder retweetet

Omkar Patil@op45_indian·19 Mar

English

135

24.3K

Philip Schroeder retweetet

Omkar Patil@op45_indian·10 Şub

🚀Accepted @ #ICRA2026! 🤖 Factorized Diffusion Policies (FDP) enable robot policies to prioritize the most task-relevant sensory modalities. 📜Paper : arxiv.org/abs/2509.16830 🌐Website : fdp-policy.github.io 💻Code : github.com/omkarpatil18/f… For more details🧵..

English

8.6K

Philip Schroeder retweetet

Lakshita Dodeja@lakshitadodeja·18 Mar

Residual RL is a powerful strategy for adapting a pretrained base policy, but it struggles when: ❌ Exploration is uncontrolled ❌ Base policies are stochastic We tackle this in our new RAL paper “Accelerating Residual Reinforcement Learning with Uncertainty Estimation” (1/5)

English

8.5K

Philip Schroeder retweetet

0x796F@0x796F·16 Mar

You can now train @physical_int style robots in 1 day for only $5k. Anvil’s devkits have all the hardware, software, controls, cameras, and more ready-to-go. (1/5)

English

573

322.8K

Philip Schroeder retweetet

Jiaheng Hu@JiahengHu1·13 Mar

VLA models are capable generalists. But can they continually self-improve? Such Continual Reinforcement Learning (CRL) problems are traditionally considered very challenging. Surprisingly, we found that with the right setup, the simplest CRL recipe can work really well! arxiv.org/abs/2603.11653

English

269

44.4K

Philip Schroeder retweetet

Seungwook Han@seungwookh·12 Mar

Can language models learn useful priors without ever seeing language? We pre-pre-train transformers on neural cellular automata — fully synthetic, zero language. This improves language modeling by up to 6%, speeds up convergence by 40%, and strengthens downstream reasoning. Surprisingly, it even beats pre-pre-training on natural text! Blog: hanseungwook.github.io/blog/nca-pre-p… (1/n)

English

261

1.7K

246.8K

Philip Schroeder retweetet

Anthony Liang@aliangdw·12 Mar

Just released RBM-1M-OOD! We collected and annotated 1k+ robot trajectories of varying expertise across 4 universities for evaluating reward models. 📂Dataset: huggingface.co/datasets/robom… Also give Robometer a try on your own robot trajectories here: huggingface.co/spaces/robomet…

Anthony Liang@aliangdw

Super excited to share Robometer, a reward model that works zero-shot across robots, tasks, and scenes! Try fine-tuning Robometer on your own dataset! 🌐Project website: robometer.github.io 💻Code: github.com/robometer/robo…

English

5.9K

Philip Schroeder retweetet

Nishanth Kumar@nishanthkumar23·11 Mar

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

English

217

74.3K

Philip Schroeder retweetet

Adam Hung@Adamjhung·10 Mar

Learning from human videos often requires restrictive, carefully choreographed human motions. We propose ✨3PoinTr✨: a scalable way to pretrain from casual human videos. It bridges the embodiment gap by learning 3D scene evolution, enabling learning from natural human motions.

English

8.6K

Philip Schroeder retweetet

Yinpei Dai@YinpeiD·9 Mar

Robot memory methods are growing fast, but systematic evaluation is largely lacking. 📉 Introducing RoboMME: a new benchmark for memory-augmented robotic manipulation! 🤖🧠 Featuring 16 tasks across temporal, spatial, object, and procedural memory 🔗 robomme.github.io

English

221

52.7K

Philip Schroeder retweetet

Itamar Pres@PresItamar·5 Mar

New paper: It's time to optimize for 🔁self-consistency 🔁 We’ve pushed LLMs to the limits of available data, yet failures like sycophancy and factual inconsistency persist. We argue these stem from the same assumption: that behavior can be specified one I/O pair at a time. 🧵

English

425

72.9K

Philip Schroeder retweetet

pfung@philfung·5 Mar

added the excellent Robometer algorithm!! Easily compare your fav robot reward functions (Robometer, TOPReward, GVL, etc) in one website: philfung.github.io/rewardscope Thanks @aliangdw, @yigitkkorkmaz, @Jesse_Y_Zhang, @JiahuiZhang__32 for amazing Robometer paper!

pfung@philfung

Inspired by the TopReward paper, I made a lil web tool to test these robot manipulation rewards on your own videos. Try: philfung.github.io/rewardscope Record yourself folding a towel, upload it, and compare: 1. TopReward (this paper) 2. GVL (Deepmind) 3. Brute Force (i.e. at each frame, ask LLM to reply with a probability) TopReward (Qwen3VL-8B) holds its own surprisingly well against the others, even if those use ChatGPT! Great work @DJiafei, UW, AllenAI, thanks for pushing @VilleKuosmanen.

San Francisco, CA 🇺🇸 English

4.4K

Philip Schroeder retweetet

Shivin Dass@ShivinDass·4 Mar

Tried out RoboMeter on some of the old demos from Telemoma and it's pretty neat! The progress tracks are quite accurate given that these videos are pretty random. Also their visualizer is a lot of fun to play with: robometer-rewardeval-ui.hf.space

Jesse Zhang@Jesse_Y_Zhang

A reward model that works, zero-shot, across robots, tasks, and scenes? Introducing Robometer: Scaling general-purpose robotic reward models with 1M+ trajectories. Enables zero-shot: online/offline/model-based RL, data retrieval + IL, automatic failure detection, and more! 🧵 (1/12)

English

3.7K

Philip Schroeder retweetet

Marius Memmel@memmelma·5 Mar

There’s a discussion going on rn about two recent robotic reward models: TOPReward⛰️ and Robometer🌡️ Which one is better? It depends entirely on your objective! Here is a deep dive into the conceptual differences, strengths, and weaknesses of both. 🧵👇

English

14.3K

Entdecken

@huggingface @LeRobotHF @positronic_ro @rai_inst @physical_int @aliangdw @yigitkkorkmaz @Jesse_Y_Zhang