Christopher Agia

94 posts

Christopher Agia

@agiachris

PhD in Computer Science candidate @Stanford. My interests span learning for robotic planning, control, vision systems, and their interfacing representations.

Stanford, California, USA 가입일 Kasım 2021

52 팔로잉616 팔로워

고정된 트윗

Christopher Agia@agiachris·25 Haz

What makes data “good” for robot learning? We argue: it’s the data that drives closed-loop policy success! Introducing CUPID 💘, a method that curates demonstrations not by "quality" or appearance, but by how they influence policy behavior, using influence functions. (1/6)

GIF

English

146

53K

Christopher Agia 리트윗함

Andrej Karpathy@karpathy·9 Nis

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English

1.2K

2.5K

20.6K

4.3M

Christopher Agia 리트윗함

Marco Pavone@drmapavone·30 Mar

A central challenge in #physical #AI is data scarcity: vision-language-action (#VLA) models are fundamentally limited by the availability of high-quality robotics demonstrations. In our recent work, we introduce R&B-EnCoRe (arxiv.org/pdf/2602.08167), a framework that enables models to self-bootstrap embodied #reasoning by leveraging synthetic visuo-textual data together with limited embodiment-specific experience. In essence, R&B-EnCoRe allows models to learn how to reason in an embodied setting. Our approach treats reasoning as a latent variable and uses self-supervised refinement to learn reasoning strategies that are directly predictive of successful control—without human annotations, reward engineering, or external verifiers. We validate the approach across a range of embodiments—including manipulation, navigation, and autonomous driving—and across model scales from 1B to 30B parameters, observing consistent improvements: 💪 +28% task success in real-world manipulation 🦿 +101% score in legged locomotion navigation 🚗 −21% collision rate in autonomous driving Overall, this work highlights a promising direction: aligning internet-scale priors with embodiment-specific data to enable scalable, self-improving physical intelligence. Kudos to an amazing team: Milan Ganai Katie Luo @JonasFrey96 Clark Barrett 🌐 Website: milanganai.github.io/rnb-encore/ 📄 Paper: arxiv.org/pdf/2602.08167

English

5.2K

Christopher Agia@agiachris·18 Mar

@Ken_Goldberg @AnneliesGamble Absolutely.

English

Ken Goldberg@Ken_Goldberg·17 Mar

Robotics is advancing rapidly: we can move even faster by being more pragmatic and less dogmatic. Excellent summary by @AnneliesGamble:

Annelies Gamble@AnneliesGamble

x.com/i/article/2033…

English

152

33K

Christopher Agia@agiachris·18 Mar

@Majumdar_Ani Great points! Both also seem complementary – one modeling a policy, another modeling dynamics, implying WAM + AC-WM might be the new VLA + AC-WM.

English

711

Anirudha Majumdar@Majumdar_Ani·17 Mar

x.com/i/article/2033…

ZXX

397

90.3K

Christopher Agia 리트윗함

Marco Pavone@drmapavone·15 Mar

What does it take to build autonomous vehicles that can reason about the world they drive in? Tomorrow at #NVIDIAGTC, Patrick Liu and I will take a deep dive into the #Alpamayo #reasoning model family—a family of reasoning-based vision–language–action (#VLA) models that form a core component of the Alpamayo open platform (huggingface.co/blog/drmapavon…). We’ll cover three main topics: - How reasoning-based VLA models like Alpamayo 1 are designed and built - What it takes to bring Alpamayo 1 to production, including some of our latest results - Several exciting announcements about the expansion of the Alpamayo open platform If you're working on autonomous driving, robotics, or foundation models for physical AI, this session will offer a look at where the field is heading. Session details: 📅 Monday, Mar 16 | 3:00 PM PDT 📍 #NVIDIAGTC 2026 🔗 nvda.ws/4rze5oj Looking forward to seeing many of you there. @NVIDIADRIVE @NVIDIAAI

English

7.5K

Christopher Agia 리트윗함

Haruki Nishimura@imp_aa·11 Mar

Are you about to evaluate robot policies for your next paper, comparing your policy with baselines? Take a moment to review this article by @MashaItkina and myself, introducing practical tips on rigorous statistical analysis with easy-to-use Python tools: medium.com/toyotaresearch…

English

2.3K

Christopher Agia 리트윗함

Jeannette Bohg@leto__jean·24 Şub

Excited to share our new work on zero-shot dexterous tool use. We train ONE general policy, yet at test time the robot generalizes to brand-new tools and tasks it never saw during training. Also note: All videos are 1x.

Kushal@kushalk_

🤖 Can a single robot policy manipulate diverse tools without ever seeing them before? Introducing SimToolReal 🔨 : a generalist dexterous manipulation policy that transfers zero-shot sim→real to unseen tools + unseen tasks All videos are 1x speed (60 Hz control) 🧵👇

English

6.2K

Christopher Agia 리트윗함

Marco Pavone@drmapavone·11 Ara

🚀 Strengthening Robot Safety with Multimodal Defenses I’m excited to share our recent work, “Preventing Robotic Jailbreaking via Multimodal Domain Adaptation,” now available on arXiv: arxiv.org/pdf/2509.23281 As vision-language models (VLMs) become foundational components of modern robot autonomy, VLM-enabled robots also become increasingly vulnerable to jailbreaking attacks—adversarial prompts that can bypass safety filters and trigger unsafe or harmful behaviors in real-world robotic systems. This poses a significant challenge for the safe deployment of AI in autonomous vehicles, maritime robots, quadrupeds, and other embodied platforms. 📌 In this work, we introduce J-DAPT, a lightweight framework for robust multimodal jailbreak detection that delivers near-perfect detection performance across multiple robotic domains with minimal overhead. Our results demonstrate that it is indeed possible to effectively enhance safety defenses for vision-language models in robotics—an important step toward trustworthy and reliable autonomous systems. 📄 Read the full paper: arxiv.org/pdf/2509.23281 A great collaboration with the research groups of George Pappas and Mauro Conti. #Robotics #AI #Safety #MachineLearning #MultimodalAI

English

2.2K

Christopher Agia 리트윗함

Marco Pavone@drmapavone·8 Ara

🚗 Imitation learning is everywhere—but is it enough? So far, imitation learning—most commonly via behavior cloning (BC)—remains the go-to approach for training real-world autonomous vehicle (AV) driving policies. Yet BC operates in an open-loop (OL) fashion, overlooking the critical interdependence among inputs, outputs, and future states that comes with closed-loop (CL) operation. The result? The notorious—but often overlooked—OL–CL gap ⚠️ To address this challenge and encourage broader adoption of CL techniques, we’ve just published a survey (research.nvidia.com/publication/20…) presenting a comprehensive taxonomy of closed-loop training methods for end-to-end driving. Our framework organizes approaches along three key axes: - Action generation - Environment response generation - Training objectives 💡 Bottom line: enabling technologies—like neural rendering, generative world models, and scalable RL—have now matured, making closed-loop AV training ready for wide-scale adoption. We’d love to hear your thoughts—drop a comment and join the discussion! 💬 And as a reminder, we are hiring for full-time research scientist and research engineer positions: 🔹 [Sr.] Research Scientist: nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAEx… 🔹 [Sr.] Research Engineer: nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAEx… @NVIDIADRIVE @NVIDIAAI @nvidia

English

9.2K

Christopher Agia 리트윗함

Marco Pavone@drmapavone·3 Kas

Excited to unveil @nvidia's latest work on #Reasoning Vision–Language–Action (#VLA) models — Alpamayo-R1! Alpamayo-R1 is a new #reasoning VLA architecture featuring a diffusion-based action expert built on top of the #Cosmos-#Reason backbone. It represents one of the core technologies driving NVIDIA’s push toward Level 4 autonomy and robotaxis (nvidianews.nvidia.com/news/nvidia-ub…), as announced by Jensen Huang at #gtc DC last week. 📄 Paper: Alpamayo-R1 research.nvidia.com/publication/20… We present: - Architecture & Design: How to transform a VLM into a driving-ready Reasoning VLA - Chain of Causation Labeling: A new framework enabling reasoning-based learning - Training Strategy: From internet-scale pre-training → AV-specific SFT → RL-based post-training - Extensive Evaluation: From closed-loop simulation to real-world, on-vehicle testing 📈 Results: Alpamayo-R1 delivers significant performance gains over end-to-end baselines — especially in rare, safety-critical scenarios — all while maintaining real-time inference (99 ms end-to-end latency). Coming soon: releases of model variants and reasoning metadata built on top of the Physical AI Dataset (huggingface.co/datasets/nvidi…)—with more updates on the way. Stay tuned! 🙌 Huge thanks to Wenjie Luo and @yan_wang_9 (project co-leads); the @nvidia AV Research team (@iamborisi, @YurongYou, @xinshuoweng, @tianran_, @wenhaoding95, and many others); collaborators across @nvidia Research (@liu_mingyu, @visualyang, @PavloMolchanov, and many others); and the @nvidia AV Product team (Sarah Tariq, Patrick Liu, Jack Huang, and many more). Full contributor list in the Appendix. @NVIDIADRIVE @NVIDIAAI

English

234

36.6K

Christopher Agia 리트윗함

Stanford AI Lab@StanfordAILab·27 Eyl

The Stanford AI Lab community is proud to showcase over 20 research papers at CoRL 2025! Read about them here : ai.stanford.edu/blog/corl-2025/ @corl_conf

English

205

21.8K

Christopher Agia@agiachris·28 Eyl

@corl_conf In case you missed it, the live workshop recording has been made available on YouTube youtu.be/DX5VCRGn12w!

YouTube

English

205

Christopher Agia@agiachris·24 Eyl

Join us @corl_conf 2025 this Saturday Sep 27th for the workshop on “Making Sense of Data in Robotics!” We have a line up of exciting talks, posters, and tutorials on all topics robotics data - from composition to intepretability! #CoRL2025

Joey Hejna@JoeyHejna

It's almost time for #CoRL 2025! A reminder that we're hosting the Data in Robotics workshop this Saturday Sept 27th. We have a packed schedule and are also attempting to livestream the event for those who can't attend in person.

English

12.4K

Christopher Agia 리트윗함

Jeannette Bohg@leto__jean·28 Eyl

We present CUPID 💘 at @corl_conf today. Spotlight: 15:30-16:30 Poster: 16:30-18:00, Poster #44 CUPID 💘 lets us trace back policy performance to training demonstrations using influence functions. This helps with curating demonstrations for DPs and VLAs.

Christopher Agia@agiachris

English

14.8K

Christopher Agia@agiachris·26 Eyl

1️⃣ CUPID: cupid-curation.github.io - Sep 28 - Spotlight Session 2 - Poster Session 1 2️⃣ RoboMonkey: robomonkey-vla.github.io - Sep 28 - Spotlight Session 2 - Poster Session 1 3️⃣ FORTRESS: milanganai.github.io/fortress - Sep 29 - Oral Session 4 - Poster Session 2

English

195

Christopher Agia@agiachris·26 Eyl

We will be presenting 3 exciting papers at @corl_conf! 🎉 1️⃣ CUPID – Data curation for imitation learning 2️⃣ RoboMonkey – Test-time scaling for VLAs 3️⃣ FORTRESS – Safe reactive planning in OOD scenarios Come find us at the poster sessions - we’d love to chat! Links in thread 🧵

Christopher Agia@agiachris

English

7.1K

Christopher Agia@agiachris·24 Eyl

See the website for program details: sites.google.com/stanford.edu/c…

English

116

Christopher Agia 리트윗함

Joey Hejna@JoeyHejna·22 Ağu

🚨 We have extended the submission deadline for the workshop until Aug 29th! 🚨 Any completed or in progress work concerning data composition, curation, and interpretability in robotics are welcome!

Joey Hejna@JoeyHejna

We're hosting the 1st workshop on Making Sense of Data in Robotics at @corl_conf this year! We'll investigate what makes robot learning data "good" by discussing: 🧩 Data Composition 🧹 Data Curation 💡 Data Interpretability Paper submissions are due 8/22/2025! 🧵(1/3)

English

11K

Christopher Agia@agiachris·11 Ağu

Join the discussion on what makes robot learning data "good" at our @corl_conf workshop 🚀! Topics include data composition, curation, and interpretability. Submit by Aug 22!

Joey Hejna@JoeyHejna

English

884

Christopher Agia 리트윗함

Jeannette Bohg@leto__jean·11 Ağu

Data, its quality and composition is so important in robot models. Submit your paper on this topic to our #CoRL2025 workshop and discuss everything data with the robot learning community!

Joey Hejna@JoeyHejna

English

10K

탐색

@JonasFrey96 @Ken_Goldberg @AnneliesGamble @Majumdar_Ani @NVIDIADRIVE @NVIDIAAI @MashaItkina @nvidia