
Codex in the ChatGPT mobile app!
Yen-Jen Wang
430 posts

@wangyenjen
Ph.D. Student @Berkeley_AI

Codex in the ChatGPT mobile app!


some news: I’ve joined OpenAI. After wrapping up my PhD in Robotics, I’m excited to keep working toward AGI in the physical world. exciting journey ahead :)



Zero teleoperation. Zero real-world data. ➔ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website: viral-humanoid.github.io Arxiv: arxiv.org/abs/2511.15200 Deep dive with me: 🧵


Humanoid robots have been prisoners of the lab. We set them free — with human data. We present EgoHumanoid: The first endorsement of human-to-humanoid transfer for whole-body loco-manipulation. 🔗 Home: opendrivelab.com/EgoHumanoid 📑 Arxiv: arxiv.org/abs/2602.10106 🧵👇





🚀 We are at #ICLR2026! 🐢❤️💛🖤 We are thrilled to present our latest research spanning Physical AI, AI Safety, Omnimodal Generation, and more. If you're attending, come say hi and check out our work! Here is our full schedule for the week: 📅 Thursday, April 23 (TODAY) Poster: MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Models for Embodied Task Planning📍 Pavilion 3 P3-#1313 | ⏰ 3:15 p.m. — 5:45 p.m. Poster: PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach📍 Pavilion 4 P4-#3910 | ⏰ 3:15 p.m. — 5:45 p.m. 📅 Friday, April 24 Oral Presentation: MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Models for Embodied Task Planning📍 203 A/B (Oral Session 3D Vision language models II) | ⏰ 10:30 a.m. — 10:40 a.m. Poster: Zebra-CoT: A Dataset for Interleaved Vision-Language Reasoning📍 Pavilion 3 P3-#1806 | ⏰ 10:30 a.m. — 1:00 p.m. Poster: ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation📍 Pavilion 4 P4-#3016 | ⏰ 3:15 p.m. — 5:45 p.m. 📅 Saturday, April 25 Poster: TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models📍 Pavilion 3 P3-#1822 | ⏰ 3:15 p.m. — 5:45 p.m. 📅 Sunday, April 26 (Workshops - 9:00 a.m. – 5:00 p.m.) 🎤 Invited Speaker/Panel (AFAA Workshop): Rethinking Test-Time Compute: From Token-Level Rewards to Robust Generative Agents (Room 211) Paper (MM Intelligence Workshop): Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings (Room 204C) Paper (AIMS Workshop): Advancing Regulation in Artificial Intelligence: An Auction-Based Approach (Room 210) Paper (AFAA Workshop): OC-PRM: Overcredit-Contrastive Training for Precision-First Process Reward Models (Room 211) 📅 Monday, April 27 (Workshops - 9:00 a.m. – 5:00 p.m.) Paper (FM4Science Workshop): SciPredict: Can LLMs Predict the Outcomes of Scientific Experiments in Natural Sciences? (Room 101B) Paper (ICBINB Workshop): The Low-Frequency Trap: Why Scaling Doesn't Solve Simple Temporal Counting (Room 201C) 🔗 Learn more about our lab and papers here: furong-huang.com/See you in Rio! 🇧🇷👋 #MachineLearning #AI #DeepLearning #UMD




🤖For embodied agents in household environments, we tackle two fundamental questions: 1️⃣ What is the optimal scene representation? 2️⃣ Can a VLM leveraging this representation actually improve spatial understanding and task planning? Introducing MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Models for Embodied Task Planning. 👉: hybridrobotics.github.io/MomaGraph/ and 🔗:arxiv.org/abs/2512.16909 Key Ideas: MomaGraph jointly models spatial AND functional relationships with part-level interactive nodes. MomaGraph is designed to be: ✅ Task-Relevant: Filters visual noise to keep only what matters for the instruction. ✅ Dynamic & State-Aware: MomaGraph adapts. 🔄 It explicitly models object states and dynamic changes in the environment. We built MomaGraph to bridge the gap between the Spatial VLM and Robotics communities. 🌉 Our hope is that this work serves as a foundation for the next generation of intelligent, adaptive embodied agents. 🦾✨Questions and feedback welcome. 🚀 #Robotics #EmbodiedAI #CV #LLM #SceneGraph



SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-W… Docs: nvlabs.github.io/GR00T-WholeBod… Site: nvlabs.github.io/GEAR-SONIC/




