Wenhao Yu

60 posts

Wenhao Yu

Wenhao Yu

@Stacormed

Research Scientist @DeepMind

가입일 Mart 2011
188 팔로잉634 팔로워
Wenhao Yu 리트윗함
Anirudha Majumdar
Anirudha Majumdar@Majumdar_Ani·
Generalist robots need a generalist evaluator. But how do you test safety without breaking things? 💥 🌎 Introducing our new work from @GoogleDeepMind: Evaluating Gemini Robotics Policies in a Veo World Simulator veo-robotics.github.io 🧵👇
English
27
90
576
233.3K
Wenhao Yu 리트윗함
Caden Lu
Caden Lu@jyluxx·
Interacting with Gemini Robotics 1.5 is so fun! Our Embodied Reasoning model planned the multi-step task and orchestrated our Vision Language Action model for precise execution!
English
1
10
42
4.4K
Wenhao Yu
Wenhao Yu@Stacormed·
Gemini Robotics 1.5 is not only general, but also fairly dexterous! Enjoy some fun videos of robot doing insertion, zipping, and more (remember this is the *same checkpoint* that also controls two other very different robots) 😆
English
0
9
66
22.1K
Wenhao Yu
Wenhao Yu@Stacormed·
Excited to share our latest work on Gemini Robotics 1.5! Our model can effectively learn from experience of drastically different robots, think on its own, and act as an agent. It’s an important step towards creating a general, intelligent, and friendly robot!
Google DeepMind@GoogleDeepMind

We’re making robots more capable than ever in the physical world. 🤖 Gemini Robotics 1.5 is a levelled up agentic system that can reason better, plan ahead, use digital tools such as @Google Search, interact with humans and much more. Here’s how it works 🧵

English
0
0
10
787
Wenhao Yu
Wenhao Yu@Stacormed·
How can we train and apply world models that step towards modeling the physical world? Come join us at ICML 2025 workshop on Building Physically Plausible World Models to learn more from the top experts and share your own research and insights! physical-world-modeling.github.io
Wenhao Yu tweet media
English
2
8
31
8K
Wenhao Yu 리트윗함
Yixin Lin
Yixin Lin@yixin_lin_·
Complementary to Gemini Robotics -- the massive vision-language-action (VLA) model released yesterday -- we also investigated how far we can push Gemini for robotics _purely from simulation data_ in Proc4Gem: 🧵
GIF
English
5
39
366
47.9K
Wenhao Yu 리트윗함
Sundar Pichai
Sundar Pichai@sundarpichai·
We’ve always thought of robotics as a helpful testing ground for translating AI advances into the physical world. Today we’re taking our next step in this journey with our newest Gemini 2.0 robotics models. They show state of the art performance on two important benchmarks - generalization and embodied reasoning - which enable robots to draw from Gemini’s multimodal understanding of the world to make changes on the fly + adapt to their surroundings. This milestone lays the foundation for the next generation of robotics that can be helpful across a range of applications.
English
172
340
3.1K
279.8K
Wenhao Yu
Wenhao Yu@Stacormed·
Got ideas for bimanual robots tackling real-world challenges? Check out the WBCD (What Bimanuals Can Do) competition at ICRA 2025! We have physical robots, realistic tasks, and amazing prize for those that extends the boundary of what robots can do!
Wenhao Yu tweet media
English
1
1
7
614
Wenhao Yu
Wenhao Yu@Stacormed·
Wow this is really good! In some way I’m more impressed that it’s teleoperated than if it’s autonomous cuz it feels very plausible to develop a highly specialized RL-based policy to do this, but being able to tele op this opens up a wide range of data to be collected.
Tesla Optimus@Tesla_Optimus

Got a new hand for Black Friday

English
2
0
15
1.8K
Wenhao Yu
Wenhao Yu@Stacormed·
How can we leverage the common sense knowledge from a VLM to understand the progress (and even quality!) of a robotics trajectory? Check out GVL on a surprisingly simple and elegant way to do that! Awesome work by Jason!
Jason Ma@JasonMa2020

Excited to finally share Generative Value Learning (GVL), my @GoogleDeepMind project on extracting universal value functions from long-context VLMs via in-context learning! We discovered a simple method to generate zero-shot and few-shot values for 300+ robot tasks and 50+ datasets using SOTA VLMs like Gemini (Try out the demo on our website on your robot video today!) I worked a lot on leveraging foundation models as guidance for robots in my PhD, and to me, this result forges a new frontier in how we can use foundation models for robot learning, given its broad applicability independent of embodiment and task types. Quite excited about how we can build on this work as a community!

English
0
4
19
2.7K