Max Jiang

71 posts

Max Jiang

@maxjiang93

Staff Research Scientist / TLM @ Waymo | Generative Models, 3D Vision, Self-driving Cars | Prev. PhD@Berkeley, Google, Cruise | Opinions are my own

Mountain View, CA Katılım Kasım 2013

226 Takip Edilen364 Takipçiler

Max Jiang@maxjiang93·3d

@ZGojcic @FidlerSanja Very nice work, congrats!

English

Zan Gojcic@ZGojcic·4d

A new generation in AV simulation is here! We are announcing AlpaDreams, a real time interactive generative world model for AV simualtion! Just a year ago it took minutes to generate a few seconds of video, today it is real time and interactive! research.nvidia.com/labs/sil/proje…

English

103

16.8K

Max Jiang retweetledi

Songyou Peng@songyoupeng·4 Mar

📢Our team @GoogleDeepMind is hiring a Research Scientist in MTV, NYC, or SF! Join us to push the frontiers of visual perception & spatial reasoning for multimodal foundation models like Gemini, Nano Banana, and more! Send your CV to gdm-3d-scene-understanding-job@google.com

English

712

95.5K

Max Jiang retweetledi

Sundar Pichai@sundarpichai·7 Şub

Great use of Genie 3 from @waymo to create high-fidelity, interactive simulations of rare events that are nearly impossible to capture in the real world.

Google DeepMind@GoogleDeepMind

Genie 3 🤝 @Waymo The Waymo World Model generates photorealistic, interactive environments to train autonomous vehicles. This helps the cars navigate rare, unpredictable events before encountering them in reality. 🧵

English

133

160

1.7K

146.6K

Max Jiang retweetledi

Demis Hassabis@demishassabis·6 Şub

Super cool use case of Genie 3 simulations!

Waymo@Waymo

We’re excited to introduce the Waymo World Model—a frontier generative mode for large-scale, hyper-realistic autonomous driving simulation built on @GoogleDeepMind’s Genie 3. By simulating the “impossible”, we proactively prepare the Waymo Driver for some of the most rare and complex scenarios—from tornadoes to planes landing on freeways—long before it encounters them in the real world. waymo.com/blog/2026/02/t…

English

104

254.9K

Max Jiang@maxjiang93·6 Şub

Incredibly excited to share our most recent work: the Waymo World Model. We leverage the broad world knowledge in Google DeepMind's Genie 3 and bring it into our most advanced autonomous driving simulator to date, with emergent transfer of world knowledge even into the 3D domain.

Waymo@Waymo

English

5.8K

Max Jiang retweetledi

Dmitri Dolgov@dmitri_dolgov·12 Ara

Exponential scaling ongoing – @Waymo has officially doubled our fully autonomous cities in a matter of weeks, reaching 10 cities with the newest additions of San Antonio and Orlando. This is a testament to the maturity and generalizability of the Waymo Driver, our deliberate, safety-first approach to scaling, and an important step as we prepare to serve more riders across more cities soon.

English

933

48.8K

Max Jiang@maxjiang93·22 Eki

An early glimpse into the exciting future of fully generative AV world models ;)

Jiahao Wang@jiahaowg

🚗 Excited to present our paper "Drive&Gen: Co-Evaluating End-to-End Driving and Video Generation Models" at #IROS2025! 📅 Time: 10:55–11:00, Wed, 10.22.2025 📍 Room: 103C, Paper WeAT27.6 arxiv: arxiv.org/abs/2510.06209 #IROS2025 #WorldModels #AutonomousDriving #GenerativeAI

English

258

Max Jiang retweetledi

Jiahao Wang@jiahaowg·20 Eki

English

538

Max Jiang@maxjiang93·8 Eyl

If you share this mission and have related experience in world models / video, image generation models, diffusion models, LLMs, 3D, …, let’s chat!

English

106

Max Jiang@maxjiang93·8 Eyl

We are in the most unique position to leverage the data, compute, talent and a little bit of secret sauce 😉 to crack one of AI’s most exciting new frontiers.

English

117

Max Jiang@maxjiang93·8 Eyl

📣Super excited to share a new opportunity to work with me and my team to build the most advanced generative world model for simulating autonomous vehicles 🤖🚕🌎 enabling Waymo to scale faster, safer, and serve more people. careers.withwaymo.com/jobs/research-…

English

917

Max Jiang retweetledi

John Lambert@jlambert_·13 Ağu

Can a single autonomous driving simulation world model jointly insert, delete, and control the behavior of all agents and traffic lights in a bird's-eye-view scene? For the first time, we show this is possible in SceneDiffuser++, our CVPR '25 paper, w/ 60+ second simulations.🧵

GIF

English

1.2K

Max Jiang retweetledi

Jack Parker-Holder@jparkerholder·5 Ağu

Genie 3 feels like a watershed moment for world models 🌐: we can now generate multi-minute, real-time interactive simulations of any imaginable world. This could be the key missing piece for embodied AGI… and it can also create beautiful beaches with my dog, playable real time

English

264

526

4.8K

2.1M

Max Jiang@maxjiang93·1 May

@Micro_Yunha @MIT @MITBiology @MITEECS @MIT_SCC Wow that’s amazing, huge congrats Yunha!

English

Yunha Hwang@Micro_Yunha·28 Nis

It’s official!🎉I’m thrilled to announce that I will be joining @MIT as an assistant professor in a shared appointment between @MITBiology, @MITEECS and @MIT_SCC this fall. My lab will couple ML and high throughput experimentation to harness the remarkable functional diversity of microbial genomes. If you are excited about the intersection of AI and microbiology, please get in touch! It’s been an incredible journey building @tatta_bio with @ancornman1 to advance AI infrastructure for biology, and I will continue to further our mission as chief scientist. I am so grateful for all the support I received from my mentors, colleagues and collaborators over the years: @pgirguis, @sokrypton, @simroux_virus, @AlexJProbst, @annedekas

English

2.8K

160.4K

Max Jiang retweetledi

Jeff Dean@JeffDean·12 Mar

Exciting news: @Waymo is beginning public service on the Peninsula, starting with Palo Alto, Mountain View, and Los Altos! Initial service area below.

Waymo@Waymo

We’ve made millions of miles of memories over the past 15+ years, but today is special. We’re returning to where the journey began, gradually opening our doors to our first public riders in Mountain View, Los Altos, Palo Alto, and parts of Sunnyvale.

English

1.8K

499.1K

Max Jiang retweetledi

Sundar Pichai@sundarpichai·27 Şub

Exciting new @Waymo milestone: Waymo One is now serving 200k+ paid trips each week across LA, Phoenix and SF - that’s 20x growth in less than two years! Up next: Austin, Atlanta and Miami.

English

177

508

3.9K

703.6K

Max Jiang@maxjiang93·1 Şub

“More than national prides and competitions, I think it’s time to start thinking globally about the challenges and social changes that AI will bring everywhere in the world. And open-source technology is likely our most important asset…”

Thomas Wolf@Thom_Wolf

Finally took time to go over Dario's essay on DeepSeek and export control and to be honest it was quite painful to read. And I say this as a great admirer of Anthropic and big user of Claude* The first half of the essay reads like a lengthy attempt to justify that closed-source models are still significantly ahead of DeepSeek. However, it mostly refers to internal unpublished evals which limit the credit you can give it, and statements like « DeepSeek-V3 is close to SOTA models and stronger on some very narrow tasks » transforming in a general conclusion « DeepSeek-V3 is actually worse than those US frontier models — let’s say by ~2x on the scaling curve » left me generally doubtful. The same applies to the takeaway that all discoveries and efficiency improvements of DeepSeek have been discovered long ago by closed-models companies, this statement mostly resulting from a comparison of DeepSeek openly published $6M training numbers with some vague « few $10M » on Anthropic side without providing much more details. I have no doubts the Anthropic team is extremely talented and I’ve regularly shared how impressed I am with Sonnet 3.5 but this longwinded comparison of open research with vague closed research and undisclosed evals has left me less convinced of their lead than I was before I reading it. Even more frustrating was the second half of the essay which dive into the US-China race scenario and totally misses the point that the DeepSeek model is open-weights, and largely open-knowledge due to its detailed tech report (and feel free to follow Hugging Face’s open-r1 reproduction project for the remaining non-public part: the synthetic dataset). If both DeepSeek and Anthropic models had been closed source, yes the arm-race interpretation could have make sense but having one of the model freely widely available for download and with detailed scientific report renders the whole « close-source arm-race competition » argument artificial and unconvincing in my opinion. Here is the thing: open-source knows no border. Both in its usage and its creation. Every company in the world, be it in Europe, Africa, South-America or the USA can now directly download and use DeepSeek without sending data to a specific country (China for instance) or depending on a specific company or server for running the core part of its technology. And just like most open-source library in the world are typically built by contributors from all over the world, we’ve already seen several hundred derivative models on the Hugging Face hub created everywhere in the world by teams adapting the original model to their specific use cases and explorations. What's more, with the open-r1 reproduction and the DeepSeek paper, the coming months will clearly see many open-source reasoning models being released by teams from all over the world. Just today, two other teams, AllenAI in Seattle and Mistral in Paris both independently released open-source base models (Tülu and Small3) which are already challenging the new state-of-the-art (with AllenAI indicating that its Tülu model surpasses the performance of DeepSeek-V3). And the scope is even much broader than this geographical aspect. Here is the thing we don’t talk nearly enough about: open-source will be more and more essential for our… safety! As AI becomes central to our lives, resiliency will increasingly become a very important element of this technology. Today we’re dependent on internet access for almost everything. Without access to the internet, we lose all our social media/news feeds, can’t order a taxi, book a restaurant, or reach someone on WhatsApp. Now imagine an alternate world to ours where all the data transiting through the internet would have to go through a single company’s data centers. The day this company suffers a single outage, the whole world would basically stop spinning (picture the recent CrowdStrike outage magnified a millionfold). Soon, as AI assistants and AI technology permeate our whole life to simplify many of our online and offline tasks, we (and companies using AI) will start to depend more on more on this technology for our daily activities and we will similarly start to find annoying or even painful any downtime in these AI assistants from outages. The most optimal way to avoid future downtime situations will be to build resilience deep in our technological chain. Open-source has many advantages like shared training costs, tunability, control, ownership, privacy but one of its most fundamental virtue in the long term –as AI becomes deeply embedded in our world– will likely be its strong resilience. It is one of the most straightforward and cost-effective ways to easily distribute compute across many independent providers and to even run models locally and on device with minimal complexity. More than national prides and competitions, I think it’s time to start thinking globally about the challenges and social changes that AI will bring everywhere in the world. And open-source technology is likely our most important asset for safely transitioning to a resilient digital future where AI is integrated into all aspects of society. *Claude is my default LLM for complex coding. I also love its character with hesitations and pondering, like a prelude to the chain-of-thoughts of more recent reasoning models like DeepSeek generations.

English

123

Max Jiang retweetledi

Robert Nishihara@robertnishihara·27 Oca

Just sat down to read the DeepSeek-R1 paper. We're entering an era where compute isn't primarily for training. It's for creating better data. I expect to see the money & compute spent on data processing (generation / annotation / curation) grow to match and exceed the money & compute spent on pre-training. People have talked about pre-training plateauing because we're "running out of data" on the internet to scrape. While that may be the case, capability improvements are going to continue full steam ahead. The improvements in intelligence are going to come not from putting in more data (scraped from the internet) but rather from putting in more compute (to generate higher-quality data). Intuitively, this feels similar to me to how people learn. You don't learn just by ingesting lots of tokens. In many cases, you learn by thinking more (I am referring to training time, but thinking more also applies at inference time). There are many creative ways to put in more compute to get better data, and this problem will be an important research area for a number of years. - In this paper, they train two models. Why two models? The first one (a reasoning model trained via RL) is used to generate data to train the second. This works by using the first model to generate reasoning traces and then selectively keeping only the high quality outputs (quality is judged by simply checking the results). This approach of "checking the results" works well for domains like math and coding where you can easily check the results. - In drug development, it is super common to put compute into generating better data in two phases. In the first phase, a generative model (e.g., for protein sequences) generates a massive number of candidate drugs. In the second phase, scoring or filtering is done with a slew of predictive models which may predict structure, toxicity, solubility, binding affinity, etc. After all this work is done, you may end up with 100 data points. - In physical domains (e.g., climate applications), expensive but accurate physics simulators exist (these simulations are run on super computers for long periods of time to simulate the physics of the atmosphere or some other system). All of that data can be used to train models, which is showing a ton of promise. The question of "how to put more compute into generating better data" is central to progress in AI right now.

English

156

976

173.9K

Max Jiang@maxjiang93·11 Ara

@YijingEric @ancornman1 @XiukunH @hongjjeon @jlambert_ @CarlosFuertes @tanmingxing @drago_anguelov @Waymo

QAM

179

Max Jiang@maxjiang93·11 Ara

Joint work with @YijingEric @ancornman1 Chris Davis @XiukunH @hongjjeon Sakshum Kulshrestha @jlambert_ Shuangyu Li Xuanyu Zhou @CarlosFuertes Chang Yuan @tanmingxing Yin Zhou @drago_anguelov

Filipino

136

Max Jiang@maxjiang93·11 Ara

[NeurIPS 2024] We are excited to share SceneDiffuser - SceneDiffuser is a diffusion based world model for traffic simulation. Paper: openreview.net/pdf?id=a4qT29L… Come talk to us at NeurIPS! Waymo Booth or Friday Poster Session. 11am, East Exhibit Hall A-C #1200.

English

1.3K

Keşfet

@ZGojcic @FidlerSanja @GoogleDeepMind @Waymo @Micro_Yunha @MIT @MITBiology @MITEECS