Ryan Po

26 posts

Ryan Po

Ryan Po

@Po_lhr

Ph.D. Candidate at Stanford

Katılım Ekim 2023
220 Takip Edilen283 Takipçiler
Sabitlenmiş Tweet
Ryan Po
Ryan Po@Po_lhr·
🎮 Real-time multiplayer world model 👥 Arbitrary number of players 🧠 Generated entirely by a neural network MultiGen is a real-time multiplayer diffusion game engine that supports an arbitrary number of players through a shared memory-based world model, rather than limiting interaction to just 2 players. While single-player world models can already be entertaining, things really change once multiple people can step into the same generated world together. Here’s a 30-minute timelapse of 4-player gameplay running in real time.
English
14
16
151
39.3K
Ryan Po
Ryan Po@Po_lhr·
Super cool work from the Odyssey team! Great to see more momentum in this direction (and happy to see the MultiGen framework getting adopted 😉) Multi-player/agent systems that scale to arbitrary numbers of agents might not be solvable with brute force scaling alone. Can’t wait to see where the field goes with this direction!
Odyssey@odysseyml

Introducing Agora-1, a multi-agent world model. Multiple participants—human or AI—can now interact inside the same world simulation, all in real-time. Try our playable research preview today, with Agora-1 simulating a multiplayer GoldenEye deathmatch!

English
2
0
13
1.3K
Ryan Po retweetledi
Gordon Wetzstein
Gordon Wetzstein@GordonWetzstein·
High-fidelity generation is hitting a scaling crisis as DiT compute grows with image resolution and video length. But do we need high-resolution denoising at every step? We introduce Spectral Progressive Diffusion, a plug-and-play framework for efficient image and video generation that directly exploits the spectral autoregression property of diffusion to grow resolution during denoising. [1/7]
English
22
64
407
84.1K
Ryan Po retweetledi
Hansheng Chen
Hansheng Chen@HanshengCh·
New paper: AsymFlow🔥 JiT x0-prediction is not enough for pixel generation. Better keep velocity in a low-rank subspace: - 1.57 FID on ImageNet (best pixel flow model) - Finetunes FLUX.2 klein into pixel space, beats the original on HPSv3/DPG/GenEval (#1 overall on HPSv3) 1/7
Hansheng Chen tweet media
English
20
54
276
52.5K
Ryan Po
Ryan Po@Po_lhr·
Had a lot of fun building this during spring break, pretty surreal to see a multiplayer generative game actually running in the browser (it even works it mobile). Go try it!
Nataniel Ruiz@natanielruizg

Our previous intern released an extremely impressive re-implemented demo of our paper on multiplayer diffusion game engines. play-multigen.com I think this might be the first time you can play a fully-functional multiplayer generative game online with other people. 🤯

English
2
4
14
2.4K
Ryan Po
Ryan Po@Po_lhr·
@natanielruizg Thanks Nataniel! We're going to be running this demo for a couple more days, so grab your friends and try it out!
English
1
0
7
591
Nataniel Ruiz
Nataniel Ruiz@natanielruizg·
Our previous intern released an extremely impressive re-implemented demo of our paper on multiplayer diffusion game engines. play-multigen.com I think this might be the first time you can play a fully-functional multiplayer generative game online with other people. 🤯
Nataniel Ruiz tweet media
English
15
30
160
29.4K
Ryan Po
Ryan Po@Po_lhr·
A couple of weeks ago, we introduced MultiGen, our work on real-time multiplayer world models. After spending way too many hours playing it with friends internally, we knew we had to share it. Today, we're excited to collab with @modal to let you experience it for yourselves. Grab your squad and play the live demo here 👇
Gordon Wetzstein@GordonWetzstein

We built a real-time multiplayer game generated entirely by a neural network—and now you can actually play it. In collaboration with @modal, we just launched the live demo for MultiGen, our diffusion-based multiplayer game engine. Grab some friends and try it here 👇

English
0
4
20
2.8K
Ryan Po
Ryan Po@Po_lhr·
@GordonWetzstein @modal Super excited about releasing this! We've been having so much fun playing Multigen with our friends and now everyone can try it from their browser (and phones)
English
0
0
6
293
Gordon Wetzstein
Gordon Wetzstein@GordonWetzstein·
We built a real-time multiplayer game generated entirely by a neural network—and now you can actually play it. In collaboration with @modal, we just launched the live demo for MultiGen, our diffusion-based multiplayer game engine. Grab some friends and try it here 👇
English
10
20
169
26.4K
Ryan Po retweetledi
Gordon Wetzstein
Gordon Wetzstein@GordonWetzstein·
High-resolution image and video generation is hitting a wall because attention in DiTs scales quadratically with token count. But does every pixel need to be in full resolution? Introducing Foveated Diffusion: a new approach for efficient diffusion-based generation that allocates compute where it matters most. 1/7🧵
English
23
116
1.1K
161.9K
Ryan Po retweetledi
Eric Chan
Eric Chan@ericryanchan·
Today, we announce our team’s progress in pursuing a different type of foundation model for robotics: the Direct Video Action Model (DVA), which does our best to take robotics and turn it into a generative modeling problem we can scale. Technical blog: rhoda.ai/research/direc…
English
12
26
197
20.2K
Ryan Po
Ryan Po@Po_lhr·
doomguy finds out he's AI generated
Nataniel Ruiz@natanielruizg

Excited to show some surprising inventions on generative multiplayer games we made at Google with Stanford. We call the work MultiGen. I've always been inspired by early studios like id Software with Doom or Blizzard with Warcraft bringing networked video games to the next level. We are at the point in history where we can make strides like them, but for generative games. It's a strange feeling to be in the age of generative video games while still discovering how exactly to train the models and design the tools that make them useful. All of the tools that have been invented for classic game engines need to be redesigned for generative games. For example level and world design is not entirely possible with existing technology. We introduce editable memory to diffusion game engines that allow for design of new levels via a minimap. But we can easily imagine how this can be expanded with different creation tools. The end goal of this research direction is to allow game designers to be able to guide the generation process of their world, at the granularity that they prefer. Editable memory also allows us to add multiplayer to Generative Doom. We were amazed when we saw GameNGen some years ago, and now you can play it live with friends in real-time, on your couch or even online. Shared representations like our editable memory seem like the future for this type of experience. Models are, in some cases, expensive and approximate encoders but great interpolators and extrapolators. Leveraging their strengths lets you have completely new experiences that can be realized now and not in the distant future. This work was started at my previous team and continued in collaboration with Stanford. Congratulations to all for the discoveries.

English
0
2
11
1.8K
Ryan Po
Ryan Po@Po_lhr·
It was a huge pleasure working with Nataniel and team on this project. Starting from his previous project (Unbounded), Nataniel’s vision for generative games is sure to shape the way we view entertainment in the coming years.
Nataniel Ruiz@natanielruizg

Excited to show some surprising inventions on generative multiplayer games we made at Google with Stanford. We call the work MultiGen. I've always been inspired by early studios like id Software with Doom or Blizzard with Warcraft bringing networked video games to the next level. We are at the point in history where we can make strides like them, but for generative games. It's a strange feeling to be in the age of generative video games while still discovering how exactly to train the models and design the tools that make them useful. All of the tools that have been invented for classic game engines need to be redesigned for generative games. For example level and world design is not entirely possible with existing technology. We introduce editable memory to diffusion game engines that allow for design of new levels via a minimap. But we can easily imagine how this can be expanded with different creation tools. The end goal of this research direction is to allow game designers to be able to guide the generation process of their world, at the granularity that they prefer. Editable memory also allows us to add multiplayer to Generative Doom. We were amazed when we saw GameNGen some years ago, and now you can play it live with friends in real-time, on your couch or even online. Shared representations like our editable memory seem like the future for this type of experience. Models are, in some cases, expensive and approximate encoders but great interpolators and extrapolators. Leveraging their strengths lets you have completely new experiences that can be realized now and not in the distant future. This work was started at my previous team and continued in collaboration with Stanford. Congratulations to all for the discoveries.

English
1
1
19
1.2K
Ryan Po
Ryan Po@Po_lhr·
One nice consequence of external memory is that it turns level design into a native part of the system. The world is defined explicitly through a top-down map layout, so users can build or modify the environment before inference starts, while the model generates first-person observations that stay aligned with that structure.
English
1
1
8
1.8K
Ryan Po
Ryan Po@Po_lhr·
🎮 Real-time multiplayer world model 👥 Arbitrary number of players 🧠 Generated entirely by a neural network MultiGen is a real-time multiplayer diffusion game engine that supports an arbitrary number of players through a shared memory-based world model, rather than limiting interaction to just 2 players. While single-player world models can already be entertaining, things really change once multiple people can step into the same generated world together. Here’s a 30-minute timelapse of 4-player gameplay running in real time.
English
14
16
151
39.3K
Ryan Po retweetledi
Gordon Wetzstein
Gordon Wetzstein@GordonWetzstein·
Video world models today have a very limited context length. Mode Seeking meets Mean Seeking (MMM) unlocks long-context, persistent video world models through a unified representation. 1/8 🧵
English
3
26
207
42.7K
Ryan Po retweetledi
Nataniel Ruiz
Nataniel Ruiz@natanielruizg·
today we are releasing new research at Google. we tackle the previously unsolved task of editing motion in an existing video. it's called MotionV2V. with it you can move objects in videos, move the camera, and other unprecedented edits in user-provided video
GIF
English
11
43
179
17.9K
Ryan Po retweetledi
Ceyuan Yang
Ceyuan Yang@CeyuanY·
Long video generation usually results in context increasing/scaling during chunk/frame-wise rollout. Considering context scaling may require context selection, we thus introduce the idea of MoE into long context modelling and propose Mixture of Contexts. All previous context/memory is considered while the chosen ones are computed in a data-driven manner. You can easily enjoy 7x compute savings.
English
4
31
219
22.1K
Ryan Po
Ryan Po@Po_lhr·
"World model" has been an overloaded term, used by different communities in various contexts. Here's a fantastic blog from Xun, packed with great insights on what world models could look like. A very valuable read for anyone working in this space!
Xun Huang@xxunhuang

What exactly is a "world model"? And what limits existing video generation models from being true world models? In my new blog post, I argue that a true video world model must be causal, interactive, persistent, real-time, and physical accurate. xunhuang.me/blogs/world_mo…

English
0
1
8
766
Kfir Aberman
Kfir Aberman@AbermanKfir·
🚀 Career Update After years pushing the boundaries of Generative AI at some of the world’s top companies -> I’m going startup. I’ve joined @DecartAI as a founding team member, leading the charge to build our San Francisco office from the ground up. decart.ai
English
16
3
160
18K