Miki Rubinstein

107 posts

Miki Rubinstein banner
Miki Rubinstein

Miki Rubinstein

@MikiRubinstein

Research Scientist @Google, leading a Computer Vision and Graphics research group | PhD @MIT.

Cambridge, MA Katılım Ocak 2021
39 Takip Edilen320 Takipçiler
Sabitlenmiş Tweet
Miki Rubinstein retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Our short film Dear Upstairs Neighbors is previewing at @sundancefest. 🎬 It’s a story about noisy neighbors, but behind the scenes, it’s about solving a huge challenge in generative AI: control. Developed by Pixar alumni, an Academy Award winner, researchers, and engineers, here’s how it came together. 🎨
English
369
413
3.4K
2.2M
Miki Rubinstein retweetledi
Google
Google@Google·
We’re at @sundancefest previewing our new animated short, "Dear Upstairs Neighbors" 📽️ In creating this film, our @GoogleDeepMind team of Pixar alumni, an Academy Award winner, researchers, and engineers designed new AI capabilities specifically for filmmakers. These tools gave director Connie He a new level of artistic control, allowing her to tell a story she's always wanted to share.
Google tweet media
English
60
42
289
64.3K
Miki Rubinstein retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
We just dropped Nano Banana Pro, built on Gemini 3. 🍌 With state-of-the-art text rendering, vast world knowledge and studio-quality creative controls, Gemini 3 Pro Image can create and edit more complex visuals, infographics and more. Here’s what’s under the hood. 🧵
English
164
590
3.8K
1.5M
Miki Rubinstein retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵
English
213
1.1K
6.5K
1.7M
Miki Rubinstein retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Veo is getting new precision editing capabilities that let you easily add or remove elements from a scene - all while preserving the integrity of your original video. 🎥
English
95
269
2.2K
266.4K
Miki Rubinstein retweetledi
Oliver Wang
Oliver Wang@oliver_wang2·
🍌🍌It's finally here! In addition to the largest ELO lead in lmarena history, I'm most excited about the fact that people really loved using the model. QPS was way above what we expected, and the model racked up 2.5M votes (also a record)! Amazing job team banana 🚀🚀🍌🍌
Arena.ai@arena

🚨🍌Big Reveal: who was "Nano Banana?" The anonymous model, “nano-banana,” that caught the world's attention with its ability to follow complex instructions, preserve character identity, and maintain contextual details was: Gemini-2.5-Flash-Image-Preview by @GoogleDeepMind 🍌✨ - Now ranked #1 on the Image Edit Arena - Also ranked #1 for Text-to-Image In two weeks, “nano-banana” has driven over 5 million votes to the Image Edit Arena. With 2.5M+ votes for this model, it is the highest number of votes any model has received, with the largest Elo score lead (171) in Arena history. Congrats to the @GoogleDeepMind team on this incredible milestone in image edit and generation. 👏

English
9
13
196
20.9K
Miki Rubinstein retweetledi
Zoubin Ghahramani
Zoubin Ghahramani@ZoubinGhahrama1·
WoZ at the Sphere was a tour de force of AI for creative industries, combining state-of-the-art super-resolution, outpainting generative video and computer vision research at @GoogleDeepMind to bring it all to life.
Lorraine Twohill@LorraineTwohill

The Wizard of Oz at Sphere last night was pure magic. The creative vision of very talented humans + the best AI tools and models. Incredible work from James & @SphereVegas team, @janetribeca, @GoogleDeepMind, @GoogleCloud, and many more. We are officially not in Kansas any more!

English
1
3
23
6.1K
Miki Rubinstein retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
English
814
2.6K
13.4K
3.7M
Miki Rubinstein retweetledi
Inbar Mosseri
Inbar Mosseri@inbar_mosseri·
Excited to introduce our new Veo 2 capabilities! Now with reference powered video generation (including style!), camera controls, outpainting, object add/removal & many more: #capabilities" target="_blank" rel="nofollow noopener">deepmind.google/models/veo/#ca… Also presenting Flow, our new AI filmmaking tool. labs.google/flow
English
1
9
33
2K
Miki Rubinstein retweetledi
Google Gemini
Google Gemini@GeminiApp·
Imagen 4 delivers visuals that pop with richer details, more nuanced color, and better text outputs. Everyone can make images for free in the Gemini App today: gemini.google.com #GoogleIO
Google Gemini tweet mediaGoogle Gemini tweet mediaGoogle Gemini tweet mediaGoogle Gemini tweet media
English
181
350
6.4K
18.6M
Miki Rubinstein retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵
English
645
1.3K
8.1K
1.6M
Miki Rubinstein retweetledi
Jon Barron
Jon Barron@jon_barron·
Happy Hanukkah! "A man using an upside-down Hanukkah menorah as a jetpack, blasting off" #Veo2
English
16
12
227
17.1K
Miki Rubinstein retweetledi
Jia-Bin Huang
Jia-Bin Huang@jbhuang0604·
Introducing Generative Omnimatte: A method for decomposing a video into complete layers, including objects and their associated effects (e.g., shadows, reflections). It enables many cool applications, such as video stylization, compositions, moment retiming, and object removal.
English
22
165
1.2K
81.7K
Miki Rubinstein retweetledi
Daniel Geng
Daniel Geng@dangengdg·
What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!
English
20
146
672
94.5K
Miki Rubinstein retweetledi
Chen Sun
Chen Sun@jesu9·
Motion is the new (and better) language for conditioned video generation, and Daniel @dangengdg shows that its vocabulary should be formed as points and tracks! You can now motion-prompt the same model for object and camera control, motion transfer, and many more!
Daniel Geng@dangengdg

What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!

English
0
4
22
2.3K
Miki Rubinstein retweetledi
Yao-Chih Lee
Yao-Chih Lee@YaoChihLee·
Excited to introduce our new paper, Generative Omnimatte: Learning to Decompose Video into Layers, with the amazing team at Google DeepMind! Our method decomposes a video into complete layers, including objects and their associated effects (e.g., shadows, reflections).
English
20
92
610
92.3K
Miki Rubinstein retweetledi
Tali Dekel
Tali Dekel@talidekel·
Working on layered video decomposition for a few years now, I'm super excited to share these results! Casual videos to *fully visible* RGBA layers, even under significant occlusions! Kudos @YaoChihLee, @erika_lu_, Sarah Rumbley, @GeyerMichal, @jbhuang0604, and @forrestercole
Yao-Chih Lee@YaoChihLee

Excited to introduce our new paper, Generative Omnimatte: Learning to Decompose Video into Layers, with the amazing team at Google DeepMind! Our method decomposes a video into complete layers, including objects and their associated effects (e.g., shadows, reflections).

English
1
16
126
11K
Miki Rubinstein retweetledi
Jialu Li
Jialu Li@JialuLi96·
🎉Excited to introduce our new paper: Unbounded: A Generative Game of Character Life Simulation! We build a game of character life simulation that is fully encapsulated in generative models. 🌟We achieve this with: ▶️ A specialized, distilled LLM that dynamically generates game mechanics, narratives, and character interactions in real-time. ▶️ A dynamic regional IP-Adapter for vision models that ensures consistent yet flexible visual generation of a character across multiple environments. 🧵
Jialu Li tweet media
English
6
49
185
47.7K