Miki Rubinstein

107 posts

Miki Rubinstein

@MikiRubinstein

Research Scientist @Google, leading a Computer Vision and Graphics research group | PhD @MIT.

Cambridge, MA Katılım Ocak 2021

39 Takip Edilen320 Takipçiler

Sabitlenmiş Tweet

Miki Rubinstein@MikiRubinstein·2 Haz

Excited about this new work we've just released, StyleDrop, stylizing text-to-image generation from very few examples (in many cases just one!). Check out the project page for some beautiful results: styledrop.github.io

AK@_akhaliq

StyleDrop: Text-to-Image Generation in Any Style introduce StyleDrop, a method that enables the synthesis of images that faithfully follow a specific style using a text-to-image model. The proposed method is extremely versatile and captures nuances and details of a user-provided style, such as color schemes, shading, design patterns, and local and global effects. It efficiently learns a new style by fine-tuning very few trainable parameters (less than 1% of total model parameters) and improving the quality via iterative training with either human or automated feedback. Better yet, StyleDrop is able to deliver impressive results even when the user supplies only a single image that specifies the desired style. An extensive study shows that, for the task of style tuning text-to-image models, StyleDrop implemented on Muse convincingly outperforms other methods, including DreamBooth and textual inversion on Imagen or Stable Diffusion. paper page: huggingface.co/papers/2306.00…

English

17.9K

Miki Rubinstein retweetledi

Google DeepMind@GoogleDeepMind·26 Oca

Our short film Dear Upstairs Neighbors is previewing at @sundancefest. 🎬 It’s a story about noisy neighbors, but behind the scenes, it’s about solving a huge challenge in generative AI: control. Developed by Pixar alumni, an Academy Award winner, researchers, and engineers, here’s how it came together. 🎨

English

369

413

3.4K

2.2M

Miki Rubinstein retweetledi

Google@Google·27 Oca

We’re at @sundancefest previewing our new animated short, "Dear Upstairs Neighbors" 📽️ In creating this film, our @GoogleDeepMind team of Pixar alumni, an Academy Award winner, researchers, and engineers designed new AI capabilities specifically for filmmakers. These tools gave director Connie He a new level of artistic control, allowing her to tell a story she's always wanted to share.

English

289

64.3K

Miki Rubinstein retweetledi

Google DeepMind@GoogleDeepMind·20 Kas

We just dropped Nano Banana Pro, built on Gemini 3. 🍌 With state-of-the-art text rendering, vast world knowledge and studio-quality creative controls, Gemini 3 Pro Image can create and edit more complex visuals, infographics and more. Here’s what’s under the hood. 🧵

English

164

590

3.8K

1.5M

Miki Rubinstein retweetledi

Google DeepMind@GoogleDeepMind·18 Kas

This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵

English

213

1.1K

6.5K

1.7M

Miki Rubinstein retweetledi

Google DeepMind@GoogleDeepMind·20 Eki

Veo is getting new precision editing capabilities that let you easily add or remove elements from a scene - all while preserving the integrity of your original video. 🎥

English

269

2.2K

266.4K

Miki Rubinstein retweetledi

Oliver Wang@oliver_wang2·26 Ağu

🍌🍌It's finally here! In addition to the largest ELO lead in lmarena history, I'm most excited about the fact that people really loved using the model. QPS was way above what we expected, and the model racked up 2.5M votes (also a record)! Amazing job team banana 🚀🚀🍌🍌

Arena.ai@arena

🚨🍌Big Reveal: who was "Nano Banana?" The anonymous model, “nano-banana,” that caught the world's attention with its ability to follow complex instructions, preserve character identity, and maintain contextual details was: Gemini-2.5-Flash-Image-Preview by @GoogleDeepMind 🍌✨ - Now ranked #1 on the Image Edit Arena - Also ranked #1 for Text-to-Image In two weeks, “nano-banana” has driven over 5 million votes to the Image Edit Arena. With 2.5M+ votes for this model, it is the highest number of votes any model has received, with the largest Elo score lead (171) in Arena history. Congrats to the @GoogleDeepMind team on this incredible milestone in image edit and generation. 👏

English

196

20.9K

Miki Rubinstein retweetledi

Zoubin Ghahramani@ZoubinGhahrama1·1 Eyl

WoZ at the Sphere was a tour de force of AI for creative industries, combining state-of-the-art super-resolution, outpainting generative video and computer vision research at @GoogleDeepMind to bring it all to life.

Lorraine Twohill@LorraineTwohill

The Wizard of Oz at Sphere last night was pure magic. The creative vision of very talented humans + the best AI tools and models. Incredible work from James & @SphereVegas team, @janetribeca, @GoogleDeepMind, @GoogleCloud, and many more. We are officially not in Kansas any more!

English

6.1K

Miki Rubinstein retweetledi

Google DeepMind@GoogleDeepMind·5 Ağu

What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵

English

814

2.6K

13.4K

3.7M

Miki Rubinstein retweetledi

Inbar Mosseri@inbar_mosseri·20 May

Veo3 is out and it has a voice! 🗣️ deepmind.google/models/veo/ Veo3 can generate video and audio, including sound effects, ambient noise, even dialogue! The future of AI video is sounding amazing 🤩 #Veo3 #AI #GoogleAI #VideoGeneration

English

Miki Rubinstein retweetledi

Inbar Mosseri@inbar_mosseri·20 May

Excited to introduce our new Veo 2 capabilities! Now with reference powered video generation (including style!), camera controls, outpainting, object add/removal & many more: #capabilities" target="_blank" rel="nofollow noopener">deepmind.google/models/veo/#ca… Also presenting Flow, our new AI filmmaking tool. labs.google/flow

English

Miki Rubinstein retweetledi

Google Gemini@GeminiApp·20 May

Imagen 4 delivers visuals that pop with richer details, more nuanced color, and better text outputs. Everyone can make images for free in the Gemini App today: gemini.google.com #GoogleIO

English

181

350

6.4K

18.6M

Miki Rubinstein retweetledi

Google DeepMind@GoogleDeepMind·20 May

Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵

English

645

1.3K

8.1K

1.6M

Miki Rubinstein@MikiRubinstein·2 Oca

@jon_barron Haha this is great. >9 candles though :)

English

543

Miki Rubinstein retweetledi

Jon Barron@jon_barron·2 Oca

Happy Hanukkah! "A man using an upside-down Hanukkah menorah as a jetpack, blasting off" #Veo2

English

227

17.1K

Miki Rubinstein retweetledi

Jia-Bin Huang@jbhuang0604·26 Kas

Introducing Generative Omnimatte: A method for decomposing a video into complete layers, including objects and their associated effects (e.g., shadows, reflections). It enables many cool applications, such as video stylization, compositions, moment retiming, and object removal.

English

165

1.2K

81.7K

Miki Rubinstein retweetledi

Daniel Geng@dangengdg·4 Ara

What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!

English

146

672

94.5K

Miki Rubinstein retweetledi

Chen Sun@jesu9·4 Ara

Motion is the new (and better) language for conditioned video generation, and Daniel @dangengdg shows that its vocabulary should be formed as points and tracks! You can now motion-prompt the same model for object and camera control, motion transfer, and many more!

Daniel Geng@dangengdg

English

2.3K

Miki Rubinstein retweetledi

Yao-Chih Lee@YaoChihLee·26 Kas

Excited to introduce our new paper, Generative Omnimatte: Learning to Decompose Video into Layers, with the amazing team at Google DeepMind! Our method decomposes a video into complete layers, including objects and their associated effects (e.g., shadows, reflections).

English

610

92.3K

Miki Rubinstein retweetledi

Tali Dekel@talidekel·27 Kas

Working on layered video decomposition for a few years now, I'm super excited to share these results! Casual videos to *fully visible* RGBA layers, even under significant occlusions! Kudos @YaoChihLee, @erika_lu_, Sarah Rumbley, @GeyerMichal, @jbhuang0604, and @forrestercole

Yao-Chih Lee@YaoChihLee

English

126

11K

Miki Rubinstein retweetledi

Jialu Li@JialuLi96·25 Eki

🎉Excited to introduce our new paper: Unbounded: A Generative Game of Character Life Simulation! We build a game of character life simulation that is fully encapsulated in generative models. 🌟We achieve this with: ▶️ A specialized, distilled LLM that dynamically generates game mechanics, narratives, and character interactions in real-time. ▶️ A dynamic regional IP-Adapter for vision models that ensures consistent yet flexible visual generation of a character across multiple environments. 🧵

English

185

47.7K

Keşfet

@sundancefest @GoogleDeepMind @jon_barron @dangengdg @YaoChihLee @erika_lu_ @GeyerMichal @jbhuang0604