Rodrigo Mira

148 posts

Rodrigo Mira

@RodrigomiraA

Senior Research Scientist @GoogleDeepMind, ex-Postdoc @MetaAI, PhD grad @imperialcollege. Audio-visual speech + GenAI + SSL .

New York CIty Katılım Kasım 2020

562 Takip Edilen364 Takipçiler

Rodrigo Mira@RodrigomiraA·30 Eki

‼️‼️‼️Application closes tomorrow‼️‼️‼️ Excited to be co-hosting a student researcher with @SianGooding next summer 🥳 Come work with us!!

Sian Gooding@SianGooding

DEADLINE TOMORROW! 🚨 I’m co-hosting a Student Researcher (Summer 2026) @GoogleDeepMind with @RodrigomiraA ! 🚀 Join us if you're excited about: 🗣️ Speech simulation, 🎛️ Adaptive speech, 💬 LLM interaction Apply here: google.com/students

English

1.4K

Rodrigo Mira retweetledi

Charles 🎉 Frye@charles_irl·16 Eki

born to build artificial intelligence forced to debug python package installation

English

116

10.5K

Rodrigo Mira retweetledi

Shiqi Yang@shiqi_yang_147·25 Ağu

Time to submit the posters! docs.google.com/forms/d/e/1FAI…

Shiqi Yang@shiqi_yang_147

Update, there will be 2 industrial demo sessions, Hedra and Veo 3 from Google DeepMind. Workshop site: goo.su/p8cj Organizer: @lightchaserx @RodrigomiraA @ShoukangHu @VickyKalogeiton @Tae_Hyun_Oh Stavros Petridis, Ming-Hsuan Yang #ICCV2025 #ICCV @ICCVConference

English

825

Rodrigo Mira retweetledi

Shiqi Yang@shiqi_yang_147·2 Tem

And we are inviting the already accepted papers as the posters, submit your poster here docs.google.com/forms/d/e/1FAI… 2nd AVGenL: Audio-visual generation and learning at ICCV2025. #ICCV2025

Shiqi Yang@shiqi_yang_147

English

1.1K

Rodrigo Mira retweetledi

Shiqi Yang@shiqi_yang_147·25 Haz

Shiqi Yang@shiqi_yang_147

We are excited to announce the 2nd Workshop on AVGenL: Audio-Visual Generation & Learning at #ICCV2025! This year, in addition to an outstanding lineup of speakers, we’ll be featuring industrial demo sessions. Stay tuned—more details are coming soon! goo.su/p8cj

English

13.4K

Rodrigo Mira retweetledi

Antoni Bigata@toninio444·12 Haz

Excited to be at #CVPR2025! I'm presenting our paper, "KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation." Stop by my poster at 📍 ExHall D, Poster #3. Details: cvpr.thecvf.com/virtual/2025/p… DM me to grab a coffee and talk research!

English

287

Rodrigo Mira retweetledi

AIGCLINK@aigclink·5 May

新出的一款唇形同步工具：KeySync，核心能力是解决了表情泄漏以及遮挡问题，唇部动作不会受到原始视频人物表情，或是嘴部被遮挡而影响同步效果能处理高分辨率视频，可以准确的把生成的唇部动作与新音频对齐，避免音画不同步它的两阶段框架设计，能使时间连贯性保持的较好，使得嘴型变化自然流畅可以通过调整参数控制动画的生成过程，比如指定遮挡物体的位置 #唇形同步 #KeySync

中文

167

18.4K

Rodrigo Mira retweetledi

Dreaming Tulpa 🥓👑@dreamingtulpa·14 May

there is a new high-quality lip-sync model called KeySync 🔥

English

735

39.7K

Rodrigo Mira retweetledi

Antoni Bigata@toninio444·2 May

Excited to share our new paper: KeySync! 🚀 After facial animation, we turned to lip synchronization – a field with similar applications (multilingual content, virtual avatars) but unique challenges! 💻 Project Page (w/ code, models and demo) : antonibigata.github.io/KeySync/ [1/8] 🧵

English

402

Rodrigo Mira@RodrigomiraA·8 Nis

@Umberto_Senpai Thanks! Yeah the AC actually works this time 🙏

English

Umberto Cappellazzo@Umberto_Senpai·8 Nis

@RodrigomiraA Great work Rodrigo!! It seems like this time you didn't sweat that much as it happened at last Interspeech 😂

English

164

Rodrigo Mira@RodrigomiraA·8 Nis

We will be presenting our new paper in 1 hour at ICASSP 2025! Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction (arxiv.org/pdf/2503.08798). If you're around, come to our poster session at 2 PM, poster number is 2E-1 :)

English

477

Rodrigo Mira retweetledi

Antoni Bigata@toninio444·27 Mar

Excited to share our new paper, KeyFace 🔑, accepted to #CVPR2025! 🎉 We propose a novel two-stage audio-driven facial animation model, enhancing naturalness and consistency in long video sequences. 💻 Project Page: antonibigata.github.io/KeyFace/ [1/8] 🧵

English

409

Rodrigo Mira@RodrigomiraA·22 Mar

@PragyaKhanna8 @ieeeICASSP Thanks for the question! Not exactly, but we did try mixing our context embedding with a more traditional TSE-style speaker embedding in H(ybrid)-ContExt, and the results were promising - check Table 1. As expected, including the speaker embedding makes results much stronger.

English

Pragya Khanna@PragyaKhanna8·15 Mar

@RodrigomiraA @ieeeICASSP Intriguing work, have you considered modeling speaker-specific conversational patterns (e.g., prosody, speaking style) along with textual context to further refine speech extraction? Would this be beneficial for distinguishing speakers in overlapping dialogue scenarios?

English

Rodrigo Mira@RodrigomiraA·14 Mar

Contextual Speech Extraction (CSE) has been accepted to @ieeeICASSP! 🥳 We propose a new speech extraction strategy which requires only text-based context (e.g., dialogue history) to identify the target speaker. Paper, code, and samples: miraodasilva.github.io/cse-project-pa… [1/8] 🧵

English

1.2K

Rodrigo Mira@RodrigomiraA·14 Mar

@AnandButani @ieeeICASSP Thanks for the interest Anand :) We haven't explored real-time extraction for our proof of concept, but this could definitely be done using a real-time speech separation backbone for example! Hoping to see this in future work.

English

Anand Butani@AnandButani·14 Mar

@RodrigomiraA @ieeeICASSP Fascinating how LLM contextual embedding streamlines audio processing. Have you explored its impact on real-time applications?

English

Rodrigo Mira@RodrigomiraA·14 Mar

@ieeeICASSP If you’re around at ICASSP 🇮🇳 this year, check out our poster: Speech Enhancement and Extraction IV Tuesday, April 8: 14:00 - 15:30 Minsu and I will be there, so come say hi! 🤗 #ICASSP2025 [8/8]

English

353

Rodrigo Mira@RodrigomiraA·14 Mar

@ieeeICASSP Notably, we find that CSE performance scales with the length of the context, rising sharply from 1-10 dialogue turns, and then plateauing around 50 turns. (More ablations in the paper) [7/8] 🧵

English

Keşfet

@SianGooding @lightchaserx @ShoukangHu @VickyKalogeiton @Tae_Hyun_Oh @ICCVConference @Umberto_Senpai @PragyaKhanna8