Rodrigo Mira

148 posts

Rodrigo Mira

Rodrigo Mira

@RodrigomiraA

Senior Research Scientist @GoogleDeepMind, ex-Postdoc @MetaAI, PhD grad @imperialcollege. Audio-visual speech + GenAI + SSL .

New York CIty Katılım Kasım 2020
562 Takip Edilen364 Takipçiler
Rodrigo Mira retweetledi
Charles 🎉 Frye
Charles 🎉 Frye@charles_irl·
born to build artificial intelligence forced to debug python package installation
English
7
12
116
10.5K
Rodrigo Mira retweetledi
Shiqi Yang
Shiqi Yang@shiqi_yang_147·
And we are inviting the already accepted papers as the posters, submit your poster here docs.google.com/forms/d/e/1FAI… 2nd AVGenL: Audio-visual generation and learning at ICCV2025. #ICCV2025
Shiqi Yang@shiqi_yang_147

Update, there will be 2 industrial demo sessions, Hedra and Veo 3 from Google DeepMind. Workshop site: goo.su/p8cj Organizer: @lightchaserx @RodrigomiraA @ShoukangHu @VickyKalogeiton @Tae_Hyun_Oh Stavros Petridis, Ming-Hsuan Yang #ICCV2025 #ICCV @ICCVConference

English
1
2
5
1.1K
Rodrigo Mira retweetledi
Shiqi Yang
Shiqi Yang@shiqi_yang_147·
Update, there will be 2 industrial demo sessions, Hedra and Veo 3 from Google DeepMind. Workshop site: goo.su/p8cj Organizer: @lightchaserx @RodrigomiraA @ShoukangHu @VickyKalogeiton @Tae_Hyun_Oh Stavros Petridis, Ming-Hsuan Yang #ICCV2025 #ICCV @ICCVConference
Shiqi Yang@shiqi_yang_147

We are excited to announce the 2nd Workshop on AVGenL: Audio-Visual Generation & Learning at #ICCV2025! This year, in addition to an outstanding lineup of speakers, we’ll be featuring industrial demo sessions. Stay tuned—more details are coming soon! goo.su/p8cj

English
1
4
18
13.4K
Rodrigo Mira retweetledi
Antoni Bigata
Antoni Bigata@toninio444·
Excited to be at #CVPR2025! I'm presenting our paper, "KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation." Stop by my poster at 📍 ExHall D, Poster #3. Details: cvpr.thecvf.com/virtual/2025/p… DM me to grab a coffee and talk research!
English
1
1
4
287
Rodrigo Mira retweetledi
AIGCLINK
AIGCLINK@aigclink·
新出的一款唇形同步工具:KeySync,核心能力是解决了表情泄漏以及遮挡问题,唇部动作不会受到原始视频人物表情,或是嘴部被遮挡而影响同步效果 能处理高分辨率视频,可以准确的把生成的唇部动作与新音频对齐,避免音画不同步 它的两阶段框架设计,能使时间连贯性保持的较好,使得嘴型变化自然流畅 可以通过调整参数控制动画的生成过程,比如指定遮挡物体的位置 #唇形同步 #KeySync
中文
5
48
167
18.4K
Rodrigo Mira retweetledi
Dreaming Tulpa 🥓👑
Dreaming Tulpa 🥓👑@dreamingtulpa·
there is a new high-quality lip-sync model called KeySync 🔥
English
20
78
735
39.7K
Rodrigo Mira retweetledi
Antoni Bigata
Antoni Bigata@toninio444·
Excited to share our new paper: KeySync! 🚀 After facial animation, we turned to lip synchronization – a field with similar applications (multilingual content, virtual avatars) but unique challenges! 💻 Project Page (w/ code, models and demo) : antonibigata.github.io/KeySync/ [1/8] 🧵
English
3
3
8
402
Umberto Cappellazzo
Umberto Cappellazzo@Umberto_Senpai·
@RodrigomiraA Great work Rodrigo!! It seems like this time you didn't sweat that much as it happened at last Interspeech 😂
English
1
0
1
164
Rodrigo Mira
Rodrigo Mira@RodrigomiraA·
We will be presenting our new paper in 1 hour at ICASSP 2025! Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction (arxiv.org/pdf/2503.08798). If you're around, come to our poster session at 2 PM, poster number is 2E-1 :)
English
1
0
12
477
Rodrigo Mira retweetledi
Antoni Bigata
Antoni Bigata@toninio444·
Excited to share our new paper, KeyFace 🔑, accepted to #CVPR2025! 🎉 We propose a novel two-stage audio-driven facial animation model, enhancing naturalness and consistency in long video sequences. 💻 Project Page: antonibigata.github.io/KeyFace/ [1/8] 🧵
English
1
3
6
409
Rodrigo Mira
Rodrigo Mira@RodrigomiraA·
@PragyaKhanna8 @ieeeICASSP Thanks for the question! Not exactly, but we did try mixing our context embedding with a more traditional TSE-style speaker embedding in H(ybrid)-ContExt, and the results were promising - check Table 1. As expected, including the speaker embedding makes results much stronger.
English
0
0
0
49
Pragya Khanna
Pragya Khanna@PragyaKhanna8·
@RodrigomiraA @ieeeICASSP Intriguing work, have you considered modeling speaker-specific conversational patterns (e.g., prosody, speaking style) along with textual context to further refine speech extraction? Would this be beneficial for distinguishing speakers in overlapping dialogue scenarios?
English
1
0
1
45
Rodrigo Mira
Rodrigo Mira@RodrigomiraA·
Contextual Speech Extraction (CSE) has been accepted to @ieeeICASSP! 🥳 We propose a new speech extraction strategy which requires only text-based context (e.g., dialogue history) to identify the target speaker. Paper, code, and samples: miraodasilva.github.io/cse-project-pa… [1/8] 🧵
Rodrigo Mira tweet media
English
3
2
11
1.2K
Rodrigo Mira
Rodrigo Mira@RodrigomiraA·
@AnandButani @ieeeICASSP Thanks for the interest Anand :) We haven't explored real-time extraction for our proof of concept, but this could definitely be done using a real-time speech separation backbone for example! Hoping to see this in future work.
English
0
0
1
25
Anand Butani
Anand Butani@AnandButani·
@RodrigomiraA @ieeeICASSP Fascinating how LLM contextual embedding streamlines audio processing. Have you explored its impact on real-time applications?
English
1
0
1
28
Rodrigo Mira
Rodrigo Mira@RodrigomiraA·
@ieeeICASSP If you’re around at ICASSP 🇮🇳 this year, check out our poster: Speech Enhancement and Extraction IV Tuesday, April 8: 14:00 - 15:30 Minsu and I will be there, so come say hi! 🤗 #ICASSP2025 [8/8]
English
0
0
0
353
Rodrigo Mira
Rodrigo Mira@RodrigomiraA·
@ieeeICASSP Notably, we find that CSE performance scales with the length of the context, rising sharply from 1-10 dialogue turns, and then plateauing around 50 turns. (More ablations in the paper) [7/8] 🧵
Rodrigo Mira tweet media
English
1
0
0
87