Scott Wisdom

20 posts

Scott Wisdom

Scott Wisdom

@ScottTWisdom

Research scientist at @GoogleAI working on sound separation

เข้าร่วม Temmuz 2011
121 กำลังติดตาม184 ผู้ติดตาม
Scott Wisdom รีทวีตแล้ว
Jason Baldridge
Jason Baldridge@jasonbaldridge·
Veo 3 is here, and in addition to better visuals, it makes noises and speaks! This was a massive effort made possible by incredible passion from the whole Veo team and the many other team enabling it to launch today. Looking forward to seeing what others do with it! #veo3
English
12
30
232
19.8K
Scott Wisdom รีทวีตแล้ว
Sundar Pichai
Sundar Pichai@sundarpichai·
Veo 3, our SOTA video generation model, has native audio generation and is absolutely mindblowing. For filmmakers + creatives, we’re combining the best of Veo, Imagen and Gemini into a new filmmaking tool called Flow. Ready today for Google AI Pro and Ultra plan subscribers.
English
9
59
845
92.4K
Scott Wisdom รีทวีตแล้ว
Google DeepMind
Google DeepMind@GoogleDeepMind·
We're sharing progress on our video-to-audio (V2A) generative technology. 🎥 It can add sound to silent clips that match the acoustics of the scene, accompany on-screen action, and more. Here are 4 examples - turn your sound on. 🧵🔊 dpmd.ai/v2a
English
90
350
1.5K
529.1K
Scott Wisdom รีทวีตแล้ว
Vivek Kumar
Vivek Kumar@vivek_kumar·
It's so awesome to see the impact of the computational audio capabilities we developed featured in @madebygoogle 🎉 🎉 🎉 Congrats to John Hershey, @ScottTWisdom, @PGetreuer & everyone who contributed for pioneering new computational audio capabilities in Pixel8 #MadeByGoogle
Google Photos@googlephotos

Check out the 4 new Google Photos features coming first to Pixel 8 and 8 Pro ↓ Whether it’s noise from wind, traffic, or barking dogs, Audio Magic Eraser in Google Photos reduces distracting sounds in your video in just a few taps! 🪄

English
4
16
60
20.3K
Scott Wisdom รีทวีตแล้ว
Jonathan Le Roux
Jonathan Le Roux@JonathanLeRoux·
Strong showing at #SANE2022 to learn about the latest and greatest in speech and audio research from a stellar lineup!
Jonathan Le Roux tweet media
English
2
3
70
0
Scott Wisdom รีทวีตแล้ว
Efthymios Tzinis
Efthymios Tzinis@ETzinis·
I am 😃 that we will present AudioScopeV2 at #ECCV2022! If you want to learn about improved audio-visual attention models and calibration for on-screen sound separation check our paper w. @ScottTWisdom! project-page: google-research.github.io/sound-separati… new dataset: github.com/google-researc…
arXiv Sound@ArxivSound

``AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation. (arXiv:2207.10141v1 [cs.SD]),'' Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey, ift.tt/jOrEQWR

English
1
3
21
0
Scott Wisdom รีทวีตแล้ว
AK
AK@_akhaliq·
Distance-Based Sound Separation abs: arxiv.org/abs/2207.00562 project page: google-research.github.io/sound-separati… With a single nearby speaker and four distant speakers, the model improves scale-invariant signal to noise ratio by 4.4 dB for near sounds and 6.8 dB for far sounds
AK tweet media
English
0
24
105
0
Scott Wisdom รีทวีตแล้ว
Aswin Sivaraman
Aswin Sivaraman@actuallyaswin·
Happy to see my summer work with @ScottTWisdom, Hakan Erdogan, and John Hershey was accepted for presentation at @ieeeICASSP 2022 😊 My first ICASSP paper in the books! Immensely thankful for their mentorship. Our first version can be found on arXiv at: arxiv.org/abs/2110.10739
Aswin Sivaraman tweet media
English
2
1
36
0
Scott Wisdom รีทวีตแล้ว
Sundar Pichai
Sundar Pichai@sundarpichai·
We can learn a lot about our environment just by listening to the birds. New #GoogleAI approaches can help isolate and identify birdsongs, helping ecologists better understand food systems and forest health. 🐦 ai.googleblog.com/2022/01/separa…
English
102
152
1.6K
0
Scott Wisdom รีทวีตแล้ว
Yuma Koizumi
Yuma Koizumi@yuma_koizumi·
Our DF-Conformer paper has received the “Best Speech Enhancement Paper Award” from #WASPAA2021! Yay!!
English
2
15
77
0
Scott Wisdom รีทวีตแล้ว
Eduardo Fonseca
Eduardo Fonseca@edfonseca_·
🔊Happy to announce FSD50K: the new open dataset of human-labeled sound events! Over 51k Freesound audio clips, totalling over 100h of audio manually labeled using 200 classes drawn from the AudioSet Ontology. Paper: arxiv.org/pdf/2010.00475… Dataset: doi.org/10.5281/zenodo…
Eduardo Fonseca tweet media
English
4
79
240
0