Sathvik Udupa

56 posts

Sathvik Udupa

@SathvikUdupa

Graduate Student, BUT Speech@FIT. Previously, SPIRE Lab, IISc.

Brno, Czechia Katılım Nisan 2013

572 Takip Edilen69 Takipçiler

Sathvik Udupa@SathvikUdupa·3d

@rdesh26 These frameworks can plan and think about response while listening to a human speaking (and planning) Hence, FD in terms of cognition

English

Sathvik Udupa@SathvikUdupa·3d

@rdesh26 and this directly ties to background models/processes - KAME, MoshiRAG, your work, interaction models etc - they all mimic it

English

Desh Raj@rdesh26·3d

x.com/i/article/2054…

ZXX

113

12.9K

Sathvik Udupa@SathvikUdupa·5d

@clarejtbirch Thanks for showing us a glimpse of the future :) Personally, I would prefer interacting with an AI with a robotic voice (with good prosody; eg: rocky from project hail mary). I would still like to have a clear distinction between humans and any AI I'm speaking with.

English

clare ❤️‍🔥@clarejtbirch·6d

AI changes us. Thinking Machines exists to build AI tools that increase human participation, preserve dignity across different minds, and move fast without severing society from our slower layers of memory, culture, and care. Interaction models are such a tool: an experiment in real-time, full-duplex, multimodal-native ways of working with AI. Make the computer disappear.

Thinking Machines@thinkymachines

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…

English

135

13.7K

Sathvik Udupa@SathvikUdupa·5d

@rdesh26 Conferencing setups as well as directed speech, third party speech (arxiv.org/pdf/2601.05564). Video provides a lot more information with diarization, gaze and localisation, so I wonder how far we can go with only streaming speech

English

Desh Raj@rdesh26·5d

@SathvikUdupa Do you mean a conferencing style setup with multiple users and 1 agent?

English

349

Desh Raj@rdesh26·5d

x.com/i/article/2054…

ZXX

169

35.4K

Sathvik Udupa@SathvikUdupa·6d

@steeve @kyutai_labs While true, they're not claiming to be first; they refer Moshi and other work. Different architecture, multimodal, async backend, and shows new capabilities. Always good to see more teams entering the space and contributing new ideas :)

English

489

Steeve Morin@steeve·6d

This is cool but @kyutai_labs demonstrated this like a year 1/2 ago ago. On an iPhone.

Thinking Machines@thinkymachines

English

16K

Sathvik Udupa retweetledi

SPIRE Lab@SPIRE_Lab·15 Eki

🚀 Voice Tech For All Challenge is LIVE! Build multilingual, expressive TTS for India’s low-resource languages. 🏆 Prizes worth ₹28.5L! 📅 Register: bit.ly/3KrKDjy 🔗 Learn more: syspin.iisc.ac.in/voicetechforall #VoiceTechForAll #TextToSpeech #AIForAll #Hackathon

English

140

Sathvik Udupa retweetledi

arXiv Sound@ArxivSound·6 Mar

``Good practices for evaluation of synthesized speech,'' Erica Cooper, S\'ebastien Le Maguer, Esther Klabbers, Junichi Yamagishi, ift.tt/5svQhOH

English

7.1K

Sathvik Udupa retweetledi

roon@tszzl·23 Ara

nobody should give or receive any career advice right now. everyone is broadly underestimating the scope and scale of change and the high variance of the future. your L4 engineer buddy at meta telling you “bro cs degrees are cooked” doesn’t know shit

English

230

576

8.7K

1.3M

Sathvik Udupa retweetledi

Tanvi Khandelwal@tanvik_2703·25 May

On 23/05/24, I fell victim to a fraud where I was told that my aadhaar was used for illegal parcel via fedex. I lost a sum of INR 31 Lakh. Requesting help from the agencies to please help me recover my lost money. Details in the thread @GoI_MeitY @AshwiniVaishnaw @ThePrintIndia

English

2.7K

Sathvik Udupa retweetledi

Journal of Physiology@JPhysiol·10 Nis

This #TechniquesinPhysiology paper from @varmaalok22, Mohini Sengupta, @VatsalaT (@NCBS_Bangalore), @SathvikUdupa and Prasanta Kumar Ghosh (@EE_IISc) introduces a machine-learning tool for identifying bistable states from calcium imaging data! 📜buff.ly/3TSBsdd

English

4.6K

Sathvik Udupa@SathvikUdupa·20 Mar

We will be chairing the LIMMITS 24 @ieeeICASSP special session on April 17 - 2024.ieeeicassp.org/program-schedu…. Looking forward to presentations and discussions around multilingual TTS with voice cloning!

English

697

Sathvik Udupa@SathvikUdupa·3 Mar

@arxiv my arxiv submission is on hold for about 3 months now, can this be resolved soon?

English

Sathvik Udupa@SathvikUdupa·15 Şub

@ieeeICASSP requesting quick resolution regarding documents I've requested for VISA application (it's been 18 days). I received an invitation letter yesterday but the country is different! I have not received the other document I've requested as well! #ICASSP2024

English

Sathvik Udupa retweetledi

IISc Bangalore@iiscbangalore·10 Şub

Open Day 2024 is here! ✨ 🏃‍♂️Visit our lush green campus on February 24th between 9 am to 5 pm to learn about the exciting research and activities happening here. 📲 Use the hashtag #IIScOpenDay2024 to get a chance to be featured on our page Details: openday.iisc.ac.in/index.php

English

152

415

45.5K

Sathvik Udupa@SathvikUdupa·4 Şub

Had a great time attending Google research week India! Thanks for a memorable experience @GoogleIndia @GoogleAI

English

Sathvik Udupa retweetledi

Thomas Hain@thomashain·23 Oca

Speech technology is not available for many languages due to a lack of data - and foundation models have role to play here. If you are interested in this topic, consider contributing to this #interspeech2024 special session sites.google.com/view/is24-ssl-… ! We look forward to your paper!

English

2.6K

Sathvik Udupa@SathvikUdupa·3 Oca

@_josh_meyer_ So sorry to hear! I've used coqui VITS to train numerous high quality TTS systems, thank you for all your contributions! The legacy will live on!

English

214

Josh Meyer@_josh_meyer_·3 Oca

Coqui is shutting down. It's sad news to start the new year, but I want to take a minute to recognize everything we accomplished and thank the great people who made it possible. First things first: the Team I'm honored to have worked with such brilliant, dedicated, and inspiring individuals. We were a small team, but we left our scratch on the earth's crust. Our accomplishments stand on their own, but when you remember we were just a rag-tag team with limited compute... now that's special. Big tech had orders of magnitude more compute, data, and researchers, but we gave them a run for their money. We didn't just replicate the state-of-the-art... we created it! That wouldn't have been possible without this exact team. We were spread across five continents, native languages, and backgrounds... and we built something great. I'm sure that we built great tech because of that mix of perspectives. I will deeply miss our team, but I'm also excited to see what they do next. Whoever gets them on-board will be a lucky duck :) What we accomplished Way back in 2016, it all began as the Machine Learning Group at Mozilla. First was DeepSpeech, then Common Voice and TTS. Crazy how far the field has come since then. We spun out as Coqui in 2021 in order to add rocket fuel to our mission. One of our biggest accomplishments at Coqui was XTTS. The state-of-the-art took a huge leap forward when we openly released model weights for XTTS v1... and v2 was even better! I'm thrilled to see where AI is heading, and proud that we could make some of that progress available to everyone. Here's a tiny snapshot of what we accomplished at Coqui: ✅ 2021: Coqui STT v1.0 release. Coqui Model Zoo goes live. SC-GlowTTS released. ✅ 2022: YourTTS goes viral. Tons of open-source releases. Building the team. ✅ 2023: Coqui Studio webapp and API go live. First customers. XTTS open release. I can confidently say that we pushed the state-of-the-art for generative speech technology... before it was called "generative" :) Thank you It took a village to make Coqui possible, and I want to thank everyone who gave us a shot. The real rockstars are the team, as I said above. Thank you! A huge thanks to the community. You have always been our core. From the Mozilla days on IRC to the current Discord server. The community has contributed, supported, and made building in the open a joy. Thank you all! Thank you to our investors. Coqui simply wouldn't have been possible without you. You believed in us before anyone else; you took a chance on us. More than just an investment, your thoughtful insights and discussions made Coqui a better company and a better product. I'm extremely grateful for your support. Thank you! Thank you to our customers. Everything we built was for you, and I hope we managed to give you something you loved. Especially thank you for your feedback: both the good and the bad. We did our best to hear you and build you something better everyday. Thank you! Lastly, thank you to our partners over the years. It's a long list of great folks I've been lucky enough to collaborate with. We worked on open science, open code, and open models. From joint research to hackathons, it was a blast! To the great folks at HuggingFace, Mozilla, Masakhane, Harvard, Indiana University, Google, MLCommons, Landing AI, NVIDIA, Intel, and Makerere University... thank you! Forgive me if I've left anyone out. What's next I can't yet say what comes next... but generative AI in 2024 is going to be bigger than ever. Generative voice will only get better, faster, cheaper, and easier to fine-tune... open-source will be a huge part of that. Speaking of open-source... Coqui TTS is on Github. Do something awesome with it! Thank you all 💚 github.com/coqui-ai/TTS

English

151

955

210K

Sathvik Udupa retweetledi

Priyanshi@_Nanpi·9 Haz

@CPVIndia @Maahanmuuttovir @Opetushallitus As an Incoming International Student at Aalto University, Finland, I have been trying to book an appointment for verification of documents for Finland Residence Permit for Studies since May at @VFSGlobal Delhi, India. (1/4)

English

602

Keşfet

@rdesh26 @clarejtbirch @steeve @kyutai_labs @GoI_MeitY @AshwiniVaishnaw @ThePrintIndia @varmaalok22