Sathvik Udupa

56 posts

Sathvik Udupa

Sathvik Udupa

@SathvikUdupa

Graduate Student, BUT Speech@FIT. Previously, SPIRE Lab, IISc.

Brno, Czechia Katılım Nisan 2013
572 Takip Edilen69 Takipçiler
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@rdesh26 These frameworks can plan and think about response while listening to a human speaking (and planning) Hence, FD in terms of cognition
English
0
0
1
45
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@rdesh26 and this directly ties to background models/processes - KAME, MoshiRAG, your work, interaction models etc - they all mimic it
English
1
0
0
44
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@clarejtbirch Thanks for showing us a glimpse of the future :) Personally, I would prefer interacting with an AI with a robotic voice (with good prosody; eg: rocky from project hail mary). I would still like to have a clear distinction between humans and any AI I'm speaking with.
English
0
0
0
40
clare ❤️‍🔥
clare ❤️‍🔥@clarejtbirch·
AI changes us. Thinking Machines exists to build AI tools that increase human participation, preserve dignity across different minds, and move fast without severing society from our slower layers of memory, culture, and care. Interaction models are such a tool: an experiment in real-time, full-duplex, multimodal-native ways of working with AI. Make the computer disappear.
Thinking Machines@thinkymachines

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…

English
9
10
135
13.7K
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@rdesh26 Conferencing setups as well as directed speech, third party speech (arxiv.org/pdf/2601.05564). Video provides a lot more information with diarization, gaze and localisation, so I wonder how far we can go with only streaming speech
English
1
0
0
57
Desh Raj
Desh Raj@rdesh26·
@SathvikUdupa Do you mean a conferencing style setup with multiple users and 1 agent?
English
1
0
0
349
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@steeve @kyutai_labs While true, they're not claiming to be first; they refer Moshi and other work. Different architecture, multimodal, async backend, and shows new capabilities. Always good to see more teams entering the space and contributing new ideas :)
English
0
0
4
489
Sathvik Udupa retweetledi
arXiv Sound
arXiv Sound@ArxivSound·
``Good practices for evaluation of synthesized speech,'' Erica Cooper, S\'ebastien Le Maguer, Esther Klabbers, Junichi Yamagishi, ift.tt/5svQhOH
English
0
12
47
7.1K
Sathvik Udupa retweetledi
roon
roon@tszzl·
nobody should give or receive any career advice right now. everyone is broadly underestimating the scope and scale of change and the high variance of the future. your L4 engineer buddy at meta telling you “bro cs degrees are cooked” doesn’t know shit
English
230
576
8.7K
1.3M
Sathvik Udupa retweetledi
Tanvi Khandelwal
Tanvi Khandelwal@tanvik_2703·
On 23/05/24, I fell victim to a fraud where I was told that my aadhaar was used for illegal parcel via fedex. I lost a sum of INR 31 Lakh. Requesting help from the agencies to please help me recover my lost money. Details in the thread @GoI_MeitY @AshwiniVaishnaw @ThePrintIndia
English
4
19
14
2.7K
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@arxiv my arxiv submission is on hold for about 3 months now, can this be resolved soon?
English
0
0
1
96
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@ieeeICASSP requesting quick resolution regarding documents I've requested for VISA application (it's been 18 days). I received an invitation letter yesterday but the country is different! I have not received the other document I've requested as well! #ICASSP2024
English
0
0
0
61
Sathvik Udupa retweetledi
IISc Bangalore
IISc Bangalore@iiscbangalore·
Open Day 2024 is here! ✨ 🏃‍♂️Visit our lush green campus on February 24th between 9 am to 5 pm to learn about the exciting research and activities happening here. 📲 Use the hashtag #IIScOpenDay2024 to get a chance to be featured on our page Details: openday.iisc.ac.in/index.php
IISc Bangalore tweet mediaIISc Bangalore tweet mediaIISc Bangalore tweet media
English
1
152
415
45.5K
Sathvik Udupa retweetledi
Thomas Hain
Thomas Hain@thomashain·
Speech technology is not available for many languages due to a lack of data - and foundation models have role to play here. If you are interested in this topic, consider contributing to this #interspeech2024 special session sites.google.com/view/is24-ssl-… ! We look forward to your paper!
English
0
10
36
2.6K
Sathvik Udupa
Sathvik Udupa@SathvikUdupa·
@_josh_meyer_ So sorry to hear! I've used coqui VITS to train numerous high quality TTS systems, thank you for all your contributions! The legacy will live on!
English
0
0
1
214
Josh Meyer
Josh Meyer@_josh_meyer_·
Coqui is shutting down. It's sad news to start the new year, but I want to take a minute to recognize everything we accomplished and thank the great people who made it possible. First things first: the Team I'm honored to have worked with such brilliant, dedicated, and inspiring individuals. We were a small team, but we left our scratch on the earth's crust. Our accomplishments stand on their own, but when you remember we were just a rag-tag team with limited compute... now that's special. Big tech had orders of magnitude more compute, data, and researchers, but we gave them a run for their money. We didn't just replicate the state-of-the-art... we created it! That wouldn't have been possible without this exact team. We were spread across five continents, native languages, and backgrounds... and we built something great. I'm sure that we built great tech because of that mix of perspectives. I will deeply miss our team, but I'm also excited to see what they do next. Whoever gets them on-board will be a lucky duck :) What we accomplished Way back in 2016, it all began as the Machine Learning Group at Mozilla. First was DeepSpeech, then Common Voice and TTS. Crazy how far the field has come since then. We spun out as Coqui in 2021 in order to add rocket fuel to our mission. One of our biggest accomplishments at Coqui was XTTS. The state-of-the-art took a huge leap forward when we openly released model weights for XTTS v1... and v2 was even better! I'm thrilled to see where AI is heading, and proud that we could make some of that progress available to everyone. Here's a tiny snapshot of what we accomplished at Coqui: ✅ 2021: Coqui STT v1.0 release. Coqui Model Zoo goes live. SC-GlowTTS released. ✅ 2022: YourTTS goes viral. Tons of open-source releases. Building the team. ✅ 2023: Coqui Studio webapp and API go live. First customers. XTTS open release. I can confidently say that we pushed the state-of-the-art for generative speech technology... before it was called "generative" :) Thank you It took a village to make Coqui possible, and I want to thank everyone who gave us a shot. The real rockstars are the team, as I said above. Thank you! A huge thanks to the community. You have always been our core. From the Mozilla days on IRC to the current Discord server. The community has contributed, supported, and made building in the open a joy. Thank you all! Thank you to our investors. Coqui simply wouldn't have been possible without you. You believed in us before anyone else; you took a chance on us. More than just an investment, your thoughtful insights and discussions made Coqui a better company and a better product. I'm extremely grateful for your support. Thank you! Thank you to our customers. Everything we built was for you, and I hope we managed to give you something you loved. Especially thank you for your feedback: both the good and the bad. We did our best to hear you and build you something better everyday. Thank you! Lastly, thank you to our partners over the years. It's a long list of great folks I've been lucky enough to collaborate with. We worked on open science, open code, and open models. From joint research to hackathons, it was a blast! To the great folks at HuggingFace, Mozilla, Masakhane, Harvard, Indiana University, Google, MLCommons, Landing AI, NVIDIA, Intel, and Makerere University... thank you! Forgive me if I've left anyone out. What's next I can't yet say what comes next... but generative AI in 2024 is going to be bigger than ever. Generative voice will only get better, faster, cheaper, and easier to fine-tune... open-source will be a huge part of that. Speaking of open-source... Coqui TTS is on Github. Do something awesome with it! Thank you all 💚 github.com/coqui-ai/TTS
English
151
98
955
210K
Sathvik Udupa retweetledi
Priyanshi
Priyanshi@_Nanpi·
@CPVIndia  @Maahanmuuttovir @Opetushallitus  As an Incoming International Student at Aalto University, Finland, I have been trying to book an appointment for verification of documents for Finland Residence Permit for Studies since May at @VFSGlobal Delhi, India. (1/4)
English
4
8
7
602