Shrey Gupta

29 posts

Shrey Gupta

Shrey Gupta

@Shrey2809

ML @mldcmu

Katılım Mayıs 2021
158 Takip Edilen70 Takipçiler
Shrey Gupta retweetledi
Arjun Desai
Arjun Desai@jundesai·
It always felt awkward to me that models offer a choice between quality and speed. The world’s best human collaborators are perceptive, responsive, and fast. Why can’t our models be? Research should be about solving the fundamental limitations to unlock unimaginable experiences. This is what @cartesia has always been about. Awesome to see these breakthroughs powering Sonic-3.5 and Ink-2.
Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English
2
8
45
6.2K
Tobias Katsch
Tobias Katsch@TobiasKatsch·
the journey to SOTA at cartesia has been truly amazing. I am incredibly proud of the team!
Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English
1
0
16
938
Shrey Gupta retweetledi
Cartesia
Cartesia@cartesia·
Two new models just dropped 👀 Sonic-3.5 and Ink-2 are the #1 streaming models for text to speech and speech to text
Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English
10
23
100
14.4K
Shrey Gupta
Shrey Gupta@Shrey2809·
686 days. that's how long @LewisHamilton waited for a win and he got his first in red this weekend, in Spain. bold bets take time. then they land all at once. @cartesia made one too: be the best in voice on one provider. Sonic 3.5 + Ink 2, both #1.
English
6
2
23
389
Shrey Gupta retweetledi
Pipecat AI
Pipecat AI@pipecat_ai·
👏🏽 Try them today with the Pipecat CLI
Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English
2
1
7
1.7K
Shrey Gupta retweetledi
Eli
Eli@elipughresearch·
InkIt is insanely good, it's been a nice keyboard replacement for a decent chunk of my work, and it's ~200ms, where most others are >5 seconds :) github.com/cartesia-ai/In…
Aiqi Liu@MeetAiqi

When @elipughresearch and the team dropped Ink-2, I had to see if this SOTA Speech-to-Text model lived up to the hype. So, I built a dictation app to find out. To my delight, it’s incredibly fast & accurate. You can just "ink it." InkIt is now free and open-source. Try it, fork it, and make it yours! 👇

English
1
3
28
683
Shrey Gupta retweetledi
Albert Gu
Albert Gu@_albertgu·
Within the span of a week, we launched streaming TTS (text-to-speech) and STT (speech-to-text) models that topped the leaderboards. I'm incredibly proud of the research team for their relentless pursuit of improvement, which have unlocked new state-of-the-art audio models on the Pareto frontier of speed and quality. As a research problem, speech requires fusing both text and audio and is the gateway to general multimodal models. We built Sonic-3.5 and Ink-2 from the ground up, developing multiple innovations along the way in a direction that will scale to general real-time intelligence. I've personally been deeply involved in building these models and more; it's been a blast working with the incredibly talented research team here @cartesia, and I can't wait to show the world what's coming next :)
Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English
3
16
179
20.3K
Aiqi Liu
Aiqi Liu@MeetAiqi·
When @elipughresearch and the team dropped Ink-2, I had to see if this SOTA Speech-to-Text model lived up to the hype. So, I built a dictation app to find out. To my delight, it’s incredibly fast & accurate. You can just "ink it." InkIt is now free and open-source. Try it, fork it, and make it yours! 👇
English
69
65
157
26.7K
Ayush Sachdeva
Ayush Sachdeva@2107ayush·
You shouldn't have to trade speed for quality. With our new ASR and TTTS models, you don't. Ink-2 (ASR) and Sonic-3.5 (TTS) just debuted at #1 on their respective leaderboards across both accuracy and latency. Try them out!
GIF
English
1
3
46
1.8K
Shrey Gupta retweetledi
Daniele Paliotta
Daniele Paliotta@DanielePaliotta·
Need TTS? We are SOTA! Need ASR? We are SOTA too ❤️ Our models are now the best at speaking, and listening. Try Cartesia Sonic 3.5 and Ink 2: cartesia.ai/launch
GIF
English
8
1
38
1.3K
Shrey Gupta
Shrey Gupta@Shrey2809·
the dead-air dance: you stop, it waits, you start, it interrupts, you both freeze. it's the thing that makes every voice bot feel like a bot. we just killed it 👇
Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English
0
0
11
182
Shrey Gupta retweetledi
Karan Goel
Karan Goel@krandiash·
We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.
English
725
597
2.6K
7M