Shrey Gupta

3

56

Shrey Gupta retweetledi

Arjun Desai@jundesai·15 Haz

It always felt awkward to me that models offer a choice between quality and speed. The world’s best human collaborators are perceptive, responsive, and fast. Why can’t our models be? Research should be about solving the fundamental limitations to unlock unimaginable experiences. This is what @cartesia has always been about. Awesome to see these breakthroughs powering Sonic-3.5 and Ink-2.

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

2

8

45

6.2K

Shrey Gupta@Shrey2809·15 Haz

@TobiasKatsch Maybe soon we cook more things... 😶🤫

English

0

3

43

Tobias Katsch@TobiasKatsch·15 Haz

the journey to SOTA at cartesia has been truly amazing. I am incredibly proud of the team!

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

0

16

938

Shrey Gupta@Shrey2809·15 Haz

@cartesia Nothing but 🥇

English

65

Shrey Gupta retweetledi

Cartesia@cartesia·15 Haz

Two new models just dropped 👀 Sonic-3.5 and Ink-2 are the #1 streaming models for text to speech and speech to text

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

10

23

100

14.4K

Shrey Gupta@Shrey2809·15 Haz

@rbuit_ @LewisHamilton @cartesia 2.4s pit stop is slow, time to speed it up lol

English

1

30

Ricardo Buitrago@rbuit_·15 Haz

@Shrey2809 @LewisHamilton @cartesia if @LewisHamilton had Shrey on his time he would have waited 90ms for a win

English

0

3

46

Shrey Gupta@Shrey2809·15 Haz

686 days. that's how long @LewisHamilton waited for a win and he got his first in red this weekend, in Spain. bold bets take time. then they land all at once. @cartesia made one too: be the best in voice on one provider. Sonic 3.5 + Ink 2, both #1.

English

6

2

23

389

Shrey Gupta@Shrey2809·15 Haz

@gracejkim9 @dhwanigargg Time to try if we can transcribe the album now using ink-2, I need lyrics

English

1

43

Grace Kim@gracejkim9·15 Haz

@dhwanigargg missed my olivia rodrigo album listening party for this

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

4

1

10

496

Shrey Gupta retweetledi

Pipecat AI@pipecat_ai·15 Haz

👏🏽 Try them today with the Pipecat CLI

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

2

1

7

1.7K

Shrey Gupta retweetledi

Eli@elipughresearch·15 Haz

InkIt is insanely good, it's been a nice keyboard replacement for a decent chunk of my work, and it's ~200ms, where most others are >5 seconds :) github.com/cartesia-ai/In…

Aiqi Liu@MeetAiqi

When @elipughresearch and the team dropped Ink-2, I had to see if this SOTA Speech-to-Text model lived up to the hype. So, I built a dictation app to find out. To my delight, it’s incredibly fast & accurate. You can just "ink it." InkIt is now free and open-source. Try it, fork it, and make it yours! 👇

English

3

28

683

Shrey Gupta retweetledi

Albert Gu@_albertgu·15 Haz

Within the span of a week, we launched streaming TTS (text-to-speech) and STT (speech-to-text) models that topped the leaderboards. I'm incredibly proud of the research team for their relentless pursuit of improvement, which have unlocked new state-of-the-art audio models on the Pareto frontier of speed and quality. As a research problem, speech requires fusing both text and audio and is the gateway to general multimodal models. We built Sonic-3.5 and Ink-2 from the ground up, developing multiple innovations along the way in a direction that will scale to general real-time intelligence. I've personally been deeply involved in building these models and more; it's been a blast working with the incredibly talented research team here @cartesia, and I can't wait to show the world what's coming next :)

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

3

16

179

20.3K

Shrey Gupta@Shrey2809·15 Haz

See the original launch post here x.com/krandiash/stat…

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

5

97

Shrey Gupta@Shrey2809·15 Haz

Try it now at: cartesia.ai/launch

English

0

5

67

Aiqi Liu@MeetAiqi·15 Haz

When @elipughresearch and the team dropped Ink-2, I had to see if this SOTA Speech-to-Text model lived up to the hype. So, I built a dictation app to find out. To my delight, it’s incredibly fast & accurate. You can just "ink it." InkIt is now free and open-source. Try it, fork it, and make it yours! 👇

English

69

65

157

26.7K

Shrey Gupta@Shrey2809·15 Haz

@MeetAiqi @elipughresearch best product ever 🔥

English

3

27

Shrey Gupta@Shrey2809·15 Haz

@2107ayush When we cooking the new kernels 😂

English

0

21

Ayush Sachdeva@2107ayush·15 Haz

cartesia.ai

ZXX

2

0

2

101

Ayush Sachdeva@2107ayush·15 Haz

You shouldn't have to trade speed for quality. With our new ASR and TTTS models, you don't. Ink-2 (ASR) and Sonic-3.5 (TTS) just debuted at #1 on their respective leaderboards across both accuracy and latency. Try them out!

GIF

English

3

46

1.8K

Shrey Gupta retweetledi

Daniele Paliotta@DanielePaliotta·15 Haz

Need TTS? We are SOTA! Need ASR? We are SOTA too ❤️ Our models are now the best at speaking, and listening. Try Cartesia Sonic 3.5 and Ink 2: cartesia.ai/launch

GIF

English

8

1

38

1.3K

Shrey Gupta@Shrey2809·15 Haz

the dead-air dance: you stop, it waits, you start, it interrupts, you both freeze. it's the thing that makes every voice bot feel like a bot. we just killed it 👇

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English