Arjun Desai (@jundesai) - Twitter Profili | Zamantika Mersobahis Locabet

Arjun Desai retweetledi

Engram@EngramLab·1d

x.com/i/article/2069…

ZXX

149

194

1.5K

1.5M

Arjun Desai@jundesai·15 Haz

It always felt awkward to me that models offer a choice between quality and speed. The world’s best human collaborators are perceptive, responsive, and fast. Why can’t our models be? Research should be about solving the fundamental limitations to unlock unimaginable experiences. This is what @cartesia has always been about. Awesome to see these breakthroughs powering Sonic-3.5 and Ink-2.

Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

2

8

45

6.2K

Arjun Desai@jundesai·15 Haz

@krandiash CARTESIA

English

0

2

206

Arjun Desai retweetledi

Karan Goel@krandiash·15 Haz

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

725

597

2.6K

7M

Arjun Desai@jundesai·15 Haz

@animeshbohara legndary

IS

0

3

91

Animesh Bohara@animeshbohara·15 Haz

Some teams train good models. Some train fast ones. We don't think you should have to choose. Today at @cartesia we shipped two: Sonic 3.5 (speaking) + Ink 2 (listening). Both SOTA, both realtime. 90ms TTFA. 3.6% WER, #1 on AA. Try them out at play.cartesia.ai

English

5

2

51

2.3K

Arjun Desai@jundesai·15 Haz

@krandiash @bclyang @_albertgu nice!

English

0

1

21

Karan Goel@krandiash·15 Haz

@bclyang @_albertgu me too (but barely)

English

1

0

5

117

Albert Gu@_albertgu·15 Haz

Within the span of a week, we launched streaming TTS (text-to-speech) and STT (speech-to-text) models that topped the leaderboards. I'm incredibly proud of the research team for their relentless pursuit of improvement, which have unlocked new state-of-the-art audio models on the Pareto frontier of speed and quality. As a research problem, speech requires fusing both text and audio and is the gateway to general multimodal models. We built Sonic-3.5 and Ink-2 from the ground up, developing multiple innovations along the way in a direction that will scale to general real-time intelligence. I've personally been deeply involved in building these models and more; it's been a blast working with the incredibly talented research team here @cartesia, and I can't wait to show the world what's coming next :)

Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

3

16

179

20.3K

Arjun Desai@jundesai·15 Haz

@bclyang @_albertgu nice!

English

0

1

9

Brandon Yang@bclyang·15 Haz

@_albertgu i'd like everyone to know i was also involved

English

2

0

5

27

Arjun Desai@jundesai·15 Haz

@lulu32125 love this

English

0

2

41

Lucy Liu@lulu32125·15 Haz

Sonic 3.5 brings expressive, natural speech. Ink 2 brings contextual endpointing that makes conversations flow naturally. No more choosing between quality and speed.  No more stitching together multiple providers. Just a complete voice stack.

GIF

English

9

0

42

2.5K

Arjun Desai@jundesai·15 Haz

@MeetAiqi @elipughresearch also love that it's open source

English

1

0

6

73

Aiqi Liu@MeetAiqi·15 Haz

When @elipughresearch and the team dropped Ink-2, I had to see if this SOTA Speech-to-Text model lived up to the hype. So, I built a dictation app to find out. To my delight, it’s incredibly fast & accurate. You can just "ink it." InkIt is now free and open-source. Try it, fork it, and make it yours! 👇

English

69

65

157

26.7K

Arjun Desai@jundesai·15 Haz

@MeetAiqi @elipughresearch Already been using it, it's insane for my productivity

English

0

4

42

Arjun Desai@jundesai·15 Haz

@nahuum_maru awesome work @nahuum_maru !

English

0

3

24

Nahum Maru@nahuum_maru·15 Haz

i worked a bunch on Ink-2 and am very excited for it to be released!! very proud of the team 🔥🔥🔥

Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

4

0

25

914

Arjun Desai@jundesai·15 Haz

@Brandon24784864 @cartesia 🤫

QME

0

4

79

Brandon Chen@Brandon24784864·15 Haz

fun fact about @cartesia's sota tts and asr models: we had the models in house for several months, but we decided it was too dangerous to release such advanced voice capabilities to the public... until now. try them now.

Karan Goel@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

English

7

0

57

4.1K

Arjun Desai@jundesai·15 Haz

@DanielePaliotta lfg

0

2

143

Daniele Paliotta@DanielePaliotta·15 Haz

Need TTS? We are SOTA! Need ASR? We are SOTA too ❤️ Our models are now the best at speaking, and listening. Try Cartesia Sonic 3.5 and Ink 2: cartesia.ai/launch

GIF

English

8

1

38

1.3K

Arjun Desai@jundesai·15 Haz

@krandiash so much more coming

English

0

6

231

Arjun Desai retweetledi

Zubin Pratap@ZubinPratap·11 Haz

Give recruiters a phone number that talks back in your voice. A Conversational AI Voice Agent that knows your career history, handles interruptions, and tells your story. This one is for non-coders. 5 mins. Build with @cartesia + @claudeai. Remember: Voice AI Agents is much more than just TTS and STT — it’s everything your agent has to handle while listening to you. Video 👇

English

4

8

22

2K

Arjun Desai@jundesai·28 May

docs: docs.cartesia.ai/build-with-car…

English

0

1

166

Arjun Desai@jundesai·28 May

Building models from first principles allows you to think about the capabilities that actually matter for intelligence. Ink-2 is our first native streaming ASR model built with support for the everyday, real-world conversation — low latency, endpointing, and accuracy. exciting to see these results from AA. more to come soon.

Cartesia@cartesia

Cartesia Ink-2 debuts as #1 for accuracy on the brand-new streaming speech-to-text leaderboard from @ArtificialAnlys! We designed Ink-2 from the ground up for voice agents - with low latency, eager transcripts, and semantic endpointing.

English

2

4

34

1.4K

Arjun Desai@jundesai·28 May

try it out today: play.cartesia.ai

English

0

1

75

Arjun Desai@jundesai·22 May

Kudos to the @cartesia TTS team for this feat!

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following. Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209 ➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters ➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS See more details and listen to samples below 🧵

English

1

30

1.5K

Arjun Desai retweetledi

Brad Menezes@bradmenezes·14 Nis

Introducing Superblocks 2.0: AI-generated enterprise apps – finally under IT control. Vibe-coded apps just became the #1 attack vector in the enterprise. Business teams are building on production data, while IT has zero visibility. No reviews. No audits. No permissions. No control. AI hackers are about to get 100x better. Anthropic proved it with Mythos. Superblocks 2.0 is the only platform to take back control: > Business teams build AI-powered apps with permissions baked in. > IT and Security can audit everything and lock down anything, instantly. > Engineering sets the standards. Every app follows them. Instacart, SoFi, and LinkedIn run Superblocks in production today. And larger organizations we can't yet name are too: A Fortune 500 just shut down 2,500 Replit users to standardize on Superblocks, running the platform air-gapped in their AWS environment. A 150,000-employee global services firm replaced Lovable with Superblocks to unlock AI-built apps on restricted internal systems. Every IT leader we’ve demoed to using Replit, Lovable or v0 asked for early access. Today we open access to the world. The genie is out of the bottle on employee vibe coding. Let it run wild, or take back control – superblocks.com

English

196

402

2.2K

4.6M

Arjun Desai

Keşfet