

Josh Talks
10.6K posts

@JoshTalksLive
Josh Talks AI - Datasets and Evaluations for Voice AI in India 🇮🇳 https://t.co/YBqGI96jXN Josh Talks Media - India’s most inspiring stories | 2.5B+ views







Drop 5/14: Introducing Bulbul V3, our latest text-to-speech model. It raises the bar for how human it sounds, while being super robust. In an independent third-party human listening study, Bulbul V3 delivers the highest listener preference, and low error rates across use-cases and languages. See details in our blog, but first watch the video. sarvam.ai/blogs/bulbul-v3


only the best and SOTA from 🇮🇳


In a blind study conducted by @JoshTalksLive, listeners compared Bulbul V3, ElevenLabs (v3 alpha and v2.5 flash), and Cartesia Sonic-3 with over 20,000 votes. Bulbul V3 tops the scores for 8kHz audio, setting a new benchmark for speech synthesis for voice agents.



We’ve spent years collecting & annotating speech data. What became equally important was learning how to evaluate speech models in a way that reflects real listeners, real accents, & real usage. This video explains how we ran this blind evaluation & what @JoshTalksLive is building in speech evaluations.












