Post

Pratyush Kumar
Pratyush Kumar@pratykumar·
Drop 2/14: Sarvam Audio: a state-space based efficient audio language model that defines the new benchmarks in speech recognition for Indian languages. Significantly outperforms Gemini 3 and GPT 4o Transcribe in a range of benchmarks. See details in our blog: sarvam.ai/blogs/sarvam-a… @SarvamAI
English
22
133
736
71.8K
Pratyush Kumar
Pratyush Kumar@pratykumar·
Sarvam Audio is trained on top of Sarvam 3B a large language model trained from scratch by @SarvamAI supporting 22 Indian languages and English. The model aces fine control of transcripts amongst various different formats, achieving the lowest word error rates across languages.
Pratyush Kumar tweet media
English
3
19
151
15.7K
Pratyush Kumar
Pratyush Kumar@pratykumar·
Sarvam Audio works great for multi-speaker transcription with high quality separation between speakers achieving lowest diarisation error rates across languages.
Pratyush Kumar tweet media
English
1
7
74
4.1K
Pratyush Kumar
Pratyush Kumar@pratykumar·
Sarvam Audio improves when provided with the context for correctly capturing user's intent in a conversation.
Pratyush Kumar tweet media
English
1
7
53
8.1K
Paylaş