
Cohere just took #1 on the Hugging Face Open ASR Leaderboard. First speech model. 5.42% WER. Open source.
━━━━━━━━━━━━━━━━━━━
Cohere Transcribe. Their first speech-to-text model — and it immediately tops the English accuracy charts.
→ 5.42% word error rate — #1 on HuggingFace Open ASR Leaderboard
→ Validated by human evaluation, not just automated benchmarks
→ One of the strongest accuracy-to-speed ratios at its size class
→ Minutes of audio → usable transcripts in seconds
→ Open source — download weights directly from Hugging Face
━━━━━━━━━━━━━━━━━━━
The bigger picture:
This isn't a standalone model release. Cohere is building toward full enterprise speech intelligence inside North — their agentic AI orchestration platform.
Translation: your AI agent will soon listen, transcribe, reason, and act — all within one enterprise platform. Transcribe is the ears.
The ASR space just got very crowded very fast. In the last few months: Mistral Voxtral, IBM Granite Speech, ElevenLabs Scribe, and now Cohere Transcribe — all pushing open-source ASR past what Whisper could do.
🔗 Blog: cohere.com/blog/transcribe
🔗 Model: lnkd.in/ggfeZye5
Building enterprise speech pipelines — transcription, voice agents, real-time audio processing — on-premise? That's what we do at Zingaro AI and LiteCompute AI. DM me.
♻️ Repost if useful.
Follow Pasha S for daily open-source AI drops.
English
