Diffio AI

26 posts

Diffio AI

@diffioai

Audio restoration for podcasts. Visit https://t.co/KxDOlns8Wq for more.

Colorado Springs Katılım Kasım 2025

5 Takip Edilen6 Takipçiler

Diffio AI@diffioai·6d

@IndianaJ9000 Yeah. Any help is appreciated.

English

Indiana Jones@IndianaJ9000·29 Mar

@diffioai Really interesting idea behind Diffio. I’m a SaaS writer and a few content ideas came to mind while checking it out — happy to share if useful.

English

Diffio AI@diffioai·29 Mar

More word alignment backends added to github.com/Diffio-AI/Cohe…

English

Diffio AI@diffioai·29 Mar

@viraataryabumi @Julianfmack @TheOneKloud github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

Viraat Aryabumi@viraataryabumi·26 Mar

state-of-the-art ✅ open weights ✅ apache 2.0 ✅ multilingual ✅ we cooked. tip my hat to @Julianfmack and @TheOneKloud! legends! 🐐

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

Diffio AI@diffioai·29 Mar

@tomaarsen @cohere github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

tomaarsen@tomaarsen·26 Mar

This is a very solid release! Apache 2.0 as well, 2B parameters (i.e. quite runnable), 14 languages, and supported using Transformers already. Great work @cohere 👏

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

2.8K

Diffio AI@diffioai·29 Mar

@TheOneKloud github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

Pierre Richemond 🇪🇺@TheOneKloud·26 Mar

Blog post : cohere.com/blog/transcribe Technical blog post : huggingface.co/blog/CohereLab… Model card : huggingface.co/CohereLabs/coh…

Pierre Richemond 🇪🇺@TheOneKloud

Excited and proud to introduce our latest: Cohere Transcribe, the best dedicated ASR model in the world. #1 EN HF leaderboard, SotA human evals, ahead of ElevenLabs, Qwen3, Mistral, Kyutai, and OpenAI. 14 supported languages. Apache 2.0, on HF for you to try. Our first audio model and a key step in powering North experiences. huggingface.co/CohereLabs/coh…

English

Diffio AI@diffioai·29 Mar

@fahdmirza github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

Fahd Mirza@fahdmirza·28 Mar

💥 Cohere Transcribe is HERE and it's OPEN SOURCE 🎙️ ♠ A free 2B parameter ASR model you can run locally — Audio In, Text Out 🚀 🔹 14 languages supported: English, French, German, Arabic, Japanese, Korean & more 🔹 Conformer architecture — built from scratch for speech, not repurposed 🔹 Up to 3× faster real-time factor than other dedicated ASR models of the same size 🔹 Apache 2.0 license — fully free, no strings attached 🔹 Works on your own GPU, your own data, no API calls, no cost 🔥 Watch the full demo below 👇

English

Diffio AI@diffioai·29 Mar

@ClementDelangue @nickfrosst @huggingface @cohere github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

clem 🤗@ClementDelangue·27 Mar

Got to meet @nickfrosst in Miami today to celebrate their awesome release of an open-source Apache 2.0 Transcribe model that could be a whisper killer and already trending on @huggingface! @cohere deserves much more visibility in the community as one of the leaders of North American open-source!

English

186

36.5K

Diffio AI@diffioai·29 Mar

@jaxson @cohere github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

Jaxson Khan@jaxson·27 Mar

If you regularly transcribe audio, @cohere Transcribe was just released - it's a free, open-source model that runs locally and is definitely worth checking out. I ran some tests against OpenAI's Whisper (which powers ChatGPT and many other apps). I used Steve Jobs' 2005 Stanford Commencement Address (15 min) on YouTube as the test video. Both models running locally on a MacBook M4. Some highlights of what each model heard: Cohere: "I learned about serif and sans serif typefaces" Whisper: "I learned about Sarah and Sans Sarah of typefaces" Cohere: "Bob Noyce" Whisper: "Bob Nois" Cohere: "tried to apologize for screwing up so badly" Whisper: "tried to apologize for sparing up so badly" I also tested Whisper's largest model (1.55B parameters) to get a closer comparison to Cohere's 2B parameters. It fixed some of the name errors but started repeating phrases and took much longer. How they compared: - Cohere (2B params): 119 seconds, ~98% accuracy - Whisper base (74M params): 69 seconds, ~90% accuracy - Whisper large (1.55B params): 915 seconds, ~93% accuracy Full side-by-side transcript comparison: github.com/jaxson/tests-p… (Note I believe that some of the different word counts stem from hallucination loops that were encountered by Whisper). Cohere Transcribe Model on Hugging Face: huggingface.co/CohereLabs/coh… Test video on YouTube: youtube.com/watch?v=UF8uR6… * Results may vary based on hardware, audio quality, and content. This is a very non scientific test! **Audio clips used under fair use for commentary/analysis. All rights belong to their respective owners.

YouTube

English

1.2K

Diffio AI@diffioai·29 Mar

@Tu7uruu github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful. I also added automatic language detection.

English

steven@Tu7uruu·26 Mar

Just dropped on HF: Cohere’s cohere-transcribe-03-2026 > 🥇 #1 on the Open ASR leaderboard > 🌍 #4 multilingual > 📄 #6 long-form > Supports 12+ languages: English, German, French, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Arabic, Vietnamese, Chinese, Japanese, Korean > Conformer-based encoder + lightweight Transformer decoder for transcription > And of course: Apache 2.0 license

English

166

12.3K

Diffio AI@diffioai·29 Mar

@psk90_ai github.com/Diffio-AI/Cohe… WhisperX style interface for cohere. Has VAD (cohere recommends) and word alignment which is always helpful.

English

Pasha S@psk90_ai·28 Mar

Cohere just took #1 on the Hugging Face Open ASR Leaderboard. First speech model. 5.42% WER. Open source. ━━━━━━━━━━━━━━━━━━━ Cohere Transcribe. Their first speech-to-text model — and it immediately tops the English accuracy charts. → 5.42% word error rate — #1 on HuggingFace Open ASR Leaderboard → Validated by human evaluation, not just automated benchmarks → One of the strongest accuracy-to-speed ratios at its size class → Minutes of audio → usable transcripts in seconds → Open source — download weights directly from Hugging Face ━━━━━━━━━━━━━━━━━━━ The bigger picture: This isn't a standalone model release. Cohere is building toward full enterprise speech intelligence inside North — their agentic AI orchestration platform. Translation: your AI agent will soon listen, transcribe, reason, and act — all within one enterprise platform. Transcribe is the ears. The ASR space just got very crowded very fast. In the last few months: Mistral Voxtral, IBM Granite Speech, ElevenLabs Scribe, and now Cohere Transcribe — all pushing open-source ASR past what Whisper could do. 🔗 Blog: cohere.com/blog/transcribe 🔗 Model: lnkd.in/ggfeZye5 Building enterprise speech pipelines — transcription, voice agents, real-time audio processing — on-premise? That's what we do at Zingaro AI and LiteCompute AI. DM me. ♻️ Repost if useful. Follow Pasha S for daily open-source AI drops.

English

Diffio AI@diffioai·29 Mar

@mr_r0b0t @cohere @NousResearch @LottoLabs @sudoingX github.com/Diffio-AI/Cohe… WhisperX style interface for cohere.

English

mr-r0b0t@mr_r0b0t·26 Mar

Huge news for local transcription! Thanks @cohere, my @NousResearch Hermes agent will love this! @LottoLabs @sudoingX huggingface.co/CohereLabs/coh…

English

387

Diffio AI@diffioai·29 Mar

@nickfrosst @cohere @huggingface Just made a whisperX style interface for cohere. github.com/Diffio-AI/Cohe…

English

Nick Frosst@nickfrosst·26 Mar

@cohere just released the best speech->text model :) It currently ranks #1 for accuracy on @huggingface Open ASR Leaderboard, setting a new benchmark for real-world transcription performance. Read more 👇

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

211

43.6K

Diffio AI@diffioai·29 Mar

@cohere Love it. Just made a whisperX style interface for it with VAD and word alignment. github.com/Diffio-AI/Cohe…

English

Cohere@cohere·26 Mar

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

297

2.6K

598.5K

Diffio AI@diffioai·29 Mar

Cohere-transcribe with a whisperX style interface: github.com/Diffio-AI/Cohe…

English

Diffio AI@diffioai·25 Mar

- OpenAI Whisper timing github.com/openai/whisper OpenAI Whisper timing uses Whisper’s internal alignment heads and decoder cross-attention, then applies DTW over the token-to-frame attention matrix to derive word timestamps from the token sequence.

English

Diffio AI@diffioai·25 Mar

Word alignment error relative to SNR. See 🧵for details.

English

Diffio AI@diffioai·25 Mar

- whisper-char-alignment github.com/30stomercury/w… whisper-char-alignment Whisper’s own decoder cross-attention maps, teacher-forces the reference text at character level, and uses DTW plus attention-head aggregation to infer word boundaries.

English

Diffio AI@diffioai·25 Mar

- WhisperX github.com/m-bain/whisperX WhisperX performs forced alignment with an external phoneme/CTC aligner, typically a wav2vec2-based model, to align a known transcript to the waveform and recover word timestamps.

English

Diffio AI@diffioai·30 Ara

Codex Wrapped 2025 Total Tokens: 3,073,600,806 Total Messages: 1,782 Total Sessions: 512 Top model: GPT 5.2 Codex Total Estimated Cost: $814.14 Credit: @nummanali @moddi3io

English

Diffio AI@diffioai·23 Ara

@Meta Has anyone ever figured out how to get help from @Meta. I guess they're too busy spending all their money on gpus to take any of my money.

English

Diffio AI@diffioai·23 Ara

No help so far @Meta I have tried email and requesting support from the website. 😢

English

Diffio AI@diffioai·22 Ara

@Meta Any chance I can get some help?

English

Keşfet

@IndianaJ9000 @viraataryabumi @Julianfmack @TheOneKloud @tomaarsen @cohere @fahdmirza @ClementDelangue