matt brown
4K posts

matt brown
@mnbbrown
engineering @ Elyos. ex @GoCardless, @bcgdv. pilot. redhead.


The part most people will skip: NVIDIA just made every voice AI API a commodity. OpenAI charges $0.06/min input and $0.24/min output for Realtime API. Gemini Live bills 25 tokens/second of audio. Every startup building voice agents is hemorrhaging cash on per-minute API fees to run what is fundamentally a pipeline problem: ASR → LLM → TTS, three models stitched together with latency at every seam. PersonaPlex replaces that entire pipeline with one 7B model. Runs on a single A100. Open weights, MIT license, commercial use permitted. Response latency: 0.170 seconds for turn-taking, 0.240 seconds for interruptions. It scores higher on dialog naturalness than Gemini (2.95 vs 2.80 MOS) and handles interruptions better than every commercial system they benchmarked. This tells you everything about NVIDIA’s playbook. They don’t need to charge for the model. They need you to buy the GPU. Every company that self-hosts PersonaPlex instead of paying OpenAI per-minute is another A100/H100 sale. Every voice agent startup that drops their API dependency is another enterprise GPU contract. NVIDIA open-sourced the fishing rod because they sell the lake. Built on the Moshi architecture from Kyutai, fine-tuned with under 5,000 hours of data. The voice AI margin is migrating from the application layer to the hardware layer. And NVIDIA is the only company that profits no matter which model wins. 330,000 downloads in the first month. That’s infrastructure capture disguised as generosity.

At @Elyos_AI We benchmarked 13 STT providers on 100 real customer calls from the trades businesses. Not synthetic lab data. Real calls with: - Background noise & multiple speakers - UK postcodes & addresses - Regional accents (England, Scotland, Ireland) - Short confirmations to long explanations Top performers: 🥇 @DeepgramAI Flux - 15.9% WER 🥈 @soniox_ai - 16.9% WER 🥉 @Speechmatics - 17.7% WER @OpenAI Whisper? 39.8% WER - wouldn't recommend for production voice AI. What's your experience with STT models? Are we there yet?

i know a small team in Texas making more than most “AI startups” just by fixing one boring problem for local contractors they noticed something simple: contractors can handle the work but the admin part drains their entire week quoting takes too long scheduling gets messy invoices pile up follow-ups never happen so they built a system that cleans up the admin headache: • pulls job requests from text, email, and WhatsApp • turns them into structured job details • drafts the quote • books the slot on the calendar • sends reminders • generates the invoice • collects payment • pushes everything to the accounting tool all with off-the-shelf tools stitched together cleanly they charge depending on the size of the business people try it for a month, realise they’re saving hours every week, and then they stick around now they manage the backend for 40+ trades businesses solving a painful, recurring, unsexy operational problem that owners will happily pay to make disappear everyone wants to build AI copilots for the Fortune 500 meanwhile, the people printing money are the ones automating inbox chaos for local businesses

Every single dev and product team I speak to in the last 30 days has moved from Cursor to Claude Code. 1. Is this permanent? 2. If so, what happens to Cursor?










