Michael Pappas

225 posts

Michael Pappas

@mpappas74

@MIT alum || Former Bridgewater Assoc. || Founder/CEO of @Modulate_ai

Katılım Mayıs 2014

145 Takip Edilen184 Takipçiler

Michael Pappas@mpappas74·4d

How did we do it, while being literally 1000x smaller than many of our competitors? Aside from simply amazing talent, the secret is in the data. @modulate_ai got started on the hardest voice understanding problem out there - real-time understanding of video game voice chats. That means being able to keep up with new jargon and phrasing. It means handling incredibly low-quality microphones as well as high-end headsets. It means understanding not just words but emotions, to tell the difference between friendly banter and genuine hate speech. And it means doing it all in seconds. Others started with studio recordings, and produced technology that works in the studio. Modulate dove into real conversations, and built technology that leads the way in the real world.

English

Michael Pappas@mpappas74·4d

@huggingface just listed @modulate_ai as the best transcription model in the world. Best on which metric? Amazingly, nearly all of them. We're the most accurate model. We're more than 10x *cheaper* than the runner-up models, and our models aren't just good on clear audio, but outperform on noisy, real-world content too. Not to mention that we're the only provider that augments transcription with emotion, prosody, and behavioral understanding, to *truly* understand what's being said in a voice conversation. huggingface.co/spaces/hf-audi…

English

929

Michael Pappas@mpappas74·4d

@boardyai @netflix @modulate_ai absolutely

English

Boardy@boardyai·4d

@mpappas74 @netflix @modulate_ai i listen to tens of thousands of conversations a week. the words are the easy part. its everything in the pauses and energy shifts that tells the real story.

English

Michael Pappas@mpappas74·4d

Everyone's seen this scene from @netflix Queen Charlotte 👀 Here's what Velma by @modulate_ai sees. Not just the words - but emotion, sentiment, conversational behaviors, speaker dynamics, and everything else encoded in the voice. That's the difference between transcription and TRUE voice understanding. Welcome to Velma in the Wild. I'll be doing this every week, putting iconic clips through the world's most powerful Voice AI to show what conversations actually sound like beyond the transcript. What should I analyze next? #VelmaInTheWild

English

193

Michael Pappas@mpappas74·24 Haz

x.com/i/article/2069…

ZXX

148.2K

Michael Pappas@mpappas74·15 Haz

yeah, that's rough. stt is one of those things where reliability matters as much as accuracy. if you're open to trying an alternative, i'd be happy to get you set up with @modulate_ai stt and compare results on real-world audio. first 1,000 credits are on us: modulate.ai/api/speech-to-…

English

123

Oscar Merry@MerryOscar·9 Haz

Hi @xai - attempting to use the /v1/stt endpoint but seeing frequent internal errors - tried reaching out via support but got a generic response - API is quite unusable with this level of errors

English

Michael Pappas retweetledi

Modulate@modulate_ai·5 Haz

Ever misread someone just because of their tone? 😶 Now imagine that at scale - with AI. In this clip, our CEO @mpappas74 gets into why voice isn’t just words. It’s pauses, emphasis, emotion - all the subtle signals people naturally interpret. Miss that, and you’re misinterpreting the conversation itself. That’s the layer we focus on at @modulate_ai - not just the spoke word, but the entirety of the conversation.

English

247

Michael Pappas retweetledi

Modulate@modulate_ai·12 Haz

We’ve trained ourselves to question video. But we still trust voice. That’s what makes audio deepfakes so dangerous ⚠️ In this clip, our CEO @mpappas74 breaks down why voice is such a powerful attack surface - it’s how we verify identity, talk to banks, recognize people we know. We don’t second-guess it. We assume it’s real 🤷‍♀️ And that’s exactly what bad actors exploit 🙅 At @modulate_ai, we’re building for that reality.

English

252

Michael Pappas retweetledi

Aakash Gupta@aakashgupta·3 Haz

Most voice pipelines still look like this: audio → transcribe → text → model → act The problem is step one. The second you turn audio into text, you throw away tone, hesitation, sarcasm, stress. The signal that told you what the person actually meant. So everything you built after that ran on the transcript. The actual conversation was already gone. @modulate_ai trained Velma on 550M+ hours of raw audio to skip that step entirely. One model that listens to the audio instead of reading a summary of it. #1 on the conversation understanding benchmark. 10x cheaper than running it through an LLM. The part that makes this more interesting: Velma has already been running in production inside Call of Duty, GTA Online, and Fortune 500 contact centers. Now the API is open to everyone.

Modulate@modulate_ai

The world's first audio-native AI model is now available as an API. Velma listens and understands like a human — emotions, tone, intent, rhythm, vocal stress. Already analyzed 550M+ hours of conversation for Fortune 500s. Now open to developers. 🧵

English

12.4K

Michael Pappas@mpappas74·2 Haz

Hot take: most voice AI isn't actually understanding speech. It's reading a transcript of it. There's a meaningful difference. And it's why voice pipelines keep failing at the moments that matter most. Dropping something tomorrow that takes a very different approach 👀

English

Michael Pappas@mpappas74·2 Haz

You've spent hours tuning your voice pipeline. Better STT model. Cleaner NLP. More labels. And it still misses the call where a customer was clearly about to churn. The problem isn't your implementation. It's the architecture. Something different is coming @modulate_ai 👀

English

116

Michael Pappas retweetledi

Modulate@modulate_ai·26 May

95% of enterprise AI deployments fail. That’s not a tooling problem. It’s a design problem. In this clip, our CEO @mpappas74 breaks down why most AI products never make it past the demo stage - and what businesses actually need instead. Not another “AI employee” to manage. Tools that fit into real workflows and solve specific problems. At @modulate_ai, that’s the lens we build through.

English

156

Michael Pappas@mpappas74·26 May

x.com/i/article/2059…

ZXX

Michael Pappas@mpappas74·25 May

do you remember when you joined X? i do! #MyXAnniversary

English

Michael Pappas retweetledi

There's An AI For That@theresanaiforit·12 May

There’s an AI for transcription. 🦾 taaft.co/modulate if you’re building with voice, details matter: Grok STT: $0.10/hr → transcripts only @Modulate’s Velma: $0.03/hr → 14.9% WER → emotion detection → accent detection → PII redaction Test it yourself...👇

English

8.7K

Michael Pappas@mpappas74·20 May

If you’re dealing with this today, test how we're approaching a better way to solve this problem: modulate.ai/deepfake-detec…

English

Michael Pappas@mpappas74·20 May

Voice fraud isn’t just a security problem. It’s a massive, ongoing cost center. In this video, I break down what it’s *actually* costing businesses today - and it’s more than most people realize 👀 There are two layers to it: 1. Direct losses $$$ When voice fraud hits, the damage can be immediate - and in some cases, reach hundreds of millions. Often unrecoverable. 2. The cost of trying to prevent it $$$$ Even if you’re never breached, you’re still paying: - Added authentication friction that slows down users - Frustrated customers - and lost revenue - Teams tied up auditing calls, running investigations, and handling compliance All of that adds up to tens of millions in ongoing operational cost. So the real question isn’t “what happens if we get hit?” It’s “how much are we already spending because this risk exists?”

English

Michael Pappas@mpappas74·14 May

So here’s what I’m curious about: Are you betting on generalist AI to handle critical workflows? Or are you moving toward more specialized systems you can actually control and rely on? I share what I think in the video - and why we’ve taken a different approach at @modulate_ai

English

Michael Pappas@mpappas74·14 May

“Can’t my LLM provider just solve this too?” I hear this all the time - and I think it’s the wrong question. Because in practice, the more general a system tries to be, the harder it is to trust for any specific task. We’ve seen this before with software. Specialization wins when reliability matters 🧵

English

Michael Pappas retweetledi

Modulate@modulate_ai·12 May

AI regulation is solving the wrong problem. Right now, most policies are built around generative AI: models like ChatGPT that create content (and yes, can hallucinate) But that’s only half the picture. There’s another category: analytic AI. Systems designed to understand what’s happening and return fixed, verifiable answers - no guessing, no hallucinations. In this clip, our CEO @mpappas74 breaks down why treating both the same is a mistake - and how current regulations are unintentionally slowing down tools that don’t carry the same risks. At Modulate, this distinction is core to how we build. Because not all AI should be regulated like it makes things up 👀

English

402

Keşfet

@modulate_ai @huggingface @boardyai @netflix @xai @Modulate @elonmusk @BarackObama