keveman

622 posts

keveman

@keveman

Co-founder at Moonshine AI. Previously @CerebrasSystems, @googleai, @nvidia

Mountain View, CA Katılım Temmuz 2007

226 Takip Edilen844 Takipçiler

keveman@keveman·3d

I think Voxtral TTS is a bigger deal than Cohere Transcribe.

English

105

keveman@keveman·23 Mar

Instead of z = e - (z - e).detach() do z = (||z||/||e|| * R).detach() * e That's it. That's the trick. Mind blown. 🤯

rohan anil@_arohan_

Really neat paper! Find a linear map from the encoder output to its chosen codebook vector using a scale and two Householder reflections then backprop through that map instead of the straight-through identity. Great work!

English

124

keveman@keveman·19 Mar

@rough__sea @taalas_inc @SemiAnalysis_ The killer use case for Taalas style chip maybe surprising. I can see stamping Whisper on a chip and selling 10s of millions of them at 10¢ a piece for wearables and appliances.

English

111

Ryan Dahl@rough__sea·19 Mar

why no coverage of @taalas_inc on @SemiAnalysis_ ?youtube.com/watch?v=d9p5xi…

YouTube

English

5.4K

keveman@keveman·16 Mar

I moved from Codex to CC and back to Codex when 5.4 came out. Can't put a finger on what caused it, but keeping the stickiness is going to be the labs' biggest challenge.

Greg Brockman@gdb

hearing this sentiment from many

English

195

keveman@keveman·14 Mar

Thanks Github user mm65x for adding Moonshine to mlx-audio and @Prince_Canuma for this awesome package. mlx-audio is quickly becoming the one stop shop for speech on Mac platform.

Prince Canuma@Prince_Canuma

🚀 mlx-audio v0.4.1 is out! New models: → Granite Speech 4.0 (STT and AST) → Canary STT (NVIDIA canary-1b-v2) → Moonshine STT → MMS STT → FireRedASR2-AED STT → SenseVoice STT → Fish Audio S2 Pro TTS Plus: → Native MLX DeepFilterNet speech enhancement (v1/v2/v3) → OGG, Opus & Vorbis audio format support → LID fix for ECAPA/SpeechBrain alignment Thank you all contributions for this release: @lllucas, @beshkenadze, @andimarafioti, mm65x, irachex and Kylehowells! 🚀 > uv pip install -U mlx-audio Leave us a star ⭐️ github.com/Blaizzy/mlx-au…

English

1.2K

keveman@keveman·3 Mar

Check out my latest article: The Imminent Obsolescence of ONNX, TFLite, and the Ilk linkedin.com/pulse/imminent…

English

keveman@keveman·2 Mar

The profusion of "those who only administer" has been the bane of the industry for quite a while.

rohan anil@_arohan_

And if you want to save on cost (though not sure why? You can print more Ads?) Probably cut unnecessary management layers. Those who only administer and no longer build or mentor slows everyone down.

English

keveman@keveman·28 Şub

@pfau More like toddlers vs adults.

English

David Pfau@pfau·28 Şub

Ok I'll say one more thing. It's kind of crazy the degree to which modern American politics has turned into a giant battle of jocks vs. nerds.

David Pfau@pfau

All I'll say about the Anthropic/DoD situation is that it is just so characteristic of the Trump administration to go completely nuclear over someone giving you only 99% of what you want.

English

4.9K

keveman@keveman·27 Şub

RIP, Dan Simmons.

English

126

keveman@keveman·26 Şub

Any sufficiently complicated sharding notation contains an ad-hoc, bug ridden, poor polyhedral notation in it.

rohan anil@_arohan_

MatX notations just dropped on the timeline.

English

107

keveman retweetledi

Atila@atiorh·22 Şub

Can’t believe the level of demand and confusion on this. > @Apple modernized their speech-to-text last year and it now matches ~whisper-small > 3rd party iOS keyboards like @superwhisper @macwhisperapp @WisprFlow etc. have been shipping whisper-large level models for a year > the “mind-blowing accuracy” has less to do with the model and more to do with personal custom vocabulary learning

Nico Albanese@nicoalbanese10

Please Apple, can you add a whisper quality transcription model to iOS?

English

159

37.4K

keveman@keveman·22 Şub

ZXX

keveman@keveman·21 Şub

If @taalas_inc offers today's Codex 5.3 or Opus 4.6 at 2000 tokens/s, how much will you pay?

English

143

keveman@keveman·21 Şub

Top two actors who are the same person in all movies: Tom Cruise Stellan Skarsgard

English

100

keveman@keveman·21 Şub

Making the full Codex 5.3 run on Cerebras and getting even 100-200 tokens/s is more valuable than 1200 tokens/s of this hobbled model, don't y'all think? @andrewdfeldman

Cerebras@cerebras

Faster 🟧

English

152

keveman retweetledi

steven@Tu7uruu·18 Şub

People are sleeping on this ASR Model 👀🎙️ Moonshine by @moonshine_ai is quietly climbing the charts, with similar performance as QwenASR 0.6B and Voxtral 24B… …while being ONLY 245 million parameters

English

4.6K

keveman@keveman·19 Şub

But seriously, nowadays realtime audio on a hugeass GPU with 40 gig memory and 20TFlops compute is almost table stakes. Try it on a sub 100 GFlops hardware running on batteries.

keveman@keveman

When I hear realtime TTS or sub-200ms ASR, I'm like..

English

keveman@keveman·19 Şub

When I hear realtime TTS or sub-200ms ASR, I'm like..

Mistral AI for Developers@MistralDevs

First, the report. Voxtral Realtime achieves state-of-the-art transcription performance at sub-500ms latency. We attribute this to training a strong causal audio encoder and incorporating novel Ada RMS-Norm delay conditioning. In the report, we highlight how we selected each of these components, alongside the training objective and inference details: arxiv.org/abs/2602.11298

English

185

keveman retweetledi

Amy Tam@amytam01·17 Şub

x.com/i/article/2023…

ZXX

186

832

7.6K

2.7M

keveman@keveman·17 Şub

@jamescham Work always expands to fill the amount of available compute.

English

James Cham@jamescham·17 Şub

so much of this depends on whether there’s a limited amount of work to get done so the AI agents will do all of it; or that there are a limitless number of problems with the world we could be tackling.

English

3.3K

Keşfet

@rough__sea @taalas_inc @SemiAnalysis_ @Prince_Canuma @pfau @Apple @superwhisper @macwhisperapp