Kevin

4.2K posts

Kevin banner
Kevin

Kevin

@TheOneKev

Product Owner building independent AI systems. Interested in psychology, politics & the global economy

参加日 Haziran 2020
475 フォロー中178 フォロワー
Kevin
Kevin@TheOneKev·
Still blown away by the speed Gemma 4 now mit MTP. Did 31B and 26B with 4bit and 8bit quants from @RedHat_AI and what can I say, those speeds are nuts. Especially the MoE model is just flying now on a single RTX 5000 Pro.
Kevin tweet media
English
0
0
0
21
Kevin
Kevin@TheOneKev·
@_philschmid @arena Big fan of both models. Especially now with MTP, they're both just crazy good. Token/s is just mind-blowing in combination with the intelligence, and super smooth on my RTX 5000 Pro. Really worthy successors for the Gemma 3 series. Amazing job.
English
0
0
0
56
Philipp Schmid
Philipp Schmid@_philschmid·
Gemma 4 shifts Pareto Frontier on Code @arena.🔥 Among open models, Gemma-4-31b ranks #13 and Gemma-4-26b-a4b ranks #17. Pretty good for open models you can run a MBP. 👀
Philipp Schmid tweet media
English
6
10
83
4.4K
Kevin
Kevin@TheOneKev·
@yabbanx Half of the stuff probably won't work in 5 years. I like tech gimmicks, but too much - unnecessar stuff - just means expensive maintenance and repair costs.
English
0
0
3
302
yaw.
yaw.@yabbanx·
China is waging a serious ‘war’ against Western car manufacturers. This is not normal bro. 😂 🏷️ $80,000
English
1.6K
3.6K
22.1K
2.6M
Kevin がリツイート
Arena.ai
Arena.ai@arena·
Gemma-4 lands in Code Arena: Frontend Webdev and shifts the Pareto Frontier! Among open models, Gemma-4-31b ranks #13 and Gemma-4-26b-a4b ranks #17. Congrats to @GoogleDeepMind on shifting the frontier!
Arena.ai tweet media
Google DeepMind@GoogleDeepMind

Meet Gemma 4: our new family of open models you can run on your own hardware. Built for advanced reasoning and agentic workflows, we’re releasing them under an Apache 2.0 license. Here’s what’s new 🧵

English
11
37
388
38.1K
Kevin
Kevin@TheOneKev·
@osanseviero Absolute game changer. Great work, fellas. No doubt!
English
0
0
1
40
Omar Sanseviero
Omar Sanseviero@osanseviero·
Excited to introduce Gemma 4 Multi-Token Prediction Drafters⚡️Accelerated inference right in your pockets - Up to a 3x speedup - Same quality guarantees - Available in your favorite open-source tools
English
47
120
1K
145.2K
Kevin
Kevin@TheOneKev·
@bnjmn_marie That was really that critical thing missing with Gemma 4 31B. Speed. Now that this also is there, best open model for me right now. Love it so far.
English
0
0
4
254
Benjamin Marie
Benjamin Marie@bnjmn_marie·
I benchmarked Google’s new MTP for Gemma 4 31B using vLLM with 4 speculative tokens, a fairly conservative setup. Results: - Much higher throughput than Qwen3.6’s MTP - Lower latency too, helped by Gemma 4 generating fewer tokens - For coding tasks with reasoning enabled, Gemma 4 is now at least 6x faster than Qwen3.6. So you can generate 5 outputs, run your tests to select the best one, and it would still be cheaper than a single output by Qwen3.6. I’ve updated my full comparison with the new numbers: kaitchup.substack.com/p/qwen36-27b-v… I also confirmed what others have reported: Gemma 4’s MTP handles a high number of speculative tokens very well. On simple text generation, I’m now testing values above 10 and reached 129 tok/s on an RTX Pro 6000, compared with 20 tok/s without MTP. Next step: confirming how this translates to real tasks.
Benjamin Marie tweet media
English
31
31
306
20.5K
Kevin
Kevin@TheOneKev·
@bnjmn_marie Tested it on my two RTX 5000 Pros, one model on each, both Gemma 4 31B, both FP8, one with and one without MTP. The 3x was no lie. Went from 30 to almost 100 tok/s. That's incredible.
Kevin tweet media
English
0
0
2
73
Kevin
Kevin@TheOneKev·
Ok, had to try it out. Doing the first runs, both FP8 and each running on a RTX 5000 Pro. And what can I say? hey weren't exaggerating. ~30 tok/s vs almost 100 tok/sec. Which meant in that test run reducing the time from 16 to 5 secs. And I can't see any degradation or similar. Great job @googlegemma
Kevin tweet media
Google Gemma@googlegemma

Gemma 4 just got even faster! We're releasing Multi-Token Prediction (MTP) drafters that deliver up to a 3x speedup, without any degradation in output quality or reasoning logic.

English
0
0
0
24
Kevin がリツイート
Google Gemma
Google Gemma@googlegemma·
Gemma 4 just got even faster! We're releasing Multi-Token Prediction (MTP) drafters that deliver up to a 3x speedup, without any degradation in output quality or reasoning logic.
GIF
English
91
354
3.3K
193.5K
Michel Laclé
Michel Laclé@micheltamanda·
@TheOneKev This is the pro move brother! You gave me inspiration to move to the next level.
English
1
0
1
11
Michel Laclé
Michel Laclé@micheltamanda·
What LLM gateway are you using? I built my own to have a single point of configuration for my local AI systems. How did you all solve the pain point of having many local models over many local machines.
Michel Laclé tweet media
English
10
0
22
2.2K
Kevin
Kevin@TheOneKev·
@micheltamanda I use mine for local and external models though. Plus API Key Management.
Kevin tweet media
English
1
0
1
21
Kevin
Kevin@TheOneKev·
@Dozer3000 @gas0linr Stimme ich halbwegs zu. Zumindest was den Status Quo betrifft. Aber wenn man überlegt, wo LLMs herkommen und dass Scaling bis jetzt immer noch was bringt, abgesehen von evtl neuen Architekturen bald, wird bald kein Dev mehr mithalten können. Biologisch nicht möglich.
Deutsch
1
0
1
122
Timotheus V.
Timotheus V.@Dozer3000·
@gas0linr Die Aussage ist meines Erachtens völliger Quark. Das Studium lohnt sich. Nur weil man mit vibe-coding mit ein paar Agenten effizienter arbeiten kann, werden gute ITler nicht obsolet. Reine Script-Coder haben es schwerer, aber sonst gibt genug zu tun.
Deutsch
10
0
171
11.9K
Yves
Yves@gas0linr·
Es ist ein wirklich irres Gefühl als Informatiker zu sehen, was der technologische Fortschritt mit der Branche macht. Und es ist beängstigend zu erkennen, dass 90% der Menschen das nicht kommen sehen. Als Student der Informatik würde ich jetzt (!) abbrechen und mich orientieren.
Deutsch
136
15
800
123.9K
Kevin
Kevin@TheOneKev·
@gas0linr 90% ist wahrscheinlich noch sehr(!) optimistisch.
Deutsch
0
0
2
66
Kevin
Kevin@TheOneKev·
Sometimes it's hard to understand, when you're right in the middle of it, but this is literally history in the making. And I honestly think just a precursor of what will come. I think @sama even said it himself, that it will probably get bad first, before it can(!) become good.
Anonymous@YourAnonNews

Kevin O'Leary's massive data center was approved by a county commission in Utah last night without residents' approval of the measure. At 40,000 acres, it would be 2.5x the size of Manhattan. The commission approved the proposal despite opposition from hundreds of locals.

English
0
0
0
23
Kevin
Kevin@TheOneKev·
@isabelunraveled As a dad, makes me happy reading that reply from you dad. That's how it should be. He's doing a great job.
English
0
0
6
2K
Isabel🌻
Isabel🌻@isabelunraveled·
me and my dad this week // me and my dad just after i was born
Isabel🌻 tweet mediaIsabel🌻 tweet media
English
33
151
5.5K
236.3K
Kevin
Kevin@TheOneKev·
@TheoMediaAI @sama I mostly agree. The only things that come to my mind in that scenario is b) for how long (driving) that will still be a thing, but even more b) Augmented Reality, e.g. HUD. Great combo. No doubt though, voice only will have it's use cases. Just not standard standalone UI.
English
0
0
0
22
Sam Altman
Sam Altman@sama·
pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI
English
930
242
6.3K
645K
Kevin
Kevin@TheOneKev·
@TheAhmadOsman Ngl, that 10x tokens thing...they really know how to get me. Tokenite 😄
English
0
0
2
326
Ahmad
Ahmad@TheAhmadOsman·
People keep treating everything like isolated events - Dario / Anthropic fearmongering - Policy maker pressure - Elon’s lawsuit - Sudden 10x tokens - SF parties All just random coincidences? Come on, look more than 2 steps ahead We’re surrounded by existential risks & Psyops
English
40
21
384
31.8K
Kevin
Kevin@TheOneKev·
@0xSero Wait...you guys get the party, plus the "band-aid"?
GIF
English
0
0
2
286
0xSero
0xSero@0xSero·
Hecking frick. Thats a lot of clanking
0xSero tweet media
English
12
1
225
11.9K