Sawyer Bowerman

4 posts

Sawyer Bowerman banner
Sawyer Bowerman

Sawyer Bowerman

@_soyr_

AI @ Red Hat

Boston, MA Sumali Şubat 2026
5 Sinusundan42 Mga Tagasunod
Red Hat AI
Red Hat AI@RedHat_AI·
What compression looks like on @vllm_project. Same Gemma 4 31B. Red Hat AI's quantized version runs at nearly 2x tokens/sec, half the memory, 99%+ accuracy retained. Open source. Quantized with LLM Compressor. Links in comments. 🙏 @_soyr_ for the 2-minute demo.
English
8
42
442
31.4K
Red Hat AI
Red Hat AI@RedHat_AI·
Michael Goin (@mgoin_) walks through what's new in @vllm_project v0.17, v0.18, and v0.19 in ~8 minutes. Flash Attention 4, new performance modes, zero-bubble async scheduling, online MXFP4 quantization, Gemma 4, and a lot more. 1,592 commits. 682 contributors (163 new). 🎉 🚀
English
1
10
74
4.2K
Sawyer Bowerman
Sawyer Bowerman@_soyr_·
As AI and LLMs become more and more powerful, the boundaries for efficient inference simply break down. If you want to experiment with quantized models as well, you can follow the code in the video, and check out huggingface.co/RedHatAI to pick any model you'd like!
Red Hat AI@RedHat_AI

What compression looks like on @vllm_project. Same Gemma 4 31B. Red Hat AI's quantized version runs at nearly 2x tokens/sec, half the memory, 99%+ accuracy retained. Open source. Quantized with LLM Compressor. Links in comments. 🙏 @_soyr_ for the 2-minute demo.

English
0
0
3
193
Sawyer Bowerman nag-retweet
DevoxxUK
DevoxxUK@DevoxxUK·
The Wildcard Ever wanted to do your daily stand-up in @Minecraft? @_soyr_ shows us how (and why!) they did it. Don't miss these sessions, May 6-7! Register here: devoxx.co.uk
DevoxxUK tweet media
English
0
1
2
112