brian stevens

750 posts

brian stevens banner
brian stevens

brian stevens

@addvin

CEO, Neural Magic. Ex VP, CTO of Google Cloud and EVP, CTO of Red Hat, RPI and UNH alumn, marathoner, ironman, ADK MT 46er.

[email protected] Katılım Ocak 2009
167 Takip Edilen4.9K Takipçiler
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
🇬🇧 London, June 10. @vllm_project & @_llm_d_ Inference Meetup, hosted by Red Hat AI, @nvidia, and @SteliaAI. Talks on vLLM updates, speculative decoding, llm-d in production, AI safety, and more. Plus food, drinks, and the people building this stuff. luma.com/iuecyow4
English
1
5
12
5.4K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
Llama 70B as a cloud endpoint costs exponentially more than Llama 8B. For teams where a smaller model meets the quality bar, that gap is hard to ignore. And with INT4 quantization: 4x smaller, 2x faster, less than 1% accuracy loss. The right model isn't always the biggest one. redhat.com/en/blog/when-l…
English
2
3
14
1.3K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
Calling Boston area startups building with AI. 🤙 We're kicking off 2026 with the first event in a new monthly, in person hackathon series hosted by @RedHat and @IBM in Boston’s Seaport District. This one day hackathon is designed specifically for local startups that want to move faster from idea to working prototype. Instead of a fixed theme, you bring a real AI problem your team is actively facing. We help you build a proof of concept using open source, enterprise ready templates from aitemplates.io, including MCP Server, AI Agent, and UI templates. What you will get: ⚡ Rapid prototyping without boilerplate 🧠 Hands on guidance from Red Hat AI architects 🤝 Connections with other Boston based AI startups and ecosystem partners If you are a Boston startup looking to turn an AI challenge into something real, this is for you. Event details are shared after registration. Register now: luma.com/i3q8df0x
English
3
5
29
2.1K
brian stevens retweetledi
SemiAnalysis
SemiAnalysis@SemiAnalysis_·
The @RedHat_AI team contributes a lot to vLLM and does amazing work for the open-source community. Great to see vLLM performing so well compared to TRT-LLM on H200! vLLM comes pretty close to B200, with the @NVIDIAAI team working on closing the gap for GPTOSS within the next couple of updates.
SemiAnalysis tweet media
Red Hat AI@RedHat_AI

InferenceMAX, vLLM TPU, compressed-tensors, MoE support via transformers, DeepSeek-OCR, and more. Here’s what’s new in the @vllm_project community over the past two weeks:

English
3
15
98
20.6K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
InferenceMAX, vLLM TPU, compressed-tensors, MoE support via transformers, DeepSeek-OCR, and more. Here’s what’s new in the @vllm_project community over the past two weeks:
English
1
8
42
24.4K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
4 tracks. 12 sessions. 1 day of learning. Join us on Oct. 16 for Red Hat AI Day of Learning, a free virtual event for developers, engineers & practitioners. Tracks: ⚡ Fast & efficient inference 🎯 Model customization 🤖 Agentic AI 🌐 Scaling AI over hybrid cloud Sessions include: · Intro to vLLM and how to get started · Model optimization with LLM Compressor · Lossless LLM inference acceleration w/ Speculators · End-to-end model customization · Synthetic data generation and data processing · Continual learning of LLMs with Training Hub · Build open source agentic AI solutions · Intro to Model Context Protocol (MCP) · Intro to Llama Stack · Intro to distributed inference · Distributed inference with llm-d · Scaling AI Infrastructure 👉 Register free: redhat.com/en/events/webi…
English
1
17
39
7.4K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
Qwen3-Next dropped yesterday and you can run it with Red Hat AI today. ✅ Day-zero support in vLLM ✅ Day-one deployment with Red Hat AI Step-by-step guide: developers.redhat.com/articles/2025/… The future of AI is open.
English
0
5
18
2.3K
Charles 🎉 Frye
Charles 🎉 Frye@charles_irl·
Proud to announce that @modal_labs was a Day 0 Lunch Partner for the release of gpt-oss. That is, on Day 0 I got lunch with @mgoin_ of @vllm_project and then had him look at my and @shariqmobin's code to deploy the model on Modal.
Charles 🎉 Frye tweet media
Modal@modal

OpenAI has released its first open weights language model since GPT-2 over five years ago. gpt-oss has - efficient mxfp4 MoEs - native tool-calling & reasoning - attention sinks for long context This looks like a great model for self-hosted agents. Try it on Modal!

English
7
5
139
8K
brian stevens
brian stevens@addvin·
@RedHat_AI Adding a shoutout to the @IBMResearch team working jointly with AMD team on contributing Triton attention kernels in vLLM v1 that improved decode throughput by 3x on Llama and Granite models.
English
0
0
4
49
Red Hat AI
Red Hat AI@RedHat_AI·
What if you could run high-performance LLM inference on AMD GPUs, without vendor lock-in or hyperscaler markup? Check this out 👇 🧵
English
3
6
45
3.4K
brian stevens retweetledi
Mark Collier 柯理怀
Mark Collier 柯理怀@sparkycollier·
Really excited to see the emergence of llm-d @addvin ! Inference is the biggest workload in human history and the open source tools need to keep evolving to serve it
NVIDIA AI Developer@NVIDIAAIDev

The llm-d project is a major step forward for the #opensource AI ecosystem, and we are proud to be one of the founding contributors, reflecting our commitment to collaboration as a catalyst for innovation in generative AI. As generative and agentic AI continue to evolve, scalable, high-performance inference will be critical to unlocking their full potential. That’s why we’re partnering with @RedHat and other contributors to grow the llm-d community and accelerate its capabilities—powered by our contributions, including innovations from NVIDIA Dynamo such as NIXL. 🔗 Explore and contribute on GitHub: nvda.ws/3FlttSL 📰 Read the launch blog: nvda.ws/3FgF0Tn 🎙️ Hear from NVIDIA’s VP of Engineering & AI Frameworks, Ujval Kapasi → nvda.ws/45pyVhU

English
0
2
11
809
brian stevens retweetledi
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
The llm-d project is a major step forward for the #opensource AI ecosystem, and we are proud to be one of the founding contributors, reflecting our commitment to collaboration as a catalyst for innovation in generative AI. As generative and agentic AI continue to evolve, scalable, high-performance inference will be critical to unlocking their full potential. That’s why we’re partnering with @RedHat and other contributors to grow the llm-d community and accelerate its capabilities—powered by our contributions, including innovations from NVIDIA Dynamo such as NIXL. 🔗 Explore and contribute on GitHub: nvda.ws/3FlttSL 📰 Read the launch blog: nvda.ws/3FgF0Tn 🎙️ Hear from NVIDIA’s VP of Engineering & AI Frameworks, Ujval Kapasi → nvda.ws/45pyVhU
NVIDIA AI Developer tweet media
English
1
17
31
9K
Red Hat AI
Red Hat AI@RedHat_AI·
Llama 4 Herd is here! It brings a lot of goodies, like MoE architecture and native multimodality, enabling developers to build personalized multimodal experiences. With Day 0 support in vLLM, you can deploy Llama 4 with @vllm_project now! Let's dig into it. (a thread)
Red Hat AI tweet media
English
5
47
184
148.5K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
DeepSeek’s Open Source Week drops A LOT of exciting goodies! We’re hosting vLLM Office Hours tomorrow—learn what they are, how they integrate with vLLM, & ask questions! Date: Thursday, Thu, Feb 27 Time: 2PM ET / 11AM PT Register: neuralmagic.com/community-offi… #DeepSeek #AI
Red Hat AI tweet media
English
0
2
9
930
brian stevens retweetledi
Matt Hicks
Matt Hicks@matthicksj·
At @RedHat, we believe the future of AI is open. That's why I'm incredibly excited about our acquisition of @NeuralMagic. Together, we're furthering our commitment to our customers and the open source community to deliver on the future of AI—and that starts today.
Red Hat@RedHat

Today, Red Hat completed the acquisition of @NeuralMagic, a pioneer in software and algorithms that accelerate #GenAI inference workloads. Read how we are accelerating our vision for #AI’s future: red.ht/408kJ8K.

English
0
28
80
4.3K
brian stevens retweetledi
Red Hat AI
Red Hat AI@RedHat_AI·
If you are at #NeurIPS2024 this week, stop by the Neural Magic booth #307 and talk to us about the @vllm_project! vLLM core committer @mgoin_ will be there, ready to hear your ideas and share them with the team. The best feature requests always come from in-person chats!
English
1
1
7
1.1K
brian stevens retweetledi
Scale ML
Scale ML@scaleml·
For our last seminar of the year we will end with Lucas Wilkinson from @neuralmagic presenting! Machete: a cutting-edge mixed-input GEMM GPU kernel targeting NVIDIA Hopper GPUs Time: Dec 4, 3pm EST Sign up via scale-ml.org to join our mailing list for the zoom link
Scale ML tweet media
English
0
4
17
1.6K