Fred Oh

4.2K posts

Fred Oh banner
Fred Oh

Fred Oh

@fredo_ai

Seasoned product marketing professional specializing in GPU Acceleration, AI/ML, Data Analytics/Integration, Data Storage, and IoT GTM Strategy and Execution

Santa Clara, CA Katılım Eylül 2010
244 Takip Edilen1.1K Takipçiler
Fred Oh retweetledi
NVIDIA AI
NVIDIA AI@NVIDIAAI·
Happy Friday! We just put DeepSeek-V4-Pro up on build.nvidia.com. It’s the world’s largest open source model at 1.6T parameters, and you can run it for free running on NVIDIA Blackwell GPUs. Try the NVIDIA NIM API → build.nvidia.com/deepseek-ai/de…
English
97
294
2.6K
200.4K
Fred Oh retweetledi
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
Introducing DWDP = “Distributed Weight Data Parallelism” for MoE LLM inference using NVL72-based clusters from #NVIDIAResearch. Keep data parallel, offload MoE experts across GPUs, fetch on demand with P2P copies, and drop layer‑wise collectives to accelerate your parallel LLM inference projects. Shows gain of ~9–11% higher output TPS/GPU on DeepSeek‑R1 at similar TPS/user, with bigger wins under imbalanced traffic. 📗 arxiv.org/abs/2604.01621
NVIDIA AI Developer tweet media
English
2
24
137
9.9K
Fred Oh retweetledi
Inferact
Inferact@inferact·
We are thrilled to announce that @nvidia is the latest investor in @inferact. We look forward to continuing the momentum driven by our deep collaboration: (1) Engineering velocity: a significant uptick in @nvidia pull requests to the @vllm_project repo. (2) Product synergy: close integration with NVIDIA Dynamo, ModelOpt, Nemotron, and more products! It’s an exciting time for the growth and development of vLLM, the world's AI inference engine!
English
8
7
82
29.8K
Fred Oh retweetledi
Jon Hernandez
Jon Hernandez@JonhernandezIA·
📁 Jensen Huang, CEO of Nvidia, says data centers are no longer supercomputers, they are AI factories. They don’t produce answers, they produce tokens. And those factories exist for one obsession train faster, arrive sooner, lead first. The jump from Blackwell to Rubin delivers a 4x speed increase. One month of training instead of four changes the entire race.
English
31
58
344
30.1K
Fred Oh retweetledi
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
🎊Congrats to @lmsysorg for advancing DeepSeek V3/R1 inference. ⚡️On NVIDIA GB200 NVL72, they’re achieving 26k input tokens/s and 13k output tokens/s per GPU — a nearly 4× / 5× speedup vs H100. They achieved this with NVFP4 MoE, FP8 attention, scaling-down expert parallelism by offloading weights via a 900GB/s bidirectional CPU-GPU interface, and compute-communication overlap. This work highlights how collaborative AI platforms with active communities speed learning and build stronger, high-performance infrastructure through shared expertise.👇
LMSYS Org@lmsysorg

🚀 Follow-up to our last breakthrough on DeepSeek V3/R1 inference! On NVIDIA GB200 NVL72, SGLang now achieves 26k input tokens/s and 13k output tokens/s per GPU with FP8 attention + NVFP4 MoE - that’s a 3.8× / 4.8× speedup vs H100 settings. See the details in the 🧵 (1/4)

English
4
31
198
21.5K
Fred Oh
Fred Oh@fredo_ai·
✨ NVIDIA DALI just got even better. New features make it easier to integrate with #PyTorch, boost video processing with smarter selective decoding, and optimize memory for flexible CPU ↔️ GPU data flows. Read more in our tech blog ➡️ nvda.ws/4dEH58s
English
0
0
0
50
Fred Oh retweetledi
NVIDIA AI Infrastructure
NVIDIA AI Infrastructure@NVIDIAAIInfra·
Scientists and engineers are using NVIDIA #CUDA-X libraries powered by the NVIDIA GB200 and GH200 Superchips to solve the world's toughest challenges. Read our #GTC25 blog to dive into these incredible scientific use cases. ⤵️ nvda.ws/4lCfjgn
English
2
5
40
1.4K
Fred Oh retweetledi
NVIDIA HPC Developer
NVIDIA HPC Developer@NVIDIAHPCDev·
Discover how #CUDA-X empowers you -- as developers, researchers, and engineers -- to drive scientific discovery, transform #AI, and redefine computation to make real-time digital twins truly possible. Watch the demo 📹 youtube.com/watch?v=CyP0Su… #GTC25
YouTube video
YouTube
NVIDIA HPC Developer tweet media
English
0
3
14
1.1K
Fred Oh
Fred Oh@fredo_ai·
For 16 years running, NVIDIA technologies have powered every Academy Award-nominated film for Best #VFX. This year, NVIDIA researchers are honored for their groundbreaking work in simulation, denoising, and rendering. #GTC25 bit.ly/4hn3ibf
English
0
0
2
47
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
Excited to build a high-performance AI workstation. 🙌 @songhan_mit just finished a brand-new system featuring the GeForce RTX 5090. Here are some key learnings they shared about configuration: nvda.ws/40XUAtM
NVIDIA AI Developer tweet media
English
2
14
70
4.3K
Fred Oh
Fred Oh@fredo_ai·
New #CUDA Toolkit 12.8 delivers NVIDIA Blackwell support with CUDA Graphs, accelerated Math and #Python libraries, compiler, and Nsight developer tools enhancements. Learn More➡️ bit.ly/4hKCnHb
English
0
0
2
81
Fred Oh retweetledi
NVIDIA HPC Developer
NVIDIA HPC Developer@NVIDIAHPCDev·
Explore a tutorial on implementing high-performance #CUDA math operations in #Python, interoperable with #PyTorch and #CuPy, using nvmath-python. Learn how to fuse epilog operations with matrix multiplication with nvmath-python. ➡️ nvda.ws/3CNxspo #SC24
English
1
10
50
2.2K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
14/ We had so much fun bringing this special community together IRL for an amazing day of technical talks/hacking. Big thanks to everyone who turned out and our planning committee (here’s an “after” photo taken at midnight). Full videos of the talks to follow soon #cudamodeirl.
Casey Aylward tweet media
English
1
1
27
2.1K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
13/ We had 40+ teams submit projects for judging and 10 projects demoed for the entire group. The top five won prizes. The top three were also awarded 4080s signed by Jensen Huang :).
Casey Aylward tweet mediaCasey Aylward tweet media
English
2
1
32
12.1K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
12/ You may recognize Wen-mei from his famous textbook "Programming Massively Parallel Processors." He signed textbooks in the evening… and even someone’s iPad!
Casey Aylward tweet media
English
1
1
20
1.5K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
11/ Last but not least Wen-mei Hwu from @nvidia shared how to pick a hard problem to work on for a decade.
Casey Aylward tweet media
English
1
1
21
1K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
10/ @Tim_Dettmers spoke on how open source can win over closed source. Fun fact: he was the original author of the phrase “CUDA mode.”
Casey Aylward tweet media
English
1
1
17
1.1K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
9/ @eqhylxx then kicked off our evening talks where she spoke on designing performant speculative decoding solutions in @vLLM.
Casey Aylward tweet media
English
2
2
20
4.6K
Fred Oh retweetledi
Casey Aylward
Casey Aylward@caseyaylward·
6/ @karpathy wrapped up our morning talks with the story of how he built llm.c and gave the audience some great hack ideas. We had the core llm.c devs there too which was awesome.
Casey Aylward tweet media
English
1
1
20
1.1K