thom✨

1.1K posts

thom✨ banner
thom✨

thom✨

@gpuwaster

highly performative computing

🇫🇷 Inscrit le Aralık 2017
239 Abonnements128 Abonnés
levi
levi@levidiamode·
Day 74/365 of GPU Programming I always found die shots and SM diagrams beautiful but difficult to map mentally, so I've been trying to find a way to interact with GPUs in 3D. This is what I have so far: a single input that goes through a simplified H100 execution pipeline to see what the silicon is doing at each step; from CPU-side tokenization and embedding lookup, through matmuls on tensor cores to the final softmax output. My current plan is to make this an interactive playground that lets you zoom in and zoom out through various levels of depth (package → die → GPC → SM → tensor core) while also including step-through examples similar to the bycroft LLM 3D visualization. Ideally this should make exploring the architectural side just as easy as mapping CUDA abstractions onto the actual hardware processes. I'm starting with an H100 but would be fun to expand this to more GPUs and highlight the differences between generations. This was largely inspired by @srush_nlp's GPU puzzles, @JayAlammar's Illustrated Transformers and @karpathy's makemore series, which made me think about how to study and visualize GPUs from the ground up.
levi@levidiamode

Day 73/365 of GPU Programming Wanted to understand FP4 better and came across this great @Cohere_Labs talk on Training LLMs with MXFP4 and @juliarturc's amazing series on quantization So fascinating learning what makes low precision work for LLM training and inference

English
16
14
322
24.5K
thom✨
thom✨@gpuwaster·
@gpusteve i always see emilio’s posts on linkedin and i love your thumbnails
English
1
0
1
28
steve
steve@gpusteve·
wanted to write a bit about some a pitfall we ran into when using cpp_extensions.load() for our kernel generation benchmarks. you can easily lose a few microseconds if you're not pinning processes correctly or loading too many modules.
steve tweet media
English
4
1
9
777
thom✨ retweeté
Chris 🇨🇦
Chris 🇨🇦@llm_wizard·
Jensen’s self described “best slide”
Chris 🇨🇦 tweet media
English
1
2
22
3.5K
thom✨
thom✨@gpuwaster·
oh the keynote is still live lets go
English
0
0
0
8
thom✨ retweeté
enigmatriz
enigmatriz@enigmatriz·
cover and opener for Bloomberg Businessweek.
enigmatriz tweet mediaenigmatriz tweet media
English
34
269
3.4K
95.2K
Kirtesh
Kirtesh@AKirtesh·
@Hi_Mrinal xAI interview on the resume >>> most people's dream offers. King shit fr 😌👀
English
4
0
40
21.2K
Mrinal
Mrinal@Hi_Mrinal·
Still one of my biggest fumble
Mrinal tweet media
English
56
5
1.7K
329.3K
teo
teo@teodorio·
I never did a leetcode problem
English
4
0
22
889
Tim
Tim@TimurNegru·
Someone is selling an 18th-century French estate sitting directly on the Lot River. 27 rooms, 10 beds/8 baths, 487m² (5,200 sq ft) of living space across the main house and 3 gîtes (self-contained guesthouses). 2 swimming pools, tennis court, wine cellar and 1.18 hectares (2.9 acres) of gardens running down to the river. Live in the main house, your friends and family have their own space and nobody's in each other's way. This is Cahors Malbec country by the way, with Bordeaux 2,5 hours away (wine lovers will understand). Asking price: €1.3M ($1.4M). How much would an estate like this cost in your country?
Tim tweet mediaTim tweet mediaTim tweet mediaTim tweet media
English
513
491
7.4K
1.6M
thom✨
thom✨@gpuwaster·
wtf is a MUFU.RCP
English
0
0
0
30
thom✨
thom✨@gpuwaster·
its the week end me and claude are reading SASS !
English
0
0
1
31
9
9@5amoljen·
I got an offer from CERN :)
English
86
21
2.2K
76.5K
thom✨
thom✨@gpuwaster·
@Leik0w0 if you had no gpus what would u use, prime intellect to rent them?
English
1
0
0
28