Vladimir Vlejd Macko

100 posts

Vladimir Vlejd Macko

Vladimir Vlejd Macko

@vlejd

I like taking things from 0 to 1. From nothing to something. Sprinkled with ML when necessary.

Katılım Mart 2011
106 Takip Edilen67 Takipçiler
Sabitlenmiş Tweet
Vladimir Vlejd Macko
Vladimir Vlejd Macko@vlejd·
Unstructured weight #sparsity made practical. 50% unstructured weight sparsity was considered too low for real GPU speed up without specific hardware support (like @cerebras). With @bozavlado we built MACKO-SpMV - a new matrix format + SpMV kernel to change that. 🧵
Vladimir Vlejd Macko tweet media
English
2
4
12
1.6K
Vladimir Vlejd Macko
Vladimir Vlejd Macko@vlejd·
🛠️ Next step: server GPUs. If you know how to implement a minimal CUDA matvec on H100 that hits ≥95% of cuBLAS 👉 My DMs are open.
English
0
0
0
71
Vladimir Vlejd Macko
Vladimir Vlejd Macko@vlejd·
Results @ 50% unstructured sparsity (fp16): • 1.2×–1.5× faster than cuBLAS (first unstructured GPU method to beat dense at this level) • 1.5× memory reduction 👉 Unstructured pruning now gives much better benefits.
English
1
0
0
86