Michael G

7 posts

Michael G banner
Michael G

Michael G

@M_Gschwind

AI acceleration, created first general purpose programmable Accelerators (Cell SPE PlayStation 3, roadrunner)

Menlo Park, CA Katılım Ekim 2019
1 Takip Edilen0 Takipçiler
Michael G
Michael G@M_Gschwind·
@Quicken your quicken license verification server is down. I would expect that adds part of paying the license fee, I can actually use the software
English
1
0
0
4
Elon Musk
Elon Musk@elonmusk·
Please stay tuned while we make adjustments to the uh .… “algorithm”
English
11.7K
12.7K
232.4K
60.8M
Michael G
Michael G@M_Gschwind·
@karpathy @benjamin_bolte XFormer and Flash custom kernels come standard with PT2, so you can use them off the shelf. Check out PyTorch nightlies! PT2 has both separate functions for each custom kernel, and generic _scaled_dot_product_attention to pick the best implementation for specific parameters
English
0
0
1
195
Andrej Karpathy
Andrej Karpathy@karpathy·
having fun optimizing minGPT today - base: 495ms - zero_grad(set_to_none=True): 492 - torch.jit.script gelu: 463 - OMP_PROC_BIND=CLOSE: 453 - torch.backends.cuda.matmul.allow_tf32: 143 - torch.autocast(torch.bfloat16): 121 - FlashAttention: 102 now: more fused kernels more better
English
35
71
1.4K
378.9K
Soumith Chintala
Soumith Chintala@soumithchintala·
@karpathy Once you publish the "optimized minGPT" repo, we'll probably send some more patches in. 1. with the latest nightlies, the xformers/flash-attention kernels are in PyTorch core now. 2. we have a matmul autotuner that is about to land that gives significant boost in perf
English
2
4
47
12.8K
Michael G
Michael G@M_Gschwind·
@JimiDevine Amazing progress for civilization. Dismantle slums and build student housing.
English
0
0
0
0
Jimi Devine
Jimi Devine@JimiDevine·
Listening to the horrible sound of chainsaws in People's Park. Watching trees drop. Sad day for Berkeley.
English
33
38
96
0