Paul

112 posts

Paul

Paul

@edchangy

ML and stuff

Katılım Haziran 2011
725 Takip Edilen134 Takipçiler
Paul retweetledi
Dan Alistarh
Dan Alistarh@DAlistarh·
Speedrunning GPT-2 is now routine thanks to @karpathy. But can we speedrun GPT3-175B? We attempted to match accuracy on a <$10K budget; while we didn't quite reach it, our first results show that quality data, engineering, and native FP4 can get close. Details in 🧵
Dan Alistarh tweet media
English
4
22
170
12.4K
Paul retweetledi
Matej Sirovatka
Matej Sirovatka@m_sirovatka·
What’s the best model you can train in a day if someone hands you a pile of Blackwell GPUs? You can try out yourself On April 9 in Paris, @GPU_MODE + @verdacloud + @sestercegroup are hosting a GPU hackathon with a bunch of GPUs to run on and even more of them for the winners.
English
12
8
160
8.9K
George Grigorev
George Grigorev@iamgrigorev·
I am thinking of writing the next blogpost about these topics: Optimizing training throughout with FP8 I will show how to write FP8 kernels How to implement DDP How to implement FSDP, with distributed Muon How to implement TP Gradient accumulation Gradient checkpointing I think using “How to scale your model” on a real consumer gpus connected with pcie and writing that from scratch on pure PyTorch would be really useful
English
13
19
357
20.6K
Paul retweetledi
Dan Alistarh
Dan Alistarh@DAlistarh·
🚀 We are releasing state-of-the-art post-training quantization (PTQ) algorithms for Microscaling FP4, together with kernels: - First study focused on MXFP4/NVFP4 PTQ for LLMs - New Micro-Rotated (MR) format and GPTQ algorithm - QuTLASS GPU kernels with up to 3.6x speedups.
Dan Alistarh tweet media
English
2
28
153
9.4K
Jack Monas
Jack Monas@JackMonas·
New leader for the Compression track in the ICCV 1X World Model Challenge! Submission from @DataCrunch_io @antferdom Final deadline to submit solutions is Sep. 27 AoE
Jack Monas tweet media
Daniel Ho@itsdanielho

We at @1x_tech with @JackMonas are excited to announce the ICCV phase of our 1X World Model Challenge: huggingface.co/spaces/1x-tech… Participate in the Compression and Sampling tracks for a $8k prize pool & train generative models for cool robot results like: 1x.tech/discover/redwo…

English
1
3
13
1.7K
Paul retweetledi
Paul retweetledi
Verda (formerly DataCrunch)
Verda (formerly DataCrunch)@verdacloud·
❗️ We just expanded our capacity of B200 SXM6 180GB servers – available in our Cloud Platform. The best thing is… With DataCrunch, you can deploy the Blackwell platform without approvals. Just sign in and select the instance type: cloud.datacrunch.io/?utm_source=x&…
Verda (formerly DataCrunch) tweet media
English
0
2
1
149
Paul
Paul@edchangy·
Also pretty cool to see open source community building on top of each other!
English
0
0
0
18
Paul
Paul@edchangy·
The paper also suggests Group Tied Attention (GTA), which works in the opposite direction and draws inspiration from MLA, incorporating those techniques into GQA.
English
0
0
0
18
Paul
Paul@edchangy·
Well, the paper suggests a hybrid method. What about using MLA and adding groups?
English
0
0
0
12
Paul
Paul@edchangy·
First of all, a confession! In the blog titled 'Multi-Head Latent Attention: Benefits in Memory and Computation', we didn't tell the whole story—the benchmarking on a single GPU. In reality, for DeepSeek V3-style models, parallelization is needed.
English
3
0
0
37
Paul
Paul@edchangy·
Instead, one must make a copy of the latent component across GPUs, which feels wasteful.
English
0
0
0
16
Paul retweetledi
Verda (formerly DataCrunch)
Verda (formerly DataCrunch)@verdacloud·
🆕 Inference API for FLUX.1 Kontext [max] & [pro] are now available on DataCrunch! We are an infrastructure partner of @bfl_ml for Kontext, a suite of generative flow matching models for text-to-image and image-to-image editing. Learn more: datacrunch.io/managed-endpoi…
Verda (formerly DataCrunch) tweet media
English
1
2
4
291
Paul retweetledi
Verda (formerly DataCrunch)
Verda (formerly DataCrunch)@verdacloud·
🚨 Summer Inference by Symposium AI is happening next Wednesday, June 4, at 16:00-22:00. 🇫🇮 This event will bring together 250 AI engineers, researchers, and founders under one roof in Helsinki. 🔗 You can still grab one of the last remaining seats: lu.ma/x5hhj79x
English
0
1
3
128