Sayak Paul

6.8K posts

Sayak Paul banner
Sayak Paul

Sayak Paul

@RisingSayak

ML at Hugging Face 🤗

Earth Katılım Mayıs 2012
124 Takip Edilen23.7K Takipçiler
Sayak Paul
Sayak Paul@RisingSayak·
@sagnikcodes In order for us to get there, we need traces for finetuning SLMs.
English
0
0
0
28
Sagnik
Sagnik@sagnikcodes·
@RisingSayak Skills are fine , I am just waiting for the day when small slms will write good kernels for my gpu locally
English
1
0
0
64
Sayak Paul
Sayak Paul@RisingSayak·
The kernels project at Hugging Face has been growing! We want it to be the go-to place for kernel devs and kernel users. We're looking to work w/ folks who're interested in doing agentic kernel dev, providing real optim value to real models. Reach out if interested :)
Sayak Paul tweet media
English
13
5
82
9K
Frosty40
Frosty40@FrostForger·
@RisingSayak My brother, this is my lifestyle. How can I be of service
English
1
0
0
9
Sayak Paul
Sayak Paul@RisingSayak·
The project isn't specific to CUDA or PyTorch, btw. It has multiple backends, going beyond CUDA, such as ROCm and `tvm-ffi`.
English
1
0
7
451
Sayak Paul
Sayak Paul@RisingSayak·
After working on releasing the v5, this is the latest release from the Transformers team at ⁦@huggingface⁩.
Sayak Paul tweet media
English
0
0
18
1.1K
Sayak Paul
Sayak Paul@RisingSayak·
Some additional things I did: * Use XLA Pallas Flash Attention kernel * Profiling with `xprof` Profiling definitely revealed some of the shortcomings of the Diffusers implementation w.r.t the XLA-specific aspects but that was out of scope for the project.
English
1
0
2
338
Sayak Paul
Sayak Paul@RisingSayak·
I implemented a PyTorch/XLA variant of Qwen/QwenImage with SMPD to fit on a TPU v6e-8. Inference for 50 steps after compilation takes ~20 sec. Not bad 🔥 Longer ⬇️
Sayak Paul tweet media
English
1
2
14
1.3K
Sayak Paul
Sayak Paul@RisingSayak·
We released Diffusers 0.38.0, and it's packed with new pipelines and several library-related improvements 🔥 A bunch of new pipelines, including audio 🎼 * Ace-Step 1.5 * LongCat-AudioDiT * Ernie-Image And more! Next up, we added support for: * Flash Attention 4 * Loading with FlashPack * Ring Anything as a new backend for context parallelism Last but not least, we added an example on how to profile a DiffusionPipeline and potentially improve its performance. Enjoy 🧨
Sayak Paul tweet media
English
3
9
78
16.8K
Sayak Paul
Sayak Paul@RisingSayak·
1. Read the post. 2. Contemplate. 3. Repeat 1.
Arthur Zucker@art_zucker

This is going to be a little bit long, but I want to give hope to my fellow anxious ML engineers. We see a lot of propaganda on how this or that AI one shotted something, about how incredibly strong the models are getting and how we don't even need to review PRs and we can just ship to production. Although this can be true for some cases, its also far from being representative of all the challenges we have to face. I started using claude code 4 month ago, and quickly realized how it really does change the way we work. I can experiment 10x faster, fix small issues without coding and refactor code without sweating. BUT, these tasks were "just" tedious and not hard. The challenge in my day to day work is to take a research code and integrate it into transformers using our standards. Its challenging because code beauty is abstract and subjective just like a philosophy. By relying too much on claude, and on how seemingly good the code it produces look, I pushed the deepseekv4 integration without realizing that claude really did not understand the model. I gave it access to `transformers`, the original paper, the original code, the different blog posts and my past chats and skills created to add a model, a b200 node node and a LOT of tokens, but it did NOT nail it. It did not understand the eager attention path, it did not understand the basics of causal attention. It was even wrong implementing the manifold constrained hyper connections. It helped to reduce the burden of exploring implementation and debugging but it did not help reason around the model. I am not a doomer, I think our job as Software Engineers has never been this great, I am just saying that we still have a job, and we should still be a bit careful when it looks to good to be true 😉

English
3
0
10
1.4K
Hunter Gon
Hunter Gon@gonlenidefi·
@RisingSayak alright you read the title, now youre loop trapped eternal learning, zero action
English
1
0
0
33
Sayak Paul
Sayak Paul@RisingSayak·
Expansion is a good thing, and may it never run out! Plus it's Japan 🧨
Sayak Paul tweet media
English
8
3
55
10.4K