Loïck BOURDOIS

335 posts

Loïck BOURDOIS

@BdsLoick

FAT5 (Flash Attention T5) boy @huggingface Fellow 🤗

France Katılım Ekim 2021

230 Takip Edilen280 Takipçiler

Sabitlenmiş Tweet

Loïck BOURDOIS@BdsLoick·20 Mar

We introduce FAT5 (Flash Attention T5) ⚡ An implementation of T5 in PyTorch with UL2 objective optimized for GPGPU for both training and inference thanks to 13 differents optimizations. Unroll to find out more! 🧵👇 [1/8]

GIF

English

942

Loïck BOURDOIS@BdsLoick·4 Mar

FYI, according to @huggingface's API as of March 3 2026, @Alibaba_Qwen has had 1,296,972,250 downloads, and even 1,625,639,122 if we count derivative models (fine-tuning, quantization, etc.) since they started releasing open-source models in September 2023.

Junyang Lin@JustinLin610

me stepping down. bye my beloved qwen.

English

Loïck BOURDOIS@BdsLoick·2 Mar

CuTeDSL is really nice For those wishing to get into writing kernels in this language, github.com/b-albar/machete can be useful Boris ALBAR reimplemented Flash Attention, RoPE, RMSnorm, etc. Everything compatible with HF Transformers (tests on llama3, GLM4.7, Qwen3), TRL, PEFT/LoRA

maharshi@maharshii

CuTeDSL is my new favourite thing: I wrote a kernel for RMS norm after learning about layouts, tiling, copying tensors, reductions and so on, especially for inference and it is about 2.13x faster than a triton fused kernel for the given shape.

English

136

Loïck BOURDOIS retweetledi

Basile Terver@BasileTerv987·4 Şub

𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗘𝗕-𝗝𝗘𝗣𝗔 ⚡ An open-source library making JEPAs accessible, trainable on a single GPU in hours! 🚀 🔗 Paper: arxiv.org/abs/2602.03604 💻 Code: github.com/facebookresear…

English

663

91.2K

Loïck BOURDOIS@BdsLoick·28 Oca

@MaziyarPanahi @OpenMed_AI @huggingface huggingface_hub api is all you need to do it programmably 👀

English

Maziyar PANAHI@MaziyarPanahi·28 Oca

@BdsLoick @OpenMed_AI @huggingface this is just the first 6 months until the blog post. wanted to share what happened so far. I haven’t counted after that. A lot of models, takes time to count 😆

English

Maziyar PANAHI@MaziyarPanahi·26 Oca

I am telling you, it's a great time to be in healthcare AI!

English

2.1K

Loïck BOURDOIS@BdsLoick·28 Oca

@gui_penedo I suppose all good things must come to an end. Thank you very much for the high-quality multilingual datasets

English

Guilherme Penedo@gui_penedo·23 Oca

Update: I’ve left Hugging Face 🤗 I spent the last ~2.5 years working on large-scale datasets like FineWeb🍷, FineWeb2 🥂, FineTranslations💬, and FinePDFs📄, and also got to contribute to exciting projects like SmolLM🤏 and Open-R1🐳. It’s pretty incredible to look back at the impact these projects have had, and to see how much the community now relies on them. The reception to FineWeb when it came out, in particular, was wild. One thing I really appreciated was being able to share what we were learning as we went rather than keeping everything internal. Writing lengthy blog posts (books?) takes a lot of work but ends up being incredibly rewarding. Hugging Face’s culture actively encourages this, and I feel very lucky to have been able to work in that kind of environment. I’m now starting a new project with @HKydlicek (also leaving HF), still fully focused on data and strongly shaped by what we learned building at scale. More soon 🫡

English

290

18.1K

Loïck BOURDOIS@BdsLoick·18 Ara

@lhoestq @huggingface @mervenoyann @abhi1thakur If you rename the `datasets` library to `nlp` as in early 2020, I'll make sure it passes 700k before the end of the year 👀

English

Quentin Lhoest 🤗@lhoestq·18 Ara

@huggingface @mervenoyann @abhi1thakur damn it was a long time ago

English

273

Quentin Lhoest 🤗@lhoestq·18 Ara

Wow there are now 600,000 public datasets on @huggingface ! They were 600 exactly five years ago 😳 (@mervenoyann @abhi1thakur can testify) So, to summarize: - 2020>2025 : x1000 - 2025>2030: x1000 too ??? 😂 The open source community is crazy !

English

18K

Loïck BOURDOIS retweetledi

Google AI Developers@googleaidevs·12 Ara

Today, we're releasing an updated Gemini 2.5 Flash Native Audio model. Now available via the Live API 🗣 blog.google/products/gemin…

English

109

167.2K

Loïck BOURDOIS@BdsLoick·9 Ara

@JFPuget @arcprize @kaggle It seemed to me that at one point in the public leaderboard, giotto ai was ranked between you and the ARChitects. Do we know what happened to them? I can't find their results on the Kaggle leaderboard

English

114

JFPuget 🇺🇦🇨🇦🇬🇱@JFPuget·9 Ara

Interesting read from the ARChitects, 2nd place team on @arcprize competition on @kaggle . Their combination of diffusion LLMs with iterative improvement is quite interesting. It has some ties with TRM and HRM models. There is some irony thoough. They tried something different from their winning solution from last year because they thought it was not successful. Irony is we won reusing their last year solution (with some improvements). Key for us was to use better pretraining data.

Jan Disselhoff@JDisselh

ARC Prize 2025 is over, an amazing contest, with amazing people competing. This year our team "the ARChitects" managed to reach second place. We tried a lot of things, some thoughts and explanation of our approach below!

English

118

9.9K

Loïck BOURDOIS@BdsLoick·4 Ara

@MaziyarPanahi "he’s gonna work" 🤔 He has been working on it for at least 2 years and uses the term AMI in his lectures (in the one given at the Collège de France as part of Benoit Sagot's series, he said that he used this name with Joëlle Pineau because of the meaning it also has in French)

English

Maziyar PANAHI@MaziyarPanahi·4 Ara

Yann revealed what’s he’s gonna work on next: Advanced Machine Intelligence (AMI) AGI ASI SSI AMI

Maziyar PANAHI@MaziyarPanahi

the man himself, @ylecun

English

1.5K

Loïck BOURDOIS@BdsLoick·27 Kas

@julien_c

QME

220

Julien Chaumond@julien_c·27 Kas

we shipped a new thing like it / hate it?

English

204

47.8K

Loïck BOURDOIS retweetledi

João Maria Janeiro@JoaoMJaneiro·5 Kas

🚨New Paper @AIatMeta 🚨 You want to train a largely multilingual model, but languages keep interfering and you can’t boost performance? Using a dense model is suboptimal when mixing many languages, so what can you do? You can use our new architecture Mixture of Languages! 🧵1/n

English

1.9K

Loïck BOURDOIS@BdsLoick·3 Kas

@antoine_chaffin So much easier than downloading the dataset yourself and then searching through it for what you want

English

Antoine Chaffin@antoine_chaffin·3 Kas

@BdsLoick (You also reminded me we can run SQL queries on the hub datasets, here goes my afternoon)

English

106

Antoine Chaffin@antoine_chaffin·3 Kas

if NanoBEIR was the official bench, we would be beating SOTA by such a large margin 😭

English

1.3K

Loïck BOURDOIS@BdsLoick·3 Kas

@antoine_chaffin I haven't checked whether the errors are already in the initial BEIR or introduced with the Nano version but earlier this year arxiv.org/abs/2505.16967 had found other annotation errors

English

Antoine Chaffin@antoine_chaffin·3 Kas

@BdsLoick FWIW I checked and it does not seems like those faulty documents are linked to any query in qrels Still bad, but a bit less bad than if they were I wonder if those come from the original DBPedia dataset or have been introduced in Nano version

English

196

Loïck BOURDOIS@BdsLoick·3 Kas

@antoine_chaffin Yeah corpus split of this subset. Note there is also similar issue for NanoFiQA2018, NanoNQ and NanoSCIDOCS (so at least 30.7% of splits have a problem).

English

Antoine Chaffin@antoine_chaffin·3 Kas

@BdsLoick god I knew Quora had issue but this seems very odd for DBPedia Those are from the documents corpus right?

English

151

Loïck BOURDOIS@BdsLoick·3 Kas

@antoine_chaffin It also means realizing that there are annotation issues in it:

English

Antoine Chaffin@antoine_chaffin·3 Kas

unfortunately growing is realizing that although NanoBEIR has some signal, it decorrelates largely from BEIR at times

English

301

Loïck BOURDOIS retweetledi

Lewis Tunstall@_lewtun·30 Eki

We've just published the Smol Training Playbook: a distillation of hard earned knowledge to share exactly what it takes to train SOTA LLMs ⚡️ Featuring our protagonist SmolLM3, we cover: 🧭 Strategy on whether to train your own LLM and burn all your VC money 🪨 Pretraining, aka turning a mountain of text into a fancy auto-completer 🗿How to sculpt base models with post-training alchemy 🛠️ The underlying infra and how to debug your way out of NCCL purgatory Highlights from the post-training chapter in the thread 👇

English

478

139.3K

Loïck BOURDOIS retweetledi

Shayne Longpre@ShayneRedford·28 Eki

📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality. The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer: 🌍Are scaling laws different by language? 🧙‍♂️Can we model the curse of multilinguality? ⚖️Pretrain from scratch or finetune from multilingual checkpoint? 🔀Cross-lingual transfer scores for 1444 lang pairs? 1/🧵

English

154

24.1K

tomaarsen@tomaarsen·22 Eki

🤗 Sentence Transformers is joining @huggingface! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer! Details in 🧵

English

376

41.1K

Loïck BOURDOIS@BdsLoick·22 Eki

@tomaarsen Tom, you told me to count trimm as part of HF but not ST because it was just sponsorship in the form of maintenance from HF😭 I'm going to have to redo all the graphs in my blog post

English

tomaarsen@tomaarsen·22 Eki

That choice ended up being very valuable for the embedding & IR community, and I think this choice of granting Hugging Face stewardship will be similarly successful. I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!

English

806

Keşfet

@huggingface @Alibaba_Qwen @MaziyarPanahi @OpenMed_AI @gui_penedo @HKydlicek @lhoestq @mervenoyann