Luc Georges

367 posts

Luc Georges banner
Luc Georges

Luc Georges

@LucSGeorges

Software & ML Engineer @huggingface 🦀

Paris, France Katılım Aralık 2020
470 Takip Edilen1.7K Takipçiler
Sabitlenmiş Tweet
Luc Georges
Luc Georges@LucSGeorges·
we've been pushing commits to transformers discretely, time to talk about we've been cooking the last few months: ⚡️ Continuous Batching is in transformers ⚡️ this will simplify, most notably, evaluation and your training loop: no need for extra dependencies or infra to get fast inference, and no need for convoluted code to update your weights note that speed is currently not on par with the best inference frameworks and servers out there and probably never will be the goal is *not* to become as fast: we want to complement the existing landscape with features like these, aiming for transformers to be the toolbox for tinkering with and building models
Luc Georges tweet media
English
14
19
177
51.6K
elie
elie@eliebakouch·
today is my last day at hugging face feeling really grateful to have worked with such an amazing team and learned so much along the way. i’m proud of what we accomplished together, especially the smollm series. building that project from scratch, putting so much into it, and getting to iterate on a model and training recipe that pushed the frontier for its size was really rewarding i hope i was able to play a part in making model training more accessible and in pushing the open model ecosystem forward. i’m also very thankful to hf for giving me the chance to share my passion for llm research, especially here, and to connect with so many awesome people things can get quite intense in this field, but i’m still very excited about the next challenges and about the good this technology can do but first, taking a few weeks break :)
English
116
10
745
32.7K
Luc Georges
Luc Georges@LucSGeorges·
I seem to have found somewhat of a sweet spot. Talk into Claude for the ideation phase, write down the plan, and do everything by hand myself, apart from tests maybe, who likes writing test amiright I question / rework / ignore everything written in plan as it often misses the target, but it does help me think through the problem in great detail. I go from one big plan to smaller in depth plans for each substep which works quite nicely. Co-ideating with Claude keeps the fun alive imo, so long you ask it to tweak / give feedback on your original ideas and have vision for what you want to do. It kind of feels like pair programming!
Adam@adamdotdev

Programming was deeply satisfying work to me. Work for hours/days before getting the payoff of the code working well on your machine. I’m feeling so much friction now to open the editor and do this kind of task by hand, but also increasingly depressed with the nature of work in an AI assisted dev workflow. Back and forth prompting seems to eat at my soul. Need to find a balance that brings back some of the toil.

English
0
0
1
316
Georgi Gerganov
Georgi Gerganov@ggerganov·
Today ggml.ai joins Hugging Face Together we will continue to build ggml, make llama.cpp more accessible and empower the open-source community. Our joint mission is to make local AI easy and efficient to use by everyone on their own hardware.
Georgi Gerganov@ggerganov

I've started a company: ggml.ai From a fun side project just a few months ago, ggml has now become a useful library and framework for machine learning with a great open-source community

English
140
232
1.6K
296.5K
Adrien Carreira
Adrien Carreira@XciD_·
Finally updated the org chart. Yes, Claude gets a @huggingface.co email. No, we're not discussing their compensation.
Adrien Carreira tweet media
English
3
1
21
7.9K
Luc Georges retweetledi
Lysandre
Lysandre@LysandreJik·
Transformers v5's FINAL, stable release is out 🔥 Transformers' biggest release. The big Ws of this release: - Performance, especially for MoE (6x-11x speedups) - No more slow/fast tokenizers -> way simpler API, explicit backends, better performance - dynamic weight loading: way faster, and enabling: MoE now working w/ {quants, tp, peft, ...} We have a migration guide on the main branch; please take a look at it in case you run into issues. Come in our GH issues if you still do after reading it 😀
Lysandre tweet media
English
9
87
434
75.3K
Luc Georges
Luc Georges@LucSGeorges·
@steeve you clearly recognise the dance moves lol
English
0
0
0
19
Steeve Morin
Steeve Morin@steeve·
Okay that one was worth it
English
3
0
10
1.8K
Luc Georges
Luc Georges@LucSGeorges·
safetensors save_file on mac go brrrr ⚡️ been working hard these last few weeks on trying to make safetensors loading & writing faster found that skipping the os page cache with `F_NOCACHE` for write operations yields about 30% speed improvement more coming up, stay tuned
Luc Georges tweet media
English
0
1
6
461
Jared Palmer
Jared Palmer@jaredpalmer·
We are sending out a proposal for Stacked Diffs on @GitHub to trusted design partners to gather initial feedback over the next few days. From there we’ll iterate and share the gameplan
Jared Palmer tweet media
Jared Palmer@jaredpalmer

RE: Stacked Diffs on @GitHub After discussion w @ttaylorr_b, we can implement stacked PRs/PR groups already (in fact we kind of do with Copilot) but restacking (automatically fanning out changes from the bottom of the the stack upwards) would be wildly inefficient. To do it right, we need to migrate @GitHub to use git reftables instead of packed-refs so that multi-ref updates / restacking will be O(n) instead of ngmi. This will take some time but has been greenlit.

English
94
132
2.5K
494.3K
Luc Georges
Luc Georges@LucSGeorges·
@silasmarvin2 Well I think there are multiple things happening at once. I wouldn’t say the loop is “hot” per se, maybe ~160 calls I think the issue is that the ok_or call was chained to a `&PyBound<PyDict>::get_item` call, in a context where the GIL wasn’t released (no allow_threads)
English
1
0
1
22
Silas
Silas@silasmarvin2·
@LucSGeorges Oh wild! Did you figure out why it added such a large overhead? Was it just in a tiny hot loop?
English
1
0
0
34
Luc Georges
Luc Georges@LucSGeorges·
fun 🦀 fact: you can nuke performance with a misplaced `ok_or`
Luc Georges tweet media
English
2
1
11
1.1K
Rémi Ouazan
Rémi Ouazan@remi_or_·
this is what it looks like when you query an llm api with 500 requests each white pixel is an actual token, each black pixel is padding the issue is not that you send too many requests. it's that they are decoding for too long
Rémi Ouazan tweet media
English
1
2
7
506
Rémi Ouazan
Rémi Ouazan@remi_or_·
So humbled to have had the chance to contribute to this amazing release 🚀 V5 of transformers is out, check it out NOW 🤗
English
2
0
8
427
Luc Georges
Luc Georges@LucSGeorges·
LESSSGOOOO kudos team, incredible work everyone 🔥🔥🔥
Luc Georges tweet media
English
0
3
55
29.8K
sysls
sysls@systematicls·
@LucSGeorges Letting you know I am stealing this
English
1
0
1
532
Luc Georges
Luc Georges@LucSGeorges·
when people ask me how HF makes money
Luc Georges tweet media
English
35
44
1.6K
104.2K