gabriel teston (@GabrielTeston) - Twitter Profili

Sabitlenmiş Tweet

gabriel teston@GabrielTeston·25 Kas

Training LLMs across multiple datacenters is hard. 🛑 Synchronization demands often cause massive slowdowns as we scale up. If you're at @NeurIPSConf, come see how we tackle this! Our work, "Scaling Laws for DiLoCo," shows how DiLoCo relax synchronization without compromising model quality, allowing training to scale incredibly well. Come chat with me and @NovaFallen8: 🗓️ Thu, Dec 4 ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811 #NeurIPS2025 #LLMs #DistributedTraining #ScalingLaws

Zachary Charles@MatharyCharles

We just put out a key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCo TL;DR: We can do LLM training across datacenters in a way that scales incredibly well to larger and larger models!

English

0

3

10

6.3K

gabriel teston@GabrielTeston·4 Ara

It is today! ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811

gabriel teston@GabrielTeston

Training LLMs across multiple datacenters is hard. 🛑 Synchronization demands often cause massive slowdowns as we scale up. If you're at @NeurIPSConf, come see how we tackle this! Our work, "Scaling Laws for DiLoCo," shows how DiLoCo relax synchronization without compromising model quality, allowing training to scale incredibly well. Come chat with me and @NovaFallen8: 🗓️ Thu, Dec 4 ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811 #NeurIPS2025 #LLMs #DistributedTraining #ScalingLaws

English

0

46

gabriel teston@GabrielTeston·30 Kas

Heading to @NeurIPSConf in San Diego. I’ve got some DiLoCo stickers to give away! 👾 ❤️ Come check out our poster. 🗓️ Thu, Dec 4 ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811 #NeurIPS2025

English

0

1

5

603

gabriel teston retweetledi

Sam Lehman@SPLehman·26 Kas

Attending @NeurIPSConf and interested in distributed, modular, and/or open AI? Hadn't seen someone put together a list of poster presentations in this area so took it upon myself to thread out who I'm excited to talk to next week🧵

English

5

48

4.7K

gabriel teston@GabrielTeston·20 Kas

@yacinelearning 😭

QME

0

1

17

Yacine Mahdid@yacinelearning·19 Kas

@GabrielTeston I wish I could man but just came back from california

English

1

0

1

50

Yacine Mahdid@yacinelearning·19 Kas

how to attend ai research conferences the yacine™️ way

Kate Deyneka@katedeyneka

How to attend AI research conferences smart NeurIPS is in two weeks, and several people asked me how to make the most out of big AI conferences. Here are 7 tips I’ve developed after attending many of them - to maximize your time and actually extract value!

English

4

105

24.2K

gabriel teston@GabrielTeston·19 Kas

@yacinelearning Are you going to NeurIPS boss?

English

1

0

1

46

Yacine Mahdid@yacinelearning·19 Kas

hope it helps also for poster sessions try to find the ones doing RL on your favorite game and ask them very precise questions about the game (not their work) to test out their skillset

English

3

0

16

811

gabriel teston@GabrielTeston·5 Kas

@eliebakouch @Ar_Douillard should we start calling it “planets” instead of “islands” already?

English

0

2

54

elie@eliebakouch·5 Kas

> Project Suncatcher is exploring how we could one day build scalable ML compute systems in space, harnessing more of the sun’s power It appears that interplanetary diloco wasn’t a joke after all 🚀

Sundar Pichai@sundarpichai

Our TPUs are headed to space! Inspired by our history of moonshots, from quantum computing to autonomous driving, Project Suncatcher is exploring how we could one day build scalable ML compute systems in space, harnessing more of the sun’s power (which emits more power than 100 trillion times humanity’s total electricity production). Like any moonshot, it’s going to require us to solve a lot of complex engineering challenges. Early research shows our Trillium-generation TPUs (our tensor processing units, purpose-built for AI) survived without damage when tested in a particle accelerator to simulate low-earth orbit levels of radiation. However, significant challenges still remain like thermal management and on-orbit system reliability. More testing and breakthroughs will be needed as we count down to launch two prototype satellites with @planet by early 2027, our next milestone of many. Excited for us to be a part of all the innovation happening in (this) space!

English

2

1

25

3.5K

gabriel teston retweetledi

Arthur Douillard@Ar_Douillard·4 Kas

We have TPUs in space. I have a DiLoCo implem running on TPUs. Cosmic Distributed Learning when?

Sundar Pichai@sundarpichai

Our TPUs are headed to space! Inspired by our history of moonshots, from quantum computing to autonomous driving, Project Suncatcher is exploring how we could one day build scalable ML compute systems in space, harnessing more of the sun’s power (which emits more power than 100 trillion times humanity’s total electricity production). Like any moonshot, it’s going to require us to solve a lot of complex engineering challenges. Early research shows our Trillium-generation TPUs (our tensor processing units, purpose-built for AI) survived without damage when tested in a particle accelerator to simulate low-earth orbit levels of radiation. However, significant challenges still remain like thermal management and on-orbit system reliability. More testing and breakthroughs will be needed as we count down to launch two prototype satellites with @planet by early 2027, our next milestone of many. Excited for us to be a part of all the innovation happening in (this) space!

English

9

3

58

8K

gabriel teston@GabrielTeston·30 Eki

@gabriel1 @nikitabier @gabriel @elonmusk Would it be like a UFC, can I challenge you after? (If you win)

English

0

57

gabriel@gabriel1·30 Eki

ok @nikitabier challenge accepted for the @gabriel username @elonmusk you and me friendly cage match, let me just gain couple pounds. give me 6 months

Nikita Bier@nikitabier

@gabriel1 @gabriel Cage match with Elon

English

24

2

343

68.6K

gabriel teston@GabrielTeston·25 Eki

@yacinelearning @DanAdvantage Living my best life 🙏

English

1

0

2

24

Yacine Mahdid@yacinelearning·25 Eki

@GabrielTeston @DanAdvantage damn man you got the job the girl and the view what a life

English

1

0

2

70

Dan Advantage@DanAdvantage·24 Eki

ultimate flex thread; now's your chance to brag without being judged. > 5 things you accomplished this week i'll go first: 1) lost ~1 pound, no muscle loss 2) ran 35 miles 3) upped father game, paying extra attention to kids and engaging them 4) @yacinelearning interview 5) $$$$

English

9

0

33

1.2K

gabriel teston@GabrielTeston·25 Eki

@DanAdvantage @yacinelearning One of the best. Btw I was here when I got the news about the promotion

English

1

0

3

51

Dan Advantage@DanAdvantage·25 Eki

@GabrielTeston @yacinelearning dang, successful week! nice work!

English

1

0

1

46

gabriel teston@GabrielTeston·25 Eki

@yacinelearning @DanAdvantage Thanks boss

English

1

0

2

22

Yacine Mahdid@yacinelearning·25 Eki

@GabrielTeston @DanAdvantage congrats on the promotion boss!!!

English

1

0

2

59

gabriel teston@GabrielTeston·23 Eki

@Ar_Douillard Super deserved!!!

English

0

1

90

Arthur Douillard@Ar_Douillard·23 Eki

I’ve been promoted to Staff RS. Vain title etc. but feels good to see appreciation for distributed learning in DeepMind ☺️

English

39

10

439

40.3K

gabriel teston@GabrielTeston·23 Eki

I would say probably get a full time role as RS/RE, maybe spend some time abroad and connect to more smart people 🤓

Yacine Mahdid@yacinelearning

what are you 5 years goal folks

English

1

0

5

721

gabriel teston@GabrielTeston·23 Eki

@eliebakouch @yacinelearning Nous is looking for a dj for their after party…🤭

English

1

0

6

282

elie@eliebakouch·22 Eki

@yacinelearning here you go boss, there is some dj set here soundcloud.com/eliemusicc

English

3

0

14

746

Yacine Mahdid@yacinelearning·22 Eki

what are you 5 years goal folks

English

42

7

270

83.9K

gabriel teston@GabrielTeston·16 Eki

@Ar_Douillard Huge!

English

0

1

54

Arthur Douillard@Ar_Douillard·16 Eki

Learned today that a startup is using Streaming DiLoCo to train a distributed AlphaFold-like model. Happy :)

English

2

0

24

1.7K

gabriel teston@GabrielTeston·10 Eki

@riseofreh @VictorTaelin Que porra é essa?

Português

1

0

3

169

Immalittle.bit@riseofreh·10 Eki

@VictorTaelin Minha função era escavar, procurar por timestamps, snapshots, vídeos espelhos, e coisas bizarras que são ocultas pra maioria das pessoas. Você navega por uma curva crítica... Eu por outra...

Português

1

0

248

Taelin@VictorTaelin·9 Eki

At this point everyone, even OpenAI, is annoyed at me posting Codex stuff non-stop. You get it. I know you do. But I'm genuinely having so much fun and progress building HVM4 that I have to contain myself not to post every small thing. So many things that were previously just not viable due to time constraints are now within reach. For example, HVM1 needed a major refactor of the GC before we could fix the parallel bug. That would take days. I never had that time. HVM4 often needs major refactors to push things forward too, except, now, I can do that by writing a 30 minute prompt, rather than 2 days of coding. In the last 2 days, I iterated through 5+ completely different approaches to task parallelism on HVM4's runtime. Most attempts went nowhere, but that means I lost minutes, not days, as it would take previously. Writing concurrent code is HARD. So many subtle errors. Should I put a fence here? Should this CAS use seq_rel? Should this field exist? Add a subtle bug and there goes an evening debugging... Now, I can just draft a half baked prompt that explains what is on my mind in a way that at least makes some sense, and the AI will fill the holes implement my idea as I browse X. 99% of the times, it just works. Sure, it is most likely just recalling details it learned from human literature. "Okay to write a MPMC, I need this atomic here, then this fence here." But applying it to my specific setup in 10 minutes is nothing but miraculous. I'm just thinking about the last 2 days and there is absolutely no way I'd be able to have tried so many different things in so little time. Now HVM4 has a parallel runtime that works, scales well with cores, all tests pass, near 0 error rate. 2 days ago I had none of that. HVM3 spent a whole year without one!

English

33

19

705

63.8K

gabriel teston@GabrielTeston·8 Eki

@eliebakouch 😮

QME

0

1

78

elie@eliebakouch·8 Eki

Open source release of smol merch coming soon at colm 🤏

English

4

55

4.3K

gabriel teston@GabrielTeston·7 Eki

Want to learn how to train models across the world, with 400x less bits exchanged and a huge latency tolerance? 🌎 I’ll be presenting our work on how to efficiently scale distributed training at @COLM_conf. 🗓️ TODAY: Tuesday, 11:00 - 13:00 📍 Room 710 #COLM2025

English

0

2

10

4.1K

gabriel teston

Keşfet