Katie Everett (@_katieeverett) - Twitter Profili

Katie Everett@_katieeverett·20 Şub

Excited to share our paper on ADANA led by the amazing @damien_ferbach! Joint work with @cypaquette @gauthier_gidel @poseypaquet

Damien Ferbach@damien_ferbach

1/10 We built ADANA, an optimizer that gets better as you scale. It extends AdamW with log-time schedules for momentum and weight decay — same hyperparameter count, no extra engineering. Scaled from 45M to 2.6B, it saves ~40% compute vs tuned AdamW, and the gap keeps growing.🧵

English

1

23

2K

Katie Everett retweetledi

Poetiq@poetiq_ai·5 Ara

Poetiq has officially shattered the ARC-AGI-2 SOTA 🚀 @arcprize has officially verified our results: - 54% Accuracy – first to break the 50% barrier! - $30.57 / problem – less than half the cost of the previous best! We are now #1 on the leaderboard for ARC-AGI-2!

English

111

261

2.4K

471.5K

Katie Everett@_katieeverett·20 Kas

Six researchers at @poetiq_ai moved the Pareto frontier on ARC-AGI-1 and ARC-AGI2 in two days??

Poetiq@poetiq_ai

Is more intelligence always more expensive? Not necessarily. Introducing Poetiq. We’ve established a new SOTA and Pareto frontier on @arcprize using Gemini 3 and GPT-5.1.

English

2

1

12

1.1K

Katie Everett@_katieeverett·27 May

@damien_ferbach @cypaquette @poseypaquet @gauthier_gidel If you enjoyed this thread, see Part 2 here: x.com/_katieeverett/…

Katie Everett@_katieeverett

There were so many great replies to this thread, let's do a Part 2! For scaling laws between loss and compute, where loss = a * flops ^ b + c, which factors change primarily the constant (a) and which factors can actually change the exponent (b)? x.com/_katieeverett/…

English

0

5

3.4K

Katie Everett@_katieeverett·23 May

@damien_ferbach @cypaquette @poseypaquet @gauthier_gidel Arxiv refs: * Hestness et al 2017: 1712.00409 * Kaplan et al 2020: 2001.08361 * Shen et al 2024: 2406.16690 * Beck et al 2024: 2405.04517 * Bahri et al 2021: 2102.06701 * Sorscher et al 2022: 2206.14486 * Brandfonbrener et al 2024: 2411.12925

Dansk

2

28

7K

Katie Everett@_katieeverett·23 May

1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?

English

16

92

626

246K

Katie Everett retweetledi

Damien Ferbach@damien_ferbach·26 May

It's very difficult to improve the *exponent* in scaling laws for loss vs compute, especially by changing the optimizer! Our new paper shows that scaling momentum correctly can *provably* improve the scaling exponent on a theoretical model. Empirically, it works on LSTMs too!

English

12

58

312

55.5K

Katie Everett@_katieeverett·25 May

@chopwatercarry @EIFY @MustafaShukor1 @ABAtanasov @SeunghyunSEO7 @ted_engineer @damien_ferbach @cypaquette @poseypaquet @gauthier_gidel More arxiv refs: Pagnoni et al 2024: 2412.09871 Snell et al: 2408.03314 Brown & Juravsky et al 2024: 2407.21787 Schaeffer et al. 2025: 2502.17578

Dansk

1

10

1.4K

Katie Everett@_katieeverett·25 May

@chopwatercarry @EIFY @MustafaShukor1 @ABAtanasov @SeunghyunSEO7 @ted_engineer @damien_ferbach @cypaquette @poseypaquet @gauthier_gidel Arxiv refs: Sharma & Kaplan 2020: 2004.10802 Henighan et al 2020: 2010.14701 Bansal et al 2022: 2202.01994 Wang et al 2024: 2410.05661 Shukor et al 2025: 2504.07951 Krajewski et al 2024: 2402.07871 Bordelon & Atanasov et al 2024: 2409.17858 Liu et al 2025: 2502.16982

Norsk

1

11

1.9K

Katie Everett@_katieeverett·25 May

There were so many great replies to this thread, let's do a Part 2! For scaling laws between loss and compute, where loss = a * flops ^ b + c, which factors change primarily the constant (a) and which factors can actually change the exponent (b)? x.com/_katieeverett/…

Katie Everett@_katieeverett

1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?

English

5

30

251

89.3K

Katie Everett

Keşfet