Mathieu Dagréou

379 posts

Mathieu Dagréou

@Mat_Dag

Ph.D. student in at @Inria_Saclay working on Optimization and Machine Learning @matdag.bsky.social

Paris, France Katılım Temmuz 2019

546 Takip Edilen508 Takipçiler

Sabitlenmiş Tweet

Mathieu Dagréou@Mat_Dag·19 Nis

📣📣 Preprint alert 📣📣 « A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization » w. @tomamoral, @vaiter & @PierreAblin arxiv.org/abs/2302.08766 1/3

English

11.3K

Mathieu Dagréou retweetledi

Clément Bonet@Clement_Bonet_·2 May

Our work "Busemann Functions in the Wasserstein Space" was accepted at #AISTATS2026 This is a joint work with Elsa Cazelles, Lucas Drumetz and @nicolas_courty. I will be presenting it tomorrow at the poster 96, see you there! Link: openreview.net/forum?id=Xpt0H…

English

Mathieu Dagréou retweetledi

Mark Schmidt@MarkSchmidtUBC·6 Eki

This is the way.

Jeremy Cohen@deepcohen

@jasondeanlee @SebastienBubeck @tomgoldsteincs @zicokolter @atalwalkar This is the third, last, and best paper from my PhD. By some metrics, an ML PhD student who writes just three conference papers is "unproductive." But I wouldn't have had it any other way 😉 !

English

10.3K

Mathieu Dagréou retweetledi

Konstantin Mishchenko@konstmish·3 Eki

Nesterov dropped a new paper last week on what functions can be optimized with gradient descent. The idea is simple: we know GD can optimize both nonsmooth (bounded grads) and smooth (Lipschitz grads) functions, but smooth+nonsmooth satisfies neither property, so what can we do?

English

468

30.7K

Mathieu Dagréou retweetledi

Fabian Schaipp@FSchaipp·1 Eyl

🚟 New blog post: On "infinite" learning-rate schedules and how to construct them from one checkpoint to the next fabian-sp.github.io/posts/2025/09/…

English

Mathieu Dagréou retweetledi

Rudy Morel@rdMorel·14 Tem

For evolving unknown PDEs, ML models are trained on next-state prediction. But do they actually learn the time dynamics: the "physics"? Check out our poster (W-107) at #ICML2025 this Wed, Jul 16. Our "DISCO" model learns the physics while staying SOTA on next states prediction!

English

301

21.1K

Mathieu Dagréou retweetledi

Mathieu Blondel@mblondel_ml·1 Tem

Back from MLSS Senegal 🇸🇳, where I had the honor of giving lectures on differentiable programming. Really grateful for all the amazing people I got to meet 🙏 My slides are here github.com/diffprog/slide…

English

5.5K

Mathieu Dagréou retweetledi

Waïss Azizian@wazizian·17 Haz

❓ How long does SGD take to reach the global minimum on non-convex functions? With @FranckIutzeler, J. Malick, P. Mertikopoulos, we tackle this fundamental question in our new ICML 2025 paper: "The Global Convergence Time of Stochastic Gradient Descent in Non-Convex Landscapes"

English

490

34.6K

Mathieu Dagréou retweetledi

Konstantin Mishchenko@konstmish·18 Haz

I want to address one very common misconception about optimization. I often hear that (approximately) preconditioning with the Hessian diagonal is always a good thing. It's not. In fact, finding a good preconditioner is an open problem, which I think deserves more attention. 1/4

English

204

20.3K

Mathieu Dagréou retweetledi

Matthieu Terris@MatthieuTerris·7 Haz

🧵 I'll be at CVPR next week presenting our FiRe work 🔥 TL;DR: We go beyond denoising models in PnP with more general restoration (e.g. deblurring) models! A starting point observation is that images are not fixed-points of restoration models:

GIF

English

2.2K

Mathieu Dagréou retweetledi

Samuel Vaiter@vaiter·3 Haz

📣 New preprint 📣 **Differentiable Generalized Sliced Wasserstein Plans** w/ L. Chapel @rtavenar We propose a Generalized Sliced Wasserstein method that provides an approximated transport plan and which admits a differentiable approximation. arxiv.org/abs/2505.22049 1/5

English

2.7K

Mathieu Dagréou retweetledi

Mathurin Massias@mathusmassias·24 Nis

It was received quite enthusiastically here so time to share it again!!! Our #ICLR2025 blog post on Flow M atching was published yesterday : iclr-blogposts.github.io/2025/blog/cond… My PhD student Anne Gagneux will present it tomorrow in ICLR, 👉poster session 4, 3 pm, #549 in Hall 3/2B 👈

English

847

Mathieu Dagréou retweetledi

Gabriel Peyré@gabrielpeyre·19 Şub

Optimization algorithms come with many flavors depending on the structure of the problem. Smooth vs non-smooth, convex vs non-convex, stochastic vs deterministic, etc. en.wikipedia.org/wiki/Mathemati…

English

108

512

21.3K

Mathieu Dagréou retweetledi

Alex Hägele@haeggee·14 Şub

A really fun project to work on. Looking at these plots side-by-side still amazes me! How well can **convex optimization theory** match actual LLM runs? My favorite points of our paper on the agreement for LR schedules in theory and practice: 1/n

Fabian Schaipp@FSchaipp

Learning rate schedules seem mysterious? Turns out that their behaviour can be described with a bound from *convex, nonsmooth* optimization. Short thread on our latest paper 🚇 arxiv.org/abs/2501.18965

English

4.9K

Mathieu Dagréou retweetledi

Fabian Schaipp@FSchaipp·5 Şub

Aaron Defazio@aaron_defazio

The sudden loss drop when annealing the learning rate at the end of a WSD (warmup-stable-decay) schedule can be explained without relying on non-convexity or even smoothness, a new paper shows that it can be precisely predicted by theory in the convex, non-smooth setting! 1/2

English

141

31.5K

Mathieu Dagréou retweetledi

Konstantin Mishchenko@konstmish·3 Şub

Learning rate schedulers used to be a big mistery. Now you can just take a guarantee for *convex non-smooth* problems (from arxiv.org/abs/2310.07831), and they give you *precisely* what you see in training large models. See this empirical study: arxiv.org/abs/2501.18965 1/3

English

434

28.7K

Mathieu Dagréou retweetledi

Theo Uscidda@theo_uscidda·22 Oca

Our work on geometric disentangled representation learning has been accepted to ICLR 2025! 🎊See you in Singapore if you want to understand this gif better :)

Theo Uscidda@theo_uscidda

Curious about the potential of optimal transport (OT) in representation learning? Join @CuturiMarco's talk at the UniReps workshop today at 2:30 PM! Marco will notably discuss our latest paper on using OT to learn disentangled representations. Details below ⬇️

English

152

14.3K

Mathieu Dagréou retweetledi

Gabriel Peyré@gabrielpeyre·22 Oca

The Mathematics of Artificial Intelligence: In this introductory and highly subjective survey, aimed at a general mathematical audience, I showcase some key theoretical concepts underlying recent advancements in machine learning. arxiv.org/abs/2501.10465

English

408

1.9K

180.9K

Mathieu Dagréou retweetledi

Francis Bach@BachFrancis·21 Ara

My book is (at last) out, just in time for Christmas! A blog post to celebrate and present it: francisbach.com/my-book-is-out/

English

310

1.9K

235.4K

Mathieu Dagréou retweetledi

Samuel Vaiter@vaiter·19 Ara

When optimization problems have multiple minima, algorithms favor specific solutions due to their implicit bias. For ordinary least squares (OLS), gradient descent inherently converges to the minimal norm solution among all possible solutions. fa.bianp.net/blog/2022/impl…

English

347

20.4K

Mathieu Dagréou retweetledi

Pierre Ablin@PierreAblin·18 Ara

🍏🍏🍏 Come work with us at Apple Machine Learning Research! 🍏🍏🍏 Our team focuses on curiosity-based, open research. We work on several topics, including LLMs, optimization, optimal transport, uncertainty quantification, and generative modeling. Infos 👇

English

376

48K

Keşfet

@nicolas_courty @FranckIutzeler @rtavenar @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates