Physics of Complex Systems Lab (@pcsl_epfl) - Twitter Profili

Physics of Complex Systems Lab retweetledi

Matthieu wyart@MatthieuWyart·17 Şub

What governs the geometry of time and space embeddings in LLMs? We show it follows from translation symmetry in language statistics. With Dhruva Karkada, @DanKorchinski, Andres Nava, @yasamanbb arxiv.org/abs/2602.15029

English

7

60

352

25.8K

Physics of Complex Systems Lab retweetledi

Matthieu wyart@MatthieuWyart·16 Şub

"Physics" approach to LLMs studied how synthetic languages are parsed after training, but the mechanism of learning how to parse was not known. Which correlations in data are used, and how many data are needed for that? This is answered here for a class of context-free languages.

Francesco Cagnetta@Fraccagnetta

❓ How do LLMs learn hierarchical structure from sentences alone? 🚨 We build PCFG-like synthetic datasets with two knobs---hierarchy + ambiguity---and derive a correlation-based learning mechanism that predicts the sample complexity of deep nets. Results 👇

English

1

10

55

5.6K

Physics of Complex Systems Lab retweetledi

Matthieu wyart@MatthieuWyart·16 Şub

This paper asks: What controls the scaling laws of LLMs? Two key ideas: (i) as the training set size increases, correlations are detected on a longer context scale and (ii) on this scale, LLMs function optimally: the loss is ~ the next-token conditional entropy.

Surya Ganguli@SuryaGanguli

Our new paper "Deriving neural scaling laws from the statistics of natural language" arxiv.org/abs/2602.07488 lead by @Fraccagnetta & @AllanRaventos w/ Matthieu Wyart makes a breakthrough! We can predict data-limited neural scaling law exponents from first principles using the structure of natural language itself for the very first time! If you give us two properties of your natural language dataset: 1) How conditional entropy of the next token decays with conditioning length. 2) How pairwise token correlations decay with time separation. Then we can give you the exponent of the neural scaling law (loss versus data amount) through a simple formula! The key idea is that as you increase the amount of training data, models can look further back in the past to predict, and as long as they do this well, the conditional entropy of the next token, conditioned on all tokens up to this data-dependent prediction time horizon, completely governs the loss! This gets us our simple formula for the neural scaling law!

English

1

7

39

7.7K

Physics of Complex Systems Lab retweetledi

Surya Ganguli@SuryaGanguli·10 Şub

Our new paper "Deriving neural scaling laws from the statistics of natural language" arxiv.org/abs/2602.07488 lead by @Fraccagnetta & @AllanRaventos w/ Matthieu Wyart makes a breakthrough! We can predict data-limited neural scaling law exponents from first principles using the structure of natural language itself for the very first time! If you give us two properties of your natural language dataset: 1) How conditional entropy of the next token decays with conditioning length. 2) How pairwise token correlations decay with time separation. Then we can give you the exponent of the neural scaling law (loss versus data amount) through a simple formula! The key idea is that as you increase the amount of training data, models can look further back in the past to predict, and as long as they do this well, the conditional entropy of the next token, conditioned on all tokens up to this data-dependent prediction time horizon, completely governs the loss! This gets us our simple formula for the neural scaling law!

English

20

117

576

60.2K

Physics of Complex Systems Lab@pcsl_epfl·10 Şub

What sets the neural scaling exponents for natural language? Joint with @Fraccagnetta, @AllanRaventos, and @SuryaGanguli.

Francesco Cagnetta@Fraccagnetta

🚨 We derive data-limited neural scaling exponents directly from measurable corpus statistics. No synthetic data models, only two ingredients: -decay of token-token correlations with separation; -decay of next-token conditional entropy with context length.

English

1

21

917

Physics of Complex Systems Lab@pcsl_epfl·10 Şub

Check out our latest work on how neural language models learn to parse PCFGs from local stats!

Francesco Cagnetta@Fraccagnetta

❓ How do LLMs learn hierarchical structure from sentences alone? 🚨 We build PCFG-like synthetic datasets with two knobs---hierarchy + ambiguity---and derive a correlation-based learning mechanism that predicts the sample complexity of deep nets. Results 👇

English

0

1

6

549

Physics of Complex Systems Lab retweetledi

Daniel Korchinski@DanKorchinski·5 Ara

I’m excited to present my work at #NeurIPS with Dhruva Karkada, @yasamanbb and Matthieu Wyart tomorrow. If you want to understand why analogical reasoning emerges geometrically in simple language models, come check out our poster (# 3209) Friday afternoon at 16:30!

English

2

3

26

3.7K

Physics of Complex Systems Lab retweetledi

Surya Ganguli@SuryaGanguli·15 Kas

We have 14 survey lectures for our @SimonsFdn Collaboration on the Physics of Learning and Neural Computation! All videos available at: physicsoflearning.org/webinar-series Here is the list: @zdeborova: Attention-based models and how to solve them using tools from quadratic networks and matrix denoising @KempeLab: Recent lessons from LLM reasoning @MBarkeshli: Sharpness dynamics in neural network training @KrzakalaF: How Do Neural Networks Learn Simple Functions with Gradient Descent? Michael Douglas: Mathematics, Economics and AI Yuhai Tu: Towards a Physics-based Theoretical Foundation for Deep Learning: Stochastic Learning Dynamics and Generalization @SuryaGanguli: An analytic theory of creativity for convolutional diffusion models Eva Silverstein: Hamiltonian dynamics for stabilizing neural simulation-based inference @adnarim066: Generation with Unified Diffusion Bernd Rosenow: Random matrix analysis of neural networks: distinguishing noise from learned information @jhhalverson Nerual networks and conformal field theory @KempeLab Synthetic data: friend or foe in the age of scaling @WyartMatthieu Learning hierarchical representations with deep architectures @CPehlevan Mean-field theory of deep network learning dynamics and applications to neural scaling laws

English

2

57

250

22.1K

Physics of Complex Systems Lab retweetledi

Alessandro Favero@alesfav·8 Eki

🎓My PhD thesis is now on arXiv! It follows a thread of compositionality in AI: from locality in CNNs, to the 'grammar' diffusion models learn to be creative, to task composition in foundation models. Officially a Dr 🎉 and starting as a Physics-AI Fellow at DAMTP @Cambridge_Uni

Stat.ML Papers@StatMLPapers

The Physics of Data and Tasks: Theories of Locality and Compositionality in Deep Learning ift.tt/9B0HFnC

English

3

1

22

1.2K

Physics of Complex Systems Lab@pcsl_epfl·18 Ağu

Excited to be part of the new @SimonFdn Collaboration on the Physics of Learning and Neural Computation!

Simons Foundation@SimonsFdn

Our new Simons Collaboration on the Physics of Learning and Neural Computation will employ and develop powerful tools from #physics, #math, computer science and theoretical #neuroscience to understand how large neural networks learn, compute, scale, reason and imagine: simonsfoundation.org/2025/08/18/sim…

English

0

8

451

Physics of Complex Systems Lab retweetledi

Francesco Cagnetta@Fraccagnetta·14 Tem

And so it begins @icmlconf ... Tuesday 11 am (east hall) -> We dive into the training process of diffusion models to understand how generalization and combinatorial creativity emerge, with @alesfav. Wednesday 11 am (east hall) -> What Really Drives Neural Scaling Laws? Hierarchical Structure vs. Power-Law Distributions. If you’re interested in hierarchical structure in data, physics of DL, or just want to chat, come say hi!

English

0

3

19

1.2K

Physics of Complex Systems Lab retweetledi

Francesco Cagnetta@Fraccagnetta·27 May

Neural scaling laws are powerful and predictive, but what sets the exponent? Previous work links it to power-law data statistics, echoing classical results of kernel theory. arxiv.org/abs/2505.07067 shows that hierarchical structure matters more. Accepted @icmlconf 2025 🎉

Katie Everett@_katieeverett

There were so many great replies to this thread, let's do a Part 2! For scaling laws between loss and compute, where loss = a * flops ^ b + c, which factors change primarily the constant (a) and which factors can actually change the exponent (b)? x.com/_katieeverett/…

English

1

4

22

1.7K

Physics of Complex Systems Lab retweetledi

Francesco Cagnetta@Fraccagnetta·18 May

@jxmnop What about this arxiv.org/abs/2406.00048? We use a model of hierarchical data to make analytical predictions about the sample size required to reach the n-gram performance as a function of n, then test the predictions of the theory in experiments with standard DL architectures

English

0

3

12

561

Physics of Complex Systems Lab retweetledi

Alessandro Favero@alesfav·5 May

Amazed by language diffusion models like Mercury? So are we 🤯 But how do they actually learn to generate coherent (and creative!) text from scratch? We dig into it using simple formal grammars and statistical physics. Just accepted @icmlconf 2025 🎉

English

1

4

31

2.3K

Physics of Complex Systems Lab retweetledi

Alessandro Favero@alesfav·9 Ara

I’ll be @NeurIPSConf Tuesday-Sunday! Happy to chat about the physics/science of DL, hierarchies and compositionality in images and language, diffusion models, task arithmetic, and model merging. Feel free to reach out! Also, I’m looking for job opportunities starting next fall!

English

0

1

10

405

Physics of Complex Systems Lab retweetledi

Francesco Cagnetta@Fraccagnetta·10 Ara

At @NeurIPSConf until Sunday. Come to my poster Wed, 4:30 PM (neurips.cc/virtual/2024/p…). Also reach out if you want to chat about hierarchical data structures and theory of deep learning!

English

1

6

43

3K

Physics of Complex Systems Lab retweetledi

Antonio Sclocchi@AntSclocchi·10 Ara

I’m @NeurIPSConf until Sunday! Join my talk (Sun, 1:20 PM, at @scifordl) on our work on data structure and diffusion models (neurips.cc/virtual/2024/1…), with @alesfav, N. Levi, @pcsl_epfl. Happy to chat/connect with anyone interested in the science of deep learning! 🤖

English

0

5

10

612

Physics of Complex Systems Lab retweetledi

Stat.ML Papers@StatMLPapers·18 Eki

Probing the Latent Hierarchical Structure of Data via Diffusion Models ift.tt/e5I0Kr7

English

0

8

91

6.4K

Physics of Complex Systems Lab retweetledi

Francesco Cagnetta@Fraccagnetta·2 Eki

Couldn't agree more, and neither could the reviewers---I am glad to announce that my latest work on modeling the hierarchical and compositional structure of text data has been accepted at #NeurIPS2024! Check it at arxiv.org/pdf/2406.00048.

Simons Institute for the Theory of Computing@SimonsInstitute

"The structure of data is the dark matter of theory in deep learning" — @SuryaGanguli during his talk on "Perspectives from Physics, Neuroscience, and Theory" at the Simons Institute's Special Year on Large Language Models and Transformers, Part 1 Boot Camp.

English

2

25

2.2K

Physics of Complex Systems Lab retweetledi

Francesco Cagnetta@Fraccagnetta·2 Tem

Just published in @PhysRevX! Enjoy open access at journals.aps.org/prx/abstract/1… and stay tuned for the follow-up on language modelling!⏲️

Francesco Cagnetta@Fraccagnetta

1/3 Check this ---> arxiv.org/abs/2307.02129. After years of dabbling in machine learning theory, we (finally) go back to our physics roots and introduce an idealised model of data that sheds light on a pressing question of the field: how do deep neural networks work?

English

0

3

28

2.6K

Physics of Complex Systems Lab

Keşfet