Andrew Saxe

737 posts

Andrew Saxe banner
Andrew Saxe

Andrew Saxe

@SaxeLab

Prof at @GatsbyUCL and @SWC_Neuro, trying to figure out how we learn. Bluesky: @SaxeLab Mastodon: @[email protected]

London, UK Katılım Kasım 2019
378 Takip Edilen5.9K Takipçiler
Sabitlenmiş Tweet
Andrew Saxe
Andrew Saxe@SaxeLab·
Why don’t neural networks learn all at once, but instead progress from simple to complex solutions? And what does “simple” even mean across different neural network architectures? Sharing our new paper @iclr_conf led by Yedi Zhang with Peter Latham arxiv.org/abs/2512.20607
GIF
English
10
58
440
29.3K
Andrew Saxe retweetledi
Jamie Simon
Jamie Simon@learning_mech·
1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧
Jamie Simon tweet media
English
53
293
1.5K
299.9K
Andrew Saxe retweetledi
Stefano Sarao Mannelli
Stefano Sarao Mannelli@stefsmlab·
Two Analytical Connectionism-related updates: 1. ⏰ 1 week left to apply! Interested in language + AI & cognition? Don’t miss it: analytical-connectionism.net/school/2026/ 2. 📜 Lecture notes from the first two editions are finally out: proceedings.mlr.press/v320/
Stefano Sarao Mannelli@stefsmlab

📢 We’re now accepting applications for the 2026 School on Analytical Connectionism dedicated this year to Language Acquisition. 📍 Gothenburg, Sweden 🗓️ August 17–28, 2026 ☠️ Apply by April 17! 🔗 analytical-connectionism.net/school/2026/ 👇 Meet the experts joining us this summer!

English
0
5
14
2.5K
Andrew Saxe
Andrew Saxe@SaxeLab·
Postdoc opening! Come work with us on deep learning theory relevant to AI safety Deadline: 7 Apr 2026 Details and application: ucl.ac.uk/work-at-ucl/se…+
English
0
36
130
11.6K
Andrew Saxe
Andrew Saxe@SaxeLab·
Very excited by this year's Analytical Connectionism Summer School! A dream lineup of speakers on the topic of language acquisition in minds and machines Bursaries available to cover costs Aug 17 – Aug 28, 2026 Gothenburg Details: analytical-connectionism.net//school/2026/
Andrew Saxe tweet media
English
0
6
34
3.2K
Andrew Saxe retweetledi
Francis Bach
Francis Bach@BachFrancis·
Looking for alternatives to quadratic functions for closed-form analysis in optimization? This post explores matrix Riccati dynamics and their applications to neural networks. francisbach.com/closed-form-dy…
GIF
English
0
23
159
9.3K
Andrew Saxe retweetledi
Andrew Lampinen
Andrew Lampinen@AndrewLampinen·
What is the relationship between memorization and generalization in AI? Is there a fundamental tradeoff? In a new blog post I’ve reviewed some of the evolving perspectives on memorization & generalization in machine learning, from classic perspectives through LLMs. Link below:
Andrew Lampinen tweet media
English
12
45
430
23.7K
Andrew Saxe retweetledi
Stefano Sarao Mannelli
Stefano Sarao Mannelli@stefsmlab·
📢 We’re now accepting applications for the 2026 School on Analytical Connectionism dedicated this year to Language Acquisition. 📍 Gothenburg, Sweden 🗓️ August 17–28, 2026 ☠️ Apply by April 17! 🔗 analytical-connectionism.net/school/2026/ 👇 Meet the experts joining us this summer!
Stefano Sarao Mannelli tweet media
English
1
15
42
9.3K
Andrew Saxe
Andrew Saxe@SaxeLab·
We’re hiring postdocs/research scientists! Your interests can be anywhere on the spectrum from pure theory to empirically testing predictions relevant to AI safety. Our theoretical work relies on dynamical systems and tools from statistical physics. 3
English
2
2
48
2.9K
Andrew Saxe
Andrew Saxe@SaxeLab·
Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety. Our goal is to develop theory for modern machine learning that can help us understand network behaviors, including those critical for AI safety. 1
English
9
36
303
18.9K
Andrew Saxe
Andrew Saxe@SaxeLab·
Equipped with this theory, we make new predictions about how network width, data distribution, and initialization affect learning dynamics. For example, increasing the number of attention heads in linear attention shortens the plateaus in learning.
English
2
0
9
822
Andrew Saxe
Andrew Saxe@SaxeLab·
Why don’t neural networks learn all at once, but instead progress from simple to complex solutions? And what does “simple” even mean across different neural network architectures? Sharing our new paper @iclr_conf led by Yedi Zhang with Peter Latham arxiv.org/abs/2512.20607
GIF
English
10
58
440
29.3K