Daniel Kunin

112 posts

Daniel Kunin banner
Daniel Kunin

Daniel Kunin

@KuninDaniel

postdoc @UCB_MillerInst PhD @ICMEStanford creator @SeeingTheory

UC Berkeley Katılım Aralık 2020
307 Takip Edilen1.1K Takipçiler
Sabitlenmiş Tweet
Daniel Kunin
Daniel Kunin@KuninDaniel·
For the last few years, a lot of my work has been driven by the feeling that deep learning is not magic — there are principles, mechanisms, and laws waiting to be understood. This paper is our attempt to say that clearly!
Jamie Simon@learning_mech

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧

English
7
12
77
9.1K
Daniel Kunin retweetledi
Adam Shai
Adam Shai@adamimos·
A longstanding dream of interp is to decompose activations into distinct, interpretable parts. But when should we expect that to work, and what even are such parts? New from Simplex: transformers factor their world into orthogonal subspaces, even when it costs accuracy.🧵👇
GIF
English
15
95
603
74K
Daniel Kunin
Daniel Kunin@KuninDaniel·
For me, this paper is learning mechanics in action! Mech interp first identified that NNs use Fourier features in algebraic tasks - great work @bilalchughtai_ @justanotherlaw @NeelNanda5 Learn mech asks why training produced those features, in that order, with that architecture
English
0
1
26
653
Daniel Kunin
Daniel Kunin@KuninDaniel·
Yes — by leveraging associativity. We explicitly construct efficient solutions: RNNs can compose elements sequentially in k steps, while deep MLPs can compose adjacent pairs in parallel in log k layers and we find preliminary evidence that GD can discover these solutions!
Daniel Kunin tweet media
English
1
3
32
892
Daniel Kunin retweetledi
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
From "Mathematical theory of deep learning: Can we do it? Should we do it?" to "There Will Be a Scientific Theory of Deep Learning". It's respectively the title of a talk I gave four years ago, and the title of an arxiv paper from four days ago. I really like the "learning mechanics" perspective (think of it as a continuation of "statistical mechanics", "quantum mechanics", and so on). Several of my last academic papers can be viewed under that lens (e.g. Learning threshold neurons via the “edge of stability”; or LEGO). I'm not as optimistic as the authors of the recent arxiv paper that we will EVER be able to reach what the "physics mechanics" field have achieved, but it's certainly worth trying. Talk: youtu.be/3uRD_lg701k?si… Paper: arxiv.org/abs/2604.21691
YouTube video
YouTube
English
17
61
424
52.3K
Daniel Kunin
Daniel Kunin@KuninDaniel·
@tydsh Yes, agreed. “Learning mechanics” is a scientific program many researchers have been building toward for years; our goal was to clearly state the evidence for such a theory Your works are great examples of 2.1 (have read many), we should have cited them, adding them now for v2
English
1
0
9
975
Yuandong Tian
Yuandong Tian@tydsh·
History repeats itself 😀 The concept "learning mechanics" is not a new thing but actually has been explored for a very long time. It is human nature to think deeper than a blind belief of the scaling laws. I have been working on rigorously modeling training dynamics of deep nonlinear models for many years, with many non-trivial solvable examples in nonlinear dynamics that may be interesting for @learning_mech to take a look. This includes 1. Contrastive learning (e.g., arxiv.org/abs/2110.09348 arxiv.org/abs/2201.12680, arxiv.org/abs/2206.01342) 2. Non-contrastive learning (e.g., arxiv.org/abs/2102.06810, arxiv.org/abs/2110.04947) 3. Training dynamics in Transformers (e.g., arxiv.org/abs/2310.00535, arxiv.org/abs/2305.16380) 4. Grokking behaviors (arxiv.org/abs/2509.21519) 5. Spontaneous symmetry breaking (arxiv.org/abs/1703.00560) 6. Mechanism in forming symbolic solutions from gradient descent (arxiv.org/abs/2410.01779) I am the first/solo author in most of the works listed above. Code are here: github.com/yuandong-tian/…
Jamie Simon@learning_mech

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧

English
2
37
349
44.4K
Daniel Kunin
Daniel Kunin@KuninDaniel·
For the last few years, a lot of my work has been driven by the feeling that deep learning is not magic — there are principles, mechanisms, and laws waiting to be understood. This paper is our attempt to say that clearly!
Jamie Simon@learning_mech

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧

English
7
12
77
9.1K
Daniel Kunin retweetledi
Daniel Kunin retweetledi
Cengiz Pehlevan
Cengiz Pehlevan@CPehlevan·
Great perspective on the theory of deep learning from a stellar group of authors!Physics-inspired ideas will play a central role in shaping this field. Congrats to my group alumni @blake__bordelon and @ABAtanasov for their contributions here and across many influential papers.
Jamie Simon@learning_mech

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧

English
1
4
34
3.5K
Daniel Kunin
Daniel Kunin@KuninDaniel·
100% agree. Neuroscience embraces studying the brain at multiple levels — computational, algorithmic, and implementational. I’m excited to see deep learning moving toward the same conversation, with theory and interpretability informing each other!
Eric J. Michaud@ericjmichaud_

It's been so heartening to see deep learning theory folks engage seriously with interpretability recently, and I hope these two communities can talk much, much more. We should seek a unified understanding of neural networks across many levels of analysis.

English
0
3
26
1.7K
Daniel Kunin
Daniel Kunin@KuninDaniel·
@SuryaGanguli @MasonKamb Many of the ideas, works, and intuitions discussed in Sections 2.1–2.5 grew out of your group — I’m very grateful to have been part of the Ganguli gang!
English
0
0
4
265