Anirbit

172 posts

Anirbit banner
Anirbit

Anirbit

@anirbit_maths

Lecturer in ML, The University of Manchester Action Editor @ TMLR Associate Editor @ ACM-TOPML

Katılım Ocak 2025
107 Takip Edilen52 Takipçiler
Anirbit
Anirbit@anirbit_maths·
@sidgairo18 This is exactly the mess that TMLR solved. There are no scores in round-1 of reviews. There is only yes/no decision after rebuttals. There are enough reasons why every system needs to converge to this.
English
1
0
1
32
Siddhartha Gairola
Siddhartha Gairola@sidgairo18·
Food for thought - 🤔 I've been thinking about this long and hard - having been reviewing for popular ML / CV conferences (ICML, ICLR, NeurIPS, CVPR, ICCV, ECCV) - with the community submitting papers across these, it only makes sense to have a uniform reviewer form, guidelines, rules and format across these conferences. Personally I have a real hard time calibrating my scale from 1-10 (ICLR) to 1-6 for CVPR, then we comes ICML which also has 1-6 but 3,4 are weak reject/accept instead of 3,4 as borderline reject/accept (for CVPR). This only gets trickier and worse when you add ICCV, ECCV, NeurIPS into the mix. Then, you add NLP related conferences and Robotics ones, to make the entire system more and more confusing - with uncalibrated reviewer scores coming - which may or may not truly reflect the reviewer's intentions. Happy to hear the thoughts of others. cc: @icmlconf @CVPR @NeurIPSConf @iclr_conf @ICCVConference @eccvconf
Siddhartha Gairola tweet media
English
3
0
8
1.3K
Anirbit
Anirbit@anirbit_maths·
@akshayrangamani Hello Akshay 😁 For a start, gradient-flow is a PDE! Way too much of modern AI is hinged on understanding gradient-flows. Then SDEs, underlie so much of noisy-S/GD, which we understand by their density evolution by FPS PDE.
English
0
0
0
62
Anirbit
Anirbit@anirbit_maths·
@thegautamkamath My anecdotal evidence is that rebuttals in general (for majority?) do no good to the authors. Is there statistics on how many papers crossed the acceptance threshold after rebuttals?
English
1
0
0
504
Gautam Kamath
Gautam Kamath@thegautamkamath·
Suppose one of NeurIPS/ICML/ICLR decided to do away with all rebuttals. Acceptances/rejections would be decided by the reviewers and the ACs, without input from the authors beyond the submissions. Which would you, as an author and a reviewer jointly, prefer?
English
16
0
15
12K
Anirbit
Anirbit@anirbit_maths·
#MyXAnniversary 😎 It started with posting about our paper on provable training of nets of any size 😁
Anirbit tweet media
English
0
0
1
62
Anirbit
Anirbit@anirbit_maths·
@kamalikac Is there a pathway from mechanistic interpretability to task specific neural architecture search? I would love to know your views on this 🙂
English
0
0
1
33
Kamalika Chaudhuri
Kamalika Chaudhuri@kamalikac·
I am looking for more topics for blog-posts; please DM your suggestions! Topics can be anything related to AI, privacy/security/safety, generalization, LLMs, career advice. A reminder that I cannot blog about my employers or anything specific/sensitive to them.
English
5
1
16
2.5K
Andrew Gordon Wilson
Andrew Gordon Wilson@andrewgwils·
Have you ever reversed your position on a strongly held technical belief? What was the belief and what convinced you to change your mind?
English
20
5
66
20.8K
Anirbit
Anirbit@anirbit_maths·
Seems our work on using Villani functions to prove neural training is now among the most-read recent papers in the IMA journal, II! 💥 ​ This is the first (only?) truly “beyond-NTK” proof. Do check our related recent work, arxiv.org/abs/2503.10428…
Anirbit tweet media
English
0
0
4
137
Anirbit
Anirbit@anirbit_maths·
@pfau I think parametetized PDEs provide a natural notion of in/out-domain. No amount of data can encompass fluid-flow at all possible viscosities. There is always an unseen value of that where one can ask for predictions - and possibly falter?
English
0
0
0
22
David Pfau
David Pfau@pfau·
This is the key difference between in-domain and out-of-domain generalization, and we still have not truly solved out-of-domain generalization. It just turns out you can build world changing technology by throwing so much data at things that the entire universe is in-domain.
Niels Rogge@NielsRogge

One of the best visual explanations I've ever seen for why scaling Transformers works, but is suboptimal, as it's just brute-forcing things, by @YesThisIsLion (co-author of the Transformer) on @MLStreetTalk "In the (rejected) paper "Intelligent Matrix Exponentiation", they show the decision boundary of a classic MLP with a ReLu/Tanh activation function on the classic Spiral dataset." "You can see they both technically solve it with great scores on the test set. Next, they show the decision boundary of the "M-layer" they propose in the paper. And it represents the spiral ... as a spiral!" "Shouldn't we? If the data is a spiral... shouldn't we represent it as a spiral?" "If you look back at the decision boundaries of the MLP, it's clear that you just have these tiny, piecewise separations without learning the concept of a spiral. That's what I mean!" "If you train these things enough, it can fit the spiral and get a high accuracy. But there's no indication that the MLP actually understands a spiral. When you represent it as a spiral, it extrapolates correctly, cause the spiral just keeps going out."

English
13
22
337
35.9K
Anirbit
Anirbit@anirbit_maths·
This was the slide where I outlined the 2 key questions which I think are foundational to progress with neural operators & #AI4Science . ACM IKDD #CODS2025, gave a platform for such discussions between new academics and subject stalwarts in the audience 💥
Anirbit tweet media
Anirbit@anirbit_maths

Gave my "new faculty highlight" talk at the ACM IKDD #CODS 2025 - where I outlined a vision for neural operator research - and reviewed our 2 #TMLR papers from 2024, in the theme.

English
0
1
1
157
Anirbit
Anirbit@anirbit_maths·
Gave my "new faculty highlight" talk at the ACM IKDD #CODS 2025 - where I outlined a vision for neural operator research - and reviewed our 2 #TMLR papers from 2024, in the theme.
Anirbit tweet mediaAnirbit tweet mediaAnirbit tweet media
English
0
0
3
350
Anirbit
Anirbit@anirbit_maths·
@deepcohen As much as I agree with the idea of giving "falsifiable predictions", I feel it would be near-impossible to publish such papers. A lot of theory is cool to do but not done because of this official requirement. I hope to be wrong about this!
English
1
0
3
92
Jeremy Cohen
Jeremy Cohen@deepcohen·
So, we should focus on theories that can reliably predict “the small things” about deep learning, and gradually broaden the scope of what we can predict, until we have theory that can reliably predict “the big things” about deep learning too.
English
2
5
49
2.7K
Jeremy Cohen
Jeremy Cohen@deepcohen·
The goal of deep learning theory/science is to guide practice. But most practical questions are >1 paper away from being legitimately answered by theory. How, then, can we make progress, without access to the ideal reward signal of “does this theory give us a SOTA algorithm?” …
English
6
25
182
26.9K
Anirbit
Anirbit@anirbit_maths·
@PreetumNakkiran Ofcourse one can't rule out that tomorrow a training guarantee might emerge that critically leverages some subnet property. If that happens, that could be a serious bolstering of the view of mechanistic interpretability.
English
0
0
0
58
Anirbit
Anirbit@anirbit_maths·
@PreetumNakkiran "understand" means opposite things to these 2 communities. As someone wanting provable training, I am far less concerned about the complexity of my proof. Mechanistics are hoping the otherway, that key features of big models somehow exist in simpler subnets.
English
1
0
2
176
Preetum Nakkiran
Preetum Nakkiran@PreetumNakkiran·
“Theory of deep learning” went through similar discussions about its goals & purpose some ~5yrs ago. Someone should write about the relations between mech-interp & theory: two communities w/ fundamentally similar motivations (“understand neural nets”), but very different methods.
David Bau@davidbau

At the #Neurips2025 mechanistic interpretability workshop I gave a brief talk about Venetian glassmaking, since I think we face a similar moment in AI research today. Here is a blog post summarizing the talk: davidbau.com/archives/2025/…

English
8
6
72
13.5K
Anirbit
Anirbit@anirbit_maths·
@gowthami_s David Pfau's results are from years ago. It was clear right then that AI-for-Science works 🙂
English
0
0
0
81
Gowthami
Gowthami@gowthami_s·
Tbh, this NeurIPS changed my perspective on "AI for science"! It looks like things are working, and there's also a lot of interest from traditional companies - both in material discovery and biotech. A topic worth exploring for the current generation of PhDs!
English
7
5
136
9.1K
Anirbit
Anirbit@anirbit_maths·
A recent paper improved one of my PhD 1st year results by 0.63. Surprising that our upperbound held for 9 yrs 🤣 These are things only maths people get excited about 😁 Still Open : Are 2 layers sufficient for a net to compute the maximum of n numbers? 💥
Anirbit tweet media
English
0
0
1
87