Muthu Chidambaram

23 posts

Muthu Chidambaram

Muthu Chidambaram

@mle_muthu

Katılım Mayıs 2024
111 Takip Edilen92 Takipçiler
Sabitlenmiş Tweet
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
Back with some more work on calibration... This time to tell you that, while ECE (and variants) may be fine, reporting only ECE and accuracy when claiming improvements due to a recalibration method is not. 🧵 Paper: arxiv.org/abs/2406.04068 Code: github.com/2014mchidamb/r…
Muthu Chidambaram tweet media
English
3
9
54
5.3K
Muthu Chidambaram retweetledi
Sitan Chen
Sitan Chen@sitanch·
Guidance is one of the key ingredients behind diffusion models' impressive generation capabilities. But what does it actually do? In new work led by @mle_muthu + Khashayar and joint w/ @oldheneel + Jianfeng, we rigorously pin down its behavior in a simple but rich setting 🧵1/
Sitan Chen tweet media
English
1
29
171
20.6K
Muthu Chidambaram retweetledi
Preetum Nakkiran
Preetum Nakkiran@PreetumNakkiran·
Our tutorial on diffusion & flows is out! We made every effort to simplify the math, while still being correct. Hope you enjoy! (Link below -- it's long but is split into 5 mostly-self-contained chapters). lots of fun working with @ArwenBradley @oh_that_hat @advani_madhu on this
Preetum Nakkiran tweet media
English
26
256
1.4K
215.5K
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
@DHolzmueller Thank you for the reference! The ideas in this work are definitely super related and I will update the related work in our paper (I missed some other important references too from the literature on scoring rules that I need to add 😅).
English
0
0
1
39
David Holzmüller
David Holzmüller@DHolzmueller·
@mle_muthu Interesting work and important point! You might find the quoted paper relevant, which is making a similar point in a different way: x.com/GaelVaroquaux/…
Gael Varoquaux 🦋@GaelVaroquaux

🍾 #ICLR2023 paper accepted: Beyond Calibration Expected calibration error is not enough to control output probabilities of predictors. We introduce the first grouping-loss estimator, thus characterizing the epistemic error of neural networks arxiv.org/abs/2210.16315 1/3

English
1
0
1
118
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
I'll leave the details of the visualization to the paper, as well as this Python package that lets you play with it: pypi.org/project/sharpc…. This is a work-in-progress and I would love to hear any thoughts on better visualizations. Thanks for making it this far!
English
0
0
4
176
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
We also develop theory relating the decomposition of specific proper scoring rules into calibration and "sharpness" to the confidence calibration problem, and propose a way to jointly visualize pointwise calibration and generalization that extends reliability diagrams.
Muthu Chidambaram tweet media
English
1
0
3
202
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
But we can see that when we look at additional generalization metrics such as NLL or Brier score, it becomes clear that this recalibration approach is off. In effect it's throwing away all of the predicted probability information. So don't forget to report these!
English
1
0
3
219
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
We can separate model predictions from model confidence, and then replace model confidence with mean model accuracy on a held-out calibration set ("mean replacement recalibration", MRR). If we only report accuracy and calibration error, this approach looks really good.
Muthu Chidambaram tweet media
English
1
0
3
172
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
In the multi-class setting, we typically reduce to binary calibration by replacing labels with accuracy and distributions over classes with the max predicted probability ("confidence calibration"). In this case, the previous example suggests a trivial recalibration strategy...
English
1
0
3
153
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
A key issue with calibration metrics is that it is possible for a model to have zero calibration error but be effectively useless. For example, consider binary classification in which both classes appear with equal probability. A predictor that always predicts 1/2 is calibrated.
English
1
0
3
171
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
We then compare the logit-smoothed ECE, binned ECE, and SmoothECE of various near-SOTA models on CIFAR-10, CIFAR-100, and ImageNet and show that they all behave nearly identically in practice. So the old binned ECE results are probably fine! 🧵(5/6)
Muthu Chidambaram tweet media
English
1
0
4
161
Muthu Chidambaram retweetledi
Daniel Beaglehole
Daniel Beaglehole@dbeagleholeCS·
Neural collapse (NC) refers to the structure in representations of general neural nets, where data "collapse" to the class means. We find that layer-wise multiplication with the average gradient outer product (AGOP) of a kernel method induces NC in untrained neural networks:
Daniel Beaglehole tweet media
English
1
3
26
2.6K
Muthu Chidambaram
Muthu Chidambaram@mle_muthu·
This is what people mean when they say elegant theory
Muthu Chidambaram tweet media
English
0
0
2
71