Muthu Chidambaram

23 posts

Muthu Chidambaram

@mle_muthu

Katılım Mayıs 2024

111 Takip Edilen92 Takipçiler

Sabitlenmiş Tweet

Muthu Chidambaram@mle_muthu·9 Haz

Back with some more work on calibration... This time to tell you that, while ECE (and variants) may be fine, reporting only ECE and accuracy when claiming improvements due to a recalibration method is not. 🧵 Paper: arxiv.org/abs/2406.04068 Code: github.com/2014mchidamb/r…

English

5.3K

Muthu Chidambaram retweetledi

Sitan Chen@sitanch·23 Eyl

Guidance is one of the key ingredients behind diffusion models' impressive generation capabilities. But what does it actually do? In new work led by @mle_muthu + Khashayar and joint w/ @oldheneel + Jianfeng, we rigorously pin down its behavior in a simple but rich setting 🧵1/

English

171

20.6K

Muthu Chidambaram retweetledi

Preetum Nakkiran@PreetumNakkiran·14 Haz

Our tutorial on diffusion & flows is out! We made every effort to simplify the math, while still being correct. Hope you enjoy! (Link below -- it's long but is split into 5 mostly-self-contained chapters). lots of fun working with @ArwenBradley @oh_that_hat @advani_madhu on this

English

256

1.4K

215.5K

Muthu Chidambaram@mle_muthu·13 Haz

@DHolzmueller Thank you for the reference! The ideas in this work are definitely super related and I will update the related work in our paper (I missed some other important references too from the literature on scoring rules that I need to add 😅).

English

David Holzmüller@DHolzmueller·11 Haz

@mle_muthu Interesting work and important point! You might find the quoted paper relevant, which is making a similar point in a different way: x.com/GaelVaroquaux/…

Gael Varoquaux 🦋@GaelVaroquaux

🍾 #ICLR2023 paper accepted: Beyond Calibration Expected calibration error is not enough to control output probabilities of predictors. We introduce the first grouping-loss estimator, thus characterizing the epistemic error of neural networks arxiv.org/abs/2210.16315 1/3

English

118

Muthu Chidambaram@mle_muthu·9 Haz

English

5.3K

Muthu Chidambaram@mle_muthu·9 Haz

I'll leave the details of the visualization to the paper, as well as this Python package that lets you play with it: pypi.org/project/sharpc…. This is a work-in-progress and I would love to hear any thoughts on better visualizations. Thanks for making it this far!

English

176

Muthu Chidambaram@mle_muthu·9 Haz

We also develop theory relating the decomposition of specific proper scoring rules into calibration and "sharpness" to the confidence calibration problem, and propose a way to jointly visualize pointwise calibration and generalization that extends reliability diagrams.

English

202

Muthu Chidambaram@mle_muthu·9 Haz

But we can see that when we look at additional generalization metrics such as NLL or Brier score, it becomes clear that this recalibration approach is off. In effect it's throwing away all of the predicted probability information. So don't forget to report these!

English

219

Muthu Chidambaram@mle_muthu·9 Haz

We can separate model predictions from model confidence, and then replace model confidence with mean model accuracy on a held-out calibration set ("mean replacement recalibration", MRR). If we only report accuracy and calibration error, this approach looks really good.

English

172

Muthu Chidambaram@mle_muthu·9 Haz

In the multi-class setting, we typically reduce to binary calibration by replacing labels with accuracy and distributions over classes with the max predicted probability ("confidence calibration"). In this case, the previous example suggests a trivial recalibration strategy...

English

153

Muthu Chidambaram@mle_muthu·9 Haz

A key issue with calibration metrics is that it is possible for a model to have zero calibration error but be effectively useless. For example, consider binary classification in which both classes appear with equal probability. A predictor that always predicts 1/2 is calibrated.

English

171

Muthu Chidambaram@mle_muthu·5 Haz

Lastly, I want to mention that this work sprung out of the Random Theory workshop (cosmc.net/rt23/), which I cannot recommend highly enough. Also, code for the paper is available at: github.com/2014mchidamb/h…. Thanks for reading! 🧵(6/6)

English

126

Muthu Chidambaram@mle_muthu·5 Haz

We then compare the logit-smoothed ECE, binned ECE, and SmoothECE of various near-SOTA models on CIFAR-10, CIFAR-100, and ImageNet and show that they all behave nearly identically in practice. So the old binned ECE results are probably fine! 🧵(5/6)

English

161

Muthu Chidambaram@mle_muthu·5 Haz

Want to calibrate your ML models but worried about erratic behavior in calibration metrics? Good news for you, thanks to joint work with wonderful collaborators @oldheneel, @cosmc, @eigenstate that will be at #ICML 2024: arxiv.org/abs/2402.10046. Short 🧵(1/6)

English

4.3K

Muthu Chidambaram@mle_muthu·29 May

@gabrielpeyre @mblondel_ml @GaelVaroquaux There's an open PR for the functionality you want: github.com/scipy/scipy/pu…. You could implement this in pure scipy using just csr_matrix as follows, but not sure if there are better ways

English

Gabriel Peyré@gabrielpeyre·29 May

@mblondel_ml @GaelVaroquaux Would it be efficient if coded in python (bc each row has a different number of non zero element)?

English

579

Muthu Chidambaram retweetledi

Daniel Beaglehole@dbeagleholeCS·28 May

Neural collapse (NC) refers to the structure in representations of general neural nets, where data "collapse" to the class means. We find that layer-wise multiplication with the average gradient outer product (AGOP) of a kernel method induces NC in untrained neural networks:

English

2.6K

Muthu Chidambaram@mle_muthu·24 May

This is what people mean when they say elegant theory

English

Keşfet

@oldheneel @ArwenBradley @oh_that_hat @advani_madhu @DHolzmueller @cosmc @eigenstate @gabrielpeyre