Sebastian G. Gruber

50 posts

Sebastian G. Gruber

@SebGGruber

Postdoc in Uncertainty Quantification (Machine Learning)

Leuven Beigetreten Haziran 2022

199 Folgt71 Follower

Sebastian G. Gruber@SebGGruber·13 Kas

@pfau @FrnkNlsn Finally, it happened!! :D I wrote one of the papers citing your old version. Maybe, interestingly for you, I provided the bvd in the dual space, and also for functional cases arxiv.org/pdf/2210.12256

English

David Pfau@pfau·13 Kas

@FrnkNlsn Check out the now-properly-referenced preprint here, if you have a penchant for information geometry, or just want to get a snapshot of what I was thinking about in the ancient history of 2013: arxiv.org/abs/2511.08789

English

2.3K

David Pfau@pfau·13 Kas

New (sort of) preprint on arXiv today: a generalized bias-variance decomposition for Bregman divergences!

English

129

10.6K

Sebastian G. Gruber@SebGGruber·9 Nis

@DHolzmueller @LChoshen @Eugene_Berta @BachFrancis 5) An f-divergence plus a function of the target marginal distribution (c.f. Eq 5 + Prop 6.1 in proceedings.mlr.press/v238/popordano…)

English

David Holzmüller@DHolzmueller·8 Nis

@LChoshen @Eugene_Berta @BachFrancis 4) The irreducible loss (due to noise) plus the extra "grouping loss" that arises from having the same prediction f(X_1) = f(X_2) for X_1, X_2 with P(Y|X_1) != P(Y|X_2). (This means you can't separate the two probabilities with a post-hoc transformation g(f(X)) anymore.)

English

Leshem (Legend) Choshen 🤖🤗 @NeurIPS@LChoshen·8 Nis

What if we have been doing early stopping wrong all along? When you break the validation loss into two terms, calibration and refinement you can make the simplest (efficient) trick to stop training in a smarter position @Eugene_Berta @DHolzmueller Michael Jordan @BachFrancis

Leshem (Legend) Choshen 🤖🤗 @NeurIPS tweet media

English

4.7K

Sebastian G. Gruber@SebGGruber·21 Tem

Looking forward to any discussions about kernels, uncertainty, bias-variance decompositions, or my other research: calibration!! See you on Wed 24 Jul 11:30 a.m. CEST at Hall C 4-9

English

Sebastian G. Gruber@SebGGruber·21 Tem

The decomposition represents a framework for understanding generative models' generalization performance and uncertainty. Specifically, we outperform other baselines of uncertainty in predicting the answer correctness of LLMs arxiv.org/abs/2310.05833

English

116

Sebastian G. Gruber@SebGGruber·21 Tem

28 years ago the bias-variance-covariance decomposition of the mean squared error was introduced. This week at @icmlconf, @BuettnerFlo and I will show that it also holds for kernel-based loss functions, which we use for image, audio, and language generation experiments...

English

391

Sebastian G. Gruber@SebGGruber·17 Tem

@BlackHC In my experience, minimal changes to the optimization procedure can have major effects on the TS results. It is difficult to pinpoint the issue without toying around with the implementation.

English

Andreas Kirsch 🇺🇦@BlackHC·16 Tem

Is there anything magic I need to do to make netcal 's TemperatureScaling work? I just handwrote an LBFGS-based calibration, which seems to work, but TemperatureScaling somehow doesn't do anything at all 😐

English

614

Sebastian G. Gruber@SebGGruber·10 Tem

@profgavinbrown @csmcr @RiccardoAli1 Conditionally yes. If you have parametric proper scores then you have a bijection to NLL expfam and bregman divergences. However, proper scores are also defined for non-parametric distributions. That's why we had to use functional bregman divergences in the paper

English

Gavin Brown@profgavinbrown·10 Tem

@SebGGruber @csmcr @RiccardoAli1 Yes absolutely. Likewise - love your paper there too - am slowly digesting. Aren’t strictly proper rules / NLL expfam in a bijection tho? Also with Bregman divergences, that we studied?

English

141

Gavin Brown@profgavinbrown·8 Tem

The most surprising paper I've ever published... “Bias/Variance is not the same as Approximation/Estimation”. openreview.net/forum?id=4TnFb… We figured out the precise connection between two seminal results in ML theory... somehow this was overlooked for 50+ years? @csmcr @RiccardoAli1

English

129

18.6K

Sebastian G. Gruber retweetet

Teodora Popordanoska@TPopordanoska·4 May

Today at #AISTATS24 @SebGGruber, @dr_alex_tiulpin and I are presenting our work on “Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors”. You can find us from 15:00 to 17:00 @ MR1 #31 ☺️ Paper: proceedings.mlr.press/v238/popordano… Code: github.com/tpopordanoska/…

English

771

Sebastian G. Gruber@SebGGruber·5 Nis

@HyperboIeva To give a (non-physical) example where a non-inner product dual space is critical: We use them to perform a bias-variance decomposition of a general loss function for distributions arxiv.org/pdf/2210.12256…

English

Sebastian G. Gruber@SebGGruber·19 Oca

@iScienceLuvr If you are interested for more, the MMD can also be used for other tasks, like audio and text generation. There even exists a bias-variance-covariance decomposition for it via kernel scores (=MMD + const) arxiv.org/pdf/2310.05833…

English

245

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·19 Oca

Rethinking FID: Towards a Better Evaluation Metric for Image Generation abs: arxiv.org/abs/2401.09603 This paper from Google Research proposes the use of CLIP MMD distance (CMMD) as an alternative to FID for text-to-image generation eval. CMMD does not make assumption about normality like FID does (which is anyway violated by the Inception embeddings and causes problems), it is an unbiased estimator, it is sample efficient, it better matches with expected trends (ex: distorting images reliably increased CMMD unlike with FID), and appears to better match human perception.

Tanishq Mathew Abraham, Ph.D. tweet media

English

244

32.1K

Sebastian G. Gruber@SebGGruber·19 Oca

@iScienceLuvr I highly enjoy that the MMD is used more for generative models. However, as mentioned in the paper, the KID already uses the MMD of feature embeddings. So, it would be interesting to have an ablation study of the kernel choice and embedder choice. The paper only evaluates FID

English

339

Sebastian G. Gruber@SebGGruber·18 Ara

@DrHughHarvey @StanfordHAI Non-determinism is vital for the uncertainty estimation of LLMs (arxiv.org/pdf/2310.05833…). You can always increase the determinism by adjusting the sample temperature, however this also adjusts how certain the model is in its answer.

English

Sebastian G. Gruber@SebGGruber·22 Kas

@Jeffaresalan @JonathanICrabbe @tennisonliu @MihaelaVDS Interesting work, Alan! You stated that averaging probabilities 'is a more natural strategy' than averaging scores. I agree, but if you want to minimize the CE loss without touching the bias then there is no way around averaging scores (c.f. Cor.3.6 in arxiv.org/pdf/2210.12256…)

English

Alan Jeffares@Jeffaresalan·22 Kas

In the meantime, check out our paper: “Joint Training of Deep Ensembles Fails Due to Learner Collusion” With amazing coauthors @JonathanICrabbe, @tennisonliu & @MihaelaVDS arxiv.org/abs/2301.11323

English

Alan Jeffares@Jeffaresalan·22 Kas

Have we been training deep ensembles with the wrong objective? 😱 Our new #NeurIPS paper investigates why training ensembles *jointly* is almost never observed in practice and uncovers some pretty surprising behaviour… 🧵 [1/N]

English

337

81.3K

Sebastian G. Gruber@SebGGruber·22 Kas

@bryancsk In the first book, yes. However, in the sequels, the author goes into great detail about how technology threatens the existence of every neutral actor. Ironically, the only safe solution in the books is to isolate yourself into a bubble of no progress for eternity

English

Bryan Cheong@bryancsk·22 Kas

The Three Body Problem is an e/acc piece of science fiction because, unlike the others that narrate the dangers of new technology, it describes the civilisation-ending consequences of *delaying* the development of new tech

English

1.1K

132.6K

Sebastian G. Gruber@SebGGruber·20 Kas

@BlackHC @ilyasut "allowing the company to be destroyed 'would be consistent with the mission'" ?!?! Does not seem like a sustainable mindset from a stochastic process perspective. You only have to slip once into the impression of 'kill the company' and it's over.

English

Andreas Kirsch 🇺🇦@BlackHC·20 Kas

It seems @ilyasut was outvoted by the board 1:3 yesterday after removing Sam and Greg. Independent board members having no shares, stakes, or employment mean that they care much less about destroying the company than actual employees. incl e.g. Ilya... x.com/ilyasut/status…

English

20.2K

Sebastian G. Gruber@SebGGruber·31 Eki

@tengyuma @Voyage_AI_ Well, coincidentally I am looking for a research internship on a novel calibration project, where I could use your embedders ;)

English

Tengyu Ma@tengyuma·30 Eki

@SebGGruber @Voyage_AI_ Very interesting! Thanks for sharing! please try Voyage embeddings in your next project :)

English

929

Tengyu Ma@tengyuma·30 Eki

📢 Introducing Voyage AI @Voyage_AI_! Founded by a talented team of leading AI researchers and me 🚀🚀. We build state-of-the-art embedding models (e.g., better than OpenAI 😜). We also offer custom models that deliver 🎯+10-20% accuracy gain in your LLM products. 🧵

English

753

225.4K

Sebastian G. Gruber@SebGGruber·17 Eki

@predict_addict Note that they don't cite that kernel density ratio estimation has already been done for the ECE in proceedings.mlr.press/v119/zhang20k.… and arxiv.org/abs/2210.07810 They also claim canonical calibration is infeasible even though unbiased+consistent estimators exist arxiv.org/abs/1910.11385

English

297

Valeriy M., PhD, MBA, CQF@predict_addict·16 Eki

Apple unveils smooth reliability diagram - one of the main tools 🧰 to measure and visualize calibration of machine learning and deep learning models. Smooth ECE complete with GitHub repo and Jupyter notebooks. #calibration #machinelearning #conformalprediction arxiv.org/pdf/2309.12236… github.com/apple/ml-calib…

English

10.7K

Sebastian G. Gruber@SebGGruber·23 Ağu

@ChristophMolnar Come on, I did this in your seminar haha sebastian-gruber.shinyapps.io/mlrPlayground/ But it's far from perfect and there are definitely better ones

English

231

Christoph Molnar 🦋 christophmolnar.bsky.social@ChristophMolnar·23 Ağu

Is there a good web app to demonstrate underfitting / overfitting? Like some scatterplot (dots = data) with a line (=model) going through and you have a slider for model complexity?

English

Entdecken

@pfau @FrnkNlsn @DHolzmueller @LChoshen @Eugene_Berta @BachFrancis @icmlconf @BuettnerFlo