Anna Neufeld

52 posts

Anna Neufeld banner
Anna Neufeld

Anna Neufeld

@AnnaCNeufeld

Statistics PhD Student @UW. Enthusiastic about mountains and p-values.

Katılım Ekim 2020
122 Takip Edilen270 Takipçiler
Anna Neufeld
Anna Neufeld@AnnaCNeufeld·
@Spottnik Not for brunch, but I recommend Leny’s in tangletown!! For next Thursday evening (June 15!)
English
0
0
3
204
Greg Spotts
Greg Spotts@Spottnik·
I’m looking for ideas for time & location for a #Summerof62 brunch meetup this Sunday June 11 Talk to me
Greg Spotts tweet media
English
6
0
12
3K
Mike Wu
Mike Wu@wu_biostat·
Best part of being a @UWBiostat & @UWStat student is learning from great Profs. Best part of being a Prof is learning from great students! Thanks @AnnaCNeufeld (@daniela_witten lab) for presenting your work on data thinning to our group!
English
1
3
24
6.2K
Anna Neufeld
Anna Neufeld@AnnaCNeufeld·
@DarwinAwdWinner @daniela_witten @LucyGao We have been exploring this in the context of the negative binomial distribution in an upcoming paper about analyzing scRNA-seq data! We found surprisingly good performance with estimated overdispersion parameters, but there is definitely more to explore!
English
0
0
3
41
Ryan Thompson
Ryan Thompson@DarwinAwdWinner·
@AnnaCNeufeld @daniela_witten @LucyGao After reading it, the main thing I'm wondering is, what happens when you estimate nuisance parameters (like var for Gaussian) from the data itself rather than specifying them a priori? Presumably some kind of overfitting?
English
2
0
1
57
Anna Neufeld
Anna Neufeld@AnnaCNeufeld·
@daniela_witten, @lucygao, Ameer Dharamshi, and I are excited to share our new preprint (arxiv.org/abs/2301.07276). We introduce *data thinning*, a flexible framework that splits a single observation into independent parts, providing an alternative to cross-validation. (1/11)
English
9
29
108
27.3K
Ryan Thompson
Ryan Thompson@DarwinAwdWinner·
@AnnaCNeufeld @daniela_witten @LucyGao Also, since Poisson = NegBin w/0 disp, if the true dist is NegBin with >0 disp, then using Poisson will underestimate the var and result in >0 correlation between thinned data sets and overfitting, right?
English
1
0
0
64
Anna Neufeld
Anna Neufeld@AnnaCNeufeld·
Data thinning is broadly applicable: any time you might perform sample splitting or cross-validation, you can use data thinning instead. And you can use data thinning in settings where sample splitting and cross-validation are NOT applicable (e.g. unsupervised learning). (10/11)
Anna Neufeld tweet media
English
1
2
6
1.6K