Shaoshi Zhang

151 posts

Shaoshi Zhang banner
Shaoshi Zhang

Shaoshi Zhang

@ZShaoshi

neuroscience, computational models | Computational Brain Imaging Group | Huge fan of Metroidvania and Edward Hopper.

Singapore Katılım Mayıs 2020
171 Takip Edilen217 Takipçiler
Sabitlenmiş Tweet
Shaoshi Zhang
Shaoshi Zhang@ZShaoshi·
For years, we've known that running a standard t-test on cross-validation folds violates sample independence. We wanted to see how widespread this issue actually is. The result? 97% of the studies used an invalid statistical test. 🧵👇
Thomas Yeo@bttyeo

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

English
1
5
11
3.4K
Shaoshi Zhang retweetledi
Tal Golan
Tal Golan@TalGolanNeuro·
This looks like a straightforward, highly applicable solution to the long-standing problem of valid inference for K-fold CV performance differences. The trade-off is smaller training sets from the split-half step and having to rerun K-fold CV many times.
Thomas Yeo@bttyeo

So we propose SHARP, which involves repeated split-half to generate pairs of independent statistics. There are still 3 unknowns — mean, variance, between-repetition correlation — but the independent pairs provide a 3rd information source to estimate all 3 unknowns. 7/N

English
2
6
33
3.6K
Shaoshi Zhang retweetledi
Imaging Neuroscience
Imaging Neuroscience@ImagingNeurosci·
New paper in Imaging Neuroscience by Ru Kong, B.T. Thomas Yeo, et al: Network-based near-scalp personalized brain stimulation targets doi.org/10.1162/IMAG.a…
Imaging Neuroscience tweet media
English
0
5
10
1.1K
Shaoshi Zhang retweetledi
Thomas Yeo
Thomas Yeo@bttyeo·
Here's bonus slides on cross-validation tests, separate from our preprint. Covering: 1. paired (sign-flip) permutation test 2. label-swap permutation test 3. sample-level vs fold-averaged stats 4. a common misapplication of the corrected t-test 5. three bootstrap variants 1/N
Thomas Yeo@bttyeo

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

English
1
25
41
7.2K
Shaoshi Zhang retweetledi
Gary Marcus
Gary Marcus@GaryMarcus·
Biomedical AI may be headed for a replication crisis. (This work below is not about AI-generated reports; it’s about studies of biomedicine that use ML in their methods, and how they are evaluted.)
Thomas Yeo@bttyeo

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

English
9
5
54
10.8K
Shaoshi Zhang retweetledi
Lijun AN | 安丽军
Lijun AN | 安丽军@anlijuncn·
Proud to participate in this study! We should keep rigorous in AI-Biomedical research, we also observe some concerning trends in AI+biomarker studies… Congratulations @tianchuzeng Tian Fang and @ZShaoshi
Thomas Yeo@bttyeo

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

English
1
5
12
1.9K
Shaoshi Zhang retweetledi
Thomas Yeo
Thomas Yeo@bttyeo·
Once again, @ten_photos came to the rescue - we prayed to him for a better statistical test for k-shot learning (since the corrected t-test is overly conservative in that scenario), and he answered our prayers with a new test that also covers classical cross-validation.
Thomas Yeo@bttyeo

So we propose SHARP, which involves repeated split-half to generate pairs of independent statistics. There are still 3 unknowns — mean, variance, between-repetition correlation — but the independent pairs provide a 3rd information source to estimate all 3 unknowns. 7/N

English
1
10
17
2.7K
Shaoshi Zhang retweetledi
Sina Mansour L.
Sina Mansour L.@Sina_Mansour_L·
@bttyeo @tianchuzeng @kkli20111 @ZShaoshi @ten_photos Can't stress this enough 👇 If you use ML to compare predictive models in your research (neuroscience, genetics, you name it), this paper is a must read! 👀 The majority of work in this space (mine included 🙋) misses critical nuances when reporting comparative stats.
English
0
2
8
932
Shaoshi Zhang retweetledi
Dhurandhar B
Dhurandhar B@bornspectator42·
My quibble: This is traditional ML *not* AI in the generative sense it means now till eternity. But yeah this is a thing. Metric chasing brought this on. Reviewers reward higher metric values & not well cross-validated results. We've been told AuC<0.8 not worth submitting. 🙄
Thomas Yeo@bttyeo

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

English
1
1
3
652
Shaoshi Zhang
Shaoshi Zhang@ZShaoshi·
It’s incredible to see this study come to fruition! Shout out to the amazing @tianchuzeng and @kkli20111 who spearheaded this work and huge thank you to all other coauthors!
English
0
0
2
81
Shaoshi Zhang
Shaoshi Zhang@ZShaoshi·
For years, we've known that running a standard t-test on cross-validation folds violates sample independence. We wanted to see how widespread this issue actually is. The result? 97% of the studies used an invalid statistical test. 🧵👇
Thomas Yeo@bttyeo

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

English
1
5
11
3.4K
Shaoshi Zhang retweetledi
Hesheng Liu
Hesheng Liu@hesheng3·
Lesion network mapping (LNM) has been powerful in linking symptoms and brain functional circuits, but ongoing debates highlight that it is still hard to isolate symptom-specific effects. We came up with a new method, robust LNM (rLNM) — a unified framework combining null models and selective specificity to reveal reliable, symptom-specific networks from background structure. biorxiv.org/content/10.648… @bttyeo @foxmdphd @ndosenbach @club_scan
Hesheng Liu tweet media
English
5
41
113
20.1K
Shaoshi Zhang retweetledi
Nico Dosenbach
Nico Dosenbach@ndosenbach·
Function & cytoarchitecture don't overlap ... they're orthogonal. Prefrontal cortex is tiled with chains of functional patches mostly known from face processing. Multi-modal parcellations are wrong ... & other insights hidden by group-averaging fMRI data: bsky.app/profile/gordon…
Nico Dosenbach tweet media
English
2
35
104
13.1K