
In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N
Shaoshi Zhang
151 posts

@ZShaoshi
neuroscience, computational models | Computational Brain Imaging Group | Huge fan of Metroidvania and Edward Hopper.

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

So we propose SHARP, which involves repeated split-half to generate pairs of independent statistics. There are still 3 unknowns — mean, variance, between-repetition correlation — but the independent pairs provide a 3rd information source to estimate all 3 unknowns. 7/N



In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N


In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

So we propose SHARP, which involves repeated split-half to generate pairs of independent statistics. There are still 3 unknowns — mean, variance, between-repetition correlation — but the independent pairs provide a 3rd information source to estimate all 3 unknowns. 7/N


In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N


In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N

In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N


In a meta-analysis of 210 biomedical AI studies that statistically compared models under cross-validation, 97% used invalid statistical tests. Here's our new preprint doi.org/10.64898/2026.… led by @tianchuzeng @kkli20111 @ZShaoshi @ten_photos 1/N





