Yifei Wang

17 posts

Yifei Wang

Yifei Wang

@wang_yifei

PhD student @Westlake_Uni, Human genetics

Katılım Mayıs 2021
62 Takip Edilen28 Takipçiler
Yifei Wang retweetledi
Amber Liu
Amber Liu@JIACHENLIU8·
My bet: in the near future, 80%⬆️ of CS research will be done by AI in collaboration with humans. However, today's research ecosystem is still built around the human, not the AI scientist. For example, the 8-page paper PDF is a lossy compression of months of branching exploration into a linear story, optimized for a human reviewer to skim in 30 minutes. It hides two structural taxes: 📖 Storytelling Tax — failures, rejected hypotheses, and dead ends get stripped. On RE-Bench (24,008 runs, 21 frontier models), failed runs = 90.2% of total compute cost, with a 113× median failed-to-success token ratio. Every lab independently rediscovers the same dead ends. 🔧 Engineering Tax — the gap between reviewer-sufficient prose and agent-sufficient spec. Across 8,921 PaperBench requirements (23 ICML'24 papers), only 45.4% are fully specified in the PDF. The rest is tacit lab knowledge. Tolerable when readers were human. Critical now that agents read, reproduce, and extend. We propose ARA: the Agent-Native Research Artifact — replace the narrative PDF with an agent-executable package, in 4 layers: 🧠 structured scientific logic ⚙️ executable code w/ full specs 🌳 exploration graph (every failure preserved) 📊 evidence grounding every claim
English
28
87
563
100K
Yifei Wang retweetledi
Bo Wang
Bo Wang@BoWang87·
This is probably the best paper I have read about causal reasoning for quite some time. Really a great weekend read! "Causal Persuasion" (Burkovskaya & Starkov) models how much evidence you need to establish vs. rule out a causal link. The result is stark: To prove X causes Y: 1-2 well-chosen variables often suffice. To prove X does NOT cause Y: you must account for every possible common cause. Arbitrarily many confounders. Practically unfalsifiable. This inverts the Humean intuition: in causal reasoning, positive claims are cheap to sell and negative ones are almost impossible to rebut. Now think about what this means for Virtual Cell models. Most perturbation datasets cover a thin slice of the combinatorial space — a few hundred gene knockouts, maybe a few contexts. A model trained on that data can confidently "learn" gene X drives phenotype Y. But if the true structure is X←C→Y , and C was never systematically varied — the model will never see its own confounding. It has no mechanism to distinguish causal signal from correlated noise. The paper formalizes exactly why: the model is a sophisticated receiver that accepts whatever causal story is consistent with the data it's seen. And if the data omits the right confounders, even a "sophisticated" model is manipulable. This is the deepest argument for perturbation diversity. Not just more data, but also more axes of variation. Vary the context. Vary the genetic background. Vary the timing. You're not just collecting samples; you're systematically eliminating alternative causal explanations. This is why we need “scale” the training data with more contexts including cell types, spatial, and temporal variations. Paper: aburkovskaya.com/pdf/causality.…
Bo Wang tweet media
English
18
204
968
106.9K
Yifei Wang retweetledi
Jian Yang
Jian Yang@jyang1981·
Excited to see our work in @Nature! We combined our PIGA workflow & a cost-effective sequencing strategy to build the 1000 Chinese Pangenome (1KCP). We hope this methodology & resource help unlock complex variants in human health. @wang_yifei @DuanZhongqu nature.com/articles/s4158…
English
6
41
167
9.8K
Yifei Wang
Yifei Wang@wang_yifei·
It has been a long and rewarding journey, and a great pleasure working with @DuanZhongqu, @jyang1981, and all the members in the Yang Lab!
English
0
0
5
88
Yifei Wang
Yifei Wang@wang_yifei·
9) Finally, we developed a 1KCP imputation panel, enabling future East Asian association studies to access most kinds of variants (small variants, SVs, TRs length, TR motif, nested variants, and HLA alleles).
Yifei Wang tweet media
English
1
0
5
95
Yifei Wang
Yifei Wang@wang_yifei·
@Hakha_Most @Nature @jkpritch Congratulations! Brilliant work bridging GWAS and burden tests, and adding a crucial new dimension to gene prioritization!
English
1
0
4
235