Yifei Wang

17 posts

Yifei Wang

@wang_yifei

PhD student @Westlake_Uni, Human genetics

Katılım Mayıs 2021

62 Takip Edilen28 Takipçiler

Yifei Wang retweetledi

Caleb Lareau@CalebLareau·2 May

A history of viral encephalitis is one of the strongest risk factors for developing dementia. With @JacobJacobog02, Yifan Chen, @RNA_Life, and @RyanDhindsa, we refine this link by describing a neuronal cell type that can reactivate HSV-1 in humans 1/n biorxiv.org/content/10.648…

English

300

51.7K

Yifei Wang retweetledi

Amber Liu@JIACHENLIU8·28 Nis

My bet: in the near future, 80%⬆️ of CS research will be done by AI in collaboration with humans. However, today's research ecosystem is still built around the human, not the AI scientist. For example, the 8-page paper PDF is a lossy compression of months of branching exploration into a linear story, optimized for a human reviewer to skim in 30 minutes. It hides two structural taxes: 📖 Storytelling Tax — failures, rejected hypotheses, and dead ends get stripped. On RE-Bench (24,008 runs, 21 frontier models), failed runs = 90.2% of total compute cost, with a 113× median failed-to-success token ratio. Every lab independently rediscovers the same dead ends. 🔧 Engineering Tax — the gap between reviewer-sufficient prose and agent-sufficient spec. Across 8,921 PaperBench requirements (23 ICML'24 papers), only 45.4% are fully specified in the PDF. The rest is tacit lab knowledge. Tolerable when readers were human. Critical now that agents read, reproduce, and extend. We propose ARA: the Agent-Native Research Artifact — replace the narrative PDF with an agent-executable package, in 4 layers: 🧠 structured scientific logic ⚙️ executable code w/ full specs 🌳 exploration graph (every failure preserved) 📊 evidence grounding every claim

English

563

100K

Yifei Wang retweetledi

Bo Wang@BoWang87·26 Nis

This is probably the best paper I have read about causal reasoning for quite some time. Really a great weekend read! "Causal Persuasion" (Burkovskaya & Starkov) models how much evidence you need to establish vs. rule out a causal link. The result is stark: To prove X causes Y: 1-2 well-chosen variables often suffice. To prove X does NOT cause Y: you must account for every possible common cause. Arbitrarily many confounders. Practically unfalsifiable. This inverts the Humean intuition: in causal reasoning, positive claims are cheap to sell and negative ones are almost impossible to rebut. Now think about what this means for Virtual Cell models. Most perturbation datasets cover a thin slice of the combinatorial space — a few hundred gene knockouts, maybe a few contexts. A model trained on that data can confidently "learn" gene X drives phenotype Y. But if the true structure is X←C→Y , and C was never systematically varied — the model will never see its own confounding. It has no mechanism to distinguish causal signal from correlated noise. The paper formalizes exactly why: the model is a sophisticated receiver that accepts whatever causal story is consistent with the data it's seen. And if the data omits the right confounders, even a "sophisticated" model is manipulable. This is the deepest argument for perturbation diversity. Not just more data, but also more axes of variation. Vary the context. Vary the genetic background. Vary the timing. You're not just collecting samples; you're systematically eliminating alternative causal explanations. This is why we need “scale” the training data with more contexts including cell types, spatial, and temporal variations. Paper: aburkovskaya.com/pdf/causality.…

English

204

968

106.9K

Yifei Wang retweetledi

Rachel Savage@Rachel_E_Savage·23 Nis

It takes two (transcription factors) to tango Transcription factor collaboration enables precise T cell state engineering tinyurl.com/tcelltfs Excited to share our preprint with @JD_Buenrostro, @cgersbach, and an incredible team of collaborators as part of @IGVFConsortium 🧵

English

224

29.6K

Yifei Wang retweetledi

Jian Yang@jyang1981·1 Nis

Excited to see our work in @Nature! We combined our PIGA workflow & a cost-effective sequencing strategy to build the 1000 Chinese Pangenome (1KCP). We hope this methodology & resource help unlock complex variants in human health. @wang_yifei @DuanZhongqu nature.com/articles/s4158…

English

167

9.8K

Yifei Wang@wang_yifei·2 Nis

It has been a long and rewarding journey, and a great pleasure working with @DuanZhongqu, @jyang1981, and all the members in the Yang Lab!

English

Yifei Wang@wang_yifei·2 Nis

9) Finally, we developed a 1KCP imputation panel, enabling future East Asian association studies to access most kinds of variants (small variants, SVs, TRs length, TR motif, nested variants, and HLA alleles).

English

Yifei Wang@wang_yifei·2 Nis

We are thrilled to announce that our new study of the 1000 Chinese Pangenome (1KCP) is now published in @Nature! nature.com/articles/s4158…

English

186

Yifei Wang@wang_yifei·7 Kas

@Hakha_Most @Nature @jkpritch Congratulations! Brilliant work bridging GWAS and burden tests, and adding a crucial new dimension to gene prioritization!

English

235

Hakhamanesh Mostafavi@Hakha_Most·7 Kas

How do GWAS and rare variant burden tests rank gene signals? In new work @Nature with Jeff Spence, @jkpritch, and our wonderful coauthors we find the key factors are what we call Specificity, Length, and Luck! 🧬🧪🧵 nature.com/articles/s4158…

English

138

39.6K

Keşfet

@JacobJacobog02 @RNA_Life @RyanDhindsa @JD_Buenrostro @cgersbach @IGVFConsortium @Nature @DuanZhongqu