Marwane

227 posts

Marwane banner
Marwane

Marwane

@GenerateLakatos

moderately locked in PhD student @sangerinstitute @Cambridge_Uni | previously @ICR_London @emblebi

he/him Katılım Kasım 2021
1.2K Takip Edilen112 Takipçiler
Marwane retweetledi
Mo Lotfollahi
Mo Lotfollahi@mo_lotfollahi·
Excited to share our new work. Over the past decade, single-cell genomics has transformed our ability to map cellular systems. But a major question remains: Can we predict how perturbations reshape cellular trajectories over time? In 2018, we first showed that it is possible to predict cellular responses to perturbations — ranging from disease signals to chemical treatments — even in unseen contexts. In 2022, we introduced CPA (MSB 2022; NeurIPS 2022), extending this idea to predict responses to unseen chemical and genetic perturbations, including their combinations. Since then, the field of perturbation modeling has grown enormously. The community has pushed the space forward with many creative ideas and powerful models. It’s exciting to see how fast things are moving — even though many fundamental challenges remain. One of the biggest is that cells are not static. They move through trajectories during development, immune responses, and disease. Yet most current models still predict perturbation effects within a single state, rather than how early perturbations propagate across future states and reshape downstream outcomes. To address this, we developed PerturbGen, a trajectory-aware generative AI model that predicts how genetic perturbations reshape downstream cellular states. Huge credit to the people who made this work possible. Thanks to co-first authors @lifeisscience_5, @Adib_m_, @Tomo_Isobe, @Amirhossein Vahidi, @delshadveghari & Anthony Rostron. Special recognition to @lifeisscience_5 and @Adib_m_ for driving this work over the finish line. Grateful for our outstanding collaborators from @HaniffaLab, @BertieGottgens lab @GosiaTrynka and many others — a true cross-institute effort across @SCICambridge, @OpenTargets ,@sangerinstitute and @Cambridge_Uni.🎉 PerturbGen learns transcriptional dynamics across cellular trajectories. By introducing perturbations at an early source state, it can simulate how these effects propagate into future states along differentiation trajectories. Scaling this across genes enables the creation of dynamic in silico perturbation atlases — maps of how perturbations reshape biological trajectories over time. We explored this idea across three biological questions. First, in a human in vivo LPS immune challenge, PerturbGen predicted that perturbing a transient IL1B signal dampens downstream inflammatory programs in myeloid cells, with pathway changes reversing signatures observed in an independent IL-1β stimulation experiment. Second, in human hematopoiesis, PerturbGen predicted transcriptional responses to CRISPR transcription factor knockouts and enabled construction of perturbation atlases revealing lineage- and age-specific regulatory programs. These programs could also be linked to human genetics and blood diseases, including recapitulation of signatures associated with ETV6-related thrombocytopenia. Finally, we asked whether perturbation modeling could help improve complex tissue models. We built a dynamic perturbation atlas of human skin organoids to identify perturbations that could guideorganoid cells towardhuman fetal skin states. PerturbGen prioritized activation of Wnt signaling via GSK3β inhibition. Experimental validation confirmed the prediction: treatment with CHIR99021 induced stromal gene programs and shifted organoid fibroblasts toward transcriptional states observed in fetal skin stroma. Together, these results show how trajectory-aware perturbation modeling can connect gene perturbations to developmental programs, human genetics, disease mechanisms, and experimental interventions. More broadly, we think these point toward a future where single-cell atlases become predictive systems. As atlases expand across tissues, developmental windows, and modalities, models like PerturbGen could enable dynamic, virtual perturbation atlases— allowing us to simulate interventions, generate hypotheses, and design experiments before stepping into the lab. Preprint shorturl.at/EkisP Code github.com/Lotfollahi-lab… Excited to see how the community builds on this work.
English
2
47
178
16.8K
Marwane retweetledi
Pall Melsted
Pall Melsted@pmelsted·
Excited to share this preprint that describes my latest work on using GPUs to accelerate processing of RNA-seq data. The title says it all: "RNA-seq analysis in seconds using GPUs" now on biorxiv biorxiv.org/content/10.648… Figure 1 shows they key result
Pall Melsted tweet media
English
15
119
483
90.5K
Marwane retweetledi
Mo Lotfollahi
Mo Lotfollahi@mo_lotfollahi·
How do we make conditional generative models that actually generalize to new, unseen conditions—instead of breaking under distribution shift? 🧠⚡️ In real science, we constantly ask models to predict outcomes for conditions we never trained on: 🧬 unseen genetic perturbations 💊 new drugs / compounds 🧫 new experimental settings 📸 new treatment effects in microscopy Most conditional flow-matching models start from the same fixed Gaussian noise every time and only “condition” the dynamics. That often works in-distribution… but can fail badly for OOD conditions 😬. We introduce SP-FM 🚀: a shortest-path flow-matching approach that conditions both ✅ the starting point (a learnable mixture base distribution) ✅ and the flow (the transport dynamics) on the condition descriptor. Intuition: if two conditions are similar, their starting distributions should also be similar—so the model only needs a small correction, making extrapolation much more reliable 🌉📈. Results: SP-FM improves OOD generalization across diverse domains, including 🧬 predicting single-cell transcriptomic responses to unseen perturbations 🧫 modeling treatment effects in high-content microscopy drug screening 🔤 even synthetic benchmarks like unseen rotations 📄 Paper: arxiv.org/abs/2601.11827 💻 Code: github.com/Lotfollahi-lab… Huge team effort 🤝 Andrea Rubbi, Amir AkbarNejad, Sanian*, Ariam Yazdan Parast #FlowMatching #GenerativeAI #OOD #MachineLearning #SingleCell #DrugDiscovery
Mo Lotfollahi tweet media
English
4
33
201
11.7K
Marwane retweetledi
Mo Lotfollahi
Mo Lotfollahi@mo_lotfollahi·
🧬✨ How can we generate genome-wide gene expression in 3D to build 3D virtual organs—without performing spatial transcriptomics on every single tissue section? Today, most spatial transcriptomics is still 2D 🧫: we capture one thin slice at a time. But real biology—development, disease niches, gradients, and cell–cell communication—unfolds in 3D 🧠🫀. Measuring ST across many serial sections is possible, but it’s still too expensive and labor-intensive 💸⏳ for routine studies. We introduce HoloTea 🍵, a flow-matching–based generative AI approach 🤖 that learns from dense serial H&E histology stacks 🧾 plus ST measured on only a subset of slices, then reconstructs the missing slices to produce a 3D-consistent volumetric spatial transcriptomics readout 📍📈🧬. Across multiple datasets (including large serial stacks), HoloTea improves reconstruction accuracy ✅ compared to strong 2D and 3D baselines, while remaining scalable 🚀 to large tissue volumes. We see this as a step toward accurate 3D virtual tissues 🌍🧫—enabling cheaper volumetric molecular maps 💡, faster biomarker discovery 🔍, and deeper insight into 3D tissue organization in health and disease 🏥. 📄 Paper: arxiv.org/abs/2511.14613 🤝 Amazing collaboration with @Muzz_Haniffa @bayraktar_lab @Lastu21 led by @ArshiaHemmat @AmirhVahidi Mohammad Vali Sanian #SpatialTranscriptomics #BioAI #GenAI #DigitalPathology #3DBiology
Mo Lotfollahi tweet media
English
3
42
229
28.9K
Marwane retweetledi
Mo Lotfollahi
Mo Lotfollahi@mo_lotfollahi·
Mixture-of-Experts (MoE) is a powerful way to scale large language models (LLMs): instead of running the full model for every token, a router activates only a few “experts,” giving more capacity at roughly the same compute. But routing is still a sore spot. Most MoE systems use Top-k + Softmax, where expert selection is discrete—so you don’t get clean end-to-end gradients. In practice, this can lead to unstable routing, calibration issues, and uneven expert usage. In our #ICLR2026 paper, we introduce DirMoE — a fully differentiable probabilistic router that separates which experts fire (Bernoulli) from how their weights are assigned (Dirichlet). We also add a simple “sparsity knob” 🎛️ (Simpson-index penalty) to control the expected number of active experts, without relying on load-balancing losses that can homogenize experts. Results: DirMoE matches/exceeds vanilla MoE throughput (no extra bottlenecks), is strong/competitive on zero-shot benchmarks (ARC, BoolQ, PIQA, …), and leads to clearer expert specialization (interpretable domain focus like ArXiv/Books/GitHub code). Led by @HesamAsdz and @AmirhVahidi paper: openreview.net/forum?id=a15cD…
Mo Lotfollahi tweet media
English
10
74
424
29.9K
Marwane retweetledi
Golnaz Vahedi
Golnaz Vahedi@golnaz_v·
Beyond excited to share the collaborative work with @bfariabi led by stars @YeqiaoZhou and Atishay Jay. Walking along thousands of chromosomes has shown us just how essential it is to measure chromatin fiber geometry to truly understand enhancer biology. tinyurl.com/2st3afwf
English
5
24
113
12.7K
Marwane
Marwane@GenerateLakatos·
@trollopdaughter my analytical self be damned, i thought it was about clarifying intuitions you already have with language that sounds rigorous
English
0
0
1
45
Marwane retweetledi
Lior Pachter
Lior Pachter@lpachter·
Single-cell genomics finally makes it into the clinic. “Can you show me on the UMAP where it hurts?”
Lior Pachter tweet media
English
15
92
836
54.1K
Marwane retweetledi
Surag Nair
Surag Nair@suragnair·
Excited to share Nona: a unifying multimodal masking framework for functional genomics. Models for DNA have evolved along separate paths: sequence-to-function (AlphaGenome), language models (Evo2), and generative models (DDSM). Can these be unified under a single paradigm? 1/15
Surag Nair tweet media
English
5
50
230
33.7K
Marwane retweetledi
Samuel King
Samuel King@samuelhking·
Many of the most complex and useful functions in biology emerge at the scale of whole genomes. Today, we share our preprint “Generative design of novel bacteriophages with genome language models”, where we validate the first, functional AI-generated genomes 🧵
English
39
315
1.3K
575.5K
Marwane retweetledi
Dr. Liana Lareau
Dr. Liana Lareau@lianafaye·
This preprint from Helen Sakharova is one of the coolest things to come out of my lab: “Protein language models reveal evolutionary constraints on synonymous codon choice.” Codon choice is a big puzzle in genome information, and we have a new angle. biorxiv.org/content/10.110…
English
7
38
205
22.7K
Marwane retweetledi
Yiping Lu
Yiping Lu@2prime_PKU·
Anyone knows adam?
Yiping Lu tweet media
English
265
440
4.8K
634.6K
Marwane retweetledi
Daniel Litt
Daniel Litt@littmath·
My favorite thing like this: a (now deceased) math professor and Soviet émigré, in response to student complaints that he was insufficiently encouraging, recorded and placed on his website an audio clip of him saying “goood jooob…you tried your best…” etc.
rue yi@ruebytwo

I respond very well to a, how do you say, eastern european style of instruction. when they straight up tell me i’m bad at something and it’s with zero malice. I’m just being observed, and a fact is being stated. The accent helps too

English
8
43
690
44K
Marwane retweetledi
Sergey Ovchinnikov
Sergey Ovchinnikov@sokrypton·
CASP is getting cut by NIH... 😢 (Anyone with extra funds wanna help support perhaps the most important competition of the century?) science.org/content/articl…
English
19
130
344
45K
Marwane retweetledi
Bo Wang
Bo Wang@BoWang87·
What a mind-blowing week for AI in biology! 🚀 Xaira just dropped the largest genome-wide perturbation dataset ever last week 🧬 Arc unveiled STATE yesterday, the most powerful foundation model for single-cell perturbation prediction 🧠 DeepMind followed up with AlphaGenome, the new state-of-the-art DNA foundation model today! The pace of innovation in this field is absolutely staggering. We’re witnessing biology being reprogrammed in real time. 🌍⚡️
English
19
94
441
35.5K
Marwane retweetledi
Jun Cheng
Jun Cheng@s6juncheng·
Excited to share #AlphaGenome, a start of our AlphaGenome named journey to decipher the regulatory genome! The model matches or exceeds top-performing external models on 24 out of 26 variant evaluations, across a wide range of biological modalities.1/6
Jun Cheng tweet media
English
14
207
912
87.3K