Rex Ma

67 posts

Rex Ma banner
Rex Ma

Rex Ma

@RexMa9

CS PHD student @ UToronto | AI for biology

Toronto, Ontario 가입일 Nisan 2018
646 팔로잉125 팔로워
Rex Ma 리트윗함
Hani Goodarzi
Hani Goodarzi@genophoria·
BioReason-Pro, the second model in our BioReason series is here! Congratulations @adibvafa, @arman1sa, @Radii2323, and the entire BioReason team!
Hani Goodarzi tweet media
English
2
16
49
6.4K
Rex Ma 리트윗함
Arman Seyed-Ahmadi
Arman Seyed-Ahmadi@arman1sa·
What if AI could explain why a protein is a kinase, not just tell you it is? We built just that. BioReason-Pro is a multimodal LLM that reasons about protein function — walking through domains, interactions, and biological context to make predictions you can actually evaluate.
English
3
9
54
7.5K
Rex Ma 리트윗함
ChloeXWang
ChloeXWang@ChloeXWang1·
1/7 First of all, big shoutout to co-authors on modeling (@MKarimzade, @neal_ravindra, @RexMa9, @HAOTIANCUI1, @LeeTaliq), huge appreciation to data generation (Lexi, @alerasool, Adam) and bioinformatics team (@_annhuang), and leadership for vision and direction (@BoWang87, @inCiChu)! Preprint is now live on bioRxiv: biorxiv.org/content/10.648… All models start from high-quality data.
Bo Wang@BoWang87

Our X-cell is up at @biorxiv_bioinfo ! Read our full paper at biorxiv.org/content/10.648… Part of the data and the model weights will be shared soon. stay tuned!

English
1
11
33
6.6K
Rex Ma 리트윗함
Bo Wang
Bo Wang@BoWang87·
2026 may be the year AI starts to truly reason about biology. AlphaFold helped close the sequence → structure gap. The next frontier is sequence → functions. Today, together with @genophoria and the team at @arcinstitute , we’re releasing BioReason-Pro — the first multimodal reasoning model for protein function prediction.
Bo Wang tweet media
English
10
72
292
56.4K
Rex Ma 리트윗함
Arc Institute
Arc Institute@arcinstitute·
Over 250 million protein sequences are known, but fewer than 0.1% have confirmed functions. Today, @genophoria, @BoWang87 & team introduce BioReason-Pro, a multimodal reasoning model that predicts protein function and explains its reasoning like an expert would.
Arc Institute tweet media
English
11
126
525
60.6K
Rex Ma 리트윗함
Mehran Karimzadeh
Mehran Karimzadeh@MKarimzade·
1/ So excited to have had the opportunity of contributing to this magnificent effort! Foundation models of observational transcriptome often memorize gene co-expression networks without understanding the underlying logic. Genetic perturbation datasets make it possible to
Bo Wang@BoWang87

Today we’re announcing X-Cell — Xaira’s first step toward a virtual cell. 🧬 A foundation model that predicts how gene expression changes under causal perturbations — across cell types, conditions, and even unseen biology. This is not trained on observational atlases. It is trained on interventions. 🧵👇

English
1
4
14
2.8K
Rex Ma 리트윗함
Ci Chu
Ci Chu@inCiChu·
Next week I’m off to Vienna, Austria for #Perturb2026 to join some of the top thinkers in high-throughput biology and foundational model building. My talk — "Towards the virtual cell: Bridging genome-scale Perturb-seq data and causal AI models” — will put a spotlight on the amazing work the Xaira Therapeutics team is doing, rooted in our core belief: building truly causal AI models requires a foundation of high-quality causal data. We’ll have some very exciting news to share as well. Looking forward to seeing everyone there!
Ci Chu tweet media
English
0
1
1
62
Rex Ma 리트윗함
Andrej Karpathy
Andrej Karpathy@karpathy·
The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later. I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run: github.com/karpathy/autor… Alternatively, a PR has the benefit of exact commits: github.com/karpathy/autor… but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back. I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.
English
524
714
7.6K
1.1M
Rex Ma 리트윗함
Brian Hie
Brian Hie@BrianHie·
Evo 2, our genome language model that generalizes: - across biological prediction and design tasks, - across all modalities of the central dogma, - across molecular to genome scale, and - across all domains of life, is published today in @Nature.
Brian Hie tweet media
English
10
71
373
55.7K
Rex Ma 리트윗함
Bo Wang
Bo Wang@BoWang87·
Everyone’s hyped about “AI for Science.” in 2025! At the end of the year, please allow me to share my unease and optimism, specifically about AI & biology. After spending another year deep in biological foundation models, healthcare AI, and drug discovery, here are 3 lessons I learned in 2025. 1. Biology is not “just another modality.” The biggest misconception I still see: “Biology is text + images + graphs. Just scale transformers.” No. Biology is causal, hierarchical, stochastic, and incomplete in ways that language and vision are not. Tokens don’t correspond cleanly to reality. Labels are sparse, biased, and often wrong. Ground truth is conditional, context-dependent, and sometimes unknowable. We’ve made real progress—single-cell, imaging, genomics, EHRs are finally being modeled jointly—but the hard truth is this: Most biological signals are not supervised problems waiting for better loss functions. They are intervention-driven problems. They demand perturbations, counterfactuals, and mechanisms, beyond just prediction. Scaling obviously helps. But without causal structure, scaling mostly gives you sharper correlations. 2025 reinforced my belief that biological foundation models must be built around perturbation, uncertainty, and actionability, not just representation learning. 2. Benchmarks are holding biology back more than compute is. Let’s be honest: Benchmarking in AI & biology is still broken. Everyone reports SOTA. Everyone picks a different dataset slice. Everyone tunes for a different metric. Everyone avoids prospective validation. We’ve imported the worst habits of ML benchmarking into a domain where stakes are much higher. In biology and healthcare, a 1% gain that doesn’t transfer is worse than useless—it’s misleading. What’s missing isn’t more benchmarks. It’s hard benchmarks: •Prospective, not retrospective •Perturbation-based, not static •Multi-site, not single-lab •Failure-aware, not leaderboard-optimized If your model only works on the dataset that created it, it’s not a foundation model—it’s a dataset artifact. In 2026, we need fewer flashy plots and more humility, rigor, and negative results. 3. “Reasoning” in biology is not chain-of-thought. There’s a growing tendency to directly apply the word reasoning onto biological LLMs. Let’s be careful. Biological reasoning isn’t verbal fluency, longer context windows, or prettier explanations. Those are surface-level improvements. Real reasoning in biology shows up elsewhere: in forming hypotheses, deciding which experiments to run, updating beliefs when perturbations fail, and constantly trading off cost, risk, and uncertainty. A model that explains a pathway beautifully but can’t decide which experiment to run next is not reasoning, it’s narrating. 2025 convinced me that the future lies in agentic biological AI: systems that couple foundation models with experimentation, simulation, and decision-making loops. Closing thought: AI & biology is not lagging behind AI for code or language. It’s just playing a harder game. The constraints are real. The data is messy. The feedback loops are slow. The consequences matter. If 2025 clarified anything for me, it’s this: We won’t make progress by treating biology like text. We’ll make progress by building AI that behaves more like a scientist : skeptical, iterative, and willing to be wrong. Onward to 2026.
Bo Wang tweet media
English
55
166
742
66.7K
Rex Ma 리트윗함
Hannes Stark
Hannes Stark@HannesStaerk·
Excited to release BoltzGen which brings SOTA folding performance to binder design! The best part of this project has been collaborating with many leading biologists who tested BoltzGen at an unprecedented scale, showing success on many novel targets and pushing its limits! 🧵..
Hannes Stark tweet media
English
18
263
991
299.1K
Rex Ma
Rex Ma@RexMa9·
@xingyuchen67 • Evaluated on human enhancer & promoter datasets across 6 cell types. • Consistently outperforms evolutionary, generative, and RL baselines, improving specificity, motif correlation and diversity.
English
1
0
0
79