Cade Gordon

226 posts

Cade Gordon banner
Cade Gordon

Cade Gordon

@CadeGordonML

Helping models grow wise @Anthropic | Hertz Fellow | Prev: LAION-5B & OpenCLIP @UCBerkeley

Berkeley, CA 가입일 Aralık 2020
873 팔로잉2.4K 팔로워
고정된 트윗
Cade Gordon
Cade Gordon@CadeGordonML·
Excited to announce our new work! 🧬 Some highlights are: - sequences likelihoods predict zero-shot fitness capabilities - a new method to calculate pLM likelihood in O(1) instead of O(L) forward passes - providing a causal between training data and outputs - suggesting a new finetuning method to improve pLM capabilities (1/9) biorxiv.org/content/10.110…
Cade Gordon tweet media
English
2
32
206
47.1K
billy bubba
billy bubba@wgrathwohl·
Hit Billy with your Austin recs plzzzzzzz
English
1
0
0
310
Furkan Eris
Furkan Eris@nappenstance·
@DdelAlamo I guess the correct approach would be rerunning evals in "pseudo-ppl" mode of ProteinGym. This would take humongous time. Also, masked models' per-position (later summed) training objectives probably doesn't help.
English
1
0
1
336
Diego del Alamo
Diego del Alamo@DdelAlamo·
I have a concern with this paper and I want someone who knows more than me to confirm if it is founded or not. The title makes a pretty specific claim about epistasis predictions, but the method does not seem sound for masked LMs (1/3)
Biology+AI Daily@BiologyAIDaily

Beyond additivity: zero-shot methods cannot predict impact of epistasis on protein properties and function 1 The study reveals a critical blind spot in modern protein AI: while 95 state-of-the-art zero-shot models can predict single mutations well, they systematically fail when mutations interact epistatically—where the combined effect of mutations deviates from simple additivity. 2 Using 53 MAVE datasets from ProteinGym, the researchers identified epistatic genotypes by comparing observed effects against expected additive effects, accounting for experimental error. For GFP fluorescence and protein thermostability, epistasis is widespread and biologically genuine, not a measurement artifact. 3 The performance gap is stark. Top models like ESCOTT, PoET, and MSA-Transformer achieve Spearman correlations above 0.6 for all genotypes, but collapse to near-zero or negative correlations for epistatic genotypes. Simple linear regression baselines often match or exceed complex deep learning models on epistatic combinations. 4 This exposes a fundamental limitation: protein language models learn evolutionary plausibility from natural sequences, but natural selection only explores functional sequence space. Epistatic combinations—often traversing fitness valleys—lie outside this training distribution, leaving models blind to higher-order mutational interactions. 5 The work highlights that clever feature engineering (evolutionary conservation, structural information) outperforms architectural complexity for epistasis prediction. Yet even structure-aware models like ProSST and ESM-IF1, while top performers on stability, show no consistent advantage across datasets. 6 The implications are profound for protein design and directed evolution. Current zero-shot methods cannot reliably navigate rugged fitness landscapes or predict functional variants along evolutionary paths requiring epistatic mutations. The field urgently needs models trained on multi-mutational data and architectures explicitly modeling non-linear interactions. 💻Code: github.com/kalininalab/ep… 📜Paper: biorxiv.org/content/10.648… #ProteinEngineering #Epistasis #MachineLearning #ProteinGym #VariantEffectPrediction #ComputationalBiology #Bioinformatics #ProteinEvolution #AIforScience #StructuralBiology

English
5
7
35
9.3K
Cade Gordon 리트윗함
Mike A. Merrill
Mike A. Merrill@Mike_A_Merrill·
The Terminal-Bench paper is here! Read it to learn where frontier models still fail and the secrets of how we sourced hundreds of high quality environments from our open source community. 🧵
Mike A. Merrill tweet media
English
23
103
461
101.1K
near
near@nearcyan·
you wouldnt believe whats inside. of the plushie, that is. i will spare your heart
English
7
0
131
8K
near
near@nearcyan·
prize for being opus 4.5's favorite user?? what
near tweet medianear tweet media
English
67
48
2.3K
93.9K
Cade Gordon
Cade Gordon@CadeGordonML·
@bilaltwovec when does the interview drop of you saying you dropped out of harvard for a year to dj?
English
1
0
2
290
Cade Gordon 리트윗함
Anthropic
Anthropic@AnthropicAI·
Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
Anthropic tweet media
English
932
3.2K
20.8K
4.3M
Cade Gordon
Cade Gordon@CadeGordonML·
Excited to share that I'll be joining @Anthropic to work on pretraining science! I've chosen to defer my Stanford PhD, where I'm honored to be supported by the Hertz Fellowship. There's something special about the science, this place, and these people. Looking forward to joining some of my most brilliant and compassionate colleagues!
English
42
10
762
58.7K
Cade Gordon 리트윗함
Mike A. Merrill
Mike A. Merrill@Mike_A_Merrill·
Many agents (Claude Code, Codex CLI) interact with the terminal to do valuable tasks, but do they currently work well enough to deploy en masse? We’re excited to introduce Terminal-Bench: An evaluation environment and benchmark for AI agents on real-world terminal tasks. Tl;dr lots of room for improvement! tbench.ai
Mike A. Merrill tweet media
English
16
61
243
51.7K
Cade Gordon 리트윗함
Hertz Foundation
Hertz Foundation@HertzFoundation·
🎓🤖 We’re thrilled to welcome @CadeGordonML to the 2025 class of Hertz Fellows! Cade’s AI research is advancing biomedical discovery. A future PhD student at @Stanford, he joins a growing community shaping the future of #science and #tech! 🔗 bit.ly/4daXZLO
Hertz Foundation tweet media
English
2
2
29
2.5K
Cade Gordon 리트윗함
Hertz Foundation
Hertz Foundation@HertzFoundation·
👏 Meet the 2025 Hertz Fellows—19 rising leaders in science and tech advancing breakthroughs in robotics, energy, medicine & more. 🔗Learn more: bit.ly/2025HertzFello…
Hertz Foundation tweet media
English
0
10
33
12.5K
Cade Gordon
Cade Gordon@CadeGordonML·
I'm still trying to wrangle with this idea in my head too. On one hand, it feels to me that certain regions of protein space shouldn't be this heavily memorized (high ll/low ppl) which can be mitigated with better pre-training data mixtures. On the other hand, I'm uncertain if zero-shot fitness prediction is an intended capability of these models at all or rather a fun quirk that we've discovered. E.g. there's nothing abt the training task to suggest this behavior is intended. Cool to see them agree with our past work though! biorxiv.org/content/10.110…
English
1
0
1
26
Cade Gordon 리트윗함
Samarth Jajoo
Samarth Jajoo@jajoosam·
Documenting and sharing research in real-time is underrated in discussions about open science. @jainhiya_ and I think software can help change problem selection, collaboration, and funding. We write about how and why we should create real-time, open lab notebooks.
Samarth Jajoo tweet mediaSamarth Jajoo tweet media
English
5
9
61
12.1K
Cade Gordon 리트윗함
Hiya Jain
Hiya Jain@jainhiya_·
Chinese policy on clinical trial approvals liberalized massively in the mid 2010s. A decade later the effects of this move are perceptible in where our drugs come from.
English
1
3
17
2.8K
Cade Gordon 리트윗함
Amy Lu
Amy Lu@amyxlu·
Arrived in Singapore for ICLR—excited to see old & new friends! I’ll also be at the: - Thursday 3:30-5pm main conference poster session, presenting work led by @CadeGordonML on the subtleties of using protein LM likelihoods for fitness prediction (see 🔗👇) - GEM workshop presenting our all-atom latent diffusion work with @nc_frey @KevinKaichuang @wilson1yan & others (unrelatedly, I’ve never been so infatuated with an airport before…check out this gradient descent sculpture)
Amy Lu tweet media
English
5
4
107
8K