Ian Quigley
6.9K posts

Ian Quigley
@allmeasures
Cofounder, @LeashBio. Ex-Recursion, Arima Genomics, Nanocellect, Salk, UT Austin, Baylor College of Medicine, Rice. He/him.

Intrinsic dataset features drive mutational effect prediction by protein language models biorxiv.org/content/10.648…

Protein language models are increasingly used to predict protein properties. What is their accurately for the impact of amino acid substitutions ? Evaluation against deep mutational scanning data suggests that their accuracy is only marginally higher than the simplest (mean) prediction for cellular proteins (blue datapoints) and about the same as the the simplest (mean) prediction for viral proteins. The results suggest that intrinsic features of the datasets themselves (such as the number of variants and the specific sites mutated) are often more predictive of a model's success than the underlying architecture of the model. Perhaps we may be hitting a ceiling imposed by the quality and nature of our training data rather than the sophistication of our algorithms.



Adversarial Sequence Mutations in AlphaFold and ESMFold Reveal Nonphysical Structural Invariance, Confidence Failures, and Concerns for Protein Design 1. A new adversarial study systematically evaluates AlphaFold 3's robustness by introducing point mutations (up to 70%) and deletions (up to 10%) across 200 proteins, revealing striking structural invariance that raises fundamental questions about the model's biophysical reasoning capabilities. 2. The most concerning finding: AlphaFold 3 maintains virtually identical predicted structures even when 40% of residues are mutated with deliberately destabilizing substitutions, or when 10% of residues are deleted—perturbations that would catastrophically destabilize real proteins. 3. This structural invariance persists even for experimentally validated fold-switching proteins, where specific mutations are known to induce alternative conformations. AlphaFold 3 fails to capture these biologically critical transitions, suggesting limited sequence-structure coupling. 4. Confidence metrics prove unreliable: AlphaFold 3's ranking score selects the most accurate structure only ~25% of the time, and these scores correlate more strongly with template availability in the training set than with actual prediction quality. 5. Comparative analysis with ESMFold reveals that the protein language model-based approach shows significantly greater sensitivity to mutations, with structures diverging more rapidly as sequence perturbations increase—suggesting superior learned sequence-structure relationships despite lower absolute accuracy. 6. The study's template analysis provides quantitative evidence that AlphaFold 3's confidence reflects structural similarity to training-set exemplars (Pearson r=0.39) rather than genuine biophysical assessment, indicating heavy reliance on memorized patterns over learned principles. 7. These findings have profound implications for the entire AlphaFold ecosystem: protein design tools like RFdiffusion, binder design methods like BoltzGen and BindCraft, and drug discovery pipelines may inherit these fundamental limitations, potentially generating non-physical sequences or missing viable candidates. 8. The work identifies critical gaps in current structure prediction—models trained primarily on stable, wild-type proteins lack exposure to destabilized mutants and misfolded states, limiting their ability to generalize beyond the training distribution. 📜Paper: biorxiv.org/content/10.648… #AlphaFold #AlphaFold3 #ProteinStructurePrediction #StructuralBiology #ProteinDesign #MachineLearning #Bioinformatics #ComputationalBiology #AIforScience #ProteinEngineering #DeepLearning #Biophysics



