Naail Kashif-Khan

532 posts

Naail Kashif-Khan banner
Naail Kashif-Khan

Naail Kashif-Khan

@NKhan212

Making proteins in the lab and with AI 🔬💻 Wielder of petri dish and keyboard 🧫⌨️ Love heavy metal and the Arsenal 🎸🔴

London, England Katılım Ağustos 2015
356 Takip Edilen220 Takipçiler
Naail Kashif-Khan retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
Optimizing Molecular Glues Using Free Energy Perturbation and Cofolding Methods 1. This study presents a comprehensive evaluation of Free Energy Perturbation (FEP) and Boltz-2 for predicting the binding affinity of molecular glues to protein complexes. The results show that FEP outperforms Boltz-2 in terms of correlation and RMSE, highlighting the need for more accurate high-throughput methods. 2. Molecular glues are small molecules that induce protein-protein interactions, offering access to new biology and protein targets. However, their rational design and optimization are challenging due to the dynamic nature of their binding sites. This study addresses this challenge by providing a detailed comparison of computational methods. 3. The study assessed 93 compounds across six diverse target/effector complexes, yielding 140 unique protein-compound measurements. This large-scale evaluation provides valuable insights into the capabilities and limitations of FEP and Boltz-2 in the context of molecular glue optimization. 4. FEP demonstrated good absolute predictability with RMSE values within 0.3-1.25 kcal/mol and strong correlations, making it a valuable tool for molecular glue optimization despite its higher computational cost. In contrast, Boltz-2 exhibited poor absolute predictability and generally poor correlations. 5. The poor performance of Boltz-2 suggests it is not suitable for high-throughput screening of molecular glues. This highlights the need for more accurate, high-throughput machine learning methods for pre-FEP screening to accelerate the discovery of molecular glues. 6. The study underscores the importance of accurate computational methods in the challenging field of molecular glue optimization. The findings provide a foundation for future research aimed at developing more efficient and accurate tools for drug discovery. 📜Paper: doi.org/10.26434/chemr… #MolecularGlues #FreeEnergyPerturbation #Boltz2 #DrugDiscovery #ComputationalBiology
Biology+AI Daily tweet media
English
1
35
179
8.7K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@liambai21 Yup folds fine with the ESM Atlas API, so must be the visualizer being difficult!
English
1
0
2
121
Liam Bai
Liam Bai@liambai21·
Ah if 166aa doesn’t work something is definitely up. I just tried a sequence and it worked so it might be a transient issue. The API we’re using is the same as ESMAtlas so I’m curious if the sequence can be folded there. If that errors out (has happened to me before) then it’s definitely an issue with the API. Otherwise it’s probably a bug in our visualizer. esmatlas.com/resources?acti…
English
1
0
0
60
Liam Bai
Liam Bai@liambai21·
Ever wondered how a protein language model sees your favorite protein? Checkout out our SAE visualizer where you can now search any sequence for activating features.
English
6
44
170
31.8K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@liambai21 I was just testing a few of the example sequences shown, some of which are pretty long! But even trying with a 163aa sequence it still doesn't work for me, it does the loading animation and then gives just a blank space. I'll tinker around and maybe try a different browser!
English
1
0
1
142
Liam Bai
Liam Bai@liambai21·
@NKhan212 Is your sequence under 400 residues (limit for ESMFold API)? If so, I’ve also experienced temporary hiccups with the API but it usually works if you retry in a bit!
English
1
0
1
200
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@DdelAlamo This is so cool! Not only really interesting to see what pLMs are "thinking", but super visually satisfying too 👀
English
0
0
2
123
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@DdelAlamo I always just assumed they were LLM generated using the text of the paper lol
English
0
0
4
309
Diego del Alamo
Diego del Alamo@DdelAlamo·
I appreciate that this account exists but sometimes wonder where these summaries come from exactly
Biology+AI Daily@BiologyAIDaily

ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics 1. ProtSCAPE is a deep learning architecture designed to map protein conformations from molecular dynamics (MD) simulations, using a novel combination of learnable geometric scattering with dual attention mechanisms. 2. The model employs geometric scattering to capture both local and global protein structures, representing proteins as graphs, which are then processed by a transformer with dual attention—focusing on residues and amino acids. 3. ProtSCAPE’s latent representations are temporally coherent, allowing it to capture conformational transitions, such as phase changes between open and closed states, and stochastic switching between meta-stable conformations. 4. Unlike conventional MD trajectory analysis that often misses complex transitions, ProtSCAPE excels in generating detailed low-dimensional representations that retain structural and temporal context, enabling enhanced visualization and downstream analysis of protein dynamics. 5. ProtSCAPE effectively generalizes from short to long trajectories and from wild-type to mutant proteins, offering insights into how mutations can affect the protein conformational landscape. 6. The model can interpolate between states to reconstruct intermediate conformations, validated with case studies on proteins like MurD, which showed hinge-like transitions consistent with experimental data. 7. ProtSCAPE outperformed traditional graph-based methods (GNNs) in predicting pairwise distances and dihedral angles, demonstrating its superior ability to decode the dynamics of protein conformations. 8. This tool holds significant promise for studying complex protein functions, such as allostery, binding, and enzymatic catalysis, by providing a comprehensive view of protein motion across various temporal scales. @egbertcastro @dbhaskar92 @KrishnaswamyLab @Siddharth2814 💻Code: github.com/KrishnaswamyLa… 📜Paper: arxiv.org/abs/2410.20317 #ProteinDynamics #DeepLearning #Bioinformatics #MolecularDynamics #Transformer #ProteinConformation #MachineLearning #ComputationalBiology

English
3
0
17
4.8K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@klausenhauser I guess that's my overall concern here - you can't trust (or even really know) the data that will go into these models, let alone anything else about them (no code or weights, no details on training or experiments) so who will trust these over something like ESM-2?
English
0
0
3
56
Kelvin Lau 🧬🧪💎
Kelvin Lau 🧬🧪💎@klausenhauser·
@NKhan212 If you saw their second announcement, their foundry is now a massive data generation platform. However they’ll do so many different assays that I don’t even know if they have the expertise in them. Even if they don’t analyze, if you don’t trust the generator then it’s GIGO 🗑️🚮
English
1
0
0
127
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
So Ginkgo has just announced its own protein language model - my first question is, why would anyone want to use a closed-source proprietary pLM over one of the open-source and better understood models already out there?
SynBioBeta@SynBioBeta

@Ginkgo’s protein language model, built on @googlecloud technology, offers unprecedented insights for researchers, accelerating the development of life-saving medicines. #DrugDevelopment #ProteinLLM #AIinBiotech #GinkgoBioworks #GoogleCloud loom.ly/RGh0ORc

English
4
3
35
7.4K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
Ah yes it's every protein scientist's favourite amino acid, the shiny golden orb one
Naail Kashif-Khan tweet media
English
0
0
3
175
Diego del Alamo
Diego del Alamo@DdelAlamo·
There should be leetcode for comp bio just to brush up our skills on all the random small bullshit we do as part of our jobs. I refuse to believe rosetta partial_thread is the best way to isolate specific subregions of a PDB file from a sequence alignment
English
1
2
19
2.6K
Kevin K. Yang 楊凱筌
Kevin K. Yang 楊凱筌@KevinKaichuang·
New favorite protein structure: alpha helix inside a beta barrel!
Kevin K. Yang 楊凱筌 tweet media
English
18
29
408
30.8K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@sokrypton This is super interesting, is there some code we can play with to experiment with this?
English
0
0
0
1.1K
Sergey Ovchinnikov
Sergey Ovchinnikov@sokrypton·
Ever wondered how many amino acids you can mutate to alanine and AlphaFold2 still predicts same structure? 🤔For denovo design Top7 (1QYS), single-sequence mode, it's 60%. (1/2)
Sergey Ovchinnikov tweet media
English
17
97
559
124.3K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@pengzhangzhi1 @LTEnjoy Something just doesn't sit right with me - train an inverse folding model on AF2 structures (which we think might not be a great idea), and then train a big language model on those inverse folded sequences, and also on AF2 predicted structures
English
1
0
1
44
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@pengzhangzhi1 @LTEnjoy This is exactly what I'm concerned about. ProteinMPNN is trained only on the PDB and has been widely experimentally validated. In contrast, ESM-IF is trained on the PDB and 12m AF2 structures and no one's got it to work in the literature yet.
English
1
0
2
46
Jin Su
Jin Su@LTEnjoy·
Just evaluated the inverse folding ability of the released ESM3 (esm3_sm_open_v1) on the CATH test set (around 1100 proteins). ESM3 performed better than Saprot but surprisingly inferior to ProteinMPNN🧐. PS: The overall predicitons took ~1.5h on one A40 GPU.
Jin Su tweet mediaJin Su tweet media
English
5
9
46
6.5K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@LTEnjoy Cool stuff! Perhaps not so surprising given that ESM3 is trained on a bunch of predicted structures and inverse folded sequences. I reckon there's a load of junk in that data and it's probably not as good as training on just the PDB which are all "real" structures
English
1
0
5
482
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@julian_englert @diffuse_bio @EvoscaleAI This is really cool! Protein design is very shiny and flashy but the nitty gritty lab characterization is my favourite part :) Any ideas why the non-binder from the paper looked like it binds in your experiments?
English
0
0
0
99
Julian Englert
Julian Englert@julian_englert·
Benchmarking AI-designed proteins in our lab New AI models for designing proteins are coming out at a faster and faster pace! Just in the past two weeks, two new models were released: @diffuse_bio's DSG-1 and @EvoscaleAI‘s ESM3. As the number of AI models increases, it becomes important for protein designers to know what model actually works best for their application. At @adaptyvbio we just launched a series of real-world benchmarks to understand how state-of-the-art protein design models perform when tested in the lab. For this first case study, we’re validating some de-novo designed binders from RFdiffusion by @UWproteindesign. Read more: adaptyvbio.com/blog/rfdiff_il…
Julian Englert tweet media
English
3
23
115
11.2K
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@jakublala It also seems that the "open" model isn't actually the best or biggest one they trained so they've definitely nerfed the publicly available stuff to keep the best ones for commercial use I'd imagine
English
1
0
0
6
Naail Kashif-Khan
Naail Kashif-Khan@NKhan212·
@jakublala Based on this snippet from their press release I think they're going to pull a DeepMind and keep some secret sauce to themselves for licensing to pharma for drug discovery. I believe the currently available model is non commercial use only too but have to double check
Naail Kashif-Khan tweet media
English
1
0
0
19
Jakub Lála 👨🏻‍🍳🥯
so how much will this esm3 api cost? any ideas? and what about the IP associated with the generations?
English
1
0
1
135