Dan Liu

35 posts

Dan Liu banner
Dan Liu

Dan Liu

@DanLiu_

Computational biologist | bioinformatics, protein language models, virus-host interactions, LLMs 🦠 💻

Glasgow, Scotland Katılım Mayıs 2017
365 Takip Edilen135 Takipçiler
Sabitlenmiş Tweet
Dan Liu
Dan Liu@DanLiu_·
Our PLM-interact is out in @NatureComms! We show that jointly encoding protein pairs using protein language models improves protein–protein interaction prediction performance and enables fine-tuning to predict mutation effects in human PPIs. nature.com/articles/s4146…
English
1
3
16
3K
Dan Liu retweetledi
Ke Yuan
Ke Yuan@keyuan1·
PLM-interact is out! We learned a lot along the way, from ColBERT to next sentence prediction for PPI, from zero short PPI mutation effect prediction to full model fine-tuning, from not knowing FSDP to burning 30k GPU hours in just a few days. Heroic effort from @DanLiu_
Dan Liu@DanLiu_

Our PLM-interact is out in @NatureComms! We show that jointly encoding protein pairs using protein language models improves protein–protein interaction prediction performance and enables fine-tuning to predict mutation effects in human PPIs. nature.com/articles/s4146…

English
0
2
11
1.7K
Dan Liu
Dan Liu@DanLiu_·
Our PLM-interact is out in @NatureComms! We show that jointly encoding protein pairs using protein language models improves protein–protein interaction prediction performance and enables fine-tuning to predict mutation effects in human PPIs. nature.com/articles/s4146…
English
1
3
16
3K
Dan Liu retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
Prediction of virus-host associations using protein language models and multiple instance learning @PLOSCompBiol 1. EvoMIL introduces an innovative method for predicting virus-host associations by combining protein language models (PLMs) and attention-based multiple instance learning (MIL), using only viral sequences. 2. The approach leverages transformer-based embeddings (ESM-1b) to represent viral protein features, achieving significant improvements over traditional sequence composition features, such as k-mers and physiochemical properties. 3. EvoMIL delivers remarkable performance in multi-host prediction tasks, achieving median F1 score improvements of 10.8% and 16.2% for prokaryotic hosts, and 6.6% and 11.5% for eukaryotic hosts, in comparison with traditional methods. 4. The system excels in binary classification, achieving an AUC above 0.95 for all prokaryotic hosts and 0.8–0.9 for eukaryotic hosts, marking a milestone in virus-host prediction accuracy. 5. Attention-based MIL not only enhances prediction accuracy but also identifies key viral proteins that drive host specificity, providing insights into virus-host interactions. 6. In benchmarks, EvoMIL outperformed state-of-the-art methods, such as iPHoP and BLASTn, achieving the highest accuracy (75.27%) on independent datasets of prokaryotic hosts. 7. This model highlights the importance of integrating evolutionary and structural data in computational virology, offering a robust tool for understanding host-pathogen interactions and emerging virus detection. 8. EvoMIL’s interpretability and strong performance make it a valuable resource for virologists, offering both predictive power and biological insight into virus-host specificity. @keyuan1 @Kieran12Lamb @DanLiu_ 💻Code: github.com/liudan111/EvoM… 📜Paper: journals.plos.org/ploscompbiol/a… #VirusHostPrediction #MachineLearning #ProteinLanguageModels #Virology #Bioinformatics
Biology+AI Daily tweet media
English
0
5
17
2.1K
Dan Liu retweetledi
Ke Yuan
Ke Yuan@keyuan1·
Big news: We just released PLM-interact, a tool for predicting protein-protein interactions, showing a 16-28% improvement over previous methods and even predicting mutation effects on interactions. Here’s the story behind this journey. 🧵👇
Dan Liu@DanLiu_

🚀 Our new preprint is out! We show that protein language models can predict protein-protein interactions by jointly encoding protein pairs, leading to significant improvements in PPI prediction. biorxiv.org/content/10.110…

English
1
11
25
4.3K
Dan Liu retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
PLM-interact: extending protein language models to predict protein-protein interactions 1. PLM-interact introduces a novel approach to predict protein-protein interactions (PPIs) by jointly encoding protein pairs, leveraging a method similar to “next sentence prediction” in NLP. This approach allows PLM-interact to capture inter-protein contexts, significantly improving PPI prediction across species. 2. The model achieves 16-28% better AUPR scores than current state-of-the-art models on non-human species datasets (e.g., mouse, yeast, E. coli), indicating its robust cross-species predictive power. This enhancement is crucial for applications where PPI data is sparse or costly to obtain experimentally. 3. Beyond general PPI prediction, PLM-interact can identify mutation-driven changes in PPIs, making it valuable for understanding disease-causing mutations and enabling applications in clinical genomics. It detects both mutations that induce interactions and those that disrupt them, showing versatility in mutation impact assessment. 4. PLM-interact also excels in virus-host PPI prediction tasks. When trained on virus-human interaction data, it outperformed other models with improvements in AUPR, F1, and MCC metrics by 5.7%, 10.9%, and 11.9%, respectively. This capability supports virology research, including zoonotic event prediction. 5. This study demonstrates that PLMs can extend beyond single-protein tasks to learn complex biomolecular relationships, setting a new standard for PPI prediction in bioinformatics. The potential of PLM-interact to streamline PPI predictions across various organisms could transform how we approach drug discovery and genomics research. @keyuan1 @craig_macdonald @Kieran12Lamb 💻Code: github.com/liudan111/PLM-… 📜Paper: biorxiv.org/content/10.110… #Bioinformatics #ProteinInteraction #MachineLearning #Genomics #ProteinLanguageModels #NLP
Biology+AI Daily tweet media
English
1
5
27
3.1K
Dan Liu
Dan Liu@DanLiu_·
🚀 Our new preprint is out! We show that protein language models can predict protein-protein interactions by jointly encoding protein pairs, leading to significant improvements in PPI prediction. biorxiv.org/content/10.110…
English
5
20
88
17.4K