Chenchen Han (@ChenchenHa42849) - Twitter Profili

Chenchen Han retweetledi

Zhenjun Zhao@zhenjun_zhao·6 Ağu

H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction Heng Jia, Linchao Zhu, Na Zhao tl;dr: volumetric latent fusion+camera-aware Transformer; spatial-aligned model>semantic-aligned model arxiv.org/abs/2508.03118

English

0

8

58

3.2K

Chenchen Han retweetledi

bioRxiv Bioinfo@biorxiv_bioinfo·6 Oca

Decoding the Molecular Language of Proteins with Evola biorxiv.org/cgi/content/sh… #biorxiv_bioinfo

English

0

9

37

3.1K

Chenchen Han retweetledi

fajie yuan@duguyuan·7 Oca

We release our protein chatGPT, Evola! 🌟 chat-protein.com Evola comes in two versions: 10B & 80B. The 80B model has a 1.3B Saprot encoder & a 70B LLaMA3 decoder. Trained on 546 protein question-text pairs with an 150 billion word tokens! 💡🔬 biorxiv.org/content/10.110…

English

20

136

618

130.4K

Chenchen Han retweetledi

Biology+AI Daily@BiologyAIDaily·7 Oca

Decoding the Molecular Language of Proteins with Evola 1. Evola introduces an 80-billion parameter multimodal protein-language model to decode protein functions, leveraging protein sequences, structures, and user queries. 2. A key innovation is its unprecedented training dataset: 546 million AI-generated protein question-answer pairs with 150 billion word tokens, reflecting immense protein diversity. 3. Evola integrates advanced techniques like Direct Preference Optimization (DPO) for model refinement and Retrieval-Augmented Generation (RAG) for incorporating external knowledge, ensuring high-quality, nuanced responses. 4. The Instructional Response Space (IRS), a novel evaluation framework, showcases Evola’s expert-level performance in protein annotation tasks like enzyme classification and gene ontology prediction. 5. The model outperforms general-purpose LLMs, demonstrating a nearly twofold improvement in generating precise, protein-specific insights compared to GPT-4-like models. 6. With scaling capabilities, Evola demonstrates enhanced performance by leveraging larger datasets and model sizes, culminating in Evola-80B achieving superior generalization on unseen protein data. 7. Evola’s ability to interpret protein molecular mechanisms extends applications to drug discovery, functional genomics, and biomedical research, revolutionizing protein functional understanding. @duguyuan @LTEnjoy @XibinBayesZhou @ChenchenHa42849 @shiyu_jiang23 📜Paper: biorxiv.org/content/10.110… #Proteomics #AI #ProteinLanguageModel #FunctionalGenomics #Biotechnology

English

0

12

75

4K

Chenchen Han retweetledi

fajie yuan@duguyuan·7 Kas

Excited to share our AI+cryo-EM work! 🧬 🔬 Cryo-IEF: Foundation model trained on 65M particles 🤖 CryoWizard: automated structure pipeline 🎯 Making cryo-EM accessible to more labs Preprint: biorxiv.org/content/10.110… Code: github.com/westlake-repl/… #CryoEM #AI #StructuralBiology

English

5

29

105

13.5K

Chenchen Han retweetledi

fajie yuan@duguyuan·2 Ağu

Pinal demonstrates impressive performance when evaluated using GT-TMscore and ProTrek CLIP score, outperforming ESM-3 for with key words as promt in dry experiment metrics. We plan to validate these results with wet experiments.

English

0

2

5

707

Chenchen Han retweetledi

fajie yuan@duguyuan·4 Eyl

We've released ColabProTrek, the successor to ColabSaprot. 🔬 Try it out: colab.research.google.com/drive/1On2xQU0… 🆕 We've also expanded ProTrek's search capabilities with additional databases including UniRef50 and PDB. 🧬 Explore: huggingface.co/spaces/westlak… Paper: biorxiv.org/content/10.110…

English

1

20

58

6K

Chenchen Han retweetledi

fajie yuan@duguyuan·17 Eyl

🚀 New Update. The latest version of ProTrek is now available on bioRxiv. 🧬 📑 Read it here: biorxiv.org/content/10.110… • Service: huggingface.co/spaces/westlak… • Try it on Colab: colab.research.google.com/drive/1On2xQU0…

English

1

8

52

16.6K

Chenchen Han retweetledi

fajie yuan@duguyuan·19 Tem

Video Training：youtube.com/watch?v=r42z1h… Vdeo Prediction: youtube.com/watch?v=N5VMBw…

YouTube

English

0

1

4

401

Chenchen Han retweetledi

fajie yuan@duguyuan·1 Tem

My student @LTEnjoy evaluated ESM3 (v1) for the inverse folding task. The results look great! Waiting more results. Also check SaprotHub without license limitation biorxiv.org/content/10.110… Welcome contributions to SaprotHub and be an author! github.com/westlake-repl/…

Jin Su@LTEnjoy

Just evaluated the inverse folding ability of the released ESM3 (esm3_sm_open_v1) on the CATH test set (around 1100 proteins). ESM3 performed better than Saprot but surprisingly inferior to ProteinMPNN🧐. PS: The overall predicitons took ~1.5h on one A40 GPU.

English

0

3

5

1.1K

Chenchen Han retweetledi

fajie yuan@duguyuan·3 Tem

Zhikai uploaded a 6-min tutorial for SaprotHub! 🚀 Biologists can now easily train & share their protein language models. Join us, be a SaprotHub author! #Bioinformatics #ProteinModeling @LTEnjoy @sokrypton Paper: biorxiv.org/content/10.110… Video: youtube.com/watch?v=r42z1h…

YouTube

English

5

11

34

8.5K

Chenchen Han retweetledi

fajie yuan@duguyuan·8 Tem

Great news: a wet lab submitted a EYFP fluorescence fitness model to SaprotHub with a Spearman ρ of 0.94, close to wet lab accuracy for double/triple-site mutations. Trained on 100K variants, it's a great🔧 tool for biologists! @ProteinBoston @ml4proteins @sokrypton @LTEnjoy

fajie yuan@duguyuan

Zhikai uploaded a 6-min tutorial for SaprotHub! 🚀 Biologists can now easily train & share their protein language models. Join us, be a SaprotHub author! #Bioinformatics #ProteinModeling @LTEnjoy @sokrypton Paper: biorxiv.org/content/10.110… Video: youtube.com/watch?v=r42z1h…

English

2

5

33

5.7K

Chenchen Han retweetledi

fajie yuan@duguyuan·19 Tem

Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! With SaprotHub, any biologist can train protein models! @sokrypton @LTEnjoy biorxiv.org/content/10.110…

English

7

33

212

20.5K

Chenchen Han retweetledi

Jin Su@LTEnjoy·22 Tem

We believe the potential of ProTrek to search proteins from large database for interested functions. Any suggestions for our evaluation would be appreciated!! Protrek demo: huggingface.co/spaces/westlak… paper: biorxiv.org/content/10.110…

English

2

4

10

877

Chenchen Han retweetledi

Jin Su@LTEnjoy·31 Tem

Used SaprotHub to predict mutations for eTDG, a uracil-N-glycosylase variant. 🧬 Lab results: 17 out of top 20 mutations had higher T-to-G editing efficiency than wild type (marked as red), with 3 showing nearly 2x improvement! 🚀

fajie yuan@duguyuan

Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! With SaprotHub, any biologist can train protein models! @sokrypton @LTEnjoy biorxiv.org/content/10.110…

English

0

4

20

3.3K

Chenchen Han retweetledi

Jin Su@LTEnjoy·4 Haz

We are thrilled to release ProTrek, a tri-modal PLM modeling protein sequence, structure and function! ProTrek supports both retrieval (9 tasks) and downstream fine-tuning! 👉Paper: biorxiv.org/content/10.110… 👉Github: github.com/westlake-repl/… 👉Demo: huggingface.co/spaces/westlak…

English

1

16

81

10.2K

Chenchen Han retweetledi

Leo Zang@LeoTZ03·4 Haz

ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning - Aligns sequence-structure, sequence-function, and structure-function pairs by ESM, BERT, and Foldseek - Leverages max-inner product search for rapid retrieval preprint: biorxiv.org/content/10.110…

English

1

17

84

6.6K

Chenchen Han retweetledi

fajie yuan@duguyuan·29 May

Exciting highlights: 1️⃣ Training is super easy—no ML or coding expertise needed! 2️⃣ Biologists can share models on our community store for others to use or retrain. 3️⃣ Join OPMC as a paper author! Welcome more contributions！ FAQs：github.com/westlake-repl/… @GoogleColab #OPMC

Sergey Ovchinnikov@sokrypton

Now everyone customize/share protein language models for their custom task/dataset via @GoogleColab 🤓 Paper: biorxiv.org/content/10.110… Colab: colab.research.google.com/drive/1nxYBed3… Credit: @LTEnjoy, Zhikai Li, @ChenchenHa42849, @BonnieSwt, Junjie Shan, @XibinBayesZhou, Dacheng Ma, @duguyuan

English

1

17

44

7.1K

Chenchen Han retweetledi

fajie yuan@duguyuan·4 Haz

Introducing ProTrek, a 3-modal PLM for protein seq, struc, and func: ✨ Trained on 40M protein-text pairs, 100x larger than ProteinCLIP, ProtST, ProteinCLAP 🚀 30x/60x better accuracy than ProtST, ProteinCLAP ⚡ 100x faster than Foldseek, MMseq2 for similar function searches

fajie yuan@duguyuan

Excited to share ProTrec, a fast & accurate protein search tool! 30x/60x better seq-func/func-seq retrieval 100x faster than Foldseek & MMseq2 9 tasks: seq-stru, seq-func, struc-fun, etc. Beats ESM2 in 9/11 tasks Thanks to @sokrypton @WChentong biorxiv.org/content/10.110…

English

7

30

83

11K

Chenchen Han retweetledi

fajie yuan@duguyuan·4 Haz

Excited to share ProTrec, a fast & accurate protein search tool! 30x/60x better seq-func/func-seq retrieval 100x faster than Foldseek & MMseq2 9 tasks: seq-stru, seq-func, struc-fun, etc. Beats ESM2 in 9/11 tasks Thanks to @sokrypton @WChentong biorxiv.org/content/10.110…

English

0

15

57

13.6K

Chenchen Han

Keşfet