Chenchen Han

30 posts

Chenchen Han

Chenchen Han

@ChenchenHa42849

Katılım Aralık 2023
76 Takip Edilen17 Takipçiler
Chenchen Han retweetledi
Zhenjun Zhao
Zhenjun Zhao@zhenjun_zhao·
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction Heng Jia, Linchao Zhu, Na Zhao tl;dr: volumetric latent fusion+camera-aware Transformer; spatial-aligned model>semantic-aligned model arxiv.org/abs/2508.03118
Zhenjun Zhao tweet mediaZhenjun Zhao tweet mediaZhenjun Zhao tweet mediaZhenjun Zhao tweet media
English
0
8
58
3.2K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
We release our protein chatGPT, Evola! 🌟 chat-protein.com Evola comes in two versions: 10B & 80B. The 80B model has a 1.3B Saprot encoder & a 70B LLaMA3 decoder. Trained on 546 protein question-text pairs with an 150 billion word tokens! 💡🔬 biorxiv.org/content/10.110…
fajie yuan tweet media
English
20
136
618
130.4K
Chenchen Han retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
Decoding the Molecular Language of Proteins with Evola 1. Evola introduces an 80-billion parameter multimodal protein-language model to decode protein functions, leveraging protein sequences, structures, and user queries. 2. A key innovation is its unprecedented training dataset: 546 million AI-generated protein question-answer pairs with 150 billion word tokens, reflecting immense protein diversity. 3. Evola integrates advanced techniques like Direct Preference Optimization (DPO) for model refinement and Retrieval-Augmented Generation (RAG) for incorporating external knowledge, ensuring high-quality, nuanced responses. 4. The Instructional Response Space (IRS), a novel evaluation framework, showcases Evola’s expert-level performance in protein annotation tasks like enzyme classification and gene ontology prediction. 5. The model outperforms general-purpose LLMs, demonstrating a nearly twofold improvement in generating precise, protein-specific insights compared to GPT-4-like models. 6. With scaling capabilities, Evola demonstrates enhanced performance by leveraging larger datasets and model sizes, culminating in Evola-80B achieving superior generalization on unseen protein data. 7. Evola’s ability to interpret protein molecular mechanisms extends applications to drug discovery, functional genomics, and biomedical research, revolutionizing protein functional understanding. @duguyuan @LTEnjoy @XibinBayesZhou @ChenchenHa42849 @shiyu_jiang23 📜Paper: biorxiv.org/content/10.110… #Proteomics #AI #ProteinLanguageModel #FunctionalGenomics #Biotechnology
Biology+AI Daily tweet media
English
0
12
75
4K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
Pinal demonstrates impressive performance when evaluated using GT-TMscore and ProTrek CLIP score, outperforming ESM-3 for with key words as promt in dry experiment metrics. We plan to validate these results with wet experiments.
fajie yuan tweet mediafajie yuan tweet media
English
0
2
5
707
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
My student @LTEnjoy evaluated ESM3 (v1) for the inverse folding task. The results look great! Waiting more results. Also check SaprotHub without license limitation biorxiv.org/content/10.110… Welcome contributions to SaprotHub and be an author! github.com/westlake-repl/…
Jin Su@LTEnjoy

Just evaluated the inverse folding ability of the released ESM3 (esm3_sm_open_v1) on the CATH test set (around 1100 proteins). ESM3 performed better than Saprot but surprisingly inferior to ProteinMPNN🧐. PS: The overall predicitons took ~1.5h on one A40 GPU.

English
0
3
5
1.1K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
Great news: a wet lab submitted a EYFP fluorescence fitness model to SaprotHub with a Spearman ρ of 0.94, close to wet lab accuracy for double/triple-site mutations. Trained on 100K variants, it's a great🔧 tool for biologists! @ProteinBoston @ml4proteins @sokrypton @LTEnjoy
fajie yuan@duguyuan

Zhikai uploaded a 6-min tutorial for SaprotHub! 🚀 Biologists can now easily train & share their protein language models. Join us, be a SaprotHub author! #Bioinformatics #ProteinModeling @LTEnjoy @sokrypton Paper: biorxiv.org/content/10.110… Video: youtube.com/watch?v=r42z1h…

English
2
5
33
5.7K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! With SaprotHub, any biologist can train protein models! @sokrypton @LTEnjoy biorxiv.org/content/10.110…
fajie yuan tweet media
English
7
33
212
20.5K
Chenchen Han retweetledi
Jin Su
Jin Su@LTEnjoy·
Used SaprotHub to predict mutations for eTDG, a uracil-N-glycosylase variant. 🧬 Lab results: 17 out of top 20 mutations had higher T-to-G editing efficiency than wild type (marked as red), with 3 showing nearly 2x improvement! 🚀
Jin Su tweet media
fajie yuan@duguyuan

Recruited 12 bio students, no coding exp, to use ColabSaprot for re-training, zero-shot mutation, & protein design. They matched AI experts w/o hyper-parameter tuning! With SaprotHub, any biologist can train protein models! @sokrypton @LTEnjoy biorxiv.org/content/10.110…

English
0
4
20
3.3K
Chenchen Han retweetledi
Leo Zang
Leo Zang@LeoTZ03·
ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning - Aligns sequence-structure, sequence-function, and structure-function pairs by ESM, BERT, and Foldseek - Leverages max-inner product search for rapid retrieval preprint: biorxiv.org/content/10.110…
Leo Zang tweet media
English
1
17
84
6.6K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
Exciting highlights: 1️⃣ Training is super easy—no ML or coding expertise needed! 2️⃣ Biologists can share models on our community store for others to use or retrain. 3️⃣ Join OPMC as a paper author! Welcome more contributions! FAQs:github.com/westlake-repl/… @GoogleColab #OPMC
Sergey Ovchinnikov@sokrypton

Now everyone customize/share protein language models for their custom task/dataset via @GoogleColab 🤓 Paper: biorxiv.org/content/10.110… Colab: colab.research.google.com/drive/1nxYBed3… Credit: @LTEnjoy, Zhikai Li, @ChenchenHa42849, @BonnieSwt, Junjie Shan, @XibinBayesZhou, Dacheng Ma, @duguyuan

English
1
17
44
7.1K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
Introducing ProTrek, a 3-modal PLM for protein seq, struc, and func: ✨ Trained on 40M protein-text pairs, 100x larger than ProteinCLIP, ProtST, ProteinCLAP 🚀 30x/60x better accuracy than ProtST, ProteinCLAP ⚡ 100x faster than Foldseek, MMseq2 for similar function searches
fajie yuan tweet media
fajie yuan@duguyuan

Excited to share ProTrec, a fast & accurate protein search tool! 30x/60x better seq-func/func-seq retrieval 100x faster than Foldseek & MMseq2 9 tasks: seq-stru, seq-func, struc-fun, etc. Beats ESM2 in 9/11 tasks Thanks to @sokrypton @WChentong biorxiv.org/content/10.110…

English
7
30
83
11K
Chenchen Han retweetledi
fajie yuan
fajie yuan@duguyuan·
Excited to share ProTrec, a fast & accurate protein search tool! 30x/60x better seq-func/func-seq retrieval 100x faster than Foldseek & MMseq2 9 tasks: seq-stru, seq-func, struc-fun, etc. Beats ESM2 in 9/11 tasks Thanks to @sokrypton @WChentong biorxiv.org/content/10.110…
English
0
15
57
13.6K