Bin Zhang

162 posts

Bin Zhang banner
Bin Zhang

Bin Zhang

@binzmit

Associate professor of chemistry, MIT

East Cambridge, Cambridge Katılım Kasım 2017
289 Takip Edilen1.2K Takipçiler
Bin Zhang retweetledi
MIT Chemistry
MIT Chemistry@ChemistryMIT·
Congratulations to Professor Bin Zhang, who has been honored as “Committed to Caring”. The C2C program is a student-driven initiative that recognizes outstanding professors who extend this dedication beyond the classroom. chemistry.mit.edu/chemistry-news…
MIT Chemistry tweet media
English
0
3
6
1.4K
Bin Zhang
Bin Zhang@binzmit·
Pretraining implicit solvent, or coarse grained, models is difficult, limiting the transferability of the resulting ML force fields. However, Justin just figured out a novel method for pretraining using protein language models. More can be found at: arxiv.org/abs/2601.05388
English
0
0
4
214
Bin Zhang
Bin Zhang@binzmit·
Thrilled to share our new JCP editorial (shorturl.at/6mKdY), co-authored with @tamar_schlick, introducing the special collection “Chromatin Structure and Dynamics: Recent Advancements.” Huge thanks to all the outstanding contributors!
English
0
0
1
127
Bin Zhang
Bin Zhang@binzmit·
Hmm, I assume these posts were made by AI? Anyway, very impressive summary of our latest preprint.
Biology+AI Daily@BiologyAIDaily

Protein Language Model Identifies Disordered, Conserved Motifs Driving Phase Separation 1. This study employs ESM2, a cutting-edge protein language model, to analyze intrinsically disordered regions (IDRs) in proteins, uncovering conserved motifs critical for phase separation and membraneless organelle (MLO) formation. 2. A major finding reveals that IDRs involved in phase separation contain conserved “sticker” residues (e.g., Y, W, F) and “spacer” residues (e.g., G, A, P), forming functional sequence motifs under evolutionary pressure. 3. By predicting mutational constraints, ESM2 accurately identifies conserved, functional residues and motifs without relying on sequence alignments, overcoming a key limitation in studying disordered proteins. 4. Experimental validation shows that many conserved motifs identified by ESM2 are essential for phase separation. Mutations within these motifs disrupt MLO formation, highlighting their biological significance. 5. The study introduces a motif-based framework for understanding the “grammar” of IDRs, emphasizing how conserved motifs integrate structural flexibility with functional specificity. 6. This work bridges computational predictions with experimental biology, advancing our understanding of IDRs in phase separation and offering insights into disease-linked protein dysfunction. 7. ESM2 emerges as a powerful tool for investigating the evolutionary and functional landscapes of disordered proteins, with broad implications for molecular biology and synthetic biology. @binzmit @yumengzhang99 📜Paper: biorxiv.org/content/10.110… #ProteinScience #Bioinformatics #PhaseSeparation #MachineLearning #IntrinsicallyDisorderedRegions

English
2
0
36
5.9K
Bin Zhang
Bin Zhang@binzmit·
Check out our recent manuscript that studies the interactions between small molecules and biomolecular condensates with all-atom simulations: pubs.acs.org/doi/abs/10.102…
English
0
0
14
1.1K
Bin Zhang retweetledi
Yunrui Qiu
Yunrui Qiu@YunruiQ·
Our findings shed light on the significant regulatory roles of bio condensate and DNA linker length and help bridge the gap between in vivo and in vitro observations.
English
1
2
10
830
Bin Zhang retweetledi
Gene Regulation
Gene Regulation@generegulation·
Nucleosome condensate and linker DNA alter chromatin folding pathways and rates [Qiu et al, 2024] biorxiv.org/content/10.110… ▶️residue-level coarse-grained models; non-Markovian dynamics ▶️10n bp DNA linker lengths favor zigzag fibrils ▶️10n+5 bp chromatin loses unique conformations
Gene Regulation tweet media
English
1
5
53
5K
Bin Zhang retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
Scaling Graph Neural Networks to Large Proteins 1. This paper introduces the DISPEF dataset, specifically designed for benchmarking graph neural networks (GNNs) on large, biologically relevant proteins. DISPEF contains over 200,000 proteins with implicit solvation free energies, a key challenge for molecular modeling. 2. A major innovation is the introduction of a multiscale architecture called “Schake,” which enhances GNN performance on large proteins by incorporating both short-range and long-range interactions, ensuring computational efficiency and transferability across protein sizes. 3. The Schake model leverages two types of message-passing layers: a more accurate SAKE layer for short-range atomic environments and a more efficient SchNet layer for long-range alpha carbon interactions. This mixed design significantly improves both accuracy and speed. 4. DISPEF provides a diverse chemical environment, including both folded and disordered protein regions, making it an ideal dataset for testing GNNs. The inclusion of many-body solvation free energies as targets pushes the limits of model accuracy and generalization. 5. Benchmark results show that Schake consistently outperforms existing GNNs on energy and force predictions for large proteins, while maintaining superior computational efficiency, reducing inference times by up to 2.7× compared to prior models. 6. A key insight from the study is that increasing the cutoff distance in GNNs improves model transferability to larger proteins, but this needs to be balanced against computational cost, which Schake addresses with its hybrid architecture. 7. This work highlights the importance of datasets like DISPEF for driving future GNN innovations and optimizing models for real-world applications like protein folding and drug discovery, where large proteins are common. 8. The paper provides valuable insights for advancing GNN architectures in computational biology, emphasizing the need for both accuracy and efficiency in handling the vast complexity of biomolecular systems. @binzmit 💻Code: github.com/ZhangGroup-MIT… 📜Paper: arxiv.org/abs/2410.03921
Biology+AI Daily tweet media
English
0
3
29
2.1K
Bin Zhang
Bin Zhang@binzmit·
@Ella_Maru Great question! ChromoGen was trained with only GM12878 data, and we showed that the prediction results for IMR90 cells were equally accurate.
English
1
0
1
125
Ella Marushchenko
Ella Marushchenko@Ella_Maru·
@binzmit Congratulations Dr. Zhang! Have you tested ChromoGen's prediction accuracy across various cell types and tissues beyond the training set?
English
1
0
0
217
Bin Zhang
Bin Zhang@binzmit·
I am excited to share our latest #preprint: ChromoGen: Diffusion model predicts single-cell chromatin conformations researchsquare.com/article/rs-463… As the title suggests, we achieved de novo prediction of 3D chromatin structures using DNA sequence and ATAC-seq using AI.
English
2
8
65
10.4K
Bin Zhang
Bin Zhang@binzmit·
Our review on chromatin organization is online at the Annual Review of Biophysics! Spoiler alert, we took a more physical chemist's perspective on the problem. annualreviews.org/doi/pdf/10.114…
English
0
12
62
10K
Bin Zhang
Bin Zhang@binzmit·
Interested in IDPs and biomolecular condensates? Do you ever wonder why IDPs adopt low-complexity domains? How do you map a given protein sequence into stickers and spacers? We attempt to address these questions in our latest preprint. biorxiv.org/content/10.110…
English
1
4
29
3.8K
Bin Zhang retweetledi
Xin Zhang
Xin Zhang@kfzhangxin·
A fruitful collaboration with Bin Zhang @binzmit reveals the nanometer scale interfacial environment of phase separated condensates. In @eLife: Frustrated Microphase Separation Produces Interfacial Environment within Biological Condensates doi.org/10.7554/eLife.…
English
0
5
23
2.6K
Bin Zhang
Bin Zhang@binzmit·
We welcome new postdoc, Joe Paggi, and grad students Camryn Carter and Ivan Riveros to the group. I am very much looking forward to working with you all.
English
0
0
12
1.5K