Benjamin Perry

226 posts

Benjamin Perry

Benjamin Perry

@bots_and_bits

Designing enzymes with bots and bits! | Romero Lab at Duke

Durham, NC Katılım Mayıs 2024
206 Takip Edilen273 Takipçiler
Benjamin Perry
Benjamin Perry@bots_and_bits·
@samsinai Any suggestions for benchmarks that can reveal these trends?
English
0
0
4
182
Sam Sinai
Sam Sinai@samsinai·
It's much worse in unsupervised biological models, particularly sequence-only. So much baseless story-telling about (most) models that have at best learned to play nearest neighbors and get lucky by recombining additive motifs.
François Chollet@fchollet

This is more evidence that current frontier models remain completely reliant on content-level memorization, as opposed to higher-level generalizable knowledge (such as metalearning knowledge, problem-solving strategies...)

English
1
6
53
5.6K
Pranam Chatterjee
Pranam Chatterjee@pranamanam·
We're super excited to have BranchSBM published at #ICLR2026!! 🌳🧫🇧🇷 I am so proud of the team! 📷 Camera-Ready Paper: arxiv.org/abs/2506.09007 💻 Github: github.com/sophtang/Branc… 📹 Sophia's Presentation: youtube.com/watch?v=inVYA0… As you may remember, @_sophia_tang_ (alongside our lab's FIRST ever PhD graduate, @yinuo_z98!! 👩‍🎓) elegantly showed that by learns diverging velocity fields and growth dynamics (via decomposing the transport into multiple unbalanced Schrödinger bridges), we can get probability mass to split across branches so a single initial state (like a progenitor cell type) can generate complex multi-modal trajectories (i.e., that of terminally differentiated states). Sophia does a wonderful job explaining the new results that we're presenting in our camera-ready version below! 👇Please come and support her and the team at our poster in Brazil! 🇧🇷
YouTube video
YouTube
Sophia Tang@_sophia_tang_

Our paper, “Branched Schrödinger Bridge Matching” (BranchSBM), has been accepted as a main conference paper at #ICLR2026 in Rio! 🌳🧫🇧🇷 In the camera-ready version, we include a new experiment scaling BranchSBM to 11 branches on cell differentiation data! 📷 Check out our freshly updated project page and Github repo below 👇🏻 🌳 Project Page: sophtang.github.io/branch-sbm 📄 Camera-Ready Paper: arxiv.org/abs/2506.09007 💻 Github: github.com/sophtang/Branc… 📹 Reading Group Presentation: youtu.be/inVYA0pQ4Wg?si… Branching is ubiquitous in many dynamical systems, including cell differentiation into distinct fates, diverging cellular responses to drug perturbations, and population dynamics. 🧫 But, existing flow matching and SBM frameworks approximate multi-modal distributions by simulating many independent particle trajectories, which are susceptible to mode collapse, with particles concentrating on dominant high-density modes or traversing only low-energy intermediate paths. To address this challenge, we introduce 🌳 BranchSBM 🌳, a framework that learns a set of diverging velocity fields to reconstruct multi-modal target distributions while simultaneously learning growth networks that allocate mass across branches. 🌳 Our key idea was to define the Branched Schrödinger Bridge Problem as the sum of unbalanced generalized Schrödinger bridge problems, where the weight determines the redistribution of mass across each branch over time. 🌳 We introduce a multi-stage training algorithm to learn the optimal branching drift and growth fields that transport mass along a branched trajectory. This allows BranchSBM to capture diverging, energy-minimizing dynamics without requiring intermediate-time supervision and can generate the full branched evolution from a single initial sample. 🌳 We demonstrate the unique capability of BranchSBM to model dynamic branching trajectories in real-world settings, from differentiating single-cell population dynamics (up to 11 branches!) to simulating diverging cellular responses to drug perturbation. On an unrelated note, I wanted to take this post to congratulate my inspiring and endlessly supportive research mentor, @yinuo_z98, who just defended her PhD and is officially a PhD graduate!! 👩🏻‍🎓 We’re super excited to present BranchSBM in Rio this April 🇧🇷, along with new workshop papers to be announced! And of course, very grateful for the support from @AlexanderTong7 and @pranamanam 💫

English
2
9
75
11.5K
Benjamin Perry
Benjamin Perry@bots_and_bits·
Multi-objective phage-assisted continuous evolution. Being able to generate large parallel datasets to map and engineer complex fitness landscapes at scale is BIG. Check out @bffswithbiology's work!🧪
Ryan Boileau@bffswithbiology

Aaaand it’s online ahhhhh!!! 🥳🥳 So excited!! The first glimpse of my postdoc work with @chorye @dukecagt. Here, @stefanmgolas and I developed TurboPRANCE, an open-source robotics platform for rapid and scaled phage-assisted continuous evolutions. 🧪Tweetorial party!👇1/n

English
1
0
7
692
Yunha Hwang
Yunha Hwang@Micro_Yunha·
For a typical microbial genome, all-vs-all PPI prediction with AlphaFold3 would take hundreds of GPU-years. With FlashPPI, we can scale molecular interaction prediction across diverse, non-model microbial genomes, unlocking truly scalable discovery. We deployed FlashPPI on Seqhub.org for intuitive and rapid exploration of PPI network, give it a spin!
English
2
2
14
3.8K
Yunha Hwang
Yunha Hwang@Micro_Yunha·
Protein–protein interactions (PPIs) are key to discovering and interpreting new biological functions. We’re excited to introduce 𝑭𝒍𝒂𝒔𝒉𝑷𝑷𝑰: a new application of gLM2 that uses genomic language modeling to predict proteome-wide PPIs in microbial genomes in minutes.
GIF
English
9
78
448
21.8K
Benjamin Perry
Benjamin Perry@bots_and_bits·
@DhuviKarthikey1 sounds like you should use a claude cowork scheduled task to brief your lab on AI tool updates
English
0
0
1
69
Benjamin Perry retweetledi
dhuv.io
dhuv.io@DhuviKarthikey1·
Gave a presentation last week to the lab on using AI tools and it’s half outdated alr 🫠🥴
English
1
2
6
679
Benjamin Perry retweetledi
Christian Dallago
Christian Dallago@sacdallago·
Five years ago, we released FLIP. The core question was: can ML models for protein fitness prediction generalize in the ways that actually matter for protein engineering, i.e. low data, extrapolation to more mutations, out-of-distribution sequences?
English
3
10
56
4.6K
fajie yuan
fajie yuan@duguyuan·
@bots_and_bits After training on Colab, one can easily share their model on Hugging Face with just one click. Others can directly use these shared models on our Colab platform, or re-train them with their own data, and then share them back to the Hub. Everything can be done with a few clicks.
English
1
0
1
57
fajie yuan
fajie yuan@duguyuan·
Want to fine-tune protein language models but don't have ML experience? 💻❌ We've got you covered! 📢 ✅ Previously: ColabSaprot & ColabSeprot (ESM1/2, ProTrek, ProtBert) 🆕 Now Available: ColabESMC & ColabESM3 Links➡️github.com/westlake-repl/… Tutorial➡️youtube.com/watch?v=nmLtjl…
YouTube video
YouTube
fajie yuan@duguyuan

ColabSaprot & SaprotHub are now in @NatureBiotech! 🧬 A user-friendly, no-code platform for training, sharing, and collaborating on protein language models. We also provide ColabSeprot, integrating ESM1b, ESM2, ProTrek, and ProtBert for the community. nature.com/articles/s4158…

English
7
18
81
15.4K
Andrew White 🐦‍⬛
Andrew White 🐦‍⬛@andrewwhite01·
After a few years of procrastination, I've updated my textbook. Changes: 1. Tensorflow -> PyTorch 2. Darkmode 3. Added scaffold split section 4. Fixed many typos
Andrew White 🐦‍⬛ tweet media
English
11
67
657
26.3K
Vince Tran
Vince Tran@tranvinq·
Excited to see this out! Grateful to Patrick for his mentorship and support throughout the arc of this story. We look forward to seeing what the community will engineer with MULTI-evolve!
Patrick Hsu@pdhsu

Delighted to share new @arcinstitute work from our group on AI-accelerated lab-in-the-loop, in @ScienceMagazine today One of the most remarkable things about biology is that it's digital. DNA, RNA, proteins: these are all sequences, and their function is directly encoded in their sequence of letters. But a protein of length N has 20^N possible variants and the vast majority are non-functional. Evolution spent billions of years finding the functional needles in this haystack through random exploration and natural selection. For modern biomedicine, we need to solve this in days to weeks.

English
1
1
19
2.2K
Benjamin Perry
Benjamin Perry@bots_and_bits·
@kabirhbiswas @romerolab1 Hi Kabir! Good question. We separate MSA + Folding in our method; however, folding is still needed. So any complex that would cause VRAM issues in the default AF3 folding pipeline would encounter the same issues here.
English
0
0
0
17
Kabir H Biswas, PhD
Kabir H Biswas, PhD@kabirhbiswas·
@romerolab1 Congratulations to you and the team! Any chance that these improvements will also allow prediction of larger structures/complexes with the same GPU RAM?
English
1
0
0
176
Romero lab
Romero lab@romerolab1·
AlphaFold 3 is a game-changer for biomolecular modeling, but the CPU-bound MSA bottleneck is a major hurdle for high-throughput discovery. Today, Romero lab introduces AlphaFast: our new framework that delivers a 22.8x speedup in AF3 inference on a single GPU. 🚀 1/5
Romero lab tweet media
English
2
24
140
7.6K
Benjamin Perry
Benjamin Perry@bots_and_bits·
@QosmosChem Yes. We wanted to start with AF3 but the general framework should apply to any folding model!
English
1
0
7
818
Benjamin Perry
Benjamin Perry@bots_and_bits·
AlphaFold 3 just got a massive speed boost. 🚀 We’re introducing AlphaFast: a GPU-accelerated framework that cuts AF3 inference from >10 mins to ~25 seconds on a single GPU–a 22.8x speedup–without losing structural accuracy. More details below! 1/6 🧵
Benjamin Perry tweet media
English
6
100
711
37.6K