tbepler

131 posts

tbepler

@tbepler1

Scientist and Group Leader of the Simons Machine Learning Center @SEMC_NYSBC. Co-founder and CEO of https://t.co/YcCw0bpHZo. Opinions are my own.

شامل ہوئے Ekim 2018

152 فالونگ645 فالوورز

پن کیا گیا ٹویٹ

tbepler@tbepler1·11 Şub

Excited to share PoET-2, our next breakthrough in protein language modeling. It represents a fundamental shift in how AI learns from evolutionary sequences. 🧵 1/13

English

205

29K

tbepler ری ٹویٹ کیا

OpenProtein.AI@openprotein·31 Mar

We’re excited to announce our expanded partnership with Boehringer Ingelheim. Together, we are building the future of AI‑driven antibody discovery and optimization. openprotein.ai/strategic-part…

English

170

tbepler@tbepler1·10 Şub

🤩

OpenProtein.AI@openprotein

Design protein binders in minutes, not months. New on OpenProtein.AI → Improved protein design GUI & refolding metrics for candidate filtering → New structure design models (RFdiffusion, BoltzGen) Nanobody + miniprotein design walkthroughs now live! Links in thread

ART

299

tbepler@tbepler1·18 Ara

@PatrickSoga @openprotein Yes, we will probably take 1-2 interns next year. If you're interested feel free to send us your CV

English

Patrick Soga@PatrickSoga·18 Ara

@tbepler1 @openprotein Are you taking any interns this spring/sunmer?

English

tbepler@tbepler1·16 Ara

New job openings @openprotein across protein foundation model research, computational protein design, and cloud platform engineering openprotein.ai/careers

English

109

9.7K

tbepler@tbepler1·17 Ara

@RippaSatss @openprotein We'll consider anyone with relevant skills who's motivated. Send us your CV and tell us more about what you've worked on!

English

107

Rippa Sats@RippaSatss·17 Ara

Quick question: Do you consider non-PhD candidates with competition-proven protein design experience? - 50 proteins designed and ranked #386/681 (Nipah Binder Competition - ProteinBase) with only laptop. - #207/337 globally (ARC Virtual Cell Challenge) - Drug discovery platforms: 1,192+ molecules validated I know I don't have the PhD, but I've built what you're building. Thoughts? You need builders who ship. I ship. Let's talk.

English

222

tbepler@tbepler1·1 Eyl

@AllThingsApx proceedings.neurips.cc/paper_files/pa… <- from NeurIPS 2023. Also check out openreview.net/forum?id=GZ7gw… <- from ICML this year

English

864

Kyle Tretina, Ph.D.@AllThingsApx·31 Ağu

Quietly, this paper reset the trajectory of protein language modeling earlier this month. Retrieval-at-inference (MSAs) + pair-representation beats 10–100× larger single-sequence LMs on contacts, PPIs, variant effects. Not just a benchmark win - a blueprint: query-biased MSAs, explicit pair geometry, stress-tested reliability.

English

159

10.3K

tbepler@tbepler1·26 Ağu

PoET-2 is open source on github: github.com/OpenProteinAI/… Thanks to the @openprotein team!

English

189

tbepler@tbepler1·26 Ağu

Our preprint on sequence-to-property learning and zero-shot fitness prediction with PoET-2 is live: arxiv.org/abs/2508.04724

English

tbepler ری ٹویٹ کیا

Biology+AI Daily@BiologyAIDaily·9 Ağu

Understanding Protein Function with a Multimodal Retrieval-Augmented Foundation Model 1. PoET-2, a new protein language model, achieves state-of-the-art performance in predicting the effects of mutations on protein function, especially for challenging cases like insertions/deletions and higher-order mutations. This model combines sequence, structure, and evolutionary information in a novel way to improve protein understanding and design capabilities. 2. The model incorporates a hierarchical transformer encoder and dual decoders with both causal and masked language modeling objectives. This dual training approach allows PoET-2 to excel in both generative tasks (like sequence generation) and bidirectional representation learning, making it versatile for various protein-related tasks. 3. PoET-2 leverages retrieval augmentation, which enables it to learn from context and incorporate new sequences not present in the original training data. This feature enhances its ability to adapt to different protein families and their specific evolutionary constraints, leading to more accurate predictions. 4. In zero-shot variant effect prediction, PoET-2 outperforms previous models significantly, especially on datasets involving multiple mutations and indels. It also shows superior performance in supervised settings with limited data, demonstrating excellent data efficiency and generalization ability. 5. The model's architecture includes a structure-based attention bias mechanism, which integrates structural information into the attention operations. This enhances the model's ability to capture 3D structural relationships, contributing to its improved performance in tasks related to protein structure and function. 6. PoET-2 is compact, with only 182 million parameters, making it efficient and scalable. Despite its smaller size, it matches or exceeds the performance of much larger models, highlighting its efficiency and practicality for real-world applications in protein engineering and design. 7. The authors demonstrate PoET-2's effectiveness across various benchmarks, including deep mutational scanning and clinical datasets. The model's ability to predict the fitness effects of mutations accurately can accelerate the development of new therapeutics and enhance our understanding of disease mechanisms. 📜Paper: arxiv.org/abs/2508.04724 #ProteinEngineering #AIinBiology #MachineLearning #ProteinFunction #MutationPrediction

English

3.8K

tbepler@tbepler1·24 Ağu

PoET-2 (and more) is available now on the @openprotein cloud. We also open sourced PoET-2 on github: github.com/OpenProteinAI/… Thanks to @timt1630 and the rest of the @openprotein team!

English

195

tbepler@tbepler1·24 Ağu

Learn more in our pre-print: arxiv.org/abs/2508.04724 or blog post: openprotein.ai/a-multimodal-f…

English

192

tbepler@tbepler1·24 Ağu

If you're interested in the PLM used here, check out PoET-2 from @openprotein! It allows controllable protein generation with sequence (homologue) and optional structure conditioning and is available right now. 🦾

Boris Power@BorisMPower

At @OpenAI, we believe that AI can accelerate science and drug discovery. An exciting example is our work with @RetroBiosciences, where a custom model designed improved variants of the Nobel-prize winning Yamanaka proteins. Today we published a closer look at the breakthrough. ⬇️

English

3.1K

tbepler@tbepler1·19 Ağu

Nice article about @lab_berger 's new work on interpreting PLMs! news.mit.edu/2025/researche…

English

819

tbepler ری ٹویٹ کیا

Andrew Gordon Wilson@andrewgwils·9 Ağu

Repeat after me: learning from data is only possible by making assumptions.

English

146

11.7K

tbepler@tbepler1·4 Ağu

@sokrypton @ferruz_noelia @ruben_weitzman @NotinPascal @yoakiyama I got the 0.45 number from this table

English

105

Sergey Ovchinnikov@sokrypton·4 Ağu

@tbepler1 @ferruz_noelia @ruben_weitzman @NotinPascal @yoakiyama See SI. Even when we break it down by category type, MSA pairformer appears to do better than MSA transformer and ESM2.

English

313

Noelia Ferruz@ferruz_noelia·4 Ağu

I’ve thoroughly enjoyed reading two (VERY!) recent papers that model protein sequences by retrieving evolutionary information (dynamically) at inference time, and there's a lot to unpack! [1] arxiv.org/abs/2506.08954 [2] biorxiv.org/content/10.110… (1/n)

English

184

17.9K

tbepler@tbepler1·4 Ağu

@sokrypton @ferruz_noelia @ruben_weitzman @NotinPascal @yoakiyama I'm not critiquing your internal comparison here, but rather that your Avg. is not comparable with the main number on the proteingym benchmark since it isn't aggregated the same way

English

258

tbepler@tbepler1·4 Ağu

@sokrypton @ferruz_noelia @ruben_weitzman @NotinPascal @yoakiyama The average spearman in proteingym is a macro average over the spearman correlations by function since some functions have significantly more datasets

English

226

Sergey Ovchinnikov@sokrypton·4 Ağu

@tbepler1 @ferruz_noelia @ruben_weitzman @NotinPascal @yoakiyama Can you expand on this? What do you mean by properly averaging? I believe we computed the values the same across the methods we show, so it should be a fair comparison. Unless we made a mistake?

English

419

دریافت کریں

@PatrickSoga @openprotein @RippaSatss @AllThingsApx @timt1630 @lab_berger @sokrypton @ferruz_noelia