Felipe Engelberger

972 posts

Felipe Engelberger

@fengel97

PhD Student at @MeilerLab @UniLeipzig | Member of the PB³ Lab @pb3_lab | Biochem @Uchile | Founder @DataRoot_CL

Katılım Ekim 2011

2K Takip Edilen707 Takipçiler

Sabitlenmiş Tweet

Felipe Engelberger@fengel97·1 Nis

I am thrilled to share my first article as a first author. Last semester while working as a T.A we implemented a set of 12 tutorials related to different bioinformatics topics within Jupyter Notebooks running on Google Colab. pubs.acs.org/doi/10.1021/ac… (1/4)

English

224

Felipe Engelberger retweetledi

Brian L Trippe@brianltrippe·14 Eki

🚨New paper! Generative models are often “miscalibrated”. We calibrate diffusion models, LLMs, and more to meet desired distributional properties. E.g. we finetune protein models to better match the diversity of natural proteins. arxiv.org/abs/2510.10020 github.com/smithhenryd/cgm

English

202

20.2K

Felipe Engelberger retweetledi

Yo Akiyama@yoakiyama·5 Ağu

Excited to share work with @ZhidianZ, Milot Mirdita, Martin Steinegger, and @sokrypton biorxiv.org/content/10.110… TLDR: We introduce MSA Pairformer, a 111M parameter protein language model that challenges the scaling paradigm in self-supervised protein language modeling 🧵

English

194

26.2K

Felipe Engelberger retweetledi

Simón Vidal@SimonVidalV·18 Oca

@ydeigin @OpenAI @RetroBio_ For every new technology (in this case engineered proteins) you have to first make a PoC.

English

799

Felipe Engelberger@fengel97·7 Ara

@VictorTaelin @robleclerc 🤔 nature.com/articles/s4156…

QME

133

Taelin@VictorTaelin·7 Ara

This is definitely not how human knowledge evolves. The brain wasn't designed by gradient descent, it was designed by continuous symbolic evolution of a list of 4-bit tokens. And we use gradient descent to think and learn either. We only know one success story of a human-like intelligence being evolved, and it was a discrete algorithm.

English

528

Taelin@VictorTaelin·7 Ara

This tweet will probably be deleted in 512ms because I'm most likely wrong and I don't want to upset people, but - I feel like ultimately what might be happening with the AI space is that people (including very smart people) are incorrectly affected by the illusion that a technology that is inherently incapable of reasoning will eventually do it. That illusion is fueled by the inherent difficulty that the human brain has to grasp large scales, and how something that has essentially memorized the entire internet is statistically very likely to answer your question intelligently by pure recall, because you're yourself very predictable and the things you can ask it are most likely close to a space of ideas that another human had in the past. This is causing these AGI labs to push models into this weird "reasoning" direction that also seems to work because it is suddenly able to nail these math benchmarks, but, again, that's an illusion because, even if these questions aren't directly in the dataset (and they probably are), they still lie inside this small space of human ideas. And the problem with this is that we're trying to make models reason precisely because we want them to expand science, but expanding science requires precisely the one thing LLMs can't do, which is explore a whole new, unexpected space of ideas that don't connect to anything we've discussed before. A few years before quantum physics was discovered, its core ideas were completely outside of human discourse, thoughts, and no amount of circling the same box (which is what reasoning models do) would get us there. So, we keep trying to make these models do something they'll never do - invent new science - and that's frustrating because this, in turn, makes LLMs do worse on what they excel, which is (sorry but...) being a glorified auto-complete. That is, a bot that, given the human-provided reasoning, goes on to produce the actual boring work. Sonnet is really effective to me precisely because it is very deterministic, it isn't trying to be too smart and it will just do exactly as I ask. If my instruction is wrong, it will be wrong too, and that's actually a feature. o1, on the other hands, will try to be too smart, and that will make it completely chaotic and unreliable when you just want it to follow instructions. Now, probably as a response to o1, I'm almost sure Sonnet-3.6 incorporated some kind of "mini reasoning" on it, which makes it slightly less good to me. I hope that Anthropic doesn't keep going in that direction and instead just make Sonnet-4 a natural extension to whatever they did with the original Sonnet-3.5, because a fully deterministic Sonnet-4 with 10x effective context size would absolutely groundbreaking to my own work, and certainly way more useful to me than a model that takes a lot of time to spit objectively worse code.

Tigran Sloyan@TigranSloyan

o1 pro's math skills are very impressive 😮. Here is o1 pro solving Q3 (the hardest question) from IMO 2006 in 6 minutes and 48 seconds. For contrast, in 2006, out of roughly 500 or so top Math kids under 19 in the whole world only 28 were able to fully solve it...and they had 4 and a half hours to do so...And no one from the 6 person US team could do it... I've tried this question with every other model (including o1) and this is the first time that I've seen an AI model get the answer correct. P.S. Obviously this is a very summarized solution so I did ask it to show work especially on steps 4/5. The expanded thought process was just as impressive.

English

282

209

2.3K

349.5K

Felipe Engelberger@fengel97·9 Kas

@DdelAlamo @MirceaSci

QAM

Diego del Alamo@DdelAlamo·8 Kas

Feature 3732 seems to be the one best capturing the membrane - here's ZnT8, rhomboid, and bovine rhodopsin (PDB 6XPD, 2XOV, 1GZM). Some of the helices even show the alternating pattern you'd expect from membrane-exposed helices

Diego del Alamo@DdelAlamo

I spent quite a while trying to find the ones that recognize the membrane, but couldn't! This is the closest I got

English

4.8K

Felipe Engelberger retweetledi

Susana Vazquez Torres@SusanaVazTor·6 Kas

Thrilled to have shared my personal and scientific journey on the Baker Lab podcast! I hope it serves as an inspiration for aspiring scientists in Latin America.

The Baker Lab Podcast@BakerLabPodcast

NEW EPISODE 🎧 Susana Vazquez Torres crossed continents to become a scientist. In our lab, she's used AI to create new antitoxins for snakebites. Apple: podcasts.apple.com/us/podcast/the… Spotify: open.spotify.com/episode/580T5E…

English

7.2K

Felipe Engelberger retweetledi

Sergey Ovchinnikov@sokrypton·7 Eki

Started teaching again! This time decided to try use #claude (@AnthropicAI) and @codesandbox for hosting to implement an interactive GREMLIN (Potts) model (w = coevolution, b = conservation) to show students how you go from MSA to contacts! 9kssnq.csb.app

English

265

15.5K

Felipe Engelberger@fengel97·29 Ağu

@ChengBingqing @MirceaSci

QAM

400

Bingqing Cheng@ChengBingqing·28 Ağu

Bothered by the lack of long-range interactions in ML potentials? Meet Latent Ewald Summation—our solution to fix "shortfalls" in short-ranged ML potentials for electrostatic and dielectric systems, with only a modest computational cost! arxiv.org/abs/2408.15165

English

207

24.8K

Felipe Engelberger retweetledi

Sergey Ovchinnikov@sokrypton·23 Ağu

@KevinKaichuang x.com/HattoriLab/sta…

GIF

Motoyuki Hattori (服部素之)@HattoriLab

The Google Colab version of RFdiffusion with the conditional fold generation option worked well! We successfully placed two helices in the beta barrel structure. The RMSD between prediction and X-ray is only 0.3 Å! colab.research.google.com/github/sokrypt… colab.research.google.com/github/sokrypt…

QME

3.3K

Felipe Engelberger retweetledi

Yunha Hwang@Micro_Yunha·14 Ağu

7/ In particular, we showcase gLM2's ability to directly learn coevolutionary signal in protein-protein interfaces with no supervision! The learned contact maps can be extracted using @ZhidianZ et al's categorial Jacobian method.

English

50.4K

Felipe Engelberger retweetledi

Yunha Hwang@Micro_Yunha·14 Ağu

4/ Metagenomes feature significant bias and redundancies, but deduplicating large genomic databases can be computationally expensive. We implement *genomic SemDeDup*, an embedding-based deduplication for genomic sequences, resulting in tunable balancing and pruning of the corpus

English

3.8K

Felipe Engelberger retweetledi

PB³ Lab@pb3lab·11 Tem

NEWS | Epic announcement coming tomorrow, thanks to @RosettaCommons @MeilerLab 🎉 Stay tuned!

English

724

Felipe Engelberger retweetledi

Daniella Pretorius@daniellapret·3 Tem

Amazing time at the @ImperialX_AI open day - super cool facilities and great content😆. Excited to receive a poster prize at the end! Electronic aspects inspired from @bradyajohnston who gave great advice to make sure the iPad stayed stuck! Animation code taken from @sokrypton 😎

English

11.1K

Felipe Engelberger retweetledi

Moritz Ertelt@ErteltMoritz·2 Tem

We recently created some new tools in Rosetta around ML protein design methods (including ProteinMPNN, ESM2, MIF-ST). You can run all of this using `docker run -it rosettacommons/rosetta:ml` (no python involved). We then benchmarked the different methods: biorxiv.org/content/10.110…

English

150

13.7K

Felipe Engelberger retweetledi

Ramith Hettiarachchi@ramith__·22 Haz

Had a great time at the ML for Drug Discovery Summer School & MoML conference in Montréal (my first time in🇨🇦!) Kudos to the organizers for arranging such a great lineup of talks & labs! thanks to the hackathon, got some hands on experience with some concepts taught👨🏼‍💻

English

2.9K

Felipe Engelberger@fengel97·21 Haz

@Lauren_L_Porter I tried this and in the couple of cases I tried the answer was no

English

104

Lauren Porter@Lauren_L_Porter·21 Haz

Good question 👇

English

1.8K

Felipe Engelberger@fengel97·21 Haz

@MirceaSci Thanks Mircea! 😊

English

Mircea Petrache@MirceaSci·21 Haz

Awesome! well done!

Felipe Engelberger@fengel97

Yesterday, I had the chance to present my first spotlight poster at #MoML2024 at @Mila_Quebec AI Institute! I presented joint work done with @luisa_kaermer and our amazing advisors, @JWestermayr and @MeilerLab. Huge thanks to @Esporascicomm for the amazing graphic design work!

English

311

Felipe Engelberger@fengel97·21 Haz

English

4.2K

Felipe Engelberger@fengel97·21 Haz

A 2D version of “AlphaFold pseudo Multi Dimensional Scaling” skills

Keyon Vafa@keyonV

New paper: How can you tell if a transformer has the right world model? We trained a transformer to predict directions for NYC taxi rides. The model was good. It could find shortest paths between new points But had it built a map of NYC? We reconstructed its map and found this:

English

327

Felipe Engelberger@fengel97·16 Haz

@_onionesque Are you working with @mario1geiger on this? 👀

English

Shubhendu Trivedi@_onionesque·16 Haz

We have the work of Passaro & Zitnick, and the Gaunt product next in progress to produce custom CUDA implementations of. I will post if and when there are updates.

English

919

Shubhendu Trivedi@_onionesque·16 Haz

Apropos of some real life discussions: We have superfast custom CUDA implementations for tensor-product-based (Clebsch-Gordan) equivariant NNs: github.com/zlin7/CGNet Based on the papers (and heavily optimized further!) arxiv.org/abs/1806.09231 and arxiv.org/abs/2010.11661

English

8.1K

Keşfet

@ZhidianZ @sokrypton @ydeigin @OpenAI @RetroBio_ @VictorTaelin @robleclerc @DdelAlamo