Felipe Engelberger

972 posts

Felipe Engelberger banner
Felipe Engelberger

Felipe Engelberger

@fengel97

PhD Student at @MeilerLab @UniLeipzig | Member of the PB³ Lab @pb3_lab | Biochem @Uchile | Founder @DataRoot_CL

Katılım Ekim 2011
2K Takip Edilen707 Takipçiler
Sabitlenmiş Tweet
Felipe Engelberger
Felipe Engelberger@fengel97·
I am thrilled to share my first article as a first author. Last semester while working as a T.A we implemented a set of 12 tutorials related to different bioinformatics topics within Jupyter Notebooks running on Google Colab. pubs.acs.org/doi/10.1021/ac… (1/4)
English
11
66
224
0
Felipe Engelberger retweetledi
Brian L Trippe
Brian L Trippe@brianltrippe·
🚨New paper! Generative models are often “miscalibrated”. We calibrate diffusion models, LLMs, and more to meet desired distributional properties. E.g. we finetune protein models to better match the diversity of natural proteins. arxiv.org/abs/2510.10020 github.com/smithhenryd/cgm
English
3
45
202
20.2K
Felipe Engelberger retweetledi
Yo Akiyama
Yo Akiyama@yoakiyama·
Excited to share work with @ZhidianZ, Milot Mirdita, Martin Steinegger, and @sokrypton biorxiv.org/content/10.110… TLDR: We introduce MSA Pairformer, a 111M parameter protein language model that challenges the scaling paradigm in self-supervised protein language modeling 🧵
English
8
51
194
26.2K
Felipe Engelberger retweetledi
Simón Vidal
Simón Vidal@SimonVidalV·
@ydeigin @OpenAI @RetroBio_ For every new technology (in this case engineered proteins) you have to first make a PoC.
English
1
1
9
799
Taelin
Taelin@VictorTaelin·
This is definitely not how human knowledge evolves. The brain wasn't designed by gradient descent, it was designed by continuous symbolic evolution of a list of 4-bit tokens. And we use gradient descent to think and learn either. We only know one success story of a human-like intelligence being evolved, and it was a discrete algorithm.
English
4
1
8
528
Taelin
Taelin@VictorTaelin·
This tweet will probably be deleted in 512ms because I'm most likely wrong and I don't want to upset people, but - I feel like ultimately what might be happening with the AI space is that people (including very smart people) are incorrectly affected by the illusion that a technology that is inherently incapable of reasoning will eventually do it. That illusion is fueled by the inherent difficulty that the human brain has to grasp large scales, and how something that has essentially memorized the entire internet is statistically very likely to answer your question intelligently by pure recall, because you're yourself very predictable and the things you can ask it are most likely close to a space of ideas that another human had in the past. This is causing these AGI labs to push models into this weird "reasoning" direction that also seems to work because it is suddenly able to nail these math benchmarks, but, again, that's an illusion because, even if these questions aren't directly in the dataset (and they probably are), they still lie inside this small space of human ideas. And the problem with this is that we're trying to make models reason precisely because we want them to expand science, but expanding science requires precisely the one thing LLMs can't do, which is explore a whole new, unexpected space of ideas that don't connect to anything we've discussed before. A few years before quantum physics was discovered, its core ideas were completely outside of human discourse, thoughts, and no amount of circling the same box (which is what reasoning models do) would get us there. So, we keep trying to make these models do something they'll never do - invent new science - and that's frustrating because this, in turn, makes LLMs do worse on what they excel, which is (sorry but...) being a glorified auto-complete. That is, a bot that, given the human-provided reasoning, goes on to produce the actual boring work. Sonnet is really effective to me precisely because it is very deterministic, it isn't trying to be too smart and it will just do exactly as I ask. If my instruction is wrong, it will be wrong too, and that's actually a feature. o1, on the other hands, will try to be too smart, and that will make it completely chaotic and unreliable when you just want it to follow instructions. Now, probably as a response to o1, I'm almost sure Sonnet-3.6 incorporated some kind of "mini reasoning" on it, which makes it slightly less good to me. I hope that Anthropic doesn't keep going in that direction and instead just make Sonnet-4 a natural extension to whatever they did with the original Sonnet-3.5, because a fully deterministic Sonnet-4 with 10x effective context size would absolutely groundbreaking to my own work, and certainly way more useful to me than a model that takes a lot of time to spit objectively worse code.
Tigran Sloyan@TigranSloyan

o1 pro's math skills are very impressive 😮. Here is o1 pro solving Q3 (the hardest question) from IMO 2006 in 6 minutes and 48 seconds. For contrast, in 2006, out of roughly 500 or so top Math kids under 19 in the whole world only 28 were able to fully solve it...and they had 4 and a half hours to do so...And no one from the 6 person US team could do it... I've tried this question with every other model (including o1) and this is the first time that I've seen an AI model get the answer correct. P.S. Obviously this is a very summarized solution so I did ask it to show work especially on steps 4/5. The expanded thought process was just as impressive.

English
282
209
2.3K
349.5K
Felipe Engelberger retweetledi
Susana Vazquez Torres
Susana Vazquez Torres@SusanaVazTor·
Thrilled to have shared my personal and scientific journey on the Baker Lab podcast! I hope it serves as an inspiration for aspiring scientists in Latin America.
The Baker Lab Podcast@BakerLabPodcast

NEW EPISODE 🎧 Susana Vazquez Torres crossed continents to become a scientist. In our lab, she's used AI to create new antitoxins for snakebites. Apple: podcasts.apple.com/us/podcast/the… Spotify: open.spotify.com/episode/580T5E…

English
2
14
56
7.2K
Felipe Engelberger retweetledi
Sergey Ovchinnikov
Sergey Ovchinnikov@sokrypton·
Started teaching again! This time decided to try use #claude (@AnthropicAI) and @codesandbox for hosting to implement an interactive GREMLIN (Potts) model (w = coevolution, b = conservation) to show students how you go from MSA to contacts! 9kssnq.csb.app
English
9
36
265
15.5K
Bingqing Cheng
Bingqing Cheng@ChengBingqing·
Bothered by the lack of long-range interactions in ML potentials? Meet Latent Ewald Summation—our solution to fix "shortfalls" in short-ranged ML potentials for electrostatic and dielectric systems, with only a modest computational cost! arxiv.org/abs/2408.15165
English
4
31
207
24.8K
Felipe Engelberger retweetledi
Yunha Hwang
Yunha Hwang@Micro_Yunha·
7/ In particular, we showcase gLM2's ability to directly learn coevolutionary signal in protein-protein interfaces with no supervision! The learned contact maps can be extracted using @ZhidianZ et al's categorial Jacobian method.
Yunha Hwang tweet media
English
1
17
84
50.4K
Felipe Engelberger retweetledi
Yunha Hwang
Yunha Hwang@Micro_Yunha·
4/ Metagenomes feature significant bias and redundancies, but deduplicating large genomic databases can be computationally expensive. We implement *genomic SemDeDup*, an embedding-based deduplication for genomic sequences, resulting in tunable balancing and pruning of the corpus
Yunha Hwang tweet media
English
1
3
16
3.8K
Felipe Engelberger retweetledi
Daniella Pretorius
Daniella Pretorius@daniellapret·
Amazing time at the @ImperialX_AI open day - super cool facilities and great content😆. Excited to receive a poster prize at the end! Electronic aspects inspired from @bradyajohnston who gave great advice to make sure the iPad stayed stuck! Animation code taken from @sokrypton 😎
Daniella Pretorius tweet media
English
4
5
91
11.1K
Felipe Engelberger retweetledi
Moritz Ertelt
Moritz Ertelt@ErteltMoritz·
We recently created some new tools in Rosetta around ML protein design methods (including ProteinMPNN, ESM2, MIF-ST). You can run all of this using `docker run -it rosettacommons/rosetta:ml` (no python involved). We then benchmarked the different methods: biorxiv.org/content/10.110…
English
4
38
150
13.7K
Felipe Engelberger retweetledi
Ramith Hettiarachchi
Ramith Hettiarachchi@ramith__·
Had a great time at the ML for Drug Discovery Summer School & MoML conference in Montréal (my first time in🇨🇦!) Kudos to the organizers for arranging such a great lineup of talks & labs! thanks to the hackathon, got some hands on experience with some concepts taught👨🏼‍💻
Ramith Hettiarachchi tweet mediaRamith Hettiarachchi tweet mediaRamith Hettiarachchi tweet mediaRamith Hettiarachchi tweet media
English
0
6
50
2.9K
Lauren Porter
Lauren Porter@Lauren_L_Porter·
Good question 👇
English
2
0
4
1.8K
Shubhendu Trivedi
Shubhendu Trivedi@_onionesque·
We have the work of Passaro & Zitnick, and the Gaunt product next in progress to produce custom CUDA implementations of. I will post if and when there are updates.
English
1
0
7
919