Daniel Shao

45 posts

Daniel Shao

Daniel Shao

@DanielStupid

Katılım Mayıs 2015
218 Takip Edilen30 Takipçiler
Daniel Shao retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody–Antigen Affinity Ranking 1.AbRank introduces a large-scale benchmark for antibody–antigen (Ab–Ag) affinity prediction, reframing the task as pairwise ranking rather than regression. This design improves generalization and robustness by focusing on relative binding preferences instead of noisy absolute values. 2.The dataset comprises over 380,000 Ab–Ag binding measurements aggregated from nine public sources. It includes highly diverse antibodies and antigens across multiple experimental conditions and affinity metrics (Kd, IC50, escape fractions). 3.AbRank introduces "m-confident ranking" by training only on pairs with at least an m-fold difference in affinity. This filters out ambiguous comparisons and emphasizes biologically meaningful distinctions. 4.Three standard train-test splits are provided to assess generalization: (i) Balanced, (ii) Hard Ab (novel antibodies), and (iii) Hard Ag (novel antigens). These splits test performance under increasing distribution shifts. 5.Two benchmarking scenarios are supported: the Unrelated Complex Benchmark (diverse Ab–Ag pairs) and the Local Perturbation Benchmark (closely related variants). This dual setup evaluates both broad generalization and fine-grained affinity shifts (e.g., from mutations). 6.Structures for all antibodies and antigens were predicted using efficient models (IgFold, Boltz-1), enabling scalable structure-aware learning without requiring known complex structures. 7.The authors propose WALLE-Affinity, a graph-based method combining pretrained embeddings (AntiBERTy for Abs, ESM-2 for Ags) with structural graphs to predict pairwise affinity rankings. 8.WALLE-Affinity trained with ranking loss consistently outperforms regression-based variants and other baselines (ANTIPASTI, GearBind, PBEE, FoldX), especially under hard generalization settings. 9.The model performs inference using only individual Ab and Ag structures, avoiding complex structure prediction while remaining fast (~10 sec/complex) and accurate. 10.Ranking-based supervision consistently yields better generalization than regression, particularly for unseen antigen scenarios. This supports the hypothesis that pairwise comparison is more robust to noise and label uncertainty. 11.Despite its scalability and robustness, the model’s performance declines on local perturbation tasks, reflecting the challenge of predicting subtle changes from minor sequence edits. 12.AbRank offers a unified platform for evaluating Ab–Ag affinity models under realistic and challenging scenarios. It is designed to catalyze progress in therapeutic antibody design, affinity maturation, and immune escape prediction. 💻Code: github.com/biochunan/AbRa… 📜Paper: arxiv.org/abs/2506.17857… #AntibodyDesign #ProteinInteraction #MachineLearning #Bioinformatics #GraphNeuralNetworks #Ranking #Benchmark #ComputationalBiology
Biology+AI Daily tweet media
English
0
1
7
1.1K
Daniel Shao retweetledi
Michael Moor
Michael Moor@Michael_D_Moor·
🧵1/ ✨New preprint ✨ LLMs are getting better at answering medical questions. However, they still struggle to spot and fix errors in their own reasoning. That’s a big problem in medicine, where stakes are high and mistakes at any step could be critical. To address this issue, we introduce Med-PRM, a process reward model that evaluates each reasoning step using clinical guidelines and high-quality medical sources. Evaluated on 7 benchmarks, Med-PRM improves accuracy by up to +13.5%, enabling the first open 8B-parameter model to surpass 80% on MedQA. We hope that this work takes the field one step into the direction of trustworthy and verified medical LLMs. 📄 Paper: arxiv.org/abs/2506.11474 🔗 Page: med-prm.github.io 🧠 Model: huggingface.co/dmis-lab/llama… 📚 Dataset: huggingface.co/datasets/dmis-… 💻 Code: github.com/eth-medical-ai… Great collab with: Jaehoon Yun*, Jiwoong Sohn* (@de_Jiung), Jungwoo Park*, Hyunjae Kim, Xiangru Tang (@XiangruTang), Daniel Shao (@DanielStupid), Yong Hoe Koo, Ko Minhyeok, Qingyu Chen (@qingyu_qc), Mark Gerstein (@MarkGerstein), Jaewoo Kang# (@jkang101).
Michael Moor tweet media
English
2
52
226
53K
CaoHe
CaoHe@Shinichi_Izumm·
@andrewwhite01 Great work!​​ We also explore LLM reasoning in chemistry with Beyond Chemical QA (arxiv.org/pdf/2505.21318)—a ​​22-task, 1500-sample benchmark​​ for step-by-step evaluation. Expanding to a ​​larger long-CoT dataset​​ for better RL training—exciting times ahead! 🚀
English
1
0
3
238
Andrew White 🐦‍⬛
Andrew White 🐦‍⬛@andrewwhite01·
At FutureHouse, we’ve noticed scientific agents are good at applying average intelligence across tasks. They always seem to make the obvious choices, which is good, but discovery sometimes requires more intuition and insight than average. We’ve made the first step today towards superhuman insight by training a reasoning model for a specific domain of science: designing drug-like molecules. We’re releasing a 24B open-weights reasoning model called 𝚎𝚝𝚑𝚎𝚛𝟶. 𝚎𝚝𝚑𝚎𝚛𝟶 has been trained with reinforcement learning to exceed frontier and human experts across a range of molecular design tasks. 𝚎𝚝𝚑𝚎𝚛𝟶 takes in natural language, reasons in English, and outputs a new molecule. 𝚎𝚝𝚑𝚎𝚛𝟶 is now a tool for our chemistry design agent, Phoenix, which can call upon it to design molecules. Training a reasoning model for a scientific domain like chemistry, rather than math or programming, required a number of small technical advances. For example, we developed an iterative method of split specialist models and aggregation of reasoning traces. Another example is we used LLMs to rewrite questions that were partially solved. A major finding from this work is that we can train with >10x efficiency per experimental measurement when using a reasoning model, rather than fine-tuning. We also found that reasoning models can learn new tasks, developed specifically for this paper and not in pretraining corpora. We even saw a task have 0% performance until 100 steps into RL, at which it randomly solved once. This, along with our change in modality from natural language to molecules, bodes well for applying reasoning models far from natural language. Reasoning models in science are the future. Scientific tasks are naturally verifiable rewards: the physical world is the ultimate arbiter of accuracy, rather than human contractors. The data efficiency gain and ability to exceed frontier models with relatively few parameters/compute mean that we should expect more scientific reasoning models soon. Congrats to team @SidN137, James, @Ryan__Rhys, Albert, @GWellawatte , @maykcaldas , @ludomitch , and @SGRodriques. Thanks to @VoltagePark @nvidia and @huggingface for supporting us, and huge thanks to @ericschmidt for funding @FutureHouseSF The model weights, reward model, and new benchmark are open source. You can also read more about scientific reasoning models in our exclusive with Nature.
English
13
79
412
80.4K
Daniel Shao retweetledi
Jiayi Zhang
Jiayi Zhang@didiforx·
No fortress, purely open ground. Manus 👋. We open-sourced its core feature in 2 hours after dinner. Check it out 👇: github.com/mannaandpoem/O… 1/4
English
32
53
223
75.2K
Daniel Shao retweetledi
MetaGPT
MetaGPT@MetaGPT_·
20 Months: 0 → 7 papers (2 ICLR orals) & 40+ institution collabs. With a clear vision, we're building the open-source foundation for tomorrow's agents. We also release MGX (mgx.dev) and commit to open-source its core soon. Check threads for what we've built! 1/8
MetaGPT tweet mediaMetaGPT tweet mediaMetaGPT tweet mediaMetaGPT tweet media
English
49
117
256
49.5K
Daniel Shao retweetledi
Jiayi Zhang
Jiayi Zhang@didiforx·
Reasoning models lack atomic thought ⚛️ Unlike humans using independent units, they store full histories🤔 Introducing Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1 ! The best part? It's plugs in for ANY framework 🔌 1/5
Jiayi Zhang tweet mediaJiayi Zhang tweet mediaJiayi Zhang tweet mediaJiayi Zhang tweet media
English
71
406
3.2K
395.5K
Daniel Shao retweetledi
Rob Tang 🦞
Rob Tang 🦞@XiangruTang·
Excited to share our latest work BC-Design - a new framework for highly accurate inverse protein folding that achieves unprecedented 88.37% sequence recovery rate (previous SOTA achieved 67%)! 🧬 To put this in perspective, the field's progress on CATH 4.2 benchmark: ProteinMPNN (2022): ~51% ESM-IF1 (2023): ~55% SPDesign (2024): ~67% BC-DESIGN: 88.37% A quantum leap! 📈 BC-Design uses a novel architecture combining: - Struct-Encoder for backbone structure - BC-Encoder for biochemical features - BC-Fusion module to integrate both signals - All optimized with contrastive learning 🧪 Key innovation: We represent biochemical properties (hydrophobicity & charge) as distributions in 3D space rather than per-residue features. This provides a more natural way to capture spatial distribution of properties. 🔬 Particularly proud of robust generalization - consistently high performance across: - All major CATH fold classes 💪 - Strong performance across proteins of different sizes (50-500 residues) and structural complexity (different structural complexities). 📈 Code and models are available! Looking forward to seeing how the community builds on this work to advance protein design! 😃😃🧑‍🔬 📜:biorxiv.org/content/10.110… #StructuralBiology #DeepLearning #ProteinDesign
Rob Tang 🦞 tweet mediaRob Tang 🦞 tweet mediaRob Tang 🦞 tweet mediaRob Tang 🦞 tweet media
English
0
14
45
5K
Premier League
Premier League@premierleague·
Who gets your vote? 🗳️
Premier League tweet media
English
11.5K
2.3K
66.4K
8.8M
Jim Fan
Jim Fan@DrJimFan·
While we are waiting for World Cup finals, here are DeepMind’s AI bots playing soccer in simulation! The agents don’t communicate with each other and only try to maximize their own incentive. But teamwork and complex strategies *emerge* through repeated competition!
English
29
284
1.7K
236K
LiveScore
LiveScore@livescore·
Which Premier League side is this? 🤔
LiveScore tweet media
English
10.8K
2.3K
34.6K
0