Caleb Ellington

575 posts

Caleb Ellington

@probablybots

Scientist @genbioai | PhD @CMUCompBio | Creator/maintainer https://t.co/E3h3NJS6S7 | multi-task learning, graphical models, and personalized medicine

San Francisco, CA 가입일 Aralık 2017

356 팔로잉778 팔로워

고정된 트윗

Caleb Ellington@probablybots·27 May

Honored to share a major thread of my PhD research, out now in PNAS. We address a core issue with how models are used for scientific discovery. Models are so important that they define the entire scientific process... 1/n

English

321

61.3K

Caleb Ellington 리트윗함

GenBio AI@genbioai·3d

Many “virtual cell” efforts restrict themselves to cell-level assays like scRNA-seq. To build a true world model for biology, we need to move beyond the individual cell and model the tissue context as well. GenBio-PathFM is a new histopathology foundation model from GenBio AI. It is the only SOTA model trained without using proprietary image archives, and the strongest open-weight model to date. Highlights: - SOTA performance on public pathology benchmarks (THUNDER, HEST, PathoROB - shown below). - Unprecedented data efficiency, requiring 5x-15x fewer WSIs for training. - Novel two-stage pretraining strategy combining DINO and JEPA. Blog post: genbio.ai/genbio-pathfm Paper: genbio.ai/papers/genbio-… GitHub: github.com/genbio-ai/genb…

English

6.1K

Caleb Ellington 리트윗함

Jonathan Gorard@getjonwithit·6d

I think one of the conclusions we should draw from the tremendous success of LLMs is how much of human knowledge and society exists at very low levels of Kolmogorov complexity. We are entering an era where the minimal representation of a human cultural artifact... (1/12)

English

188

494

4.5K

744.1K

Caleb Ellington 리트윗함

Haohan Wang@HaohanWang·25 Şub

Ever since I was a teenager, I have been wondering why Google can make such a huge amount of money, and one reason I now believe is that it serves as a gatekeeper between user and the massive information online. Nowadays, we are witnessing a quick shift of this gatekeeper from Google-style search engine to large language models. Therefore, what used to matter a lot in the search engine context will soon start to matter in LLM context. One example would be the items ranked by search engine (so-called search engine optimization) and now by LLM. Therefore, we introduce this (one of the first) solutions to answer this question: "How can I write my product descriptions, so that it will be ranked at the top when a user asks an LLM to recommend similar things to buy" Here comes our recent work: 🚀 Controlling Output Rankings in Generative Engines for LLM-based Search 🚀 With a solution, a benchmark, and a demo. Check out our project page: llm-recommendation.vercel.app Or directly play with the demo to feel the power: ivonne-code.github.io/AI-recommendat…

English

1.3K

Caleb Ellington 리트윗함

owl@owl_posting·24 Şub

new @NOETIK_ai blog post discussing a new cancer model we've been working on: TARIO noetik.blog/p/scaling-beha… TARIO is an autoregressive transformer, trained on one of the largest sets of tumor spatial transcriptomics datasets in the world in the post, we discuss some scaling experiments we ran on TARIO across parameter count, context length, and data tokens. we find that we are able to reliably scale the model on all 3 of these axes, and have yet to find some upper limit. we establish some interesting takeaways, including that... 1. models trained on one cancer subtype do not reliably transfer to new cancer subtypes 2. model parameter count must be scaled with context size 3. most interesting of all: pan-cancer models are better than 'specialist' cancer models on all axes, even on whatever the specialist model is trained on and more! we cannot reveal much about how this particular model transfers to patient response prediction (which, at the end of the day, is what matters most) due to privacy concerns, but internal results show scale helping out there as well. we are currently working on a version of this model that can operate in H&E-only regimes, and we expect to be able to more deeply share + interrogate those results in a future post. finally: we are always looking for new ML/engineering/wet-lab talent, and are open to remote candidates! feel free to reach out to me here, or in the email in the article

English

15.7K

Caleb Ellington@probablybots·22 Şub

@anshulkundaje Predicting within the margin of experimental error is still a major step forward, and hopefully the methods help realign the field. But we are far from done, as @anshulkundaje laid out in a great thread here. x.com/anshulkundaje/…

Anshul Kundaje@anshulkundaje

@probablybots Anyway, this is a good study. I just don't think it's great news for the current paradigm. Folks should take a good hard look at the results presented here and other recent work & think about where to invest efforts.

English

307

Caleb Ellington@probablybots·22 Şub

@anshulkundaje We didn't forget; it's one of three main results in the paper. For perturbation prediction, knowledge base pretraining wins. We agree. But the title/motivation addresses a major open question about whether FMs improve over simple baselines at all. Answer: yes, especially KG FMs.

English

234

Caleb Ellington@probablybots·21 Şub

There's a lot of noise about what does/doesn't work for perturbation modeling (and how to evaluate virtual cells). Excited to share some answers from benchmarking 600+ model variants covering the entire field. Some methods are almost within the margin of experimental error!

GenBio AI@genbioai

Predicting how cells respond to genetic or chemical changes is a fundamental challenge in drug discovery. While the potential of biological Foundation Models (FMs) has been widely discussed, their actual superiority over simple statistical baselines has remained a subject of significant debate in the field. In our latest preprint, we provide a definitive evaluation of FMs for perturbation prediction. By benchmarking over 600 model variants, we demonstrate that FMs, when trained on the right modalities and integrated effectively, provide a significant leap in predictive accuracy. Our findings confirm that FMs are not just a theoretical improvement, but a practical tool for building accurate, actionable simulations of cellular behavior. Preprint: biorxiv.org/content/10.648… Code and data: github.com/genbio-ai/foun… Blog post: genbio.ai/foundation-mod…

English

4.1K

Caleb Ellington@probablybots·22 Şub

@anshulkundaje But your point is about the field's priorities, and yes the popular focus on single cell pretraining is absolutely putting the cart before the horse for perturbation modeling. We're glad to have a definitive answer: bang for buck, knowledge base pretraining wins on pert modeling.

English

456

Caleb Ellington@probablybots·22 Şub

@anshulkundaje Which is why the AIDO.Cell scaling on these tasks is still important. There's still runway for this approach and it still applies when we don't have known targets and curated knowledge.

English

534

Caleb Ellington 리트윗함

GenBio AI@genbioai·20 Şub

Don’t miss GenBio AI Co-Founder & Chief Scientific Advisor @Prof_Lundberg at #NVIDIAGTC: 🧬Scaling Laws in Biology: Why Bigger Models Alone Aren’t Enough [S81652] March 18, 10:00–10:40 AM PT An in-person panel on breaking the data wall in Bio x AI through at-scale data generation and new scaling laws. Save your seat → nvda.ws/3OAKT1T 🎟️ Register → nvda.ws/4cCOiY2 v/ @NVIDIAHealth

English

Caleb Ellington 리트윗함

Haohan Wang@HaohanWang·5 Şub

Celebrating the #ICLR2026 acceptance of our paper SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback 🚀 But what really matters is not the acceptance—it's the question that kicked everything off. A few months back, I kept feeling like prompt optimization was strangely familiar. Then it clicked: we're replaying 40 years of neural network parameter optimization... compressed into just ~3 years.🔂 ➡️Parameter side (1980s–2000s): Genetic algorithms → plain SGD (the big breakthrough moment) → Adam, momentum, adaptive rates, second-order tricks. ➡️Prompt side (2022–2025): Evolutionary search (GPS, EvoPrompt) → textual gradients (ProTeGi, TextGrad—the "SGD moment") → what comes next? We think SIPDO is a solid step toward the answer. Instead of passively optimizing against a fixed dataset, SIPDO closes the loop: 🌟A synthetic data generator actively crafts challenging examples to expose the current prompt's exact weaknesses 🌟The optimizer refines the prompt based on those failures 🌟Difficulty ramps up progressively (curriculum-style) 🌟The improved prompt feeds back to generate even harder data It's inspired by adversarial training + curriculum learning, leading to faster convergence and dramatically more robust prompts—no extra human annotations needed. We laid out this full "parallel evolution" framing in our recent blog post, tracing the arc from early genetic methods through textual gradients to where we believe Phase 3 (closed-loop, adaptive, history-aware systems like SIPDO) is headed next.If you're working on prompts, synthetic data, or LLM robustness, this historical lens might spark some ideas: the next real leap could be asking, “What would Adam (or even second-order methods) look like for prompts?”

English

770

Caleb Ellington 리트윗함

Kexin Huang@KexinHuang5·3 Şub

Today we’re launching Phylo, a research lab studying agentic biology, backed by a $13.5M seed round co-led by @a16z and @MenloVentures / Anthology Fund @AnthropicAI. We’re also introducing a research preview of Biomni Lab, the first Integrated Biology Environment (IBE), where we’re imagining a new way biologists work. Biomni Lab uses agents to orchestrate hundreds of biological databases, software tools, molecular AI models, expert workflows, and even external research services in one workspace, supporting research end-to-end from question to experiment to result. Agents handle the mechanics, while you define the question, then review, steer, and decide. Scientists end up spending more time on science: asking questions, understanding mechanisms, and eliminating diseases. Phylo (@phylo_bio) is a spin-out of @ProjectBiomni, where we will maintain the open-source community and push open-science research. I’m grateful to continue building with my co-founders @YuanhaoQ @jure @lecong and the dream founding team @serena2z @TianweiShe @huangzixin20151 @gm2123 @margaretwhua @malayhgandhi. We’re also fortunate to be advised by leading scientists @zhangf, Carolyn Bertozzi, and @fabian_theis, and supported by an amazing group of investors including @JorgeCondeBio @zakdoric Matt Kraning @ZettaVentures @dreidco @conviction @saranormous @svangel @valkyrie_vc and others. Biomni Lab is available for free today: biomni.phylo.bio Learn more in our launch post: phylo.bio/blog/company-f… We are also hosting launch events - join us at South San Francisco: luma.com/n8k8qb0n Virtual: luma.com/l5ryjaij We’re also hiring! phylo.bio/careers

English

112

242

1.7K

434.7K

Caleb Ellington@probablybots·30 Oca

@genbioai Product Engineer jobs.lever.co/genbio/9a86c23… Research Engineer Intern jobs.lever.co/genbio/97d1b07… Research Scientist Intern jobs.lever.co/genbio/7037a80…

English

205

Caleb Ellington@probablybots·30 Oca

If you're excited about AI scientists and biology simulators, we're looking for FTEs and interns @genbioai. Come work with an elite team of nobel laureates and titans of science+engineering on products that both people and agents use to accelerate biomedical research. DM or email

English

906

Caleb Ellington 리트윗함

Ido Salomon@idosal1·15 Oca

Building AgentCraft v1 with AgentCraft v0 is 🤌 Managed up to 9 Claude Code agents with the RTS interface so far. There's a lot to explore, but it feels right. v1 coming soon

English

193

159

2.6K

328.8K

Caleb Ellington 리트윗함

Eric Xing@ericxing·19 Ara

K2-V2 is a 70B parameter 360-open #LLM built from scratch as a superior base for reasoning adaptation, in addition to functions such as conversation and knowledge retrieval from general LLMs. It stands as the strongest fully open model ... (1/2) arxiv.org/abs/2512.06201

English

1.7K

탐색

@NOETIK_ai @anshulkundaje @Prof_Lundberg @NVIDIAHealth @a16z @MenloVentures @AnthropicAI @phylo_bio