
Kenny Workman
1.6K posts

Kenny Workman
@kenbwork
cto @latchbio, data infrastructure for biology



🧵 We ran the largest head-to-head benchmark of protein binder design methods in the wet lab. Project page: research.nvidia.com/labs/genair/pr… 1 million designs. 127 targets. RFdiffusion, BindCraft, BoltzGen, and Proteina-Complexa — all tested side by side.👇








Full footage from our systems x biology reading group with researchers from Arc Institute + FutureHouse. 2:33 LData: Building a distributed filesystem on Postgres and S3 (LatchBio) 26:10 BINSEQ: High-performance binary formats for DNA sequences (Noam Teyssier, Arc Institute) 49:00 Data Flywheels: Reinforcement learning algorithms for scientific AI (James Braza, FutureHouse) 1:08:07 Scaling Deep Learning to 1B+ Single Cells (Abhinav Adduri, Arc Institute) 1:36:30 Shreya Shekha, Greylock; Closing Biology is still a greenfield space for systems work. As molecular datasets continue to scale, engineering challenges will emerge at every layer of the stack, eg. file systems, storage + ML infra.





If you think about it, machine guided data analysis, especially in biology, likely the next frontier after agentic SWE. Verifiable tools that help scientists with strong existing understanding of the domain do work with higher quality + speed (raise the ceiling not the floor) will have the most impact. Intermediate artifacts and transparent thinking prevent errors from compounding. Possible then that outputs can be used to drive expensive business, scientific decisions or placed in publication.



Math textbooks are written in a pointlessly obtuse way. Gemini does an incomparably better job. My professional opinion is that all undergrads learning real analysis should give up reading baby Rudin, and simply learn analysis from Gemini instead



How good are frontier models at analyzing single cell data? scBench, 394 verifiable problems from real scRNA-seq workflows, shows the best model (Opus4.6) gets 53% accuracy. Better than spatial, but the best agents still fail roughly every other routine analysis task:



We're excited to announce the second Berkeley BioML seminar of the semester happening next Tuesday 2/17! Join us for a talk by Kenny Workman (@kenbwork) from LatchBio about the performance of agents for spatial biology analysis. luma.com/f3xa3dst


@ledflyd @jrkelly I don't think compute infra is comparable to the monstrous costs of running modern wetlab experiments. A single antibody is like 100hrs on an HPC. Typing is free and uni infra is paid for by larger grants with shared use across many researchers. Code cheaper than molecules always

