
I am a physician and a researcher. I also build software. And the longer I spent in all three of those worlds, the more one thing became impossible to ignore: science methodology is the most important part of research, and it is the least developed as infrastructure. In software, we solved this problem 20 years ago. Code is versioned. Pipelines are automated. Tools compose. You can call a function from a terminal, chain it into a workflow, test it, diff it, deploy it. The entire discipline is built on the principle that process should be reproducible by design, not by memory. Science has never had that. A protocol is a document. It lives in a methods section written in passive voice prose, in a PDF nobody can query, in a Word file on someone's desktop. More than 70% of researchers have tried and failed to reproduce another lab's experiment. That failure is not because scientists are careless. It is because the tools we use to describe methodology were never designed to be instructions. They were designed to satisfy journal reviewers. ReplicateScience started as an answer to a simpler problem: take open-access papers, pull the methods, and turn them into something a person can actually follow. Structured steps, evidence quotes from the original text, equipment mapped to real suppliers. That part exists today, across 1,529 protocols from 639 papers. But the reason I keep building it is the bigger problem. I want science methodology to become programmable infrastructure. Not a UI you browse, but a protocol layer you can query from a terminal, integrate with ML pipelines, version like code, and trigger from automated systems. The kind of thing where a behavioral rig can advance a protocol step based on sensor output, or where a lab can diff their actual procedure against the canonical one and log the deviation automatically. That is what software engineering already is. Science deserves the same primitives.













