Tweet Disematkan
Benjamin Chang
185 posts

Benjamin Chang
@benjamin0chang
Automating discovery | ML PhD @OxfordStats
San Francisco, CA Bergabung Temmuz 2013
459 Mengikuti1.5K Pengikut

@catherinehyeo @evatuecke @GreylockVC @neo @BoxGroup @Liquid2V @JeffDean Congrats Catherine and Eva!!
English
Benjamin Chang me-retweet

Introducing Altara: the scientific intelligence platform for the physical world.
Today @evatuecke and I are excited to announce our $7M seed led by @GreylockVC, joined by @Neo, @BoxGroup, @Liquid2V, and angel investors including @JeffDean and leadership from OpenAI & AMD.
We’re already working with early customers in semiconductors, batteries, and advanced materials. More below.
English
Benjamin Chang me-retweet
Benjamin Chang me-retweet

This is exciting; I expect we are going to see a lot more things like this and it will be one of the most important impacts of AI. Congrats to the Future House team.
edisonscientific.com/articles/annou…
English

This post made me so happy to read!
"It is an understatement to say I was impressed...Kosmos is causing me to re-imagine what my career will look like."
I've had a few moments like this myself while working on Kosmos and I'm grateful to hear that it's helping scientists already 🙂
Zachary Flamholz@zflam94
Just published a post on my first experience with Kosmos, the new AI-scientist from FutureHouse / Edison Scientific. Genuinely felt like sci-fi. open.substack.com/pub/zacharyfla… @FutureHouseSF
English
Benjamin Chang me-retweet
Benjamin Chang me-retweet
Benjamin Chang me-retweet

Today, we’re announcing Kosmos, our newest AI Scientist, available to use now.
Users estimate Kosmos does 6 months of work in a single day. One run can read 1,500 papers and write 42,000 lines of code. At least 79% of its findings are reproducible. Kosmos has made 7 discoveries so far, which we are releasing today, in areas ranging from neuroscience to material science and clinical genetics, in collaboration with our academic beta testers. Three of these discoveries reproduced unpublished findings; four are net new, validated contributions to the scientific literature. AI-accelerated science is here.
Our core innovation in Kosmos is the use of a structured, continuously-updated world model. As described in our technical report, Kosmos’ world model allows it to process orders of magnitude more information than could fit into the context of even the longest-context language models, allowing it to synthesize more information and pursue coherent goals over longer time horizons than Robin or any of our other prior agents. In this respect, we believe Kosmos is the most compute-intensive language agent released so far in any field, and by far the most capable AI Scientist available today. The use of a persistent world model also enables single Kosmos trajectories to produce highly complex outputs that require multiple significant logical leaps. As with all of our systems, Kosmos is designed with transparency and verifiability in mind: every conclusion in a Kosmos report can be traced through our platform to the specific lines of code or the specific passages in the scientific literature that inspired it, ensuring that Kosmos’ findings are fully auditable at all times.
We are also using this opportunity to announce the launch of Edison Scientific, a new commercial spinout of FutureHouse, which will be focused on commercializing our agents and applying them to automate scientific research in drug discovery and beyond. Edison will be taking over management of the FutureHouse platform, where you can access Kosmos alongside our Literature, Molecules, and Precedent agents (previously Crow, Phoenix, and Owl). Edison will continue to offer free tier usage for casual users and academics, while also offering higher rate limits and additional features for users who need them. You can read more about this spinout on our blog, below.
A few important notes if you’re going to try Kosmos. Firstly, Kosmos is different from many other AI tools you might have played with, including our other agents. It is more similar to a Deep Research tool than it is to a chatbot: it takes some time to figure out how to prompt it effectively, and we have tried to include guidelines on this to help (see below). It costs $200/run right now (200 credits per run, and $1/credit), with some free tier usage for academics. This is heavily discounted; people who sign up for Founding Subscriptions now can lock in the $1/credit price indefinitely, but the price ultimately will probably be higher. Again, this is less chatbot and more research tool, something you run on high-value targets as needed.
Some caveats are also warranted. Firstly, we find that 80% of Kosmos findings are reproducible, which also means 20% are not -- some things it says will be wrong. Also, Kosmos certainly does produce outputs that are the equivalent to several months of human labor, but it also often goes down rabbit holes or chases statistically significant yet scientifically irrelevant findings. We often run Kosmos multiple times on the same objective in order to sample the various research avenues it can take. There are still a bunch of rough edges on the UI and such, which we are working on. Finally, we are aware that the 6 month figure is much greater than estimates by other AI labs, like METR, about the length of tasks that AI Agents can currently perform. You can read discussion about this in our blog post.
Huge congratulations to our team that put this together, led by @ludomitch and @michaelathinks: Angela Yiu, @benjamin0chang, @sidn137, Edwin Melville-Green, Albert Bou, @arvissulovari, Oz Wassie, @jonmlaurent. A particular shout out to @m_skarlinski and his team that rebuilt the platform for this launch, especially Andy Cai @notAndyCai, Richard Magness, Remo Storni, Tyler Nadolski @_tnadolski, Mayk Caldas @maykcaldas, Sam Cox @samcox822 and more.
This work would not have been possible without significant contributions from academic collaborators @mathieubourdenx, @EricLandsness, @bdanubius, @physicistnevans, Tonio Buonassisi, @BGomes_1905, Shriya Reddy, @marthafoiani, and @RandallBateman3.
We also want to thank our numerous supporters, especially @ericschmidt, who has been a tremendous ally. We will have more to say about our supporters soon!
English

Kosmos is available to use today on the Edison Scientific Platform! I’m very excited to see how you’ll use Kosmos and the science you may discover.
A huge shoutout to @ludomitch, @michaelathinks, @andrewwhite01, @SGRodriques, Angela, and the incredible team at Edison for transforming Kosmos from a whiteboard sketch into a real system that scientists can use.
English

7⃣ Key Limitations
🔨 Kosmos runs require scientists to specify a research objective, and the output strongly depends on the quality of the objective, which is arguably the hardest part of research!
🔍 It is quite difficult to know what makes a good discovery — even the best scientists often disagree. Therefore, identifying truly valuable discoveries from a Kosmos run is time intensive and usually requires lots of expert feedback.
English

💡 Excited to introduce Kosmos, an AI scientist system for data-driven discovery!
Kosmos is a multi-agent system designed around a central “world model” to coordinate information across hundreds of scientific agent instances. Given an open-ended objective and dataset, Kosmos can perform up to 12 hours of research to explore, analyze, and complete the objective.
We present 7 expert-validated discoveries that Kosmos generated or reproduced across scientific disciplines, including:
🧠 A novel mechanism of ENT neuron vulnerability with aging
🔥 Identifying a critical determinant for perovskite performance
📊 Evidence that high SOD2 levels may causally reduce myocardial fibrosis
📄 Technical report: arxiv.org/abs/2511.02824

English
Benjamin Chang me-retweet
Benjamin Chang me-retweet






