Doug Downey

118 posts

Doug Downey

@_DougDowney

Researching AI for Science @allen_ai, Prof @northwesterncs

Katılım Mayıs 2020

266 Takip Edilen421 Takipçiler

Doug Downey retweetledi

Ai2@allen_ai·9 Mar

🚨 The best AI gets built in the open. Next week, we’re bringing that message to #NVIDIAGTC — with panels, demos, and a window into what fully open models can do. Here's where to find us 🧵👇

English

13.4K

Doug Downey retweetledi

Pao Siangliulue@Siangliulue·14 Mar

Are you a researcher in CS or a CS-adjacent field curious about how an AI agent can help you with your research project? Want to try a new tool for your research support in a paid user study ($100, 2 hr)? Limited spot numbers. See details and sign up here: forms.gle/JzLtkAhe7Ttvui…

English

101

9.4K

Doug Downey@_DougDowney·13 Mar

TL;DR: Evaluating Deep Research systems is hard. We discuss why and call out the importance of fine-grained metrics, annotator expertise, and subjectivity. Enjoyed this collaboration led by @JenaHwang2, with mentorship from @SergeyFeldman and contributions from a great team.

Ai2@allen_ai

🔎 Deep research agents like Asta ScholarQA and OpenAI Deep Research are transforming how we perform literature review. But how do we know if the way we evaluate them is actually meaningful? Announcing our new paper: “Deep Research, Shallow Evaluation: A Case Study in Meta-Evaluation for Long-Form QA Benchmarks” 🧵

English

312

Doug Downey@_DougDowney·28 Şub

Releasing the Asta Interaction Dataset: large-scale logs of real interactions with LLM-powered scientific research tools. Analysis led by Dany Haddad reveals how scientists use these systems in practice: longer, more complex queries and treating results as persistent artifacts. Special shout-out to one of his favorite figures: this Sankey diagram tracing section expansion (Si = section i expanded).

Ai2@allen_ai

We analyzed 250K+ queries & 430K+ clickstream interactions from Asta, our AI-powered research assistant—and today we're releasing the full dataset. How do researchers actually use AI science tools? Here's what we found. 🧵

English

2.3K

Doug Downey@_DougDowney·25 Şub

Can today’s agents anticipate future scientific collaborations, ideas, and impact? Introducing PreScience, a large-scale AI benchmark for scientific forecasting. Careful dataset construction led by @anirudhajith42, with @aps6992, @jaydepun, @Hoper_Tom and collaborators.

Ai2@allen_ai

Can AI predict what scientists will do next—not just one piece, but the whole research process? PreScience is our new model eval for forecasting how science unfolds end-to-end, from how research teams form to a paper's eventual impact. Built with @UChicago, supported by @NSF.

English

594

Doug Downey retweetledi

Ai2@allen_ai·12 Şub

Knowing which questions to ask is often the hardest part of science. Today we're releasing AutoDiscovery in AstaLabs, an AI system that starts with your data and generates its own hypotheses. 🧪

English

173

261.5K

Doug Downey retweetledi

Ai2@allen_ai·28 Oca

Introducing Theorizer: Turning thousands of papers into scientific laws 📚➡️📜 Most automated discovery systems focus on experimentation. Theorizer tackles the other half of science: theory building—compressing scattered findings into structured, testable claims. 🧵

English

607

55.3K

Doug Downey retweetledi

Ai2@allen_ai·27 Oca

Introducing Ai2 Open Coding Agents—starting with SERA, our first-ever coding models. Fast, accessible agents (8B–32B) that adapt to any repo, including private codebases. Train a powerful specialized agent for as little as ~$400, & it works with Claude Code out of the box. 🧵

English

143

938

346.5K

Doug Downey@_DougDowney·18 Ara

Big usability upgrade to Asta's report-writing experience.

Ai2@allen_ai

🆕 New in Asta: multi-turn report generation. You can now have back-and-forth conversations with Asta, our agentic platform for scientific research, to refine long-form, fully cited reports instead of relying on single-shot prompts.

English

367

Doug Downey retweetledi

Kyle Lo@kylelostat·17 Ara

olmo 3 paper finally on arxiv 🫡 thx to our teammates esp folks who chased additional baselines thx to arxiv-latex-cleaner and overleaf feature for chasing latex bugs thx for all the helpful discussions after our Nov release, best part of open science is progressing together!

English

458

52.7K

Doug Downey retweetledi

Ai2@allen_ai·16 Ara

Last year Molmo set SOTA on image benchmarks + pioneered image pointing. Millions of downloads later, Molmo 2 brings Molmo’s grounded multimodal capabilities to video 🎥—and leads many open models on challenging industry video benchmarks. 🧵

English

325

125.6K

Doug Downey retweetledi

Ai2@allen_ai·8 Ara

Update: DataVoyager, which we launched in Preview early this fall, is now available in Asta. 🎉 You can upload real datasets, ask complex research questions in natural language, & get back reproducible answers + visualizations. 🔍📊

English

13K

Doug Downey retweetledi

Ai2@allen_ai·20 Kas

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

English

328

1.7K

607.4K

Doug Downey retweetledi

Ai2@allen_ai·18 Kas

Today we’re releasing Deep Research Tulu (DR Tulu)—the first fully open, end-to-end recipe for long-form deep research, plus an 8B agent you can use right away. Train agents that plan, search, synthesize, & cite across sources, making expert research more accessible. 🧭📚

English

122

669

123.3K

Doug Downey retweetledi

Jonathan Bragg@turingmusician·6 Kas

Agent benchmarks don't measure true *AI* advances We built one that's hard & trustworthy 👉AstaBench tests agents w/ *standardized tools* on 2400+ scientific research problems 👉SOTA results across 22 agent *classes* 👉AgentBaselines agents suite 🆕arxiv.org/abs/2510.21652 🧵👇

English

4.1K

Doug Downey@_DougDowney·8 Eki

New project led by Shriya Atmakuri in collaboration with @aps6992: Ai2's Asta system now reports weekly which papers its research summaries have cited. The aim is to give credit to the work that powers the reports, and provide a dataset for studying how AI systems cite science.

Ai2@allen_ai

📊 Today we're releasing data showing which scientific papers our AI research tool Asta cites most frequently. Think of it as creating citation counts for the AI era—tracking which research is actually powering AI answers across thousands of queries. 🧵

English

5.1K

Doug Downey retweetledi

Ai2@allen_ai·1 Eki

Introducing Asta DataVoyager—our new AI capability in Asta that turns structured data into transparent, reproducible insights. Built for scientists, grounded in open, inspectable workflows. 🧵

English

115

371.8K

Doug Downey retweetledi

Ai2@allen_ai·29 Eyl

A few new challengers enter SciArena—including DeepSeek-V3.2-Exp and Claude Sonnet 4.5 🔬

English

5.6K

Doug Downey retweetledi

Ai2@allen_ai·26 Ağu

As part of Asta, our initiative to accelerate science with trustworthy AI agents, we built AstaBench—the first comprehensive benchmark to compare them. ⚖️

English

106

9.6K

Doug Downey retweetledi

Ai2@allen_ai·26 Ağu

Introducing Asta—our bold initiative to accelerate science with trustworthy, capable agents, benchmarks, & developer resources that bring clarity to the landscape of scientific AI + agents. 🧵

English

223

295.7K

Keşfet

@JenaHwang2 @SergeyFeldman @anirudhajith42 @aps6992 @jaydepun @Hoper_Tom @elonmusk @BarackObama