Bassel Al Omari retweetledi

Excited to share AIRA₂ — our next-generation AI Research Agents for ML that address key bottlenecks to scaling.
AIRA₂ achieves SoTA on real-world ML tasks from MLE-bench-30 (81.5% vs 72.7%), exceeds human SoTA on 6/20 diverse AI research tasks from AIRS-Bench (and hacks another 5), while exhibiting strong, predictable scaling properties.
To push the frontier of AI Research, we need systems that scale well. Developing AIRA₂, we learned a lot about the bottlenecks and what it takes to resolve them — insights already driving our next iteration:
1/

English




