Sabitlenmiş Tweet

This week we launched the Open Benchmarks Grant with a $3M initial commitment from @SnorkelAI + partner support from @huggingface @togethercompute @PrimeIntellect @PyTorch @harborframework & others, in order to close the evaluation gap in AI.
Our ability to measure AI has been outpaced by our ability to develop it - and open benchmarks are one of several critical, complementary tools to fix this.
We're particularly interested in novel benchmarks that push and probe the frontier along three key vectors:
(1) Environment complexity
--> E.g. complex, domain-specific context and tool/action spaces, human interaction, world modeling)
(2) Autonomy horizon
--> E.g. long horizon, non-stationary goals
(3) Output complexity
--> E.g. complex outputs with nuanced, rubric-based evaluation / reward signals
Check out more detail + link to apply here! benchmarks.snorkel.ai
English









