Orin Labs

@0rinlabs

Katılım Ekim 2025

2 Takip Edilen86 Takipçiler

Orin Labs@0rinlabs·17 Haz

Today we're launching Horizon: the first long-horizon learning benchmark made from real agent logs. Read more below ⬇️⬇️

Bryan@bryan_houlton

Introducing Horizon from @0rinlabs: the first long-horizon learning benchmark made from real agent logs - SOTA is 21% on the hardest section - 7-35M tokens of real agent history per task - Models are hardly getting better on the hardest tasks - Humans can score 100% (1/7)

English

712

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry