Orin Labs

1 posts

Orin Labs banner
Orin Labs

Orin Labs

@0rinlabs

Katılım Ekim 2025
2 Takip Edilen86 Takipçiler
Orin Labs
Orin Labs@0rinlabs·
Today we're launching Horizon: the first long-horizon learning benchmark made from real agent logs. Read more below ⬇️⬇️
Bryan@bryan_houlton

Introducing Horizon from @0rinlabs: the first long-horizon learning benchmark made from real agent logs - SOTA is 21% on the hardest section - 7-35M tokens of real agent history per task - Models are hardly getting better on the hardest tasks - Humans can score 100% (1/7)

English
29
0
33
712