
We got 74.4% on TerminalBench2 with Opus 4.6 simply by improving Terminus 2. That's up from 62.9% on "Terminus 2 + Opus 4.6", making Opus 4.6 match "Simple Codex + GPT-5.3-Codex". We'll share a short technical blog post + open-source the modified Terminus soon. Stay tuned 😉













