
We’ve been testing Sonnet 4.6 and it has been potent in our agent, Maestro. Our primary eval is to implement a long list of features across a diverse set of use cases, iteratively across codebases, building on prior work. The result: it completed features faster, cheaper, and with a higher benchmark pass rate.




