
Last week, I spent 90 minutes with @RobinNewhouse, Senior SWE Applied AI at @cline, discussing agent testing. He's the person building evals infrastructure for one of the most-used open source coding agents in the world. Here's a few things from the talk that stayed with me:





















