Peter Farago retweetledi

Ask Claude to build you a financial model in Excel.
You'll get back reasonable structure, plausible assumptions, formulas that link together correctly.
Now you have to check it.
Do you open every cell and inspect every formula? If you do that, you might as well have built it yourself. If you don't, you're trusting a junior employee who works at superhuman speed but might have encoded some very strange assumptions that didn't stand out at first glance.
Validating agent-generated work is the problem nobody is talking about.
Agents have made creation cheap. They haven't made it any easier to know whether what was created is actually right.
The bottleneck used to be writing the code, building the model, drafting the document. Now it's checking the output. And our tools — spreadsheets, code review, document editors — were all designed for a world where humans did the creating. None of them are built for the volume or the speed agents produce at.
@profjoeyg and I wrote about this, and what we think validation actually has to look like going forward: open.substack.com/pub/frontierai…
English




