Shann³@shannholmberg
how to use autoreason for ad creative
you run Meta ads, your best creative has a 1.8% CTR and $19 CPA, and you want both numbers to improve. most marketers would prompt an LLM with "write me 10 ad variations" and test whatever comes back.
the problem with asking AI to rewrite its own work is it never says "this is already good." it invents problems that don't exist, drifts from the original angle with every pass, and keeps changing things even when the output was fine two rounds ago.
autoreason tries to fix that with adversarial isolation. every role in the loop is a fresh agent that can't see what the others wrote.
your current best ad is incumbent A, the one that's live and spending.
the critic:
a fresh agent reads your knowledge layer and tears incumbent A apart. it pulls from:
> your last 50 ads with performance data, what patterns separate the 1.8% CTR creatives from the 0.4% ones
> audience language from reviews, support tickets, reddit threads, how customers describe the problem in their own words
> competitor ads from Meta Ad Library and TikTok Creative Center
> a swipe file of 20-30 high performing ads in your vertical, the ones you know are spending heavy because spending heavy means working creative
the critic doesn't rewrite anything, it just produces a teardown. whats weak in the hook, whats generic in the body, whats missing compared to ads that convert better.
author B
a fresh agent that has never seen the original ad. it only gets the critics teardown and the knowledge layer, then writes a rival version from scratch.
if author B could see incumbent A it would just rearrange the same words. keeping it blind is what forces a different angle instead of a tweak.
the synthesizer
a fresh agent takes both and produces three candidates:
> A unchanged (keeping the original is always an option, this is how the system knows when to stop)
> AB merged, pulling the strongest hook from one and the strongest body from the other
> B the rival as-is
the judge panel
3 blind agents score all three using borda count. they dont know which is original, which is rival, which is merge. they look at hook strength, clarity of the offer, social proof, CTA friction, and how each version stacks up against the swipe file.
the loop
winner becomes the new incumbent. if A wins twice in a row (k=2), the system stops because further iterations would just be scope creep. otherwise it loops back with fresh agents for every role.
nothing leaks between rounds. the judges have none of the context that produced the revisions, which is why their scoring stays clean.
the knowledge layer is one of the things that makes this work for ad creative instead of just generic prompt chaining:
> full ad performance history, CTR, CPA, ROAS per creative
> winning and losing patterns across your account
> customer language from reviews, tickets, reddit
> competitor ads from Meta Ad Library and TikTok Creative Center
> swipe file of high performing ads in your vertical
> your brand voice and positioning
without reference points for what a 3%+ CTR ad looks like in your space, the critic has no benchmark and the author writes copy that could be for any product. pull 20-30 ads from competitors and adjacent brands that you know are spending heavy, screenshot them, feed them in.
you still run the winner through real A/B testing in Meta. autoreason just narrows the search space so youre testing strong candidates instead of random shots.
results feed back every run so the next one is sharper
paper + code: SHL0MS / Nous Research