Kris Puckett: "You don't need a 20x Max plan or 2k of token budget a month to do real good work"

Kris Puckett@krispuckett·8 Haz

You don't need a 20x Max plan or 2k of token budget a month to do real good work. The "design loops that prompt your agents" energy is real, and it rips. But it can read like you need infinite tokens to play. You don't. Here's what makes every token count: 1. Right model for the job. Sonnet for most of it, Opus for hard architecture, Haiku for grunt work. Don't pay Opus rates to rename a variable. 2. Be specific or pay for it. "Fix line 42 in auth.ts" costs pennies. "Something's off with login" makes it read half your repo. Precision is the cheapest optimization there is. 3. One task per chat. Clear context between jobs. Stale context taxes every message after it. 4. Send the heavy lifting to subagents. Tests, logs, doc-fetching, big searches. The noise stays there. Only the summary comes home. 5. Plan in Opus, build in Sonnet. Pay once for the thinking, cheap for the doing. 6. Tiny CLAUDE.md, plus a .claudeignore. Stop it re-reading junk every time. Set it once, win every session. 7. Leave yourself a note. Dump decisions and next steps to a markdown file. Load it tomorrow instead of re-explaining everything. 8. Watch /usage. Spend the expensive model on the moments that earn it. Constraints make you more precise, if you let them.

Peter Steinberger 🦞@steipete

Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.

English

249

41.8K

PrimeLine@PrimeLineAI·8 Haz

#7 is the one everyone does worst. it's manual... you have to remember to write the note AND load the right one tomorrow. i stopped trusting myself to do that. a retrieval-backed memory layer surfaces the relevant past decision the moment a task matches it, so nothing rides on me recalling which markdown to open. that's the upgrade that makes the other 7 compound across sessions instead of resetting.

English

246