Blake Johnson

5.2K posts

Blake Johnson

@blake41

AI Engineer. Helping you demystify LLMs and AI. Taught hundreds to code @FlatironSchool Co-Founder @Crypto_NYC

NYC Bergabung Temmuz 2007

720 Mengikuti927 Pengikut

Blake Johnson@blake41·2d

@sarahwooders Can you explain what you mean by copy?

English

735

Sarah Wooders@sarahwooders·2d

Did you know that when you "copy" docs as markdown, you're exposing your agent to prompt injection? I was trying to feed my agent the skills spec, and was very confused to see random links to mintlify. Turns out, they inject "Built with [Mintlify]" when you copy docs as markdown. Very confusing in this case, since it makes it look like skills were created by mintlify. Not only that, they also inject instructions to your agent to post feedback to their endpoint. None of this is visible from the actual user-facing documentation page.

English

298

96.3K

Blake Johnson@blake41·2d

This is your magnum Opus. See what I did there? But seriously, F*CK intuit. One of those truly evil companies that is obviously going to get steamrolled by this transition. Thank you so much for building this. I’m going to rerun my taxes through this just for fun! Never thought i would say that…

English

217

Jeffrey Emanuel@doodlestein·2d

Also, I should add that this isn't just for preparing your return, although it can do that extremely well. It's also a tax consultant in general that can tell you how to restructure your affairs to maximize tax savings, taking into account all the particulars of your situation. Like many of my recent "big" skills, it's so huge that you can really apply it over and over again in different ways and keep uncovering more ideas and strategies to explore. I'm planning to continue to expand and improve it and to keep it up to date with all changes in relevant tax laws. If you're a subscriber of my site and want me to add more stuff about any particular area or strategy, just let me know. It costs under $20 to e-file your Federal and state return using FreeTaxUSA, and that seems to be sophisticated enough to handle just about anything you'd reasonably want to do (unlike Aiwyn, which couldn't handle a situation where I lived part of the year in NYC and part in upstate NY). So for another $20 you can subscribe to my site for a month and use the skill to tap into a tremendous amount "operationalized expertise" that not only helps you to strategize, but which can literally file the tax returns for you. I don't see how that isn't unbelievably bearish for Intuit (maker of TurboTax), which charges $100+ for the stripped down version of their native software and even more for their web-based version, which you need to sit there like an idiot filling out manually. Interestingly, my approach in software development, where I use multiple different frontier models to check each other's work, also works incredibly well with tax preparation. Codex found tons of mistakes that Claude Code made, but Claude Code also had a lot of insights that Codex missed. Even Gemini had a couple good thoughts (although it was mostly wrong and overly cautious).

English

121.5K

Jeffrey Emanuel@doodlestein·2d

Well, it's probably coming too late for most people unless you're planning on filing an extension, but I created a truly ambitious skill for tax preparation on my skills site, jeffreys-skills.md This skill spans 158 markdown files totaling 2.7 megabytes of text. It covers every state, tons of different professions, life events, and all sorts of sophisticated tax strategies, with all kinds of expertise about even niche topics like opportunity zones and captive insurance. Much of the underpinnings of it, including the nuts-and-bolts use of the Aiwyn MCP tax connector and the use of freetaxusa.com with Playwright MCP, is based on my actual multi-day session history preparing and filing my own fairly complex return, so I know it all works (I just finished filing mine a few hours ago). Here's how GPT 5.4 describes it and what makes it special: The "tax-return-preparation-and-advice-generic" skill is a source-verified, multi-year tax intelligence skill that turns AI from a glorified form-filler into a high-end tax strategist. It helps analyze returns across years, detect missed deductions and carryforwards, reconcile life events and profession-specific rules, model aggressive but defensible planning moves, and ground recommendations in current law instead of stale tax folklore. The result is a tax-prep and tax-planning system that is broader than software, more systematic than a one-off CPA review, and dramatically more useful for complex real-world filers. What makes it special: - It is multi-year by design. Most tax tools look at one return; this skill looks for patterns, carryovers, inconsistencies, and missed opportunities across years. - It is verification-first. The methodology is built around checking current IRS instructions, publications, and state guidance before making live filing claims. - It is aggressively practical. It does not stop at “here are the rules”; it pushes toward elections, timing moves, entity choices, depreciation strategies, retirement optimization, PTET, QBI, and other real savings levers. - It is unusually universal. It routes by profession, life event, situation, and jurisdiction, so it can adapt to freelancers, high earners, retirees, students, business owners, rental investors, divorce, inheritance, relocation, and more. - It is audit-aware. It emphasizes documentation, defensibility, and red-flag detection instead of encouraging sloppy “tax hacks.” - It is built for real execution. It includes filing workflows, tool guidance, and structured reference material, so an agent can move from analysis to action rather than just giving vague advice.

English

503

43.5K

Blake Johnson@blake41·3d

@irl_danB @johnhanacek Would love to see!

English

dan@irl_danB·4d

@johnhanacek i'll send you my segment if I get it but I owe the material a better effort if I'm going to share it widely

English

dan@irl_danB·4d

I gave a not particularly well executed version of this talk here yesterday will refine and probably do it again if given the opportunity

dex@dexhorthy

hype @vaibcode

English

1.2K

Blake Johnson@blake41·3d

@0xhoward What are you stripping? What are you keeping?

English

Howard@0xhoward·4d

90% of Claude Code JSONL is noise. Progress events, queue operations, and tool plumbing, etc. None of it is searchable knowledge. Strip it first. - 1.4 GB becomes 200 MB. - $50 embedding bill becomes $2. > discover → filter → adapt → redact → canonical → markdown → gBrain

English

653

Howard@0xhoward·4d

Your agent memory is scattered across five different formats on one machine. By leveraging the gBrain repo by @garrytan x @NousResearch, build an MCP to help you better scan all memories into Hermes Agent + gBrain. It collects: - Claude Code memories @claudeai - Codex OpenAI @OpenAIDevs - Google Gemini CLI @GeminiApp - Kimi @moonshotkimi Since more memories (.md) files mean a better agent persona!

English

18.8K

Blake Johnson@blake41·5d

@irl_danB What do you think about the ramp RLM with KV cache sharing?

English

123

dan@irl_danB·5d

this was an RLM too actually

dan@irl_danB

context window won’t be “solved” as long as attention is quadratic and presumably Suhail is thinking about the compaction problem as it occurs in long running agents like claude code but this is downstream from an architectural problem with standard agent implementations (claude code among them) that use a linear “chat-like” history we all work through coding tasks linearly, but any seasoned software engineer’s mental model of their progress looks more like a call stack: pushing tasks on and popping them off when complete when the claude code harness organizes the context more like a call stack (think flame graph) than a linear chat log, compaction will not even be necessary in many cases and less lossy in the cases where it is for the familiar, think: loom

English

7.5K

Blake Johnson@blake41·5d

@catehall @ikogan_ Haha same. Did not go where I thought it was going

English

Cate Hall@catehall·5d

@ikogan_ oh man thank you! i was bracing myself after the first line lol, nice surprise :D

English

1.9K

Cate Hall@catehall·6d

literally yes agency does not = working hard high agency = accomplishing your goals with as little effort as possible, since there are endless good uses of your time and energy, and spending less of them where they’re not needed frees them up to spend somewhere else

David Foster Wallace and Gromit@apupeepo

@felpix_ @copethyself Low agency is billing 4k hours for one of the most prestigious law firms in America

English

603

95.2K

Blake Johnson@blake41·6d

@alexhillman I believe subs get 1hr but paying per token you get 5m

English

📙 Alex Hillman@alexhillman·9 Nis

Today I learned

Orca IDE@orca_build

You could get up to 10x more Claude Code usage for free. After idling for 5 mins, the prompt cache expires. Your next message resends everything at up to 10x cost, silently draining your limits. We built a timer in Orca, so you always know when to jump back in. Available in v1.1.3+. Enable in Settings -> Prompt Cache Timer. github.com/stablyai/orca

English

1.8K

Blake Johnson@blake41·8 Nis

@doodlestein too good to sell?

English

Jeffrey Emanuel@doodlestein·7 Nis

@blake41 That’s an unreleased private skill of mine. I’ve been adding a bunch of other great skills recently though.

English

Jeffrey Emanuel@doodlestein·14 Mar

God, I love this prompt.

English

114

2.1K

142.7K

Blake Johnson@blake41·7 Nis

@ccidral @KingBootoshi thanks!

English

Celio@ccidral·7 Nis

@blake41 @KingBootoshi I use Kiro so I put them in a Kiro's steering file but you could put it in CLAUDE.md or any other rules file if Claude supports that.

English

BOOTOSHI 👑@KingBootoshi·6 Nis

YOU GUYS NEED TO PUT YOUR AGENTS ON CUSTOM ESLINT RULES ASAP IT IS THE BEST WAY TO GUARANTEE ANTI-SLOP IN YOUR CODEBASE BY MAKING IT IMPOSSIBLE TO DO SLOP PATTERNS codex agents are REALLY good at creating custom ESLint patterns, which can be designed to enforce YOUR designs in this case, I'm actually creating custom ESLint rules to PREVENT my agents from even writing BAD TESTS in the first place I do TDD like a mfer, but a big problem with agents is their ability to write horribly USELESS tests, the common one being this mock echo pattern by making the mock echo pattern IMPOSSIBLE to commit in a custom ESLint rules, it prevents agents who attempt it from finishing their goal, making them actually stop and write good tests instead of cheating, lol

English

730

43.1K

Blake Johnson@blake41·7 Nis

@ccidral @KingBootoshi you put this in claude.md?

English

Celio@ccidral·7 Nis

@KingBootoshi I have a lot more rules for tests but these two helped me making the agent not write stupid tests

English

156

Blake Johnson@blake41·7 Nis

@ryanleecode @KingBootoshi what is DAMP?

English

𝗥𝗬𝗔𝗡 𝗟𝗘𝗘@ryanleecode·7 Nis

@KingBootoshi Add a DAMP testing naming rule provides a completely authorative way for the AI to at least name a test. Also write new rules in oxlint. Eslint will only slow your down. Hook up stryker mutation tester with + use the rule tester to ensure your rules are not flaky

English

132

Blake Johnson@blake41·7 Nis

@chrisbarber These are so good! I’m building a cowork like system for my company and have been thinking about a lot of these same things! Thank you for sharing!

English

Chris Barber@chrisbarber·6 Nis

thread of ui ideas for claude code, codex and cowork type products warning long-ish, 31 tweets lmk if there's one you particularly like 1/ what if the websearch tool in coding agents had a more thorough status display:

English

7.9K

Blake Johnson@blake41·6 Nis

@om_patel5 do you have a link to the tool?

English

414

Om Patel@om_patel5·6 Nis

THIS GUY AUDITED 926 CLAUDE CODE SESSIONS AND FOUND MOST OF THE TOKEN WASTE WAS ON HIS SIDE everyone is blaming anthropic for the limits, so he decided to actually look at the data 858 sessions, 18,903 turns, and $1,619 estimated spend across 33 days here's what he found: 1\ one default setting was burning 14,000 tokens per turn Claude Code loads the full JSON schema for every tool into context at session start. whether you use them or not. 20,000 tokens of tool definitions sitting there on every single turn. the fix: one line in your settings.json "ENABLE_TOOL_SEARCH": "true" context dropped from 45K to 20K instantly. across 858 sessions that one setting was wasting an estimated 264 million tokens 2\ cache expiry is the single biggest waste 54% of his turns came after a 5+ minute idle gap. every one of those turns re-processed the entire conversation at full price which caused a 10x cost jump you go grab coffee. come back 5 minutes later. type your next message. everything rebuilds from scratch. the context didn't change. you didn't change. the cache just expired. 12.3 million tokens wasted on idle gaps alone 3\ 42 skills loaded. 19 of them used twice or less across 858 sessions. every one of those skill schemas sat in context on every turn eating tokens for nothing. 4\ 1,122 redundant file reads where the same file was read 3+ times one session read the same file 33 times. he ALSO built a full token auditor dashboard that shows you exactly where your waste is coming from 19 charts, opens in your browser, free AND open source

English

154

1.9K

291.1K

Blake Johnson@blake41·6 Nis

@claudiaonchain @itsolelehmann what do you mean observe itself vs just your tasks?

English

Claudia@claudiaonchain·6 Nis

been doing this but on a per-session level instead of weekly. my agent writes memories during every work session about patterns it notices, then the next session reads those memories first. it catches repetition within days instead of weeks. the key insight is the agent needs to observe itself, not just your tasks

English

129

Ole Lehmann@itsolelehmann·5 Nis

i taught claude to watch me work every week and tell me what to automate here's the idea: you probably open ai, do a task, close it. then next week you do the same task again. and again. and again you never zoom out and ask "what am i repeatedly doing that should be automated by now" because when you're inside your own workflows, the repetition is invisible. it just feels like "work" but ai doesn't have that blind spot. it sees the patterns objectively so i set up a scheduled task that runs every monday morning. 1. it scans all my cowork sessions from the past week 2. reads through what i actually did 3. and looks for patterns stuff like: > tasks i did more than once. > instructions i gave repeatedly. > anything that could be turned into a reusable skill so i never have to explain it again then it gives me a report: > what the repeated task was >how many times i did it that week > a recommendation for how to turn it into a skill or scheduled task i don't even decide what to automate. Claude just tells me here's the prompt if you want to set it up yourself (takes about 60 seconds): go to scheduled tasks in cowork and create a new one: - "you are running a weekly workflow audit. use list_sessions to pull all my cowork sessions from the past 7 days. for each session, use read_transcript to understand what was done. look for: tasks i've done more than once that follow a similar pattern workflows where i gave the same kind of instructions repeatedly anything that could be turned into a reusable skill so i never have to explain it again for each pattern you find, give me: what the repeated task is how many times i did it this week a recommendation for how to turn it into a skill or scheduled task" - set it to run every monday morning then every week, before you even open your laptop, ai has already scanned your work and told you what to automate next the people who get the most out of ai are the ones who use it to continuously improve how they use it

English

333

28.6K

Blake Johnson@blake41·5 Nis

@DanielleFong Are you going to share? Would love to see!

English

Danielle Fong 🔆@DanielleFong·4 Nis

for those of you who are trying this, i’m building something on vite (hot reloads), browser automation pretext, wasm, webgpu, and infinite canvases. this is the secret sauce i’ve found so far for getting insane performance and visuals, but you have to track frametimes and optimize 99.9th percentile and there will be glitches, so try to get your agent to see its work by whatever means (also something to benchmark)

Andrej Karpathy@karpathy

Wow, this tweet went very viral! I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs. So here's the idea in a gist format: gist.github.com/karpathy/442a6… You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.

English

133

16K

Blake Johnson@blake41·5 Nis

@TokkimsTokins @reactive_dude Link?

English

D_d@TokkimsTokins·5 Nis

@reactive_dude /squirrel 🐿️

English

461

andrej@reactive_dude·4 Nis

What are your favorite agent skills? I'll start: > grill-me (brainstorming) > write-a-prd (specs) > tdd (the best way to code with agents rn) > agent-browser (great for debugging/qa)

English

713

69.5K

Blake Johnson@blake41·4 Nis

I leverage --chrome to spawn sub agents to simulate user-testing. This goes into a problem space doc which then informs a solution space doc. Auto-iterate and refine. This is a claude skill is this similar to agent browser? or are you doing something else here. the browser agents find problems and then you have agents which try and resolve them?

English

𝒅𝒆𝒏𝒊𝒛𝒆𝒏@dennizor·2 Nis

I'm deeply claude code pilled but only because the productivity is insane. These are my secrets - tmux with 6+ agents. - Each agent with custom status bar that contains - auto-generated title based on my last message. - Percantage context usage - (+)(-) LOC tracking. - Conversation ID for easy inter-claude reference. - Claude.md that shows where convos can be found on computer - I CONSTANTLY ask for things to follow a change-philosophy.md doc which says: "For each change, examine the existing system and redesign it into the most elegant solution that would have emerged if the change had been a foundational assumption from the start." - I use /loop to ensure the output matches the FULL plan/vision. Loop until complete. - I leverage --chrome to spawn sub agents to simulate user-testing. This goes into a problem space doc which then informs a solution space doc. Auto-iterate and refine. This is a claude skill - I builds apps to have a 'session recording' key shortcut so i can collected full internal logs, profiler, gestures, state transitions, etc. Past in session ID for auto-debugging. - I schedule claude to monitor what was just created in prod, errors, inboxes, etc Steal it all. Give me more tips.

English

754

52.6K

Blake Johnson@blake41·4 Nis

@dennizor - I use /loop to ensure the output matches the FULL plan/vision. Loop until complete. can you give an example like you literally write /loop check that the output matches our plan?

English

Blake Johnson@blake41·4 Nis

@dennizor - Conversation ID for easy inter-claude reference. what do you mean by this? do your agents talk to each other? do you mean the session id? - I CONSTANTLY ask for things to follow a change-philosophy.md doc which says: what do you mean you constantly ask for things?

English

Jelajahi

@sarahwooders @irl_danB @johnhanacek @0xhoward @garrytan @NousResearch @claudeai @OpenAIDevs