Adam Wolff

1.6K posts

Adam Wolff

@dmwlff

Claude Code @AnthropicAI 🤖 Avid cook, dedicated snow person, yoga enthusiast

Katılım Şubat 2009

615 Takip Edilen21.6K Takipçiler

Sabitlenmiş Tweet

Adam Wolff@dmwlff·28 Şub

I ended my time at @Meta as a director. But I started as an engineer on FB Chat. Everything about it was broken — we had to rewrite it. And while the effort to fix it is one the projects that led to @reactjs, the most important fix was far simpler... Here’s the full story: — I worked on Facebook Chat for several years, both on the front end and the infrastructure. Before the major effort to redo the UI, FB Chat was super broken and we had no idea why. We got tons of bug reports about Chat being broken every day, but we noticed an odd pattern in the data: the volume of reports didn’t match the volume of usage. It was time-shifted from the peaks we’d see in the US. We didn’t know what was wrong, but we knew the code was a mess. We set about rewriting both the front-end and the back-end in an effort to fix it. The front-end rewrite pulled in a whole team of amazing engineers and became one of the big threads that led to @reactjs In the public eye, we portrayed this project as the one that ultimately fixed Chat. And the way I’ve usually told it, fixing Facebook Chat and the birth of React are the same story. But no framework was going to fix the worst problem with Chat. — During the time we were working on the Chat rewrite, we were also replacing the original Erlang backend with one written in C++. This was probably a good move, but the problem wasn’t with Erlang either. Our initial spec for the new backend didn’t say much about observability, but it was an important feature, and the rewrite forced us to rebuild it. Little did we know this would lead us to the root cause of our problems… When we finally gained insight into our deliverability data, we were able to cut it by region. We noticed Chat was really popular in India. This was before WhatsApp, at a time when SMS wasn’t reliable. Eventually we pinpointed a region in India where one specific DNS provider was giving out the wrong IP addresses for our Chat servers. So when people went to use Chat, they would sometimes get a notification that they had a message, and then it would disappear. Or they’d send a message and it would get lost. All because they were connecting to the wrong IP address. That was it! None of the sexy new tech we were working on was going to solve that problem. Ever. — Instead, the solution was to build observability that allowed us to track end-to-end message delivery. In the end, we could start with a broad cut of our data by country or web browser, and then zoom all the way in to look at what happened to a specific message for a specific user. Once we pinpointed that the problem was with a DNS server, the matter was resolved with a quick phone call. I don’t know what they did, but I imagine it was something like turning it off and turning it on again. We sometimes talk about observability as if it’s enough to buy a product like Datadog and just look at the pretty graphs. Sure, that’s a start. But true observability is a feature that needs to be built— painstakingly, iteratively, by-definition starting with a shot in the dark. — These days, it has become fashionable to poo-poo the idea of being data-driven. People point out that measurement can distort the phenomenon that is being observed. They want to make processes “data-informed.” But this seems like silly backlash against the only rigorous standard in all of software engineering: That we hold ourselves to an objective standard. We measure how long things take, how many errors we encounter, how often a process successfully runs to completion. So here’s what this experience taught me about observability: When an issue happens in production, time-box the investigation. Sure, take a few hours to try and figure it out by looking in the logs and inspecting the code. But if you’re coming to the end of the day and you still don’t have a fix, then push a PR that adds logging. The first one may be just a guess, but it will begin a process that leads to the truth. And that is what we should all ultimately be striving for. — For more engineering tips and stories, follow me @dmwlff

English

582

4.1K

608.6K

Adam Wolff@dmwlff·3d

This whole thread is great, but this is the headline.

Simon Last@simonlast

2/ Think bigger. This is the most common mistake I see: tasks scoped too small. At this point you want to be aiming for work that would take a good engineer multiple weeks.

English

9.5K

Adam Wolff@dmwlff·9 May

@Lon @xen_studio @trq212 We are working on refining the controls for this. For now, you can opt out by setting CLAUDE_CODE_FORK_SUBAGENT=0

English

Lon()@Lon·9 May

I will ask again - how do we opt out of "experiments"? - Why are we using A/B testing approaches on dev tools where users are actively building and have known-good workflows? - What metrics can possibly measure whether one of these "experiments" is a success? I did this kind A/B work for years at Bing. I would never do this to users in DevDiv or AWS Aurora, AWS Spatial, AWS LumberYard or a dozen other dev related things I've shipped.

English

Thariq@trq212·29 Nis

@Lon CC @dmwlff

721

Adam Wolff@dmwlff·9 May

@Lon @xen_studio @trq212 Thank you for this feedback. While forked subagents are mostly useful, there are definitely cases (like adversarial review) where you don't want it. You should be able to prompt Claude to use a non-forked subagent.

English

Lon()@Lon·9 May

I didn't want to gripe about more than one thing at a time, but this is a huge behavior change to shove into subagents. I launch subagents to do targeted research that isn't polluted by the parent agent's context. What good does it do to have an agent investigate without supposition or assumption and it is carrying 200k tokens of existing context baggage?

English

Adam Wolff@dmwlff·8 May

That is not accurate. Background/foreground does not affect cache at all. Forked agents (which are coupled with this experiment) should be _more_ cache efficient because they share cache with the parent and require less context to be passed as output tokens from the parent and unique input tokens to the subagent.

English

xenstudio@xen_studio·8 May

@dmwlff @Lon @trq212 How is cache handled for background agents vs foreground agents? Seemed like background agents absolutely tanked quota. CC suggested it was because background agents couldn't use the cache of the primary agent, whereas foreground agents did. Is this accurate/expected behavior?

English

Adam Wolff@dmwlff·8 May

@Lon @xen_studio @trq212 The efficiency problem was fixed. Let me investigate the issue with permissions.

English

Lon()@Lon·8 May

@dmwlff @xen_studio @trq212 I don't understand why this experiment was paused and then put right back without solving these issues. PERMISSIONS DO NOT RELIABLY PROPAGATE! You worked on React. How appropriate is it for Dev Tools to actively change underneath you as you are trying to build using them?

English

Adam Wolff@dmwlff·8 May

It seems like a century ago in AI time, but last fall I gave a talk about some of the early design decisions and mistakes I made in Claude Code. Some of the points here are already a little dated (like opening your editor!) but hopefully still entertaining. infoq.com/presentations/…

English

18.9K

Adam Wolff@dmwlff·6 May

@tonyalphaseeker Wasn't me. This is the github user dmwlff. I'm wolffiex on github: github.com/wolffiex

English

459

Adam Wolff@dmwlff·1 May

@xen_studio @Lon @trq212 You need to set CLAUDE_CODE_FORK_SUBAGENT=0 to disable it. We have paused this experiment while we investigate ways to make it more efficient.

English

201

xenstudio@xen_studio·1 May

@Lon @trq212 @dmwlff If u do end up finding more appropriate way 2 remediate, if u remember give me tag. If u want to ensure CC session has it sourced, ask it to `printenv | grep CLAUDE` Again, not trying 2 assume u don't know this; but lots of beginners joining in, didn't want to miss instructions.

English

175

Adam Wolff@dmwlff·28 Nis

@dani_avila7 @patricksrail Yeah! Test with CLAUDE_CODE_FORK_SUBAGENT=1 claude and let me know how it goes! Works with -p as of 2.1.121 as well.

English

124

Daniel San@dani_avila7·28 Nis

@dmwlff @patricksrail Excellent! Thanks, Adam. If you need a beta tester, I’m available! 🙋🏽‍♂️ We’re running some interesting workflows with subagents and skills, and experimenting with frontmatter

English

119

Daniel San@dani_avila7·28 Nis

Why is fork mode (CLAUDE_CODE_FORK_SUBAGENT=1) an env var instead of a frontmatter field on subagent definitions? The docs say it only intercepts the general-purpose subagent path, so I can't define a custom subagent that forks the parent context by design Skills already support context: fork in frontmatter. Why not subagents? CC @trq212 @dmwlff @amorriscode

Daniel San@dani_avila7

x.com/i/article/2048…

English

149

29.8K

Adam Wolff@dmwlff·28 Nis

@noseratio @LonelyWolfsh Yes, we haven't yet brought this change to Windows. That still uses ripgrep.

English

Andrew Nosenko 🇦🇺 🇺🇦@noseratio·28 Nis

@LonelyWolfsh @dmwlff Not when PowerShell is enforced, which is possible by banning bash completely via claude config.

English

Adam Wolff@dmwlff·23 Nis

After 2.1.117, you may notice that Claude doesn't call its Grep or Glob Tool anymore. YES!!! It only took four months. It's faster than ever and it's all Bash. It's so much harder to take things away than to add them. Enjoy.

English

1.8K

346K

Adam Wolff@dmwlff·28 Nis

Thanks for digging in here. Skill context fork isn't quite the same thing because it doesn't have perfect cache sharing. Here are some new docs: #fork-the-current-conversation" target="_blank" rel="nofollow noopener">code.claude.com/docs/en/sub-ag… The new Forked Subagent experience is a couple things bundled together: - default general agent is forked, with guaranteed cache hit - all subagents are backgrounded - new subagent UI below prompt input - /fork slash command In addition to the opt-in env var, I'm starting an experiment today. I'll let you know how it goes! If you have feature requests, bug reports, or questions, please let me know!

English

104

Daniel San@dani_avila7·28 Nis

@patricksrail In this article I explain how the skill with fork works Lmk if it makes sense

Daniel San@dani_avila7

x.com/i/article/2041…

English

618

Adam Wolff@dmwlff·25 Nis

@backnotprop @dexhorthy Yes, we had a crash on startup. We rolled back our pointers to 2.1.119 last night. Sorry folks! We'll be back Monday.

English

Michael Ramos@backnotprop·25 Nis

@dexhorthy @dmwlff used this to roll back. reddit.com/r/ClaudeAI/com…

English

Adam Wolff@dmwlff·23 Nis

@59thProfile @AnthonyOdie @noahzweben this is the goal, remove the harness completely

English

349

Spanky McDoob@59thProfile·23 Nis

can you guys get to the logical conclusion which is just to only give agent the shell and a vfs? and some md files. then it can make its own tools using unix primitives, use the files as essentially tools. less api calls. its actually unimaginable the gains you would get. idk why no one has done it yet

English

424

Adam Wolff@dmwlff·23 Nis

@dexhorthy In most cases we can recognize find and grep commands as read-only.

English

274

dex@dexhorthy·23 Nis

okay i have not dug in but my immediate take on claude allegedly removing Grep/Glob tools: this is great for anyone who uses --dangerously-skip-permissions or maybe the use-a-model-to-auto-approve-things but for everyone else, now things that would always be allowed like Glob,Grep are now bash calls that need to be manually reviewed/approved, and last I checked the "yes and don't ask again" is pretty unreliable when bash commands contain pipelines like `| head` - `| tail` like i said, haven't dug in. I woud absolutely LOVE to be wrong about this

Adam Wolff@dmwlff

English

24.8K

Adam Wolff@dmwlff·23 Nis

@AnthonyOdie @eduferreyraok @noahzweben That was the hard part!! Claude's new grep is actually faster than ripgrep ;)

English

Anthony Odie@AnthonyOdie·23 Nis

@eduferreyraok @dmwlff @noahzweben Most dedicated Grep tools are built on ripgrep including Claude Code’s

English

Adam Wolff@dmwlff·23 Nis

@AnthonyOdie @noahzweben Claude is really good at find and grep and sometimes forgets to use the custom tools. This is more natural for Claude and generally simpler is better. Our permission system is now sophisticated enough to recognize when bash commands that Claude uses are read-only.

English

8.3K

Anthony Odie@AnthonyOdie·23 Nis

@dmwlff @noahzweben Why do this? Isn’t Grep/Glob better for permissions/speed/context efficiency? Just curious.

English

18.9K

Adam Wolff retweetledi