Adam Wolff

1.6K posts

Adam Wolff banner
Adam Wolff

Adam Wolff

@dmwlff

Claude Code @AnthropicAI 🤖 Avid cook, dedicated snow person, yoga enthusiast

Katılım Şubat 2009
615 Takip Edilen21.6K Takipçiler
Sabitlenmiş Tweet
Adam Wolff
Adam Wolff@dmwlff·
I ended my time at @Meta as a director. But I started as an engineer on FB Chat. Everything about it was broken — we had to rewrite it. And while the effort to fix it is one the projects that led to @reactjs, the most important fix was far simpler... Here’s the full story: — I worked on Facebook Chat for several years, both on the front end and the infrastructure. Before the major effort to redo the UI, FB Chat was super broken and we had no idea why. We got tons of bug reports about Chat being broken every day, but we noticed an odd pattern in the data: the volume of reports didn’t match the volume of usage. It was time-shifted from the peaks we’d see in the US. We didn’t know what was wrong, but we knew the code was a mess. We set about rewriting both the front-end and the back-end in an effort to fix it. The front-end rewrite pulled in a whole team of amazing engineers and became one of the big threads that led to @reactjs In the public eye, we portrayed this project as the one that ultimately fixed Chat. And the way I’ve usually told it, fixing Facebook Chat and the birth of React are the same story. But no framework was going to fix the worst problem with Chat. — During the time we were working on the Chat rewrite, we were also replacing the original Erlang backend with one written in C++. This was probably a good move, but the problem wasn’t with Erlang either. Our initial spec for the new backend didn’t say much about observability, but it was an important feature, and the rewrite forced us to rebuild it. Little did we know this would lead us to the root cause of our problems… When we finally gained insight into our deliverability data, we were able to cut it by region. We noticed Chat was really popular in India. This was before WhatsApp, at a time when SMS wasn’t reliable. Eventually we pinpointed a region in India where one specific DNS provider was giving out the wrong IP addresses for our Chat servers. So when people went to use Chat, they would sometimes get a notification that they had a message, and then it would disappear. Or they’d send a message and it would get lost. All because they were connecting to the wrong IP address. That was it! None of the sexy new tech we were working on was going to solve that problem. Ever. — Instead, the solution was to build observability that allowed us to track end-to-end message delivery. In the end, we could start with a broad cut of our data by country or web browser, and then zoom all the way in to look at what happened to a specific message for a specific user. Once we pinpointed that the problem was with a DNS server, the matter was resolved with a quick phone call. I don’t know what they did, but I imagine it was something like turning it off and turning it on again. We sometimes talk about observability as if it’s enough to buy a product like Datadog and just look at the pretty graphs. Sure, that’s a start. But true observability is a feature that needs to be built— painstakingly, iteratively, by-definition starting with a shot in the dark. — These days, it has become fashionable to poo-poo the idea of being data-driven. People point out that measurement can distort the phenomenon that is being observed. They want to make processes “data-informed.” But this seems like silly backlash against the only rigorous standard in all of software engineering: That we hold ourselves to an objective standard. We measure how long things take, how many errors we encounter, how often a process successfully runs to completion. So here’s what this experience taught me about observability: When an issue happens in production, time-box the investigation. Sure, take a few hours to try and figure it out by looking in the logs and inspecting the code. But if you’re coming to the end of the day and you still don’t have a fix, then push a PR that adds logging. The first one may be just a guess, but it will begin a process that leads to the truth. And that is what we should all ultimately be striving for. — For more engineering tips and stories, follow me @dmwlff
Adam Wolff tweet media
English
95
582
4.1K
608.6K
Adam Wolff
Adam Wolff@dmwlff·
@Lon @xen_studio @trq212 We are working on refining the controls for this. For now, you can opt out by setting CLAUDE_CODE_FORK_SUBAGENT=0
English
0
0
1
61
Lon()
Lon()@Lon·
I will ask again - how do we opt out of "experiments"? - Why are we using A/B testing approaches on dev tools where users are actively building and have known-good workflows? - What metrics can possibly measure whether one of these "experiments" is a success? I did this kind A/B work for years at Bing. I would never do this to users in DevDiv or AWS Aurora, AWS Spatial, AWS LumberYard or a dozen other dev related things I've shipped.
English
1
0
0
25
Adam Wolff
Adam Wolff@dmwlff·
@Lon @xen_studio @trq212 Thank you for this feedback. While forked subagents are mostly useful, there are definitely cases (like adversarial review) where you don't want it. You should be able to prompt Claude to use a non-forked subagent.
English
1
0
2
48
Lon()
Lon()@Lon·
I didn't want to gripe about more than one thing at a time, but this is a huge behavior change to shove into subagents. I launch subagents to do targeted research that isn't polluted by the parent agent's context. What good does it do to have an agent investigate without supposition or assumption and it is carrying 200k tokens of existing context baggage?
English
1
0
0
20
Adam Wolff
Adam Wolff@dmwlff·
That is not accurate. Background/foreground does not affect cache at all. Forked agents (which are coupled with this experiment) should be _more_ cache efficient because they share cache with the parent and require less context to be passed as output tokens from the parent and unique input tokens to the subagent.
English
5
0
3
33
xenstudio
xenstudio@xen_studio·
@dmwlff @Lon @trq212 How is cache handled for background agents vs foreground agents? Seemed like background agents absolutely tanked quota. CC suggested it was because background agents couldn't use the cache of the primary agent, whereas foreground agents did. Is this accurate/expected behavior?
English
1
0
0
26
Lon()
Lon()@Lon·
@dmwlff @xen_studio @trq212 I don't understand why this experiment was paused and then put right back without solving these issues. PERMISSIONS DO NOT RELIABLY PROPAGATE! You worked on React. How appropriate is it for Dev Tools to actively change underneath you as you are trying to build using them?
English
1
0
2
41
Adam Wolff
Adam Wolff@dmwlff·
It seems like a century ago in AI time, but last fall I gave a talk about some of the early design decisions and mistakes I made in Claude Code. Some of the points here are already a little dated (like opening your editor!) but hopefully still entertaining. infoq.com/presentations/…
English
3
4
54
18.9K
Adam Wolff
Adam Wolff@dmwlff·
@xen_studio @Lon @trq212 You need to set CLAUDE_CODE_FORK_SUBAGENT=0 to disable it. We have paused this experiment while we investigate ways to make it more efficient.
English
2
0
3
201
xenstudio
xenstudio@xen_studio·
@Lon @trq212 @dmwlff If u do end up finding more appropriate way 2 remediate, if u remember give me tag. If u want to ensure CC session has it sourced, ask it to `printenv | grep CLAUDE` Again, not trying 2 assume u don't know this; but lots of beginners joining in, didn't want to miss instructions.
English
1
0
0
175
Adam Wolff
Adam Wolff@dmwlff·
@dani_avila7 @patricksrail Yeah! Test with CLAUDE_CODE_FORK_SUBAGENT=1 claude and let me know how it goes! Works with -p as of 2.1.121 as well.
English
0
0
1
124
Daniel San
Daniel San@dani_avila7·
@dmwlff @patricksrail Excellent! Thanks, Adam. If you need a beta tester, I’m available! 🙋🏽‍♂️ We’re running some interesting workflows with subagents and skills, and experimenting with frontmatter
English
1
0
1
119
Daniel San
Daniel San@dani_avila7·
Why is fork mode (CLAUDE_CODE_FORK_SUBAGENT=1) an env var instead of a frontmatter field on subagent definitions? The docs say it only intercepts the general-purpose subagent path, so I can't define a custom subagent that forks the parent context by design Skills already support context: fork in frontmatter. Why not subagents? CC @trq212 @dmwlff @amorriscode
Daniel San tweet media
Daniel San@dani_avila7

x.com/i/article/2048…

English
6
13
149
29.8K
Adam Wolff
Adam Wolff@dmwlff·
After 2.1.117, you may notice that Claude doesn't call its Grep or Glob Tool anymore. YES!!! It only took four months. It's faster than ever and it's all Bash. It's so much harder to take things away than to add them. Enjoy.
English
63
63
1.8K
346K
Adam Wolff
Adam Wolff@dmwlff·
Thanks for digging in here. Skill context fork isn't quite the same thing because it doesn't have perfect cache sharing. Here are some new docs: #fork-the-current-conversation" target="_blank" rel="nofollow noopener">code.claude.com/docs/en/sub-ag… The new Forked Subagent experience is a couple things bundled together: - default general agent is forked, with guaranteed cache hit - all subagents are backgrounded - new subagent UI below prompt input - /fork slash command In addition to the opt-in env var, I'm starting an experiment today. I'll let you know how it goes! If you have feature requests, bug reports, or questions, please let me know!
English
1
0
3
104
Adam Wolff
Adam Wolff@dmwlff·
@backnotprop @dexhorthy Yes, we had a crash on startup. We rolled back our pointers to 2.1.119 last night. Sorry folks! We'll be back Monday.
English
1
0
3
97
Spanky McDoob
Spanky McDoob@59thProfile·
can you guys get to the logical conclusion which is just to only give agent the shell and a vfs? and some md files. then it can make its own tools using unix primitives, use the files as essentially tools. less api calls. its actually unimaginable the gains you would get. idk why no one has done it yet
English
2
0
1
424
Adam Wolff
Adam Wolff@dmwlff·
@dexhorthy In most cases we can recognize find and grep commands as read-only.
English
0
0
1
274
dex
dex@dexhorthy·
okay i have not dug in but my immediate take on claude allegedly removing Grep/Glob tools: this is great for anyone who uses --dangerously-skip-permissions or maybe the use-a-model-to-auto-approve-things but for everyone else, now things that would always be allowed like Glob,Grep are now bash calls that need to be manually reviewed/approved, and last I checked the "yes and don't ask again" is pretty unreliable when bash commands contain pipelines like `| head` - `| tail` like i said, haven't dug in. I woud absolutely LOVE to be wrong about this
Adam Wolff@dmwlff

After 2.1.117, you may notice that Claude doesn't call its Grep or Glob Tool anymore. YES!!! It only took four months. It's faster than ever and it's all Bash. It's so much harder to take things away than to add them. Enjoy.

English
14
4
99
24.8K
Adam Wolff
Adam Wolff@dmwlff·
@AnthonyOdie @noahzweben Claude is really good at find and grep and sometimes forgets to use the custom tools. This is more natural for Claude and generally simpler is better. Our permission system is now sophisticated enough to recognize when bash commands that Claude uses are read-only.
English
3
0
49
8.3K
Anthony Odie
Anthony Odie@AnthonyOdie·
@dmwlff @noahzweben Why do this? Isn’t Grep/Glob better for permissions/speed/context efficiency? Just curious.
English
3
0
41
18.9K
Adam Wolff retweetledi
ClaudeDevs
ClaudeDevs@ClaudeDevs·
For the developers building with Claude, a direct line from the team. Follow for changelogs, API releases, community updates, and deep dives.
English
618
1.6K
21.5K
8.9M
JeffMo
JeffMo@lbljeffmo·
@dmwlff Anything I could do to provide insights or feedback or debug info when I experience it?
English
1
0
0
46