Rihard Jarc@RihardJarc
People are bearish on memory, but the leaked Claude Code source code is showing us some additional memory demand that the market hasn't priced in IMO.
1. The market thinks about AI memory demand as a server-side story: HBM on H100s/B200s for inference. What the bug reports reveal in this code is that the client-side of AI coding agents is also extraordinarily memory-hungry. Idle Claude Code processes growing to 15GB each, active sessions hitting 93-129GB. This matters because the feature flag pipeline (DAEMON, PROACTIVE, CRON) points toward future always-on background agents. If a developer has a persistent daemon agent running alongside their active sessions, you're looking at baseline memory consumption of 15-30GB+ just for Claude Code on a developer workstation - before they even open their IDE, browser, or anything else. This means either enterprise IT needs a big uplift to higher-RAM workstations or we move even more memory-hungry workloads towards the cloud.
2. The Auto Dream consolidation feature runs background Claude sessions to clean up memory files. One observed consolidation took 8-9 minutes processing 913 sessions. In other words, a meaningful fraction of Anthropic's token consumption is the system managing its own memory, not the user doing productive work. As memory systems get more sophisticated (team sync, cross-session event buses, memory consolidation), this overhead grows. It's a recursive cost - more memory features require more inference to manage memory. I don't think anyone is modeling this as a distinct line item in token consumption estimates.
3. 1M token context windows for Claude Code. Moving from 200K to 1M context is a 5x increase in KV cache memory per session on the server side. Combined with multi-agent (5-15x per user) and the proactive/daemon features (sessions that persist for hours/days instead of minutes), you get a compounding memory demand curve that's steeper than linear adoption growth that many analysts model.
Memory demand per active user is increasing faster than user count, because each user's sessions are getting longer, wider (more agents), and deeper (larger context windows).