
Recently spent time updating an older analysis on where AI demand is actually going and came away still thinking we’re massively short compute (~8–50x short) on consumer inference alone. Big range (future is humbling), but even the low end makes the point. I dropped a link to the fuller write-up for anyone inclined over a slow week. It also hits a few popular debates / my steelman AI bear case. Some of this may be optimistic (or wrong). I’m a dreamer, so be kind :) Consumer is easiest to parameterize. If we’re massively short just on that, you start to understand why the biggest players are building so aggressively. Framework: tokens are the kWh of knowledge work and demand scales as price drops, leading to new workloads and moving us from 100-token prompts to agentic loops + multimodal + “robotic episodes” that can consume orders of magnitude more tokens. Supply: we’ve installed mid-teens GW of frontier compute using Jensen’s rule of thumb. Other accounts suggest it may already be closer to the mid-20s GW. Either way, it sounds huge until you realize cluster-level effective performance is ~5–10% of chip specs once you net out site power overhead, MFU, and fleet mix. Steelman bear: AI creates massive shadow output gap, but much of it is competed away or shows up as deflation/consumer surplus rather than immediate EPS gains. More detail in the write-up: drive.google.com/file/d/1n8WcKs… Appendix (topics covered): • TPU vs GPU • China/Huawei • Robotics + world models $nvda $orcl $crwv $nbis











