
Victoria Dickson
33 posts








Deep|AI Infra 2026: Shifting from "Brain Power" Competition to "Whole-Body" Evolution This is one of our most important reports this year, and our entire team invested a significant amount of time and effort into it. We observed that the OCS ratio in Scale Up scenarios is still rising rapidly, and we also found that $MRVL is involved not only in TPU but also in LPU. In 2026, the focus of AI development has pivoted from chasing high benchmark scores to pursuing AI Agents capable of multi-step reasoning and autonomous action. This infrastructure arms race is undergoing a transformation akin to biological evolution. If an AI system is viewed as an evolving organism: the GPU/TPU represents the calculating brain; Memory and Storage serve as the memory carriers for experience and context; the CPU acts as the hands coordinating tasks; while Optics and Networking function as the limbs supporting systemic data flow and response sensitivity. Under the framework of the Agent Scaling Law, the core bottleneck is no longer just the FLOPS of a single chip (brain power), but rather the communication efficiency (limbs), the memory wall (memory), and the Total Cost of Ownership (TCO). The “Brain” Idle Crisis: Even with the most powerful compute cores, if the “limbs” (communication) are underdeveloped, chips will sit idle for over 1/3 of the time waiting for data. The “Memory” Retrieval Bottleneck: Long-sequence reasoning for Agents imposes rigorous demands on KV Cache management; the performance of memory and storage components has become the deciding factor for an Agent’s logical depth. Dimensional Evolution of “Limbs”: To overcome the communication bottlenecks inherent in MoE architectures, infrastructure is moving from 3D Torus toward high-dimensional topologies (up to 10D). Networking investment weight is now matching or even surpassing that of compute chips. This report outlines the bottlenecks facing AI Agents and recent TPU progress, specifically exploring how Google TPU optimizes “whole-body” coordination through vertical integration. We argue that: Networking is the new core battlefield: To solve MoE All-to-All bottlenecks, Google is significantly expanding scale-out bandwidth and shifting from 3D Torus to higher dimensions. Unlocking TCO and Allocation Efficiency: Through proprietary architecture and vertical integration, the TPU v7 rack cost is significantly lower than the NVIDIA GB200. This efficiency gain frees up CapEx for growth in optical communications and memory. $LITE $NOK $CRDO Detailed Report fundaai.substack.com/p/deepai-infra…




@winnie_the_punk Je sais pas, imagines t’as un pépin quelconque, tu préfères quoi ? Être forcé de vendre des choses ou de faire un crédit (dont t’es même pas sûr qu’il te soit accordé) ? Ou simplement sortir du cash de ton épargne ?






























