Jukan@jukan05
Today's biggest semiconductor industry news is that Gerard has founded a new ARM CPU company, Nuvacore, five years after Nuvia was acquired by Qualcomm. Even the name closely echoes the predecessor.
Starting a datacenter CPU company at this point in time means catching what is arguably the best decade for CPUs in recent memory. The shortage wave that AI agents will usher in is already faintly visible, and several AWS customers have gone so far as to request entire blocks of Graviton ARM CPU capacity wholesale.
The pull this news exerts on Silicon Valley's semiconductor crowd is enormous. Nuvacore's lineup is stacked exclusively with proven star players — the original Nuvia founding team reunited, backed by Sequoia, building general-purpose ARM CPUs aimed at AI infrastructure and agentic computing. The direction was entirely unvalidated back then, yet ended in resounding success. Now ARM CPU servers sit squarely in the eye of the storm. The scale of the outlook and the imagination it invites are incomparable to Nuvia in 2019.
Last time Gerard poached architectural heavyweights from Google's and Apple's platform architecture teams; this time his pitch will land far harder. A $240M raise, a proven playbook, a reassembled founding team, and the next wave of growth visible straight ahead — all of it conspires to make Nuvacore the hottest, most sought-after semiconductor startup in the Bay Area, and by some distance. This is, after all, a chance at financial freedom right in front of you with an exceptionally favorable risk/reward profile.
The first-generation Nuvia CPU announcement landed in the Apple M2 era and was genuinely shocking. Nuvia pushed Qualcomm's CPU benchmark scores forward by a full three generations in a single year, with single-core scores climbing from 2300 to 3200, decisively outpacing the Apple M2 Max.
The disappointment was that Nuvia's Phoenix core dragged far too long from announcement to actual launch. In that interval Apple went on a toothpaste-squeezing tear, pushing out M3 and M4 back-to-back. By the time Nuvia's CPU finally shipped, it was being benchmarked against the M4 rather than the M2, and fell from anticipated protagonist to mere background.
Nuvia's foresight back then was well ahead of the field. In 2019, ARM CPU server market share was essentially zero, and Nuvia set out to crack this market from scratch. After the $1.4B acquisition by Qualcomm in 2021, Qualcomm pledged unlimited resources, went on an aggressive hiring expansion, and offered compensation at the very top of the market.
Unfortunately, the external environment deteriorated sharply in 2022, and on top of that, Qualcomm's leadership displayed strategic myopia of the worst kind. Just as the ARM server ecosystem was beginning to stretch its limbs, Qualcomm dissolved its own Nuvia CPU server team for the sake of share price and cost reduction (the second time — they had already dismantled an ARM server team back in 2015).
It was only in 2025 — four years after NVIDIA's Grace ARM CPU had already been announced, long after Vera ARM CPU had been in internal development, and with Amazon's Graviton accounting for nearly 50% of new server CPU shipments — that Qualcomm belatedly and tentatively restarted its ARM server project.
So Gerard walking out of a senior Qualcomm position to regroup the original founding team and do this himself is probably because Qualcomm's leadership has been so short-sighted that it kept missing the window. When Nuvia wanted to do ARM servers the first time around, Qualcomm's commitments turned into bad checks the moment the macro turned; after the acquisition, the project was cancelled outright and redirected toward laptop and smartphone chips.
Further, Qualcomm is bracing for a meaningful hit to smartphone sales this year from the historic surge in memory and storage prices (the market is contracting roughly 30%), and expansion budgets are constrained, so the resources available internally are limited.
And a startup offers far easier access to decision velocity, team purity, product definition authority, and a capital story than a large platform like Qualcomm can. That is why, with the window already validated, the original members have been reassembled.
But the more fundamental reason is probably that the CPU outlook in the AI era has genuinely become so vast that jumping back in is worth it. Gerard hasn't changed — the external market has.
Since entering 2025, the arrival of AI agents has been quietly turning the CPU back into a bottleneck. CPU servers have re-entered a growth trajectory, and the latent potential is considerable. A few factors:
1. As the inference era arrives, GPUs have evolved into inference-specialized system-level architectures, while CPUs have become the perpetually busy orchestrator. The pursuit of token throughput multiplies the number of heterogeneous compute stages, batch sizes grow, scheduling / routing / dataflow complexity rises, and orchestration requirements scale accordingly.
So in system-level heterogeneous inference architectures, the CPU-to-GPU ratio has turned far more aggressive. From the previous 1:4, to 1:2 in Grace Blackwell, and likely heading toward 1:1. Google TPU pairs with Axion, Amazon Trainium with Graviton, NVIDIA Rubin with its own Vera CPU.
I wrote about this in last November's semiconductor year-end review, and by 2026 it has essentially become consensus. That said, this piece is mostly captured by each company's in-house AI chip efforts, so it's hard to count as pure CPU server demand, and it's ambiguous whether it constitutes an opportunity for the external CPU server market.
2. Also from that same year-end review:
Viewed from the CPU angle, agentic workloads run routing and tool handling entirely on CPU. Profile popular agentic frameworks like SWE-Agent, LangChain, and Toolformer and you'll find that CPU can account for up to 90% of end-to-end latency; throughput bottlenecks land on the CPU more often than not; and CPU energy consumption can exceed 40% of total energy draw.
Agentic AI, today, is more of a CPU-bottlenecked workload. Agents manage numerous CPUs, and they need to spin up sandboxes frequently — this has a very strong chance of triggering a fresh recovery cycle in CPU demand.
Looking back now, the latent potential of this thesis I wrote last year is substantial. But it hadn't materialized at scale at the start of this year. The CPU growth and the "CPU shortage wave" that various players have been talking about early this year aren't directly tied to this logic — they're more about backfilling the traditional CPU server under-investment that accumulated from years of over-allocating capex to GPUs.
In H2, and certainly into 2027, agents will spread much more broadly. Smart shopping assistants and customer service agents, for instance, already accounted for a significant share of Amazon's 1M CPU procurement late last year, and growth in this segment is very fast.
The first two threads are essentially the mainstream consensus when this year's market talks about CPU potential. But there are two more threads I think are underappreciated:
3. The main thread that makes CPU server potential bigger and longer-lived may not be directly tied to agents themselves. It's a byproduct of code agents.
As the barrier and speed of coding have been dramatically optimized, the entire cost of "building software + connecting software + calling software + automating software" has collapsed by an order of magnitude. On the software supply side, the Jevons paradox kicks in — the world is pushed toward higher software density and higher API density, which in turn translates directly into a linear increase in traditional CPU workloads.
Since late 2025, coding agents have undergone a qualitative shift. Claude Code has seen explosive growth, with token revenue tripling in three months. The next phase will inevitably be a 10x increase in code volume and an explosion in the number of apps.
Even at the large enterprise level, daily consumption of 1M tokens is merely average, and per-capita coding output will inevitably at least double (at small companies, 10x). The surge in code supply doesn't stay inside repos — it gradually becomes long-running software assets, meaning more long-lived features, more products, more microservices, more APIs.
Over the long run, the total production cost and production cycle of every app and API collapse to about 10% of their original levels, and APIs become extraordinarily abundant. API usage then rises in bulk, which drives a bulk rise in traditional CPU workloads — in CPU seconds. This doesn't even have a direct tie to "agentic."
On the time axis, this thread is not short-dated. Claude Code's inflection is only a few months old, and the product releases, microservice releases, and API releases that follow are necessarily delayed further out in time. Software getting cheaper doesn't mean society uses less software — it means more work gets shifted onto software.
So in H2 or later, we'll probably see a phenomenon where traditional CPU cloud demand mysteriously rises again — one that on the surface doesn't even appear directly linked to AI agents.
4. CPU is technically very hard to deflate. It does not have the property that, like memory/storage, you can compress the per-task footprint through various compression algorithms. Growth in CPU workloads translates into hardware demand growth for real.
For example, KV cache sees new compression techniques each year. An older one, multi-head KV cache sharing down to a single head (GQA), delivers roughly a 4x compression. Last year's turboquant-style techniques added several more multiples of compression. Add data precision dropping from FP16 toward the next step of FP4, and the precision reduction brings KV cache compression — technical deflation on the storage side.
But on CPU, the technical deflation headroom is very thin. Every increment in today's agentic CPU workload (CPU seconds) converts into real hardware demand, essentially in full. The only deflationary factor amounts to the 10–15% benchmark score uplift per generation per year. What about other deflationary factors, like cloud's 5–6x over-sell? No — because CPU has always been over-sold to begin with. The over-sell / underutilization technical deflation on CPU will not widen further from here, and every incremental CPU second converts linearly into hardware with no meaningful discount.
ARM's guidance is that the CPU supply-demand gap could widen to 30%+. Stack these factors, add the fact that AI servers are cannibalizing CPU server capacity and order slots, and the gap could grow further, with hyperscalers' responses likely running a step behind.
At the same time as aggregate CPU demand potential is rising, ARM server CPU is living through its best era in history.
Roughly 50% of the new traditional server CPUs hyperscalers are deploying for cost optimization are ARM — Google's Axion, Amazon's Graviton, Microsoft's Cobalt. Graviton's 2026 capacity is already fully sold, with the bottleneck having shifted to capacity itself.
Google TPU pairs with Axion, Amazon Trainium with Graviton, NVIDIA Rubin with its own Vera. The collective rotation of these CPUs toward ARM is driven by more than cost. As inference systems push token throughput to the limit, batches get larger and more complex, and under those conditions, ARM CPU in-house development plus system-level HW-SW co-design is much more convenient. NVIDIA, for instance, uses Dynamo to orchestrate Vera–Rubin cooperation.
From Nuvacore's plan, it doesn't look content to stop at IP — it is going after finished products. The hiring page already lists validation engineer roles.
That said, the challenges Nuvacore faces this time are not small. The start is far later. Both the market and the technology are far more competitive. CPU servers are vastly more complex than seven years ago — the contest is no longer at the individual CPU level but at rack-system complexity.
Building a CPU now for 2028–2029 launch that is competitive at the rack level requires much greater scale. You basically have to assemble dozens of chiplets, 500+ cores, while simultaneously working out how to handle AI agentic workloads. The workload has grown noticeably larger than before. The bar for a startup is meaningfully higher than seven years ago.
Last time, Nuvia was successfully sold for $1.4B just two years after founding. The heat in this market is orders of magnitude hotter than five years ago. So what does the road ahead look like for Nuvacore?
On the acquisition route, the pool of potential acquirers has not really expanded versus five years ago. Over these five years, Google got Axion, Microsoft got Cobalt, Amazon got Graviton, and NVIDIA's in-house Vera CPU has already taken shape. Even ARM has broken its 35-year IP-only tradition and begun building its own AGI CPU chips.
The most plausible acquirer is the SoftBank camp. SoftBank has been building deep positions in the ARM CPU server ecosystem for years, and acquired Ampere for $6.5B. Adding Nuvacore on top would be a natural follow-on, and the imagination this market affords more than justifies that kind of price tag.
Another possibility is Meta. Among the major internet companies, Meta is the only one without a reliably in-house CPU server offering, and what limited resources it has are going into MTIA on the AI accelerator side.
The problem with Meta, though, is that its stability is extremely low. Decisions shift month to month, attention span is very short-term, and projects can be killed at any time. For Nuvacore, it would be a very poor acquirer — one that wouldn't let the company realize any of its potential.
Taken as a whole, Nuvacore's option set is considerably wider than it was five years ago. The consensus around ARM CPU server potential is now firmly in place. Funding difficulty is meaningfully lower. The friction for operating and scaling independently is much reduced versus the past. And partners are more willing to engage, buoyed by expectations of the future.
It is entirely reasonable to scale well beyond where Nuvia was before considering an exit. There is no need to rush.
$QCOM $ARM