Connor Ingleson (@connoringleson) - Twitter Profili

Connor Ingleson@connoringleson·16 Nis

It’s interesting to think about the evolution of the AI business model. The promise is that AI will get 10x cheaper and create unlimited economic value. Gartner predicts that the cost of inference will decrease 90% by 2027, however, overall costs for AI are still increasing. If you look at the price per token of the most demanded/frontier models, it stays relatively flat over time. Pricing for the most advanced reasoning models has remained relatively consistent. GPT-4 launched in 2023 at $60 per million output tokens, and today Opus 4.6 sits around $75 per million output tokens. The stark reality is that people are cognitively greedy. They want the best model. The supercar, not the Honda Civic. Most users are not thinking about the price-to-performance curve. Few consumers or employees would see the benefit of cheaper models, for example Magistral, sitting closer to $1 per million output tokens. And even if you trade off price for performance, the models often require more iteration, input/output, to complete a given set of tasks. This is also true of reasoning models, they tend to think, think, think. This can be seen in the fact that the length of AI tasks is doubling every six months. What used to output 1K tokens is now returning 100K tokens. This exponential increase in token consumption will be exacerbated by the shift towards agentic models (agents, tools, models), which require up to 30 times more tokens per task than a standard chat use case. So even as unit costs fall, consumption is exploding. For tech companies and research labs (for-profit maximizing companies), this foreshadows the potential for a dramatic industry shift in the primary business model. It’s clear the user subscription SaaS model is breaking in many scenarios. Anthropic rolled back its unlimited subscription. Windsurf was sold for parts. Builder. ai filed for an epic bankruptcy. Yet companies continue chasing share now with the promise of increasing future margins. This is especially true in the all-you-can-eat consumer model, more so than the enterprise PAYG API approach. We may also see more effort focused on model routing. I can imagine a world where you give a system a task and a budget, and let it decide how to allocate across models to stay economically efficient. For consumers, we should remain greedy until they roll back unlimited subscriptions or put new pricing in place. For founders, this changes the game. You’re no longer building high-margin SaaS with near-zero marginal cost, you’re running a variable cost engine where usage directly impacts your P&L. That means you have to design your product and pricing with cost in mind from day one. I wonder if the VC analogy still applies, fuel for a rocket ship? Growth over profit? For the enterprise, this will dramatically transform operations. A shift from cloud FinOps to AI FinOps, putting in place the right governance across the stack. This includes thoughtful capacity planning and model selection, whether through provisioned throughput or reservations, alongside strong monitoring with token telemetry, usage dashboards, and trend analysis. It also means tighter control over model access, defining which models teams can use, what defaults are set, and how environments are separated. Enterprises will need to enforce token limits per model or deployment, apply rate limits to prevent agents from spiking usage unexpectedly, and introduce quotas at the caller or agent level, tracking consumption across apps and services. On top of that, identity-based controls and agent registration will becritical, separating human and non-human usage. Over time, expect this will extends into standardized agent blueprints with known cost profiles, along with clear constraints around tool access. The bottom line is that the winners won’t just build great AI products, they’ll build economically intelligent systems that balance cost, quality, and speed in real time.

English

Connor Ingleson

Keşfet