
Sabr
1.8K posts


@HedgieMarkets Nah, they just got greedy because they know centralized AI's are A. unmanagable in the long haul and B. the personal/local AI era is coming and only the strong will survive.
English

🦔Microsoft canceled its internal Claude Code licenses this week after token-based billing made the cost untenable, even for a company with effectively infinite cloud resources. Uber's CTO sent an internal memo warning the company burned through its entire 2026 AI budget in just four months. American AI software prices have jumped 20% to 37%, and GitHub (owned by Microsoft) is dropping flat-rate plans for usage-based billing across its products.
My Take
The AI subsidy era is ending in real time. The same company that put $13 billion into OpenAI and built the Azure infrastructure powering most of Anthropic's compute just looked at the bill from a competitor's coding tool and decided it was not worth paying. That is not a productivity failure on Anthropic's end. Token-based pricing is forcing every enterprise customer to confront the actual cost of running these models at scale, and the number turns out to be far higher than the flat-rate experiments suggested.
This ties directly to my Gemini Flash post yesterday. Anthropic, OpenAI, and Google all raised effective prices in the last six months. Enterprises that built workflows assuming AI costs would keep falling are now watching annual budgets evaporate in months. Two outcomes look likely from here. Either enterprises scale back AI usage to fit budgets, which slows the revenue ramp the labs need to justify their valuations ahead of IPOs, or the labs cut prices and absorb the losses, which makes the unit economics worse at exactly the wrong moment. Both paths land in the same place, the numbers stop working, and somebody has to take the writedown.
Hedgie🤗

English


Me starting with LLMs:
"bigger GPU, more VRAM = faster inference"
Me now:
- VRAM bandwidth
- KV cache behaviour
- memory latency
- cache locality
- PCIe bottlenecks
- kernel efficiency
- quantization tradeoffs
- memory movement
Modern AI inference is basically systems engineering disguised as matric multiplication.
English

Hey @grok is this sentence true?
"If signed: XRP gets permanent commodity status."
Eleanor Terrett@EleanorTerrett
🚨NEW: @BankingGOP Committee markup of the Clarity Act set for Thursday, May 14 at 10:30 AM EST.
English


@leftcurvedev_ afaik both use llama.cpp in backend, especially ollama is doing some very weird stuff.
English

One minute of silence for all the memory banks in this NL datacenter right now :X
youtu.be/32BI99W5LFE?si…

YouTube
English










