
Gartner now says inference costs will fall 90% by 2030. Total AI spend will still rise. The reason: token consumption is growing faster than prices fall, and agentic loops hit an LLM 10-20 times per user task. Falling unit cost plus exploding volume means the invoice dispute surface area is expanding, not shrinking. Cheap tokens don't fix the problem if nobody agrees on how many were consumed.
hpcwire.com/aiwire/2026/03…
English