Tech giants face AI cost reality as usage bills eclipse human labour expenses
Major technology firms are confronting a financial bottleneck: the aggregate cost of agentic AI systems is rising faster than unit prices fall, complicating widespread deployment plans.
Microsoft has reportedly cancelled most direct Claude Code licenses for its engineers, shifting the workforce to GitHub Copilot CLI following high usage volumes that rendered the technology more expensive than human employees. This reversal occurred just six months after the firm encouraged thousands of developers, project managers, and designers to experiment with the tool. The decision does not impact Microsoft’s broader Foundry deal, which includes a $5 billion investment in Anthropic and access to Claude models for Foundry customers, alongside Anthropic’s $30 billion commitment to purchase Azure compute capacity.
The move follows similar financial strain at Uber, where the company exhausted its entire 2026 AI coding tools budget in just four months. Uber’s chief technology officer, Praveen Neppalli Naga, confirmed the budget exhaustion to The Information in April, noting that the rapid consumption occurred despite the firm actively incentivising adoption through internal leaderboards that ranked teams by AI tool usage.
This trend highlights a growing disconnect between the promise of AI productivity and the economics of token-based pricing. As companies push employees to maximise usage, the work becomes more expensive with increased consumption. Meta employees have created a leaderboard named “Claudeonomics” to track AI usage, while Amazon is encouraging staff to “toxenmaxx,” or maximise AI token usage. However, the token-based model means that efficiency gains and higher usage volumes directly inflate costs.
Industry analysts warn that while unit costs for AI tokens may decline, the aggregate cost for enterprises will likely rise due to the demands of agentic AI systems. Goldman Sachs forecasted that agentic AI could drive a 24-fold increase in token consumption by 2030, reaching 120 quadrillion tokens per month. This surge in consumption threatens to outpace any reductions in the price of individual tokens, complicating plans for widespread AI deployment across the sector.
Bryan Catanzaro, vice president of applied deep learning at Nvidia, stated that compute costs for his team are far beyond employee costs, underscoring the financial reality of current AI infrastructure. Gartner predicts that while inference on a one-trillion-parameter LLM will cost nearly 90% less than in 2025 by 2030, aggregate costs may still rise. The research firm noted that agentic models require significantly more tokens per task than standard models, and providers may not fully pass through lower costs to consumers, keeping inference expenses high for enterprises.


