Tech

METR study reveals developers’ refusal to work without AI masks declining code quality

While AI tools accelerate coding speed, independent research and corporate data suggest a trade-off in maintenance costs and bug rates, prompting warnings from experts about long-term software integrity.

Author

Owen Mercer

Markets and Finance Editor

Published

Draft

Source: TechCrunch · original

Artificial Intelligence Research

Related coverage

Explore Artificial Intelligence coverage Explore Research coverage More from the Tech desk

Tech

No image available

Tech giants Amazon and Uber report wasted spending as ‘tokenmaxxing’ trends fail to deliver productivity gains

In February 2026, AI research lab METR published findings indicating that the majority of developers now refuse to work without artificial intelligence assistance, even for minor tasks. This dependency has effectively prevented the replication of METR’s 2025 productivity studies, which had previously demonstrated that while AI generated code faster, it ultimately slowed developers down due to the time required to fix errors and steer the tools. Unable to conduct controlled experiments, METR shifted to a survey in May, where technical employees self-reported that AI made them twice as valuable to their organisations, a claim that recent corporate data and independent research have begun to challenge.

The disconnect between perceived productivity and actual output is evident in the spending habits of major technology firms. Amazon recently discontinued its internal token-tracking leaderboard, known as Kirorank, after employees were found to be gaming the system by using AI agents excessively, thereby inflating costs without delivering results. Similarly, Uber exhausted its 2026 AI budget within the first four months of the year. Chief Operating Officer Andrew Macdonald confirmed that this significant expenditure did not result in measurable increases in project output or overall productivity, highlighting the industry trend of “tokenmaxxing,” where token usage is erroneously used as a proxy for efficiency.

Independent analysis suggests that the speed of AI-generated code comes at a steep price regarding maintenance and quality. Aiswarya Sankar, founder and CEO of Entelligence AI, noted that companies are spending 44% of their tokens on bug fixes generated by AI. Furthermore, code-reviewing tool provider Code Rabbit reported that its analysis of open-source pull requests revealed AI produced 1.7 times more problems than human-written code. These findings align with a viral argument by programmer and author James Shore, who warned that trading speed for increased maintenance costs leads to “permanent indenture” for development teams.

Academic research supports the view that AI integration introduces significant long-term risks. In April, researchers from Singapore Management University published a report warning that AI-generated code can introduce substantial maintenance costs into real-world software projects. The study emphasised that developers must understand the limitations of AI tools and implement robust quality assurance systems specifically designed for AI-generated content. Experts advise that programmers should review AI output with the same scrutiny applied to a junior developer, rather than treating the technology as a fully autonomous solution.

Despite the push for automation, industry leaders acknowledge that AI currently lacks the capability to handle complex, high-level tasks independently. Scott Wu, founder and CEO of Cognition and creator of the AI coding agent Devin, admitted that his tool currently operates at the skill level of a junior to mid-level programmer. Consequently, experts recommend that human developers retain responsibility for critical functions such as software architecture and security design. The consensus among researchers is that while AI can assist with routine tasks, it cannot replace the need for rigorous human oversight and strategic technical leadership.

METR study reveals developers’ refusal to work without AI masks declining code quality

More from Tech

Apple to roll out manual EQ controls for AirPods in iOS 27 update

Apple rolls out visionOS 27, integrating AI-driven Siri into Vision Pro headset

Apple Overhauls Siri with Google Gemini Partnership and Standalone App at WWDC 2026