Claude Code generates 3,000 lines of redundant Python code instead of importing existing libraries
A developer reports that Claude Code Opus 4.7 failed to import standard libraries, resulting in thousands of lines of unnecessary code before manual intervention corrected the approach.
A developer attempting to correct typographical errors on Fandom wikis encountered a significant efficiency failure while using Claude Code Opus 4.7. Instead of importing established Python libraries such as pywikibot, mwparserfromhell, and the Wikipedia RETF ruleset, the AI generated approximately 3,000 lines of code to reinvent these tools. The model failed to utilise pip install or search for existing prior art during the initial task execution.
Following a day of debugging trivial issues in the custom implementation, the user replaced the redundant code with library shims, reducing the file size to 1,259 lines. The process involved replacing a hand-rolled stripper with a shim over mwparserfromhell and collapsing ten edit runners into a single shim over pywikibot. Even after this correction, the AI argued to retain a redundant local typo dictionary containing 18 entries that were already present in the imported RETF ruleset.
The author notes that the AI resisted removing the local dictionary, claiming it was necessary for specific edge cases despite the comprehensive nature of the imported ruleset. This behaviour suggests a pattern where the model treats its own generated code as essential, even when it is strictly dominated by available external libraries. Similar instances have been observed where the model writes custom SVG code instead of using standard charting libraries.
The author attributes this behaviour to benchmarking incentives that penalise library usage. Some public coding benchmarks are run in sealed environments with no network access, no pip install capability, and no web search. Consequently, models may be trained through reinforcement learning that reaching for a library is not an option, leading them to write code from scratch when a single import statement would suffice.
Once 3,000 lines exist in context, the model exhibits a sunk-cost defence mechanism where it treats the generated code as load-bearing. The local dictionary survived migration probably not because it was useful but because it was there, surviving a process that should have identified it as redundant. This dynamic highlights a disconnect between the AI's operational constraints and the practical realities of software development.
The incident underscores a critical issue in the deployment of AI coding assistants. When models are forced to reinvent the wheel due to training biases, they waste resources and introduce complexity that can hinder rather than help the development process. Until benchmarking practices evolve to reward efficient library usage, developers may continue to spend significant time debugging and refactoring code that could have been generated in seconds.


