Tech

Expert argues C and C++ code contains pervasive undefined behaviour

A 30-year veteran of the industry contends that virtually all non-trivial C and C++ code contains undefined behaviour, citing the C23 standard and testing on OpenBSD.

Author

Owen Mercer

Markets and Finance Editor

Published

Draft

Source: Hacker News · original

Artificial Intelligence Media Research

Related coverage

Explore Artificial Intelligence coverage Explore Media coverage Explore Research coverage More from the Tech desk

Tech

No image available

Blog post suggests manual verification is insufficient and AI supervision is now essential for compliance

A blog post published on 20 May 2026 asserts that virtually all non-trivial C and C++ code contains undefined behaviour (UB), rendering manual verification insufficient. The author, citing 30 years of experience, argues that UB is pervasive due to subtle issues such as unaligned pointers, signed character handling in standard library functions, and null pointer assumptions. The piece suggests that writing C/C++ without AI supervision to detect UB is irresponsible and potentially a Sarbanes-Oxley violation, noting that even meticulously maintained projects like OpenBSD contain such errors.

The author, who has written C and C++ on an almost daily basis for three decades, references specific sections of the C23 standard, including 6.3.2.3 regarding pointer conversions and 7.4p1 concerning the isxdigit function. The post claims the C23 standard contains 283 uses of the word "undefined", highlighting that integer promotion rules are difficult to apply at code skimming speeds. The author notes that while some machines historically used non-zero NULL pointers, the C standard defines the abstract machine rather than specific hardware addresses, making assumptions about memory layout risky.

Testing an LLM on OpenBSD code, the author found UB and submitted a patch for an out-of-bounds write, though they did not submit patches for all identified UB issues. The author mentions that OpenBSD has historically been unreceptive to bug reports, which influenced the decision to limit direct contributions. The piece suggests that while x86 architectures are forgiving about cache coherency subtleties, ARM and RISC-V may behave differently, raising concerns about future architectures and cross-platform compatibility.

The author argues that UB does not merely mean the compiler can take advantage of sloppiness, but that the compiler can assume the code is valid, meaning the intention of the code may not be expressible between compiler stages or modules. The post suggests that writing C/C++ without AI supervision is irresponsible and potentially a Sarbanes-Oxley violation, noting that even meticulously maintained projects like OpenBSD contain such errors. The author recalls a prominent post from a decade ago discussing the potential for C++ usage to constitute a SOX violation, stating that they have found this to be more true over time.

The author suggests that while the industry cannot throw away its C/C++ code bases, leaving them inherently broken is not an option. They propose that AI supervision is necessary to fix UB at scale without committing AI slop or overwhelming human reviewers. The author notes that while expert humans are needed to confirm AI findings, they are often busy with other tasks, making this a form of janitor work that is too subtle for junior programmers. The post concludes that if OpenBSD cannot weed out UB from their code base in 30+ years, the rest of the industry has little chance without automated assistance.

Expert argues C and C++ code contains pervasive undefined behaviour

More from Tech

Apple to roll out manual EQ controls for AirPods in iOS 27 update

Apple rolls out visionOS 27, integrating AI-driven Siri into Vision Pro headset

Apple Overhauls Siri with Google Gemini Partnership and Standalone App at WWDC 2026