Tech

Anthropic releases open-source framework for autonomous AI vulnerability discovery

Anthropic has published a reference harness to help security teams build custom pipelines for finding and fixing code vulnerabilities, though the company notes that autonomous triage remains an open technical challenge.

Author

Owen Mercer

Markets and Finance Editor

Published

Draft

Source: Hacker News · original

Artificial Intelligence Media Research

Related coverage

Explore Artificial Intelligence coverage Explore Media coverage Explore Research coverage More from the Tech desk

Tech

No image available

New GitHub repository offers reference implementation for Claude-based security pipelines

Anthropic has published an open-source reference implementation on GitHub, titled 'defending-code-reference-harness', designed to assist security teams in constructing custom pipelines for the autonomous discovery and remediation of code vulnerabilities. The framework utilises the Claude AI model to facilitate threat modelling, scanning, triage, and patching, drawing on learnings from partnerships with security teams during the launch of the 'Claude Mythos Preview'.

The repository provides a structured approach for organisations to move from interactive skills to autonomous operations. The recommended implementation follows a phased timeline, beginning with interactive threat modelling and static scanning on the first day, followed by an autonomous run on a known-vulnerable open-source library on the second day. Subsequent days involve customising the harness for specific targets, with the second week introducing an outer loop for continuous scanning and triage.

While the reference pipeline is optimised for identifying memory vulnerabilities in C and C++ code, its architecture is generic enough to be ported to other languages or vulnerability classes. The system includes mechanisms to deduplicate findings across multiple runs and recalibrate severity ratings against a custom threat model. It supports integration with various Claude API access points, including Bedrock, Vertex, or Azure.

Security protocols within the framework are strict, requiring the autonomous reference pipeline to execute within a gVisor sandbox unless explicitly overridden. In contrast, interactive skills are read- and write-only and can run unsandboxed if tool use is approved. The pipeline is designed to verify and deduplicate its own findings, with the triage component collapsing duplicates across runs and routing findings to component owners.

Anthropic explicitly states that the repository is not actively maintained and does not accept external contributions. The company notes that autonomous triage and patching remain open technical challenges, with verified patches not always being upstreamable. For organisations seeking a managed solution, Anthropic directs users to 'Claude Security', a hosted product that provides a multi-stage verification pipeline to reduce false positives and manage findings through their lifecycle.

Anthropic releases open-source framework for autonomous AI vulnerability discovery

More from Tech

USDA confirms first New World screwworm infection in Texas since 1960s eradication

TechCrunch Startup Battlefield returns to Sydney with Stripe partnership

StrictlyVC Los Angeles to examine defence tech, physical AI and capital durability