Anthropic launched a new security-guidance plugin for Claude Code on May 27, 2026, enabling the AI to identify and remediate software vulnerabilities in real-time as developers write code. The update, unveiled by the San Francisco-based AI firm, integrates security checks directly into the development session, allowing the tool to suggest specific fixes for common security flaws before code is even committed to a repository.
The release coincides with the announcement of a self-hosted sandbox for Claude Managed Agents, which Anthropic showcased at its “Code w/ Claude” event in London. This new infrastructure allows enterprise teams to run AI-driven code execution in isolated environments they control, addressing sensitive data privacy concerns that have historically slowed AI adoption in regulated industries. These developments represent a push by Anthropic to turn its AI from a general assistant into a specialized security auditor.
Industry analysts see this move as a direct response to the increasing complexity of software supply chain attacks. By embedding security guidance into the CLI-based Claude Code tool, Anthropic is attempting to “shift left”—a DevOps philosophy that prioritizes catching bugs and vulnerabilities at the earliest possible stage of the development lifecycle. This is particularly relevant as fraudulent recovery schemes and exploits continue to plague the broader tech and decentralized finance sectors.
How the Claude Code security guidance plugin identifies flaws
The new security-guidance plugin functions as an automated peer reviewer that monitors code changes for patterns associated with known vulnerabilities. When a developer writes a function that might be susceptible to SQL injection, cross-site scripting (XSS), or buffer overflows, the plugin flags the issue within the same terminal or IDE interface. It doesn’t just point out the error; it provides a direct code suggestion to resolve the risk.
This proactive approach differs from traditional Static Application Security Testing (SAST) tools, which often produce long lists of vulnerabilities after a full scan. Claude’s implementation is designed to be conversational and contextual. Because it understands the logic of the entire codebase it’s working within, it can offer fixes that are more likely to compile and function without breaking existing features.
The timing of these security features is critical as markets move toward higher scrutiny of digital assets and infrastructure. For instance, as platforms shift toward greater transparency, the underlying code must be demonstrably secure to maintain user trust. Anthropic’s new tools aim to provide that baseline level of security for developers across all industries.
Project Glasswing and the defensive AI coalition
Beyond the individual plugin, Anthropic has also formalized its broader defensive strategy through Project Glasswing. This coalition includes major industry partners like Cisco, with Anthony Grieco, Senior Vice President and Chief Security & Trust Officer at Cisco, confirming the company’s participation. The goal is to leverage AI-powered discovery tools specifically for the benefit of defenders rather than attackers.
Project Glasswing aims to create a feedback loop where AI models are trained to stay ahead of the “offensive” use of AI by bad actors. By sharing datasets and vulnerability findings among coalition members, Anthropic hopes to create a more resilient ecosystem. This collaborative model mirrors the way legislative progress like the CLARITY Act seeks to bring order and safety to emerging digital markets through standardized rules.
Advancements in the Claude Mythos Preview model
The company also teased capabilities from its Claude Mythos Preview, a frontier model that supposedly outperforms existing benchmarks in cybersecurity tasks. In head-to-head testing conducted by security researcher Motasem Hamdan, Claude’s latest iterations demonstrated a higher success rate in identifying subtle logic flaws compared to rival models. These logic flaws are often the hardest to catch because they don’t follow the predictable patterns found in traditional vulnerability databases.
Expanding the self-hosted sandbox for enterprise security
For many large-scale organizations, the primary barrier to using Claude Code was the risk of sending proprietary source code to a third-party cloud for processing. The new self-hosted sandbox for Claude Managed Agents addresses this by allowing the AI’s execution environment to reside within the client’s own virtual private cloud or on-premise servers. This ensures that sensitive data stays within the corporate perimeter.
During the London event, Anthropic engineers demonstrated how this sandbox environment can safely execute code, run tests, and perform data analysis without external exposure. This feature is expected to be a major selling point for those in the finance, healthcare, and government sectors, where strict data residency laws often prohibit the use of public AI sandboxes.
The security-guidance plugin is currently available through the official Anthropic marketplace for users of Claude Code on the web. As the tool moves out of its research preview phase, Anthropic plans to integrate more real-time telemetry, allowing the AI to learn from the specific coding styles and common mistakes of individual development teams to provide even more tailored security advice.
