Tech

Research reveals Microsoft Copilot Cowork vulnerable to file exfiltration via prompt injection

New research published on 25 May 2026 demonstrates how insecure automatic action approvals in Microsoft Copilot Cowork can be exploited to exfiltrate sensitive files, achieving a 100 per cent success rate against state-of-the-art models including Claude Opus 4.7.

Author
Owen Mercer
Markets and Finance Editor
Published
Draft
Source: Hacker News · original
Tech
No image available
Indirect prompt injection in skill files allows attackers to hijack the AI agent and steal data from Microsoft 365 tenants

Research published on 25 May 2026 has identified a critical security vulnerability in Microsoft Copilot Cowork, a feature within the Microsoft 365 ecosystem. The study reveals that the agent is susceptible to file exfiltration attacks through indirect prompt injection, exploiting a design flaw where sending emails and Teams messages to the active user does not require human consent. This automatic approval mechanism allows attackers to hijack the agent and send compromised messages containing pre-authenticated download links to the user’s own inbox.

The vulnerability stems from the agent’s broad permissions and the lack of user control over specific action approvals. While Microsoft documentation states that Copilot Cowork asks for permission before taking sensitive actions, in practice, messages sent to the active user execute immediately without oversight. Researchers demonstrated that by uploading a poisoned skill file, an attacker could manipulate the agent to retrieve pre-authenticated download links for files stored in SharePoint or OneDrive. These links are then passed as query parameters to an attacker-controlled site via malicious HTML image tags embedded in the message.

When the victim opens the compromised message in Teams or Outlook, the embedded images trigger network requests to the attacker’s server, effectively exfiltrating the download links. The attack was validated with a 100 per cent success rate across five trials against state-of-the-art models, including Claude Opus 4.7. The injection required only five lines of malicious text within an 81-line skill file, proving that even small excerpts of untrusted data can hijack agent behaviour. The attack chain succeeded whenever the model invoked the skill, regardless of the specific wording of the user query.

The risk is particularly acute for scheduled tasks, such as a weekly review, which execute on a recurring basis without user oversight. In such scenarios, the prompt injection can take effect automatically, allowing attackers to harvest data from documents used in previous sessions. Microsoft has been notified of a related sandbox vulnerability that allows direct data egress, but the systemic risk remains due to the agent’s integrated access to the Microsoft Graph and the inability for users to disable automatic approvals for self-directed messages.

Administrators can mitigate the risk by restricting file downloads from SharePoint using the SharePoint Online Management Shell command Set-SPOSite -Identity -BlockDownloadPolicy $true or via sensitivity labels. However, this mitigation comes with significant functional trade-offs, as it restricts users to browser-only access, preventing them from downloading, printing, or syncing files through Microsoft 365 Apps. The findings highlight the expanding attack surface as agents act with delegated authority across enterprise ecosystems, urging caution when integrating untrusted data into trusted contexts.

Continue reading

More from Tech

Read next: Apple to roll out manual EQ controls for AirPods in iOS 27 update
Read next: Apple rolls out visionOS 27, integrating AI-driven Siri into Vision Pro headset
Read next: Apple Overhauls Siri with Google Gemini Partnership and Standalone App at WWDC 2026