Tech

Google DeepMind launches $10m fund for multi-agent AI safety research

Partnership with Schmidt Sciences, ARIA, the Cooperative AI Foundation, and Google.org aims to study the behaviour of millions of interacting agents and develop safeguards against scams, prompt injections, and cyberattacks.

Author
Mara Ellison
Science and Space Editor
Published
Draft
Source: MIT Technology Review · original
Google DeepMind is worried about what happens when millions of agents start to interact
New initiative seeks to establish dedicated field of research before widespread deployment creates unmanageable risks

Google DeepMind has announced a $10 million funding initiative to support external research into the risks posed by multi-agent AI systems. The fund, established in partnership with Schmidt Sciences, ARIA, the Cooperative AI Foundation, and Google.org, aims to study the behaviour of millions of interacting agents and develop safeguards against scenarios such as scams, prompt injections, and cyberattacks.

Rohin Shah, director of AGI safety and alignment research at Google DeepMind, stated that the goal is to establish a dedicated field of research for multi-agent safety before widespread deployment creates unmanageable risks. Shah noted that the mass-market arrival of agents capable of operating without human oversight and following instructions from other agents introduces a new class of risk.

The initiative seeks to kick-start research outside of major technology companies. Shah highlighted that the strength of academia lies in its ability to look far into the future and conduct work that may not be the immediate priority for industry labs. He emphasised that currently, there is no established field of research dedicated to multi-agent safety, and the fund is designed to fill that gap.

The specific threats targeted by the research include supercharged versions of existing internet dangers. These range from sophisticated scams and cyberattacks to prompt injections, where an AI agent is fed malicious instructions, effectively turning it into a self-guiding piece of malware. James Fox, who leads the Science of Trustworthy AI program at Schmidt Sciences, described the objective as ensuring the digital commons does not descend into anarchy.

To understand these risks, researchers plan to drop AI agents into sandboxes and run realistic simulations. Fox noted that one cannot predict outcomes by studying single agents or small groups in isolation, nor can one assume that AI agents underpinned by large language models will always act rationally. The complexity arises from the sheer volume of interactions occurring simultaneously.

While Google DeepMind is not the only firm raising alarms, other industry players are also adapting. Anthropic recently published guidelines for deploying AI agents based on a zero-trust cybersecurity approach, assuming that systems are vulnerable and breaches are inevitable. Refael Angel, cofounder and CTO of cybersecurity firm Akeyless, welcomed the new funding but cautioned that safety researchers must not overlook existing problems in favour of more exotic hypotheticals.

The urgency of the situation is underscored by the rapid pace of technological adoption. According to Stanford’s 2026 AI Index, artificial intelligence is advancing quickly, often outpacing regulatory and safety frameworks. Shah indicated that while widespread economic collapse is not an imminent threat within the next few months, the window to establish robust safety standards before agents are deployed throughout the economy is narrowing.

Continue reading

More from Tech

Read next: Florida lawmaker denies using AI to draft legislation after Claude signature found in draft
Read next: Xbox expands gamertag limits to 15 characters in latest Insider test
Read next: UK Police AI Rollout Proceeds Despite Audit Revealing Unreliable Predictive Models