AWS retools cloud infrastructure for the age of AI agents
As non-human internet activity surges, Amazon and competitors redesign systems to accommodate AI workloads that spike without warning and idle without notice.

Amazon Web Services has launched a new generation of OpenSearch Serverless, a fully managed search and vector database engineered specifically for AI agent workloads. The update marks a significant shift in cloud architecture, as the system decouples compute resources from storage. This structural change allows the infrastructure to scale up instantly to handle sudden bursts of agent traffic and scale down to zero when idle, ensuring customers incur no costs for unused compute capacity.
The move addresses the fundamental mismatch between legacy cloud infrastructure, which was designed for steady human behaviour such as searching and streaming, and the erratic patterns of AI agents. Unlike human users, agents often spin up multiple sub-agents to query databases and call APIs in seconds before disappearing. Tia White, general manager for Amazon OpenSearch Service, noted that previous serverless versions required at least one instance to remain operational because compute and storage were coupled, forcing enterprises to pay for idle resources.
This technical evolution reflects a broader industry realignment as providers adapt to the growing volume of machine-generated internet traffic. Cloudflare reports that bots accounted for 31 per cent of overall HTTP traffic over the past six months, with AI crawlers and assistants comprising roughly a quarter of those requests. Lai Yi Ohlsen, senior product manager at Cloudflare, stated that non-human traffic is predicted to exceed human traffic in the first half of 2027.
At launch, OpenSearch Serverless will integrate natively with AI development platforms Vercel and Kiro. This integration allows developers to deploy production-ready search and vector backends for agents without managing underlying infrastructure. The system is designed to keep up with enterprise needs that require search capabilities to respond to traffic spikes without warning while eliminating the cost of empty compute instances.
The shift is evident across the wider cloud sector. Microsoft has rolled out updates to Azure to handle AI agent bursts and share memory between agents, while Databricks and Snowflake are repositioning themselves as AI memory and retrieval systems for enterprise data. As AI agents move from experimentation into production, these infrastructure changes aim to make machine-generated workloads cheaper and easier to deploy at scale.


