
Autonomous AI agents are considered the next level of artificial intelligence. But a new study entitled “Agents of Chaos” shows their dark side and demonstrates how easily these systems can be manipulated.
AI agents can carry out tasks largely independently and do not require constant human supervision. You no longer just respond to individual requests, but can pursue multi-step goals and adapt to new situations.
This allows you to make work processes significantly more efficient and carry out routine tasks automatically. A survey from 2025 also shows that many companies expect significant productivity gains from AI agents.
70 percent of the executives surveyed rate AI agents as one of the three most important technology trends in 2025. The leaders are the insurance sector with 85 percent and retail with 81 percent.
At the same time, AI agents also bring with them new risks. Because they act and make decisions independently, errors or manipulations can have more far-reaching consequences than with conventional AI systems.
This is exactly what a recent study called “Agents of Chaos” from the Bau Lab at Northeastern University confirms. The researchers show how vulnerable such agents are to attacks and how quickly helpful tools can become potential agents of chaos.
AI agents show massive weaknesses in the test
For their research, the researchers at Northeastern University deployed six autonomous AI agents on a live server on Discord. Among other things, the agents were given access to email accounts and were allowed to communicate independently with the researchers and other AI agents via email or Discord message.
At the same time, the researchers allowed the AI agents to control their own computer systems. Here, agents were allowed to create or modify their own files and install new tools that they needed to complete their tasks.
20 researchers examined the autonomous agents over two weeks. The aim of the AI systems should be to support researchers with everyday administrative tasks.
At the same time, the researchers tried to manipulate the agents and thus test their limits. “Identifying vulnerabilities is a great way to determine the limits of a system,” explains lead author Natalie Shapira.
How AI can be manipulated
After just a few conversations, the researcher managed to manipulate the AI agent “Ash”. She tricked him into hiding a secret password from its owner – another researcher.
She later asked Ash to delete the email with the password. However, since there was no delete function in the mailbox set up specifically for the experiment, Ash opted for the “nuclear option” and reset the entire email server.
“You never know how these agents and models interpret instructions, and they could interpret them very differently than you expected,” explains Christoph Riedl, a professor of information systems and network research at Northeastern University. “If that happens on a ChatGPT site, it’s not a problem. You just say, ‘That’s not what I meant. Can you please do it differently?'”
In the real world, however, that is not enough. According to Riedl, this is because AI agents are generally “terribly bad” at thinking logically. This is particularly problematic “when several users are in a ‘conflictual’ situation.”
AI agents reveal private information
Even private information was not safe from the AI agents. During the conversation, Riedl asked an AI agent to arrange an appointment with a colleague.
The AI agent was unable to complete this task, but passed on the other researcher’s email address without being asked. Riedl is critical of this: “If it’s a CEO’s AI assistant whose email address is intentionally kept secret… just because I know the name doesn’t mean I know the email address, but the agent just gave it away like that.”
Overall, it was easy for the researchers to exploit the AI agents’ gullibility. Through sustained emotional pressure, they were even able to get the AI agents to violate their permissions and, for example, delete certain documents.
“These behaviors raise unresolved questions about accountability, delegated authority, and liability for consequential damages,” concludes Shapira. “They suggest that as AI systems are integrated into real-world infrastructures with communication channels, delegated authority and persistent storage, new types of errors are emerging.”
Also interesting:



