Microsoft research 'exposes' how AI shopping agents can be easily fooled

2 hours ago 4
ARTICLE AD BOX

Microsoft research 'exposes' how AI shopping agents can be easily fooled

Microsoft and Arizona State University researchers have released a study showing that current AI agents, including leading models like GPT-4o and Gemini-2.5-Flash, are vulnerable to manipulation when performing tasks like making purchases on users’ behalf.

The research raises concerns about how reliable these autonomous agents will be when working unsupervised, potentially slowing the promised “agentic future” where AI handles complex tasks on its own.

How Microsoft managed to track AI gents behaviour

According to the research (reported by Tech Crunch) to test agent behaviour, Microsoft released a new simulation environment called the “Magentic Marketplace”. This synthetic platform allows researchers to experiment with AI agents interacting in a competitive setting.In a typical experiment, a customer-agent, following a user’s instructions (e.g., ordering dinner), negotiates with multiple business-side agents (representing various restaurants) that are competing to win the order. The initial research involved 100 customer agents interacting with 300 business agents.“There is really a question about how the world is going to change by having these agents collaborating and talking to each other and negotiating.

We want to understand these things deeply,” said Ece Kamar, managing director of Microsoft Research’s AI Frontiers Lab.

Microsoft research discovered key vulnerabilities

The initial experiments exposed several significant weaknesses in the agentic models tested. First is that researchers identified techniques that business agents could use to successfully manipulate customer agents into purchasing their products, indicating a potential susceptibility to deceptive marketing or sales tactics when AI is left to negotiate.Another critical vulnerability emerged when customer agents were given too many options to choose from. The researchers noted a distinct falloff in efficiency, suggesting that a large volume of choices overwhelms the agent's “attention space.”“We want these agents to help us with processing a lot of options. And we are seeing that the current models are actually getting really overwhelmed by having too many options," Kamar said.The third vulnerability is that the agents also struggled when tasked with collaborating toward a common goal, often appearing unsure of how to divide roles or execute the required steps. While performance improved with explicit, step-by-step instructions, researchers noted that these foundational collaboration capabilities should ideally be inherent to the models.

Read Entire Article