AI agents are becoming ubiquitous — and, as it turns out, increasingly attractive targets for attackers.

In fact, they represent a new frontier in the eternal battle between the hacker and the hack-ee.

According to Microsoft, almost a quarter of a million organizations use AI agents (including 90% of the Fortune 500) — and that figure only covers Copilot Studio usage. This is good news for user productivity, but bad news for security teams.

“AI assistants, copilots, and agents significantly expand the enterprise attack surface in ways that traditional security architectures were not designed to handle,” Todd Thiemann, a cybersecurity analyst at research firm Omdia, told TechRepublic. “Prompt injections and other AI-targeted exploits represent a new class of attacks that use text-based payloads that manipulate machine reasoning rather than human behavior.”

Agents can bypass email security safeguards

Email security is largely built around the detection of suspicious links, fake domains, malware-ridden attachments, and other red flags. Regular email phishing remains a go-to vector used heavily by cybercriminals. IBM’s Cost of a Data Breach report noted that it accounts for 16% of all breaches. But AI assistants and AI agents are now in the firing line.

Instead of hoping that an inattentive user clicks on a malicious URL or attachment, cybercriminals are going after agents and AI assistants. They do this by using questions or commands in text or code form (i.e., prompts) that trick agents into producing the desired response or persuade them to execute specific tasks.

This bypasses most defenses as these malicious prompts are hidden. They use invisible text, special formatting, and other forms of subterfuge to trick generative AI tools into doing risky actions. In extreme cases, this has resulted in data being exfiltrated and in-place security checks being ignored.

Email messages, after all, contain far more than the visible text, the sender, the recipient, and the subject. They include headers, metadata, and content in both plain text and HTML, creating many places where hidden instructions can be buried to escape detection. Traditional filters, malware signature files, and other defenses are not likely to detect that anything is amiss.

“There are cases in recent attacks where the HTML and plain text version were completely different,” Daniel Rapp, Chief AI and Data Officer at Proofpoint, told TechRepublic. “Invisible plain text might contain a prompt injection that can be picked up and acted on by an AI system.”

The situation is worsened by the fact that AI agents can sometimes be far more gullible than humans. While a human might become suspicious about being told to buy $2,000 of gift cards and send them to a stranger, an AI agent may carry out the task rapidly and without question.

“The literal nature of AI agents makes them susceptible to phishing and other social engineering tricks,” said Rapp.

Now factor in AI agents being set up to access inboxes before users have a chance to view them. Some users arrange their agents to send automatic acknowledgements, direct certain messages to spam, and prioritize others. As these AI assistants have access to the inbox, they can automatically act on emails the moment they arrive.

Preventing agent scamming

Proofpoint Prime Threat Protection includes features designed to thwart hackers’ attempts to subvert AI agents’ actions. It does this by scanning email messages for potential threats before they reach any inbox. Its software picks up messages inline when they are traveling from the originator to the recipient.

“We have placed detection capabilities directly in the delivery path, which means latency and efficiency are critical,” said Rapp.

This is accomplished using AI models that are much smaller than typical large language models (LLMs) such as ChatGPT. To maintain accuracy, its models are updated every couple of days using the last few hundred million emails it has processed. AI is used to interpret the message’s intent, not just scan for indicators. It can spot prompt injections, hidden instructions, and other exploits before an agent can act upon them.

“Security tooling must evolve from detecting known bad indicators to interpreting intent for humans, machines, and AI agents,” said Thiemann. “Approaches that identify malicious instructions or manipulative prompts pre-delivery, ideally using distilled AI models for low-latency inline protection, address a significant gap in today’s defenses.”

Also read: Gmail-linked credentials exposed in massive breach, detailing how stolen passwords can spread through underground networks and be reused in phishing and credential-stuffing attacks.

Share.
Leave A Reply

Exit mobile version