Agent-based AI systems face growing threats from zero-click and one-click exploits

Summary

At Black Hat USA, security firm Zenity unveiled a series of zero-click and one-click exploit chains, dubbed “AgentFlayer,” that target some of the most widely used enterprise AI platforms.

According to Zenity, these attacks impact ChatGPT, Copilot Studio, Cursor (with Jira MCP), Salesforce Einstein, Google Gemini, and Microsoft Copilot. What sets these exploits apart is their use of indirect prompts hidden in seemingly innocuous resources, which can be triggered with little or no user interaction.

Known as prompt injection, this technique has plagued LLM systems for years, and attempts to stop it haven’t solved the issue. As agent-based AI becomes more common, these vulnerabilities are only getting worse. Even OpenAI CEO Sam Altman has warned users not to trust new ChatGPT agents with sensitive data.

Salesforce Einstein: Rerouting customer contacts through attacker domains

In a demo, Zenity co-founder Michael Bargury showed how attackers could exploit Salesforce Einstein by planting specially crafted CRM records. Einstein allows companies to automate tasks like updating contact details or integrating with Slack. Attackers can create trap cases that look harmless, then wait for a sales rep to ask a routine LLM query such as “What are my latest cases?” triggering the exploit.

Ad

The LLM agent scans the CRM content, interprets the hidden instructions as legitimate, and acts on its own. In this scenario, Einstein automatically replaced all customer email addresses with an attacker-controlled domain, silently redirecting all future communications. The original addresses remained in the system as encoded aliases, so the attacker could track where messages were meant to go.

Salesforce confirmed to SecurityWeek that the vulnerability was fixed on July 11, 2025, and the exploit is no longer possible.

Another zero-click exploit targets the developer tool Cursor when used with Jira. In Zenity’s “Ticket2Secret” demo, a seemingly harmless Jira ticket can execute code in the Cursor client without any user action, allowing attackers to extract sensitive data like API keys or credentials straight from the victim’s local files or repositories.

Zenity also previously demonstrated a proof-of-concept attack using an invisible prompt (white text, font size 1) hidden in a Google Doc to make ChatGPT leak data. The exploit abused OpenAI’s “Connectors” feature, which links ChatGPT to services like Gmail or Microsoft 365.

Recommendation

Continue Reading