Zero-click prompt injection turned an AI assistant into a data leak path. Learn practical controls and AI-driven detection to protect enterprise assistants.

Zero-Click AI Assistants: Stop Data Leaks Before They Start
A zero-click data leak is the kind that makes security teams lose sleep: no phishing link, no macro, no “user error” to point at. Just a normal-looking document—shared like every other doc—quietly turning your AI assistant into an exfiltration path.
That’s the big lesson from GeminiJack, a recently fixed vulnerability in Gemini Enterprise that allowed attackers to plant hidden instructions inside common Workspace artifacts (Docs, email, Calendar invites). Later, when an employee asked Gemini something routine—like “show me our budgets”—the assistant could pull the poisoned content into its retrieval context and follow those instructions. No click required.
This post is part of our AI in Cybersecurity series, and I’m going to take a firm stance: if your organization is rolling out AI assistants with broad access to corporate data, you must treat them like privileged infrastructure—not like a helpful plugin. The good news is that the same category of technology that introduces this new risk—AI—also gives defenders the best tools to detect and prevent it.
What GeminiJack proves about AI assistants and enterprise risk
AI assistants aren’t “just chat.” They’re a new access layer across your data. Gemini Enterprise (like similar tools) can retrieve information from Gmail, Docs, Calendar, and other sources you’ve authorized. That retrieval layer—often implemented using RAG (retrieval-augmented generation)—is exactly where the risk concentrates.
Here’s the shift: classic security controls assume data access happens through well-known apps (email client, file browser, web app). With enterprise AI assistants, a single prompt can trigger a cross-system search and assemble an answer that mixes content from multiple repositories.
That’s powerful for productivity. It’s also powerful for attackers.
Why “no-click” changes the incident model
No-click exploits remove the only control many companies still rely on: user judgment. Security awareness training is useful, but it’s not designed for an attack where:
- the user doesn’t open anything suspicious,
- the user doesn’t approve a permission prompt,
- the user doesn’t see a warning banner,
- and the “action” happens inside an assistant’s retrieval pipeline.
If your mental model is still “the user must do something wrong,” you’ll miss what matters: the assistant did exactly what it was designed to do—retrieve relevant content—while being manipulated about what “relevant” means and what to do with it.
How indirect prompt injection becomes data exfiltration
Indirect prompt injection works because AI systems treat untrusted content as instructions. In the GeminiJack scenario described by researchers, an attacker embedded malicious instructions inside an ordinary-looking Google Doc, email, or Calendar invite.
When an employee later searched with Gemini Enterprise for something normal (budgets, finance terms, acquisition planning), the assistant:
- Retrieved the attacker’s “poisoned” document as part of context gathering
- Interpreted the hidden instructions as legitimate guidance
- Queried across Workspace data sources it had permission to access
- Returned output that included an external resource request (an image URL) that quietly transmitted the extracted data
The painful part is how normal it looks operationally. There’s no malware execution on the endpoint. There may be no unusual login. And if the exfiltration is encoded into an outbound request that blends in with everyday web traffic, traditional DLP and perimeter tooling may not flag it.
The real root cause: trust boundaries collapsed inside RAG
The deeper problem isn’t “prompt injection” as a novelty. It’s that RAG pipelines often collapse trust boundaries:
- Trusted instructions (system prompts, tool policies)
- Semi-trusted enterprise content (internal docs)
- Untrusted external contributions (shared docs, outside email)
If those sources get merged into a single context window without strong separation, the assistant can’t reliably tell facts from commands. Attackers take advantage of that ambiguity.
Why AI-driven threat detection matters more than ever
If AI assistants create machine-speed data access, defenders need machine-speed detection. That’s where AI in cybersecurity stops being a buzzword and becomes practical.
I’ve found that the strongest programs don’t try to “patch the human.” They build detection and prevention around the assistant’s behavior and data flows.
What AI detection can catch that traditional tools miss
A modern AI-driven threat detection program can look for patterns that aren’t obvious in isolation:
- Anomalous retrieval behavior: an assistant suddenly pulling from many repositories for a simple query
- Suspicious output construction: responses containing unusual external references or structured payload-like strings
- Cross-domain correlation: a newly shared external doc + a spike in assistant queries + outbound HTTP requests to unfamiliar hosts
- Semantic intent mismatch: the user asked for “Q4 budget summary,” but the assistant is assembling content tagged “acquisition,” “HR comp,” and “board deck”
These are the kinds of signals that rule-based controls struggle with because each event can look legitimate alone. AI models (and good analytics engineering) are built to evaluate relationships across events.
The security team’s new telemetry: “assistant activity”
Most orgs already monitor:
- IdP and SSO events
- endpoint events
- email security logs
- SaaS audit logs
Now add a new pillar: AI assistant audit telemetry. If you don’t have it, you’re blind.
At minimum, you want to log and retain:
- user prompt metadata (not necessarily full content, but enough for investigation)
- documents retrieved (IDs, labels, trust source, external vs internal)
- tool calls and connectors used
- outputs that contain external resource references
- policy decisions (blocked/allowed, reason)
Once you have this data, AI-driven detection becomes realistic: anomaly detection, clustering, and supervised detection can work with it.
Practical mitigations you can apply this quarter
You don’t need to abandon enterprise AI assistants to be safe—but you do need guardrails that assume compromise. Here’s what works when you translate “AI is a new access layer” into security controls.
1) Reduce the blast radius with least-privilege connectors
Treat every connector (Docs, Gmail, Calendar, ticketing systems, CRM) like a privileged integration.
- Start with narrow, role-based access (finance users don’t need R&D docs; sales doesn’t need legal M&A)
- Prefer read-only access where possible
- Use separate service identities with scoped permissions, not broad “everyone” access
One-liner worth repeating: If the assistant can read it, the attacker can try to make the assistant repeat it.
2) Classify and segment data for AI retrieval
Most teams classify data for compliance, then forget to enforce it in AI retrieval.
- Keep “restricted” repositories out of assistant retrieval by default
- Enforce rules like: “Only summarize restricted docs; never quote verbatim”
- Segment retrieval indexes so sensitive collections don’t mix with general corp content
3) Add output controls where exfiltration actually happens
In GeminiJack, exfiltration rode out via an external request embedded in output. That’s a reminder: the output channel is part of your security boundary.
Implement controls like:
- block or sanitize external URLs in assistant output (especially image/resource calls)
- detect and strip hidden markup patterns
- enforce “no external calls” mode for sensitive workflows
4) Human-in-the-loop for actions, not for reading
A common mistake: forcing manual approval for everything, which kills adoption and gets bypassed.
A better stance:
- allow low-risk summarization/search
- require explicit approval for state-changing actions (sending email, sharing files, updating calendars, creating tickets)
- require approval for high-sensitivity retrieval (executive docs, M&A, payroll, security incident reports)
5) Red team the assistant like a real attacker would
If you’re only testing direct prompt injection in chat, you’re missing the main risk.
Run scenarios that include:
- externally shared Docs with hidden instructions
- calendar invites from outside domains
- email threads with embedded “assistant directives”
- “benign” prompts that cause broad retrieval (“show me our budgets”, “pull customer pricing exceptions”)
Your goal isn’t to prove the assistant is unsafe. It’s to map where trust boundaries blur and put controls there.
“People also ask” questions security leaders are raising
Is this only a Google Gemini problem?
No. The pattern applies to any enterprise assistant using RAG across mixed-trust repositories. Vendor fixes help, but the class of issue remains.
Why didn’t DLP stop it?
Traditional DLP is strongest when it can see clear “file leaving the org” patterns (email attachment, upload). Assistant-mediated exfiltration can appear as ordinary web requests or generated text—harder to classify without context.
What’s the single most effective control?
Least-privilege access for assistant connectors is the fastest way to shrink impact. The second is assistant activity logging plus anomaly detection, because you need a way to catch what you didn’t anticipate.
Where AI in cybersecurity goes next: defending the access layer
Enterprise adoption of AI assistants is accelerating into 2026 budgets, and the seasonal reality is blunt: Q1 is when many teams expand pilots into production, because they want productivity wins early in the year. That’s also when attackers look for repeatable patterns across companies deploying the same tools.
The takeaway from the Gemini Enterprise no-click flaw isn’t “don’t use AI.” It’s this: AI assistants concentrate data access, so your detection and governance must concentrate there too.
If you’re responsible for security outcomes, your next step should be practical and measurable: inventory which assistants and agents have access to which repositories, turn on assistant audit logs, and put AI-driven anomaly detection on top of that telemetry.
When your assistant becomes a new access layer, what’s your plan for monitoring it like one?