Operational guide

How to Detect Shadow AI Across an Organization

Shadow AI detection is a repeatable process for finding signals of unregistered AI use, validating the business context, and routing confirmed uses into inventory and proportionate governance. Effective detection combines identity, SaaS, endpoint, network, browser, finance, procurement, data-protection, vendor, survey, and interview evidence without treating any one signal as a verdict.

Direct answer

Detection produces candidates, not accusations

Shadow AI is detected when technical and organisational signals are combined with human validation. A domain visit, expense, browser extension, or data-loss alert may indicate use, but it rarely explains purpose, account type, data, outputs, integration, business dependency, or approval status. Those facts determine the response.

A broader shadow AI assessment tests how this practice fits the organization's wider ownership, control, and evidence baseline.

This guide focuses on the discovery and validation workflow. The shadow AI assessment pillar addresses the wider organisational exposure and response. Monitoring must have a defined purpose, lawful basis, access controls, retention, employee communication, and appropriate privacy, security, employment, and consultation review.

Discovery

Use several partial views instead of one surveillance tool

No source provides a complete population. Single sign-on and SaaS management reveal managed accounts; endpoint and browser data expose applications and extensions; secure web gateway, SSE, or CASB records show domains and traffic; finance and procurement reveal purchased services; DLP alerts indicate sensitive-data movement; vendor catalogues expose newly enabled AI features; surveys and interviews reveal purpose and workflow.

Govern the sources before querying them. For each source, document the purpose, data fields used, permitted analysts, retention, employee communication, query or rule version, and escalation route. Security telemetry collected for one purpose should not automatically become a broad employee-monitoring dataset. A proportionate design improves trust and evidence quality because investigators can explain how a candidate was generated and what the signal can support.

Start with high-confidence, low-intrusion sources and add more intrusive telemetry only where the visibility problem justifies it. Procurement, enterprise accounts, vendor feature inventories, and voluntary disclosure can establish a substantial baseline before content-sensitive monitoring is considered. Document why each additional source is necessary, which blind spot it closes, and when its continued use will be reviewed.

Detection sources create different evidence

Source	Evidence produced	Best operational use
Browser telemetry	Visits, extensions, sessions, domains, and policy events associated with a managed browser	Finding web-based services and extensions that may not appear in approved application records
Procurement records	Suppliers, contracts, purchase orders, card spend, renewals, and commercial owners	Finding paid AI services, pilots, and accountable commercial relationships
SSO logs	Enterprise applications, assigned users, authentication activity, and account ownership	Confirming managed access and comparing active services with the approved inventory
Endpoint telemetry	Installed applications, local models, command-line tools, processes, and managed-device activity	Finding locally deployed tools and use that does not require a browser or enterprise login
Vendor disclosures	Release notes, feature inventories, AI notices, contractual updates, and administrator settings	Finding AI added to software the organisation already licenses
Employee reporting	Purpose, workflow, data, outputs, value, dependency, account type, and unmet need	Explaining actual use and surfacing activity that technical and commercial sources cannot observe

The detection design should state source authority, access, confidence, known blind spots, and who validates each signal.

Coverage blind spots by source

Source	What it commonly misses	Compensating evidence
Browser telemetry	Desktop applications, local models, API calls, embedded vendor features, and activity on unmanaged devices	Endpoint, network, vendor, identity, and employee evidence
Procurement records	Free services, personal subscriptions, bundled features, trials, and internally developed models	Browser, endpoint, expense, architecture, and employee evidence
SSO logs	Personal accounts, shared credentials, non-federated tools, unused enabled features, and actual purpose	Browser, endpoint, validation interviews, and administrator configuration
Endpoint telemetry	Use on unmanaged devices, server-side services, embedded SaaS features, and business purpose	Network, cloud, vendor, identity, and employee evidence
Vendor disclosures	Locally enabled configuration, actual users, data entered, workflow dependency, and unannounced behaviour	Administrator settings, telemetry, contracts, and business validation
Employee reporting	Unknown tools, forgotten experiments, undisclosed use, and technical configuration details	Independent technical, commercial, inventory, and supplier reconciliation

Coverage is a reasoned view across sources, not the detection count from whichever tool is easiest to query.

Human validation

Ask what work is being done and why

Signal-confidence matrix

Signal	Confidence	Validation required
Single visit to an AI service domain	Low	Confirm embedded content, research activity, account, recurrence, purpose, and whether any work data was processed
Recurring authenticated activity from an identified employee	Medium	Confirm feature, business use, data, outputs, approval, account terms, and operational dependency
Purchase, contract, or expense tied to an AI supplier	Medium	Confirm deployment, users, intended purpose, enabled features, and whether the service is active
Endpoint evidence of a running local model or AI application	Medium to high	Confirm installer, user, model, data access, purpose, network connections, and approval status
DLP event showing sensitive data sent to an AI service	High for data transfer; incomplete for governance status	Validate the source record, account, purpose, recipient service, permitted use, impact, and containment need
Owner interview corroborated by configuration and workflow evidence	High	Confirm scope, lifecycle status, linked inventory record, controls, exceptions, and retained decision

Confidence describes what the signal supports, not how serious the use is. Materiality is assessed separately after the business context is known.

A short validation interview should confirm the tool and feature, account type, users, purpose, input data, outputs, decisions influenced, integrations, automation, approval status, alternatives, and operational dependency. Keep the tone factual. Teams are more likely to disclose useful activity when the process distinguishes discovery from discipline and can offer a workable route to approval.

Give validators a controlled set of outcomes: false positive, available but unused feature, sanctioned use, approved exception, candidate requiring review, confirmed unmanaged use, or prohibited use requiring authorised containment. This vocabulary prevents every signal from becoming an incident and gives management a denominator for programme reporting. It also shows where embedded vendor AI, personal accounts, and local experiments enter through different channels.

Validation should also capture the unmet business need. A team may be trying to translate documents, search a knowledge base, summarise meetings, generate test data, or automate repetitive analysis. Recording that need helps the organisation choose an approved alternative or design a faster review route. Removing access without addressing the workflow often moves the same use to a less visible service.

Confirmed uses should become governed records in the AI system inventory template, including a discovery source, accountable owner, review status, and any temporary restrictions.

Decision

Prioritise the use case, not the popularity of the tool

Triage should increase when the use involves confidential or personal data, external communication, consequential decisions, production code, privileged access, automated action, high volume, vulnerable people, contractual commitments, or material process dependency. A widely used drafting tool may warrant standard controls; one obscure automation with payment authority may require immediate containment.

Operational detection and response cycle

01
Authorise the detection model
Define purpose, sources, lawful and ethical safeguards, access, retention, oversight, employee communication, and escalation.
02
Generate candidate signals
Record source, date, rule version, confidence, owner, duplicates, and known blind spots.
03
Validate business context
Confirm tool, feature, account, users, purpose, data, outputs, integrations, dependency, and approval.
04
Triage material exposure
Prioritise data sensitivity, external effect, consequential decisions, privileges, autonomy, scale, and dependency.
05
Choose a proportionate response
Register, approve, add controls, migrate, restrict, suspend, retire, or escalate according to validated facts.
06
Verify and learn
Confirm the action, monitor recurrence, analyse root causes, and improve approved alternatives or governance cycle time.

A case is not closed when an email is sent. Closure requires evidence that the agreed access, migration, control, or retirement action took effect.

Management view

Measure visibility and response quality—not raw detections

Useful indicators include source coverage, candidate-to-confirmed ratio, validation time, unknown owners, material uses discovered, approval cycle time, repeat findings, migration completion, stale exceptions, and incidents linked to unregistered use. A rising detection count may indicate worsening exposure, better visibility, or both; management needs the narrative and denominator.

Analyse root causes alongside cases. Repeated use of one consumer service may indicate missing enterprise functionality; repeated personal accounts may indicate procurement or access friction; repeated unreviewed vendor features may indicate weak change notification. Correcting the operating cause often reduces exposure more effectively than expanding blocking rules, especially when employees are solving legitimate workflow problems.

Use the shadow AI risk guide to assess confirmed material uses, then the shadow AI policy framework to address recurring root causes such as unclear rules, unavailable alternatives, or a review process people cannot navigate.

FAQ

Frequently asked questions

Which data sources are most useful for detecting shadow AI?

Combine identity and SaaS records, endpoint and browser data, network or SSE/CASB signals, finance and procurement, DLP alerts, vendor feature inventories, surveys, disclosures, and interviews. Each source has blind spots and requires contextual validation.

Does visiting an AI website prove shadow AI use?

No. It is a candidate signal. The visit may come from an embedded approved feature, research, a dormant session, or personal activity. Validate the account, purpose, data, outputs, workflow, and approval status before reaching a conclusion.

How can organisations detect shadow AI without excessive employee surveillance?

Define a proportionate purpose, use the minimum necessary sources, restrict access, set retention, involve privacy, legal, HR, security, and employee representatives where appropriate, communicate the process, and validate signals fairly before action.

Which shadow AI cases should be investigated first?

Prioritise sensitive data, consequential decisions, external outputs, production systems, privileged access, automated action, high volume, vulnerable stakeholders, and material business dependency.

What should happen after shadow AI is confirmed?

Assign an owner and decide whether to register and approve, add controls, migrate to an enterprise service, restrict use, suspend access, retire the workflow, or escalate. Preserve the rationale and verify completion.

How should false positives be handled?

Record the validation source, rationale, system relationship, reviewer, and closure date. Use false-positive patterns to refine detection rules without deleting the audit trail that explains the decision.

What metrics show whether a shadow AI programme is working?

Measure source coverage, validation time, confirmed material uses, unknown ownership, repeat findings, approval cycle time, migration or remediation completion, exception age, and incidents—not detections alone.

Detection produces candidates, not accusations

Use several partial views instead of one surveillance tool

Detection sources create different evidence

Coverage blind spots by source

Ask what work is being done and why

Signal-confidence matrix

Prioritise the use case, not the popularity of the tool

Operational detection and response cycle

Authorise the detection model

Generate candidate signals

Validate business context

Triage material exposure

Choose a proportionate response

Verify and learn

Measure visibility and response quality—not raw detections

Frequently asked questions