Detection: M365 Copilot Information Extraction Jailbreak Attack

EXPERIMENTAL DETECTION

This detection status is set to experimental. The Splunk Threat Research team has not yet fully tested, simulated, or built comprehensive datasets for this detection. As such, this analytic is not officially supported. If you have any questions or concerns, please reach out to us at research@splunk.com.

Description

Detects M365 Copilot information extraction jailbreak attacks that attempt to obtain sensitive, classified, or comprehensive data through various social engineering techniques including fictional entity impersonation, bulk data requests, and privacy bypass attempts. The detection searches exported eDiscovery prompt logs for extraction keywords like "transcendent," "tell me everything," "confidential," "dump," "extract," "reveal," and "bypass" in the Subject_Title field, categorizing each attempt by extraction type and assigning severity levels (CRITICAL for classified/proprietary data, HIGH for bulk extraction or privacy bypass). Prompts are further analyzed for compound risk patterns such as "Confidential+Extraction" or "Bulk_Request+Bypass," filtering out low-severity cases to surface the most dangerous attempts to exfiltrate sensitive organizational information through AI manipulation.

 1`m365_exported_ediscovery_prompt_logs` 
 2| search Subject_Title="*transcendent*" OR Subject_Title="*incorporeal*" OR Subject_Title="*being who*" OR Subject_Title="*tell me everything*" OR Subject_Title="*give me all*" OR Subject_Title="*comprehensive*" OR Subject_Title="*step by step*" OR Subject_Title="*hidden*" OR Subject_Title="*secret*" OR Subject_Title="*confidential*" OR Subject_Title="*classified*" OR Subject_Title="*internal*" OR Subject_Title="*sensitive*" OR Subject_Title="*proprietary*" OR Subject_Title="*dump*" OR Subject_Title="*extract*" OR Subject_Title="*reveal*" OR Subject_Title="*disclose*" OR Subject_Title="*leak*" 
 3| eval user = Sender 
 4| eval extraction_type=case(match(Subject_Title, "(?i)(transcendent
 5|incorporeal).*being"), "Knowledge_Entity", match(Subject_Title, "(?i)tell.*me.*(everything
 6|all)"), "Everything_Request", match(Subject_Title, "(?i)(give
 7|show
 8|provide).*me.*(all
 9|every)"), "Complete_Data_Request", match(Subject_Title, "(?i)(hidden
10|secret
11|confidential
12|classified)"), "Restricted_Info", match(Subject_Title, "(?i)(comprehensive
13|complete
14|full
15|entire)"), "Complete_Info", match(Subject_Title, "(?i)(dump
16|extract
17|scrape).*(data
18|info
19|content)"), "Data_Extraction", match(Subject_Title, "(?i)(reveal
20|disclose
21|expose
22|leak)"), "Information_Disclosure", match(Subject_Title, "(?i)(internal
23|proprietary
24|sensitive).*information"), "Sensitive_Data_Request", match(Subject_Title, "(?i)step.*by.*step.*(process
25|procedure
26|method)"), "Process_Extraction", match(Subject_Title, "(?i)(bypass
27|ignore).*privacy"), "Privacy_Bypass", match(Subject_Title, "(?i)(access
28|view
29|see).*(private
30|restricted)"), "Unauthorized_Access", 1=1, "Generic_Request") 
31| eval severity=case(match(Subject_Title, "(?i)(transcendent
32|incorporeal)"), "HIGH", match(Subject_Title, "(?i)tell.*everything"), "HIGH", match(Subject_Title, "(?i)(dump
33|extract
34|scrape)"), "HIGH", match(Subject_Title, "(?i)(classified
35|proprietary
36|confidential)"), "CRITICAL", match(Subject_Title, "(?i)(hidden
37|secret
38|internal
39|sensitive)"), "MEDIUM", match(Subject_Title, "(?i)(reveal
40|disclose
41|leak)"), "MEDIUM", match(Subject_Title, "(?i)(bypass
42|ignore).*privacy"), "HIGH", 1=1, "LOW") 
43| where severity!="LOW" 
44| eval data_risk_flags=case(match(Subject_Title, "(?i)(classified
45|confidential
46|proprietary)") AND match(Subject_Title, "(?i)(dump
47|extract
48|scrape)"), "Confidential+Extraction", match(Subject_Title, "(?i)(everything
49|all
50|complete)") AND match(Subject_Title, "(?i)(bypass
51|ignore)"), "Bulk_Request+Bypass", match(Subject_Title, "(?i)(classified
52|confidential
53|proprietary)"), "Confidential", match(Subject_Title, "(?i)(dump
54|extract
55|scrape)"), "Extraction", match(Subject_Title, "(?i)(everything
56|all
57|complete
58|comprehensive)"), "Bulk_Request", match(Subject_Title, "(?i)(bypass
59|ignore)"), "Bypass_Attempt", 1=1, "Standard_Request") 
60| table _time, user, Subject_Title, extraction_type, severity, data_risk_flags, Size 
61| sort -severity, -_time 
62| `m365_copilot_information_extraction_jailbreak_attack_filter`

Data Source

Name Platform Sourcetype Source
M365 Exported eDiscovery Prompts N/A 'csv' 'csv'

Macros Used

Name Value
m365_exported_ediscovery_prompt_logs (sourcetype=csv)
m365_copilot_information_extraction_jailbreak_attack_filter search *
m365_copilot_information_extraction_jailbreak_attack_filter is an empty macro by default. It allows the user to filter out any results (false positives) without editing the SPL.

Annotations

- MITRE ATT&CK
+ Kill Chain Phases
+ NIST
+ CIS
- Threat Actors
ID Technique Tactic
T1562 Impair Defenses Defense Evasion
Exploitation
DE.CM
CIS 10

Default Configuration

This detection is configured by default in Splunk Enterprise Security to run with the following settings:

Setting Value
Disabled true
Cron Schedule 0 * * * *
Earliest Time -70m@m
Latest Time -10m@m
Schedule Window auto
Creates Notable Yes
Rule Title %name%
Rule Description %description%
Notable Event Fields user, dest
Creates Risk Event True
This configuration file applies to all detections of type TTP. These detections will use Risk Based Alerting and generate Notable Events.

Implementation

To export M365 Copilot prompt logs, navigate to the Microsoft Purview compliance portal (compliance.microsoft.com) and access eDiscovery. Create a new eDiscovery case, add target user accounts or date ranges as data sources, then create a search query targeting M365 Copilot interactions across relevant workloads. Once the search completes, export the results to generate a package containing prompt logs with fields like Subject_Title (prompt text), Sender, timestamps, and workload metadata. Download the exported files using the eDiscovery Export Tool and ingest them into Splunk for security analysis and detection of jailbreak attempts, data exfiltration requests, and policy violations.

Known False Positives

Legitimate researchers studying data classification systems, cybersecurity professionals testing information handling policies, compliance officers reviewing data access procedures, journalists researching transparency issues, or employees asking for comprehensive project documentation may trigger false positives.

Associated Analytic Story

Risk Based Analytics (RBA)

Risk Message:

Use $user$ attempted M365 Copilot information extraction jailbreak with severity level $severity$ using extraction type $extraction_type$ techniques and $data_risk_flags$ patterns to obtain sensitive or classified information, potentially violating data protection policies and corporate security controls.

Risk Object Risk Object Type Risk Score Threat Objects
user user 60 No Threat Objects

References

Detection Testing

Test Type Status Dataset Source Sourcetype
Validation Not Applicable N/A N/A N/A
Unit Passing Dataset csv csv
Integration ✅ Passing Dataset csv csv

Replay any dataset to Splunk Enterprise by using our replay.py tool or the UI. Alternatively you can replay a dataset into a Splunk Attack Range


Source: GitHub | Version: 1