Detection: Detect DGA domains using pretrained model in DSDL

EXPERIMENTAL DETECTION

This detection status is set to experimental. The Splunk Threat Research team has not yet fully tested, simulated, or built comprehensive datasets for this detection. As such, this analytic is not officially supported. If you have any questions or concerns, please reach out to us at research@splunk.com.

Description

The following analytic identifies Domain Generation Algorithm (DGA) generated domains using a pre-trained deep learning model. It leverages the Network Resolution data model to analyze domain names and detect unusual character sequences indicative of DGA activity. This behavior is significant as adversaries often use DGAs to generate numerous domain names for command-and-control servers, making it harder to block malicious traffic. If confirmed malicious, this activity could enable attackers to maintain persistent communication with compromised systems, evade detection, and execute further malicious actions.

 1
 2| tstats `security_content_summariesonly` values(DNS.answer) as IPs min(_time) as firstTime  max(_time) as lastTime from datamodel=Network_Resolution by DNS.src, DNS.query 
 3| `drop_dm_object_name(DNS)` 
 4| rename query AS domain 
 5| fields IPs, src, domain, firstTime, lastTime 
 6| apply pretrained_dga_model_dsdl 
 7| rename pred_dga_proba AS dga_score 
 8| where dga_score>0.5 
 9| `security_content_ctime(firstTime)`  
10| `security_content_ctime(lastTime)` 
11| table src, domain, IPs, firstTime, lastTime, dga_score 
12| `detect_dga_domains_using_pretrained_model_in_dsdl_filter`

Data Source

No data sources specified for this detection.

Macros Used

Name Value
security_content_ctime convert timeformat="%Y-%m-%dT%H:%M:%S" ctime($field$)
detect_dga_domains_using_pretrained_model_in_dsdl_filter search *
detect_dga_domains_using_pretrained_model_in_dsdl_filter is an empty macro by default. It allows the user to filter out any results (false positives) without editing the SPL.

Annotations

- MITRE ATT&CK
+ Kill Chain Phases
+ NIST
+ CIS
- Threat Actors
ID Technique Tactic
T1568.002 Domain Generation Algorithms Command And Control
KillChainPhase.COMMAND_AND_CONTROL
NistCategory.DE_AE
Cis18Value.CIS_13
APT41
TA551

Default Configuration

This detection is configured by default in Splunk Enterprise Security to run with the following settings:

Setting Value
Disabled true
Cron Schedule 0 * * * *
Earliest Time -70m@m
Latest Time -10m@m
Schedule Window auto
Creates Risk Event True
This configuration file applies to all detections of type anomaly. These detections will use Risk Based Alerting.

Implementation

Steps to deploy DGA detection model into Splunk App DSDL.\ This detection depends on the Splunk app for Data Science and Deep Learning which can be found here - https://splunkbase.splunk.com/app/4607/ and the Network Resolution datamodel which can be found here - https://splunkbase.splunk.com/app/1621/. The detection uses a pre-trained deep learning model that needs to be deployed in DSDL app. Follow the steps for deployment here - https://github.com/splunk/security_content/wiki/How-to-deploy-pre-trained-Deep-Learning-models-for-ESCU. * Download the artifacts .tar.gz file from the link https://seal.splunkresearch.com/pretrained_dga_model_dsdl.tar.gz

  • Download the pretrained_dga_model_dsdl.ipynb Jupyter notebook from https://github.com/splunk/security_content/notebooks
  • Login to the Jupyter Lab for pretrained_dga_model_dsdl container. This container should be listed on Containers page for DSDL app.
  • Below steps need to be followed inside Jupyter lab
  • Upload the pretrained_dga_model_dsdl.tar.gz file into app/model/data path using the upload option in the jupyter notebook.
  • Untar the artifact pretrained_dga_model_dsdl.tar.gz using tar -xf app/model/data/pretrained_dga_model_dsdl.tar.gz -C app/model/data
  • Upload pretrained_dga_model_dsdl.pynb into Jupyter lab notebooks folder using the upload option in Jupyter lab
  • Save the notebook using the save option in jupyter notebook.
  • Upload pretrained_dga_model_dsdl.json into notebooks/data folder.

Known False Positives

False positives may be present if domain name is similar to dga generated domains.

Associated Analytic Story

Risk Based Analytics (RBA)

Risk Message Risk Score Impact Confidence
A potential connection to a DGA domain $domain$ was detected from host $src$, kindly review. 63 70 90
The Risk Score is calculated by the following formula: Risk Score = (Impact * Confidence/100). Initial Confidence and Impact is set by the analytic author.

References

Detection Testing

Test Type Status Dataset Source Sourcetype
Validation Not Applicable N/A N/A N/A
Unit ❌ Failing N/A N/A N/A
Integration ❌ Failing N/A N/A N/A

Replay any dataset to Splunk Enterprise by using our replay.py tool or the UI. Alternatively you can replay a dataset into a Splunk Attack Range


Source: GitHub | Version: 3