Kubernetes Anomalous Traffic on Network Edge

THIS IS A EXPERIMENTAL DETECTION

This detection has been marked experimental by the Splunk Threat Research team. This means we have not been able to test, simulate, or build datasets for this detection. Use at your own risk. This analytic is NOT supported.

Try in Splunk Security Cloud

Description

This detection detects network traffic volume anomalies between workloads in a microservices hosted application, or between a workload and the outside world if the workload is shown as (unknown). This detection leverages Network performance Monitoring metrics harvested using an OTEL collector, and is pulled from Splunk Observability cloud using the Splunk Infrastructure Monitoring Add-on (https://splunkbase.splunk.com/app/5247). This detection compares the tcp.bytes, tcp.new_sockets, tcp.packets, udp.bytes, udp.packets metrics between workloads over the last 1 hour, with the average of those metrics over the last 30 days in order to detect any anonymously high inbound or outbound network activity. Unexpected spikes in network traffic may signify unauthorized data transfers, or abnormal behavior within the microservices ecosystem. Such activity might signify data exfiltration, unauthorized lateral movement, within the microservices environment. If a bad actor is responsible for this traffic they could compromise additional services or extract sensitive data, potentially leading to data breaches.

Type: Anomaly
Product: Splunk Enterprise, Splunk Enterprise Security, Splunk Cloud
Last Updated: 2024-01-10
Author: Matthew Moore, Splunk
ID: 886c7e51-2ea1-425d-8705-faaca5a64cc6

Annotations

ATT&CK

ID	Technique	Tactic
T1204	User Execution	Execution

Kill Chain Phase

Installation

NIST

DE.AE

CIS20

CIS 13

CVE

Search

| mstats avg(tcp.*) as tcp.* avg(udp.*) as udp.* where `kubernetes_metrics` AND earliest=-1h by k8s.cluster.name source.workload.name dest.workload.name span=10s 
| eval key='source.workload.name' + ":" + 'dest.workload.name' 
| join type=left key [ mstats avg(tcp.*) as avg_tcp.* avg(udp.*) as avg_udp.* stdev(tcp.*) as stdev_tcp.* avg(udp.*) as stdev_udp.* where `kubernetes_metrics` AND earliest=-30d latest=-1h by source.workload.name dest.workload.name 
| eval key='source.workload.name' + ":" + 'dest.workload.name' ] 
| eval anomalies = "" 
| foreach stdev_* [ eval anomalies =if( '<<MATCHSTR>>' > ('avg_<<MATCHSTR>>' + 3 * 'stdev_<<MATCHSTR>>'), anomalies + "<<MATCHSTR>> higher than average by " + tostring(round(('<<MATCHSTR>>' - 'avg_<<MATCHSTR>>')/'stdev_<<MATCHSTR>>' ,2)) + " Standard Deviations. <<MATCHSTR>>=" + tostring('<<MATCHSTR>>') + " avg_<<MATCHSTR>>=" + tostring('avg_<<MATCHSTR>>') + " 'stdev_<<MATCHSTR>>'=" + tostring('stdev_<<MATCHSTR>>') + ", " , anomalies) ] 
| fillnull 
| eval anomalies = split(replace(anomalies, ",\s$$$$", "") ,", ") 
| where anomalies!="" 
| stats count(anomalies) as count values(anomalies) as anomalies by k8s.cluster.name source.workload.name dest.workload.name 
| rename service as k8s.service 
| where count > 5 
| rename k8s.cluster.name as host 
| `kubernetes_anomalous_traffic_on_network_edge_filter` 

Macros

The SPL above uses the following Macros:

kubernetes_metrics

kubernetes_anomalous_traffic_on_network_edge_filter is a empty macro by default. It allows the user to filter out any results (false positives) without editing the SPL.

Required fields

List of fields required to use this analytic.

tcp.*
udp.*
k8s.cluster.name
source.workload.name
dest.workload.name
udp.packets

How To Implement

To gather NPM metrics the Open Telemetry to the Kubernetes Cluster and enable Network Performance Monitoring according to instructions found in Splunk Docs https://docs.splunk.com/observability/en/infrastructure/network-explorer/network-explorer-setup.html#network-explorer-setup In order to access those metrics from within Splunk Enterprise and ES, the Splunk Infrastructure Monitoring add-on must be installed and configured on a Splunk Search Head. Once installed, first configure the add-on with your O11y Cloud Org ID and Access Token. Lastly set up the add-on to ingest metrics from O11y cloud using the following settings, and any other settings left at default:

Name sim_npm_metrics_to_metrics_index
Org ID <Your O11y Cloud Org Id>
Signal Flow Program data('tcp.packets').publish(label='A'); data('tcp.bytes').publish(label='B'); data('tcp.new_sockets').publish(label='C'); data('udp.packets').publish(label='D'); data('udp.bytes').publish(label='E')
Metric Resolution 10000
Known False Positives

unknown

Associated Analytic Story

Abnormal Kubernetes Behavior using Splunk Infrastructure Monitoring

RBA

Risk Score	Impact	Confidence	Message
25.0	50	50	Kubernetes Anomalous Traffic on Network Edge in kubernetes cluster $host$

The Risk Score is calculated by the following formula: Risk Score = (Impact * Confidence/100). Initial Confidence and Impact is set by the analytic author.

Reference

https://github.com/signalfx/splunk-otel-collector-chart

Test Dataset

Replay any dataset to Splunk Enterprise by using our replay.py tool or the UI. Alternatively you can replay a dataset into a Splunk Attack Range

source | version: 1

Twitter Facebook LinkedIn