Detection: Ollama Possible Memory Exhaustion Resource Abuse

EXPERIMENTAL DETECTION

This detection status is set to experimental. The Splunk Threat Research team has not yet fully tested, simulated, or built comprehensive datasets for this detection. As such, this analytic is not officially supported. If you have any questions or concerns, please reach out to us at research@splunk.com.

Updated Date: 2025-10-05 ID: ca96297f-e82e-4749-8cc9-d1ab555abb57 Author: Rod Soto Type: Anomaly Product: Splunk Enterprise Security

Description

Detects abnormal memory allocation patterns and excessive runner operations in Ollama that may indicate resource exhaustion attacks, memory abuse through malicious model loading, or attempts to degrade system performance by overwhelming GPU/CPU resources. Adversaries may deliberately load multiple large models, trigger repeated model initialization cycles, or exploit memory allocation mechanisms to exhaust available system resources, causing denial of service conditions or degrading performance for legitimate users.

Search

 1`ollama_server` ("*llama_kv_cache*" OR "*compute buffer*" OR "*llama runner started*" OR "*loaded runners*") 
 2| rex field=_raw "count=(?<runner_count>\d+)" 
 3| rex field=_raw "size\s*=\s*(?<memory_mb>[\d\.]+)\s+MiB" 
 4| rex field=_raw "started in\s*(?<load_time>[\d\.]+)\s*seconds" 
 5| rex field=_raw "source=(?<code_source>[^\s]+)" 
 6| bin _time span=5m 
 7| stats count as operations, sum(runner_count) as total_runners, dc(code_source) as unique_sources, values(code_source) as code_sources, avg(memory_mb) as avg_memory, max(memory_mb) as max_memory, sum(memory_mb) as total_memory, avg(load_time) as avg_load_time, max(load_time) as max_load_time by _time, host 
 8| where operations > 5 OR total_runners > 0 OR max_memory > 400 OR total_memory > 500 
 9| eval avg_memory=round(avg_memory, 2) 
10| eval max_memory=round(max_memory, 2) 
11| eval total_memory=round(total_memory, 2) 
12| eval avg_load_time=round(avg_load_time, 2) 
13| eval severity=case( max_memory > 500 OR total_memory > 1000, "critical", max_memory > 400 OR operations > 20, "high", operations > 10, "medium", 1=1, "low" ) 
14| eval attack_type="Resource Exhaustion / Memory Abuse" 
15| sort -_time 
16| table _time, host, operations, total_runners, unique_sources, avg_memory, max_memory, total_memory, avg_load_time, max_load_time, severity, attack_type 
17| `ollama_possible_memory_exhaustion_resource_abuse_filter`

Data Source

Name	Platform	Sourcetype	Source
Ollama Server	Other	`'ollama:server'`	`'server.log'`

Macros Used

Name	Value
ollama_server	`(sourcetype="ollama:server")`
ollama_possible_memory_exhaustion_resource_abuse_filter	`search *`

ollama_possible_memory_exhaustion_resource_abuse_filter is an empty macro by default. It allows the user to filter out any results (false positives) without editing the SPL.

Annotations

- MITRE ATT&CK

+ Kill Chain Phases

+ NIST

+ CIS

- Threat Actors

ID	Technique	Tactic
T1499	Endpoint Denial of Service	Impact

Actions on Objectives

DE.AE

CIS 10

Default Configuration

This detection is configured by default in Splunk Enterprise Security to run with the following settings:

Setting	Value
Disabled	true
Cron Schedule	`0 * * * *`
Earliest Time	`-70m@m`
Latest Time	`-10m@m`
Schedule Window	`auto`
Creates Risk Event	True

This configuration file applies to all detections of type anomaly. These detections will use Risk Based Alerting.

Implementation

Ingest Ollama logs via Splunk TA-ollama add-on by configuring file monitoring inputs pointed to your Ollama server log directories (sourcetype: ollama:server), or enable HTTP Event Collector (HEC) for real-time API telemetry and prompt analytics (sourcetypes: ollama:api, ollama:prompts). CIM compatibility using the Web datamodel for standardized security detections.

Known False Positives

Legitimate high-volume production workloads processing multiple concurrent requests, users loading large language models (7B+ parameters) that naturally require substantial memory allocation, simultaneous multi-model deployments during system scaling, batch processing operations, or initial system startup sequences may generate similar memory allocation patterns during normal operations.

Associated Analytic Story

Suspicious Ollama Activities

Risk Based Analytics (RBA)

Risk Message:

Potential resource exhaustion attack detected on $host$ with $operations$ memory operations in 5 minutes, utilizing $max_memory$ MiB peak memory and $total_runners$ runners, indicating possible attempts to exhaust system resources through excessive model loading or memory abuse.

Risk Object	Risk Object Type	Risk Score	Threat Objects
host	system	10	No Threat Objects

References

https://github.com/rosplk/ta-ollama

Detection Testing

Test Type	Status	Dataset	Source	Sourcetype
Validation	Not Applicable	N/A	N/A	N/A
Unit	✅ Passing	Dataset	`server.log`	`ollama:server`
Integration	✅ Passing	Dataset	`server.log`	`ollama:server`

Replay any dataset to Splunk Enterprise by using our replay.py tool or the UI. Alternatively you can replay a dataset into a Splunk Attack Range

Source: GitHub | Version: 1