baselines/baseline_of_kubernetes_process_resource_ratio.yml

name: Baseline Of Kubernetes Process Resource Ratio
id: 427f81cf-ce6a-4a24-a73d-70c50171ea66
version: 2
date: '2024-09-24'
author: Matthew Moore, Splunk
type: Baseline
status: production
description: This baseline rule calculates the average and standard deviation of the
  ratio of various process resources in a Kubernetes environment. It uses metrics
  from the Kubernetes API and the Splunk Infrastructure Monitoring Add-on. The rule
  generates a lookup table with the average and standard deviation of the resource
  ratios for each process. This baseline can be used to detect anomalies in process
  resource utilization, which may indicate security threats such as resource exhaustion
  attacks, cryptojacking, or compromised process behavior.
search: "| mstats avg(process.*) as process.* where `kubernetes_metrics` by host.name
  k8s.cluster.name k8s.node.name process.executable.name span=10s | eval cpu:mem =
  'process.cpu.utilization'/'process.memory.utilization' | eval cpu:disk = 'process.cpu.utilization'/'process.disk.operations'
  | eval mem:disk = 'process.memory.utilization'/'process.memory.utilization' | eval
  cpu:threads = 'process.cpu.utilization'/'process.threads' | eval disk:threads =
  'process.disk.operations'/'process.threads' | eval key = 'k8s.cluster.name' + \"\
  :\" + 'host.name' + \":\" + 'process.executable.name' | fillnull | stats avg(cpu:mem)
  as avg_cpu:mem stdev(cpu:mem) as stdev_cpu:mem avg(cpu:disk) as avg_cpu:disk stdev(cpu:disk)
  as stdev_cpu:disk avg(mem:disk) as avg_mem:disk stdev(mem:disk) as stdev_mem:disk
  avg(cpu:threads) as avg_cpu:threads stdev(cpu:threads) as stdev_cpu:threads avg(disk:threads)
  as avg_disk:threads stdev(disk:threads) as stdev_disk:threads count latest(_time)
  as last_seen by key | outputlookup k8s_process_resource_ratio_baseline"
how_to_implement: "To implement this detection, follow these steps: 1. Deploy the
  OpenTelemetry Collector (OTEL) to your Kubernetes cluster. 2. Enable the hostmetrics/process
  receiver in the OTEL configuration. 3. Ensure that the process metrics, specifically
  Process.cpu.utilization and process.memory.utilization, are enabled. 4. Install
  the Splunk Infrastructure Monitoring (SIM) add-on.(ref: https://splunkbase.splunk.com/app/5247)
  5. Configure the SIM add-on with your Observability Cloud Organization ID and Access
  Token. 6. Set up the SIM modular input to ingest Process Metrics. Name this input
  \"sim_process_metrics_to_metrics_index\". 7. In the SIM configuration, set the Organization
  ID to your Observability Cloud Organization ID. 8. Set the Signal Flow Program to
  the following: data('process.threads').publish(label='A'); data('process.cpu.utilization').publish(label='B');
  data('process.cpu.time').publish(label='C'); data('process.disk.io').publish(label='D');
  data('process.memory.usage').publish(label='E'); data('process.memory.virtual').publish(label='F');
  data('process.memory.utilization').publish(label='G'); data('process.cpu.utilization').publish(label='H');
  data('process.disk.operations').publish(label='I'); data('process.handles').publish(label='J');
  data('process.threads').publish(label='K') 9. Set the Metric Resolution to 10000.
  10. Leave all other settings at their default values."
known_false_positives: none
references: []
tags:
  analytic_story:
  - Abnormal Kubernetes Behavior using Splunk Infrastructure Monitoring


  detections:
  - Kubernetes Process with Resource Ratio Anomalies
  product:
  - Splunk Enterprise
  - Splunk Enterprise Security
  - Splunk Cloud
  security_domain: network
deployment:
  scheduling:
    cron_schedule: 0 2 * * 0
    earliest_time: -30d@d
    latest_time: -1d@d
    schedule_window: auto