Skip to content

Kubernetes OTel Integration #30815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 10 additions & 5 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -840,31 +840,36 @@ menu:
identifier: otel_kafka_metrics
parent: otel_integrations
weight: 809
- name: Kubernetes Metrics
url: opentelemetry/integrations/kubernetes_metrics/
identifier: otel_kubernetes_metrics
parent: otel_integrations
weight: 810
- name: MySQL Metrics
url: opentelemetry/integrations/mysql_metrics/
identifier: otel_mysql_metrics
parent: otel_integrations
weight: 810
weight: 811
- name: NGINX Metrics
url: opentelemetry/integrations/nginx_metrics/
identifier: otel_nginx_metrics
parent: otel_integrations
weight: 811
weight: 812
- name: Podman Metrics
url: opentelemetry/integrations/podman_metrics/
identifier: otel_podman_metrics
parent: otel_integrations
weight: 812
weight: 813
- name: Runtime Metrics
url: opentelemetry/integrations/runtime_metrics/
identifier: otel_runtime_metrics
parent: otel_integrations
weight: 813
weight: 814
- name: Trace Metrics
url: opentelemetry/integrations/trace_metrics/
identifier: otel_trace_metrics
parent: otel_integrations
weight: 814
weight: 815
- name: Troubleshooting
url: opentelemetry/troubleshooting/
identifier: otel_troubleshooting
Expand Down
3 changes: 3 additions & 0 deletions content/en/opentelemetry/integrations/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Gain insights into your containerized environments and host systems:

- [Docker Metrics][5] - Monitor Docker container performance
- [Host Metrics][6] - Track system metrics such as CPU, disk, and memory usage
- [Kubernetes Metrics][18] - Monitor Kubernetes cluster health and performance
- [Podman Metrics][16] - Monitor Podman container performance

### Web servers and proxies
Expand Down Expand Up @@ -93,3 +94,5 @@ Monitor big data processing frameworks:
[15]: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/dockerstatsreceiver/metadata.yaml
[16]: /opentelemetry/integrations/podman_metrics/
[17]: /opentelemetry/integrations/datadog_extension/
[18]: /opentelemetry/integrations/kubernetes_metrics/

207 changes: 207 additions & 0 deletions content/en/opentelemetry/integrations/kubernetes_metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
---
title: Kubernetes Metrics
further_reading:
- link: "/opentelemetry/setup/"
tag: "Documentation"
text: "Send OpenTelemetry Data to Datadog"
- link: "https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/"
tag: "Documentation"
text: "Unified Service Tagging"
- link: "https://github.com/DataDog/opentelemetry-examples/tree/main/guides/kubernetes"
tag: "GitHub"
text: "Example Collector Configurations"
---

<div class="alert alert-info">The OpenTelemetry Kubernetes integration is in Preview. To request access, contact your Datadog account team.</div>

## Overview

Collect Kubernetes metrics using the OpenTelemetry Collector to gain comprehensive insights into your cluster's health and performance. This integration uses a combination of OpenTelemetry receivers to gather data, which populates the [Kubernetes - Overview][1] dashboard.

{{< img src="/opentelemetry/collector_exporter/kubernetes_metrics.png" alt="The 'Kubernetes - Overview' dashboard, showing metrics for containers, including status and resource usage of your cluster and its containers." style="width:100%;" >}}

This integration requires the [`kube-state-metrics`][8] service and uses a two-collector architecture to gather data.

The `kube-state-metrics` service is a required component that generates detailed metrics about the state of Kubernetes objects like deployments, nodes, and pods. This architecture uses two separate OpenTelemetry Collectors:
- A Cluster Collector, deployed as a Kubernetes Deployment, gathers cluster-wide metrics (for example, the total number of deployments).
- A Node Collector, deployed as a Kubernetes DaemonSet, runs on each node to collect node-specific metrics (for example, CPU and memory usage per node).

This approach ensures that cluster-level metrics are collected only once, preventing data duplication, while node-level metrics are gathered from every node in the cluster.

## Setup

To collect Kubernetes metrics with OpenTelemetry, you need to deploy `kube-state-metrics` and configure both of the above OpenTelemetry Collectors in your cluster.

### Prerequisites

* **Helm**: The setup uses Helm to deploy resources. To install Helm, see the [official Helm documentation][2].
* **Collector Image**: This guide uses the `otel/opentelemetry-collector-contrib:0.130.0` image or newer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In context, should a user usually know which version of the collector image they're using? If not, maybe we could add a link where they could check/update it if necessary?


### Installation

#### 1. Install kube-state-metrics

Run the following commands to add the `prometheus-community` Helm repository and install `kube-state-metrics`:
```sh
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install kube-state-metrics prometheus-community/kube-state-metrics
```

#### 2. Create a Datadog API Key Secret

Create a Kubernetes secret to store your Datadog API key securely.
```sh
export DD_API_KEY="<YOUR_DATADOG_API_KEY>"
kubectl create secret generic datadog-secret --from-literal api-key=$DD_API_KEY
```

#### 3. Install the OpenTelemetry Collectors

1. Add the OpenTelemetry Helm chart repository:
```sh
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
```

1. Download the configuration files for the two Collectors:
- [cluster-collector.yaml][3]
- [daemonset-collector.yaml][4]

1. Set your cluster name as an environment variable and use Helm to deploy both the Cluster and Node Collectors. Make sure the paths to the YAML files are correct.

```bash
# Set your cluster name
export K8S_CLUSTER_NAME="<YOUR_CLUSTER_NAME>"

# Install the Node Collector (DaemonSet)
helm install otel-daemon-collector open-telemetry/opentelemetry-collector \
-f daemonset-collector.yaml \
--set image.repository=otel/opentelemetry-collector-contrib \
--set image.tag=0.130.0 \
--set-string "config.processors.resource.attributes[0].key=k8s.cluster.name" \
--set-string "config.processors.resource.attributes[0].value=${K8S_CLUSTER_NAME}"

# Install the Cluster Collector (Deployment)
helm install otel-cluster-collector open-telemetry/opentelemetry-collector \
-f cluster-collector.yaml \
--set image.repository=otel/opentelemetry-collector-contrib \
--set image.tag=0.130.0 \
--set-string "config.processors.resource.attributes[0].key=k8s.cluster.name" \
--set-string "config.processors.resource.attributes[0].value=${K8S_CLUSTER_NAME}"
```

## Metric metadata configuration

Some metrics require manual metadata updates in Datadog to ensure they are interpreted and displayed correctly.

To edit a metric's metadata:
1. Go to **[Metrics > Summary][6]**.
1. Select the metric you want to edit.
1. Click **Edit** in the side panel.
1. Edit the metadata as needed.
1. Click **Save**.

Repeat this process for each of the metrics listed in the following table:

| Metric Name | Metric Type | Unit |
|--------------------------|-------------|------------------------------------------|
| `k8s.pod.cpu.usage` | `Gauge` | `core` |
| `k8s.pod.network.io` | `Gauge` | `byte_in_binary_bytes_family per second` |
| `k8s.pod.network.errors` | `Gauge` | `byte_in_binary_bytes_family per second` |

## Correlating traces with infrastructure metrics

To correlate your APM traces with Kubernetes infrastructure metrics, Datadog uses [unified service tagging][7]. This requires setting three standard resource attributes on telemetry from both your application and your infrastructure. Datadog automatically maps these OpenTelemetry attributes to the standard Datadog tags (`env`, `service`, and `version`) used for correlation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clicked the unified service tagging link and it went to an OTel specific section, so I was wondering if the first instance of Datadog here should say OpenTelemetry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understandable confusion here. Unified Service Tagging is a Datadog feature. Datadog uses this system to correlate data by mapping standard OpenTelemetry resource attributes to standard Datadog tags (env, service, version). I have a separate PR which should make the OTel nuance more clear (which I'll swap the link to when it's ready). But it is correct to say Datadog here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense, thank you for sating my curiosity!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review! Great feedback!


The required OpenTelemetry attributes are:

- `service.name`
- `service.version`
- `deployment.environment.name` (formerly `deployment.environment`)

This ensures that telemetry from your application is consistently tagged, allowing Datadog to link traces, metrics, and logs to the same service.

### Application configuration

Set the following environment variables in your application's container specification to tag outgoing telemetry:

```yaml
spec:
containers:
- name: my-container
env:
- name: OTEL_SERVICE_NAME
value: "<SERVICE_NAME>"
- name: OTEL_SERVICE_VERSION
value: "<SERVICE_VERSION>"
- name: OTEL_ENVIRONMENT
value: "<ENVIRONMENT>"
- name: OTEL_RESOURCE_ATTRIBUTES
value: "service.name=$(OTEL_SERVICE_NAME),service.version=$(OTEL_SERVICE_VERSION),deployment.environment.name=$(OTEL_ENVIRONMENT)"
```

### Infrastructure configuration

Add the corresponding annotations to your Kubernetes `Deployment` metadata. The `k8sattributes` processor in the Collector uses these annotations to enrich infrastructure metrics with service context.

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
annotations:
# Use resource.opentelemetry.io/ for the k8sattributes processor
resource.opentelemetry.io/service.name: "<SERVICE_NAME>"
resource.opentelemetry.io/service.version: "<SERVICE_VERSION>"
resource.opentelemetry.io/deployment.environment.name: "<ENVIRONMENT>"
spec:
template:
metadata:
annotations:
resource.opentelemetry.io/service.name: "<SERVICE_NAME>"
resource.opentelemetry.io/service.version: "<SERVICE_VERSION>"
resource.opentelemetry.io/deployment.environment.name: "<ENVIRONMENT>"
# ... rest of the manifest
```

## Data collected

This integration collects metrics using several OpenTelemetry receivers.

### kube-state-metrics (using Prometheus receiver)

Metrics scraped from the `kube-state-metrics` endpoint provide information about the state of Kubernetes API objects.

### Kubelet stats receiver

The `kubeletstatsreceiver` collects metrics from the Kubelet on each node, focusing on pod, container, and volume resource usage.

{{< mapping-table resource="kubeletstats.csv">}}

### Kubernetes cluster receiver

The `k8sclusterreceiver` collects cluster-level metrics, such as the status and count of nodes, pods, and other objects.

{{< mapping-table resource="k8scluster.csv">}}

### Host metrics receiver

The `hostmetricsreceiver` gathers system-level metrics from each node in the cluster.

{{< mapping-table resource="host.csv">}}

See [OpenTelemetry Metrics Mapping][5] for more information.

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: https://app.datadoghq.com/dash/integration/86/kubernetes---overview
[2]: https://helm.sh/docs/intro/install/
[3]: https://github.com/DataDog/opentelemetry-examples/blob/main/guides/kubernetes/configuration/cluster-collector.yaml
[4]: https://github.com/DataDog/opentelemetry-examples/blob/main/guides/kubernetes/configuration/daemonset-collector.yaml
[5]: /opentelemetry/schema_semantics/metrics_mapping/
[6]: https://app.datadoghq.com/metric/summary
[7]: /getting_started/tagging/unified_service_tagging/?tab=kubernetes#opentelemetry
[8]: https://github.com/kubernetes/kube-state-metrics
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading