Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes_events input seems to create high cpu usage #9787

Open
applike-ss opened this issue Jan 2, 2025 · 0 comments
Open

kubernetes_events input seems to create high cpu usage #9787

applike-ss opened this issue Jan 2, 2025 · 0 comments

Comments

@applike-ss
Copy link

applike-ss commented Jan 2, 2025

Bug Report

Describe the bug
I have deployed a fluent-bit via a Deployment which' only job is to gather kubernetes_events and output them somewhere.
This fluent-bit seems to have an issue where sometimes over the timespan of a few minutes to sometimes multiple hours the cpu usage goes to 1 (100% on 1 core).
The deployment only has a request of >1, no limit set, and the node has a lot of spare cpu capacity (32 core system).
My other fluent-bits which are gathering logs and outputting to the same output do not seem to have this issue.
There is no custom parsers in custom_parsers.conf.
I do use the helm chart of fluent-bit with these values:

    kind: Deployment
    autoscaling:
      vpa:
        enabled: true
    config:
      hotReload:
        enabled: true
      inputs: |
        [Input]
            Name    kubernetes_events
            db      /var/sync/db
            kube_retention_time 15m
            Tag     k8s-events
      customParsers: ""
      filters: ""
      outputs: |
        [Output]
            Name    forward
            Match    k8s-events
            Retry_Limit    5
            Host    my-external-fluentd-hostname
            Port    15000
    extraVolumes:
      - name: sync
        persistentVolumeClaim:
          claimName: fluent-bit-k8s-events-sync
    extraVolumeMounts:
      - name: sync
        mountPath: /var/sync
    image:
      tag: 3.2.4
    rbac:
      create: true
      eventsAccess: true
    replicaCount: 1
    serviceMonitor:
      enabled: true
    updateStrategy:
      type: RollingUpdate
      rollingUpdate:
        maxUnavailable: 1
        maxSurge: 1

I can also see on the node that it is fluent-bit itself causing the cpu usage and not the config watcher or hot-reload mechanism:

# ps aux | grep fluent
root        3751  0.0  0.0 1226304 2164 ?        Ssl  09:50   0:00 /fluent-bit/bin/fluent-bit-watcher
root        3778  0.1  0.1 125872 19508 ?        Sl   09:50   0:02 /fluent-bit/bin/fluent-bit --enable-hot-reload -c /fluent-bit/etc/fluent-bit.conf
root       54668 99.6  0.5 295496 92072 ?        Ssl  10:14   5:25 /fluent-bit/bin/fluent-bit --workdir=/fluent-bit/etc --config=/fluent-bit/etc/conf/fluent-bit.conf

To Reproduce

  • Run fluent-bit with the given chart values for some days (ensure to create a pvc fluent-bit-k8s-events-sync first)
  • observe cpu usage

Expected behavior
cpu usage should correlate to event amount produced

Screenshots
Bildschirmfoto 2025-01-02 um 11 23 44

Your Environment

  • Version used: 3.2.4
  • Configuration: as can be seen above, manually create a pvc with name fluent-bit-k8s-events-sync that can be used to create the db sync
  • Environment name and version (e.g. Kubernetes? What version?): Kubernetes - AWS EKS v1.31.1-eks-1b3e656
  • Server type and version: AWS EC2 Instance
  • Operating System and version: Bottlerocket OS 1.29.0 (aws-k8s-1.31)
  • Filters and plugins: none

Additional context
It seems that fluent-bit is still processing events and writing them to the output, but i haven't checked if they are complete.
I do see this behavior across all our clusters, except those where the output is running inside the same cluster (the outputs hostname is an internal kubernetes service in this case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant