How we investigated high container memory with Fluent Bit filesystem buffering #11672

jmtt89 · 2026-04-04T18:44:32Z

jmtt89
Apr 4, 2026

Hi,

I want to share a debugging session from a Kubernetes cluster because the result was surprising and ended up changing our initial hypothesis. It may be useful to others who run Fluent Bit with filesystem buffering and see very high container memory in kubectl top.

TL;DR

We had a Fluent Bit DaemonSet where some pods showed more than 1 GiB in kubectl top, but:

the Fluent Bit process RSS itself was small
there was no active output backlog explaining that memory

After digging into it, we found that:

the memory was mostly not in user space
it was being charged to the container cgroup as kernel/slab
node-level slab was dominated by dentry and xfs_inode
the most important runtime difference was high churn of chunk files under /fluent-bit/storage/tail.0

In our rollout, this happened because the new collector was still tailing almost the whole node and then dropping most records later in the pipeline with a label-based filter.

So this did not look like a memory leak in Fluent Bit itself. In our case, it was mainly the result of how our current pipeline shape interacted with filesystem buffering.

Environment

Kubernetes
container runtime: containerd
node filesystem: xfs
Fluent Bit using tail input with storage.type filesystem
network output

We were comparing two Fluent Bit DaemonSets on the same node: a legacy collector (v1) and a new collector (v2). Both were tailing the same node log tree, /var/log/containers/*.log.

What looked surprising

On one hot node:

v2 collector: about ~1.8Gi
v1 collector: about ~200Mi

At first glance, this looked like a Fluent Bit memory problem.

What we measured

1. Process RSS was small

Inside the hot v2 pod:

VmRSS: about 22Mi

So the process heap was clearly not where the ~1.8Gi was going.

2. Cgroup memory was almost entirely kernel memory

From the hot v2 pod cgroup:

memory.current: about 1.98Gi
kernel: about 1.96Gi
slab_reclaimable: about 1.959Gi

On the same node, the legacy collector looked like this:

memory.current: about 222Mi
kernel: about 183Mi
slab_reclaimable: about 181Mi

So both collectors showed the same kind of accounting pattern, but v2 had roughly 10x more of it.

3. Node-level slab was dominated by filesystem metadata

On hot nodes:

dentry: around ~10M objects
xfs_inode: hundreds of thousands of objects

On a cool node:

dentry: around 122k

That pushed us away from a Fluent Bit heap issue and toward filesystem metadata churn.

4. It was not explained by open log files or inotify watchers

On the same hot node, comparing the real PIDs:

both processes had 48 inotify watches
both had 48 log file FDs

So the difference was not simply that v2 was watching more files.

5. The key runtime difference was chunk file churn

We traced filesystem syscalls for both processes on the same node.

In a 30s window:

v2
- 1678 ops on /fluent-bit/storage/tail.0
- 839 openat
- 839 unlink
- about 55.9 ops/s
legacy collector
- 539 ops on /fluent-bit/storage/tail.0
- 270 openat
- 269 unlink
- about 18.0 ops/s

The paths looked like this:

/fluent-bit/storage/tail.0/1-1775324786.619623418.flb

So the excess churn was specifically in chunk file creation and deletion inside the filesystem storage engine.

6. The `.flb` files themselves looked normal

We inspected chunk files from both collectors:

both were normal Fluent Bit binary chunk files
both carried tags like kube.var.log.containers.*
we did not find evidence that v2 was generating a different chunk format

The important difference was not the file format, but the rate:

v2 was creating many more short-lived chunk files
the legacy collector kept fewer chunk files, and they were more stable

Why this is happening in our case

In our current rollout, v2 still tails almost the whole node, but later drops most records with a grep filter based on a Kubernetes label.

The current order is effectively:

tail
kubernetes
grep on label

So right now v2:

ingests most of the node log stream
creates input chunks
then discards most records later

That greatly amplifies chunk churn in storage/tail.0.

An important detail here is that this explains our current rollout state, but not necessarily the final steady state once all traffic is moved to v2.

Relevant source code references

These files were useful when mapping behavior to runtime:

plugins/in_tail/tail.c
src/flb_input_chunk.c
lib/chunkio/src/cio_file.c

Two points stood out:

input_chunk_append_raw() applies filters before writing the final chunk content, and if the result is empty the chunk can be destroyed.
Chunk I/O in filesystem mode uses real files, mmap, open/down/up transitions, and deletion on close.

That matches the runtime pattern we observed:

many .flb files
many openat / unlink
high dentry / xfs_inode

Practical takeaway

Our main takeaway is that, with storage.type filesystem, high container memory in kubectl top may be dominated by kernel slab related to filesystem metadata churn, not by Fluent Bit process RSS.

In our case, the biggest amplifier was pipeline shape: the new collector was still tailing almost the whole node and only dropping records later, after chunk creation had already happened. That made the problem look like a Fluent Bit memory issue when it was really a consequence of filesystem buffering plus late filtering.

Our next step is to reduce how much data reaches the storage layer before it gets discarded, and then compare that with the final steady state once v2 is the only collector on the node.

I'm sharing this in case it helps others debugging similar symptoms, and I'd be interested to hear from maintainers whether this matches the expected behavior of filesystem buffering on XFS/containerd, or whether there are recommended patterns to reduce this kind of chunk churn.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How we investigated high container memory with Fluent Bit filesystem buffering #11672

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How we investigated high container memory with Fluent Bit filesystem buffering #11672

Uh oh!

jmtt89 Apr 4, 2026

TL;DR

Environment

What looked surprising

What we measured

1. Process RSS was small

2. Cgroup memory was almost entirely kernel memory

3. Node-level slab was dominated by filesystem metadata

4. It was not explained by open log files or inotify watchers

5. The key runtime difference was chunk file churn

6. The .flb files themselves looked normal

Why this is happening in our case

Relevant source code references

Practical takeaway

Replies: 0 comments

jmtt89
Apr 4, 2026

6. The `.flb` files themselves looked normal