You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running into an issue, where fluent-bit will deadlock and stop processing logs, when reading 2GB of log-files via tails and sending them to elastic. The memory footprint will steadily grow towards the limit of Mem_Buf_Limit (which is set to 1GB). It will then hover around the limit for maybe a minute while serving logs, while tails is constantly paused and resumed due to hitting Mem_Buf_Limit, until it all of a sudden stops.
The CPU utilization drops to 0, while the memory footprint stays at 1 GB.
I did some debugging that gave me some insights, but not enough to find the issue:
The majority of allocations on deadlock comes from the tails input module
There are no retries in the elastic output module
cb_es_flush simply stops being called
The program seems to just wait on an event. There is no infinite-loop or something like this going on.
So I suspect that either the file-contents that were read by tails are not properly cleaned up after the elasticsearch output is done sending them, or that there is some kind of bug where the input read by tails can't be handed over to the elasticsearch output, because there is no memory left because of Mem_Buf_Limit.
[SERVICE]
flush 1
daemon Off
log_level info
http_server Off
http_listen 0.0.0.0
http_port 2020
storage.metrics on
[INPUT]
name tail
Path O:\log_folder\*.log
Read_From_Head On
Mem_Buf_Limit 1g
[OUTPUT]
name es
match *
host 127.0.0.1
index gb30-fluentd-test-index
This is the log-ouput with info-severity right before it deadlocks:
(Note the write: No error messages. I couldn't find where this message is printed in the code. I suspect the No error part to be a bug, where the wrong error code is printed on windows, like mentioned in #3146. So unfortunately, I have no idea where this error happens, or what it is.)
And this is the log-ouput with debug-severity right before it deadlocks (from a different run obviously):
For reproduction: I am reading 5.7 million lines of ndjson-logs for a total of 2.4GB from 267 files into a local elasticsearch (7.x) in default configuration. The deadlock happens after processing approximately 5 million lines.
The text was updated successfully, but these errors were encountered:
I'm running into an issue, where fluent-bit will deadlock and stop processing logs, when reading 2GB of log-files via tails and sending them to elastic. The memory footprint will steadily grow towards the limit of
Mem_Buf_Limit
(which is set to 1GB). It will then hover around the limit for maybe a minute while serving logs, while tails is constantly paused and resumed due to hittingMem_Buf_Limit
, until it all of a sudden stops.The CPU utilization drops to 0, while the memory footprint stays at 1 GB.
I did some debugging that gave me some insights, but not enough to find the issue:
cb_es_flush
simply stops being calledSo I suspect that either the file-contents that were read by tails are not properly cleaned up after the elasticsearch output is done sending them, or that there is some kind of bug where the input read by tails can't be handed over to the elasticsearch output, because there is no memory left because of
Mem_Buf_Limit
.I am running on Windows.
This issue seems to be very similar to #3148.
This is my configuration:
This is the log-ouput with info-severity right before it deadlocks:
(Note the
write: No error
messages. I couldn't find where this message is printed in the code. I suspect theNo error
part to be a bug, where the wrong error code is printed on windows, like mentioned in #3146. So unfortunately, I have no idea where this error happens, or what it is.)And this is the log-ouput with debug-severity right before it deadlocks (from a different run obviously):
For reproduction: I am reading 5.7 million lines of ndjson-logs for a total of 2.4GB from 267 files into a local elasticsearch (7.x) in default configuration. The deadlock happens after processing approximately 5 million lines.
The text was updated successfully, but these errors were encountered: