-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log lost when task_id is occupied #9997
Comments
@edsiper @patrick-stephens @cosmo0920 it would be much appreciated if you could help check this issue. |
Just some update: 2> As mentioned in #9966 (comment), question 2, curl cmd shows the total_chunks is 2099 this time after about 40h without output. As the total file size (data/tail.1/*) is about 132M, still confused with unmapping number of task_id and local chunk files. 3> Could we add a new configurable parameter, such as max_flb_task in service. Default value is 2048 as https://github.com/fluent/fluent-bit/blob/v3.2.6/include/fluent-bit/flb_config.h#L291 defined and also could be changed per need. The actual number of taskid is depends on storage.total_limit_size and max_flb_task which reached first. Any concern or suggestion of this change? |
Any update on this one? |
It is being investigated, the issue will be updated once there is some info. The suspicion is as indicated above that there is a limit which is being hit before the configured total storage limit. |
Thanks @patrick-stephens for the update, looking forward to any further solution. |
dis a doozy |
Bug Report
Describe the bug
We have Loki deployed as the output of FluentBit and owning to the maintenance of K8S cluster, Loki stack was down as well (36 hours-ish). After bring Loki back online this morning, we found some of the logs were lost.
Below is the configuration in our environment:
Please correct me if there is any misunderstanding:
We noticed that the
task_id
stopped at 2047 though there was incoming chunks created, this should be by design: https://github.com/fluent/fluent-bit/blob/v3.2.6/include/fluent-bit/flb_config.h#L291. Each chunk will be processed by each task_id, taking the chunk size into consideration, the maximum buffered data should be:2048 * 2MB = 4096MB/4G (or even smaller), though
storage.total_limit_size
is set as 5G or larger.Similar issues:
#8503
#8395
To Reproduce
Expected behavior
FluentBit can buffer the log files as much as it can be per the setting in
storage.total_limit_size
while output service is down.Your Environment
Version used: 3.2.6
Environment name and version (e.g. Kubernetes? What version?): Linux
Server type and version: Linux
Operating System and version: RHEL 8.0
The text was updated successfully, but these errors were encountered: