Skip to content

in_tail: Increase BYTES_TO_READ from 8KB to 64KB for better I/O performance on Windows#5278

Merged
kenhys merged 4 commits into
fluent:masterfrom
Watson1978:in_tail
Mar 17, 2026
Merged

in_tail: Increase BYTES_TO_READ from 8KB to 64KB for better I/O performance on Windows#5278
kenhys merged 4 commits into
fluent:masterfrom
Watson1978:in_tail

Conversation

@Watson1978

@Watson1978 Watson1978 commented Mar 14, 2026

Copy link
Copy Markdown
Contributor

Which issue(s) this PR fixes:
Fixes #

What this PR does / why we need it:
A larger buffer reduces system call overhead during file reading, making in_tail much faster on Windows.
In our benchmark, changing BYTES_TO_READ from 8KB to 64KB improved the processing time by ~14% without increasing memory usage (since the iobuf is reused).

As seen in the benchmark, the performance improvement plateaus around 32 KB to 64 KB.
It seems that increasing the buffer beyond 64 KB offers diminishing returns.

I think 64 KB is a standard I/O chunk size that provides the best balance between speed and minimal memory footprint.

Out of curiosity, I looked into the historical background of why BYTES_TO_READ was set to 8192 (8KB).
According to #2418, it was originally chosen based on the BUFSIZ in stdio.h on Ubuntu and Debian at the time.

While this was a reasonable and solid choice back then, circumstances have changed over the years. Modern structured logs (like JSON) frequently exceed 8KB per line.

Increasing the buffer size to 64KB is a necessary update for modern environments.
It drastically reduces the number of system calls and reallocation overhead, saving significant CPU time on Windows, while still keeping memory usage well within acceptable limits.

Benchmark

The following table shows the results of running rake benchmark:run:in_tail with a 10 GB dataset:

8 KB 16 KB 32 KB 64 KB 128 KB
Processing Time [sec] 65.94 57.80 56.39 56.56 55.56

Soak test

The following graph shows the memory usage during a simple soak test using 50 GB of data:

chart

Environment

  • OS: Windows 11 Home (25H2) x86_64
  • CPU: Intel(R) Core(TM) Ultra 9 275HX (24) @ 3.07 GHz
  • Memory: 63.31 GiB
  • Ruby: ruby 4.0.1 (2026-01-13 revision e04267a14b) +PRISM [x64-mingw-ucrt]

Docs Changes:
N/A

Release Note:

  • in_tail: Improve I/O performance significantly on Windows by reducing system call overhead.

@Watson1978 Watson1978 changed the title in_tail: Increase BYTES_TO_READ from 8KB to 64KB for better I/O performance in_tail: Increase BYTES_TO_READ from 8KB to 64KB for better I/O performance on Windows Mar 14, 2026
@Watson1978 Watson1978 added this to the v1.20.0 milestone Mar 14, 2026
@Watson1978 Watson1978 requested a review from kenhys March 14, 2026 07:43
@Watson1978 Watson1978 marked this pull request as draft March 14, 2026 07:57
…rmance

Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
@Watson1978 Watson1978 marked this pull request as ready for review March 16, 2026 06:04
Comment thread test/plugin/test_in_tail.rb Outdated
Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
@kenhys kenhys merged commit 357c753 into fluent:master Mar 17, 2026
21 checks passed
@Watson1978 Watson1978 deleted the in_tail branch March 17, 2026 02:46
@Watson1978

Copy link
Copy Markdown
Contributor Author

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants