Skip to content

test_in_tail: fix flaky throttling test#5381

Merged
kenhys merged 1 commit into
fluent:masterfrom
Watson1978:test_in_tail
Jun 8, 2026
Merged

test_in_tail: fix flaky throttling test#5381
kenhys merged 1 commit into
fluent:masterfrom
Watson1978:test_in_tail

Conversation

@Watson1978

Copy link
Copy Markdown
Contributor

Which issue(s) this PR fixes:
Fixes #

What this PR does / why we need it:
The test "lines collected with throttling" was intermittently failing, especially on macOS CI runners.

4) Failure: test: lines collected with throttling(TailInputTest::throttling logs at in_tail level):
  elapsed_seconds 1.81999524999992 is out of allowed range:
    lower: 1.88 [sec]
    upper: 3.3200000000000003 [sec].
  <true> expected but was
  <false>
/Users/runner/work/fluentd/fluentd/test/plugin/test_in_tail.rb:2793:in 'block (3 levels) in <class:TailInputTest>'
     2790:           sleep(sleep_interval) while d.emit_count <= emit_count
     2791:           elapsed_seconds = Fluent::Clock.now - start_time
     2792:           if emit_count > 0
  => 2793:             assert_true(elapsed_seconds > lower_interval && elapsed_seconds < upper_interval,
     2794:                         "elapsed_seconds #{elapsed_seconds} is out of allowed range:\n" +
     2795:                         "  lower: #{lower_interval} [sec]\n" +
     2796:                         "  upper: #{upper_interval} [sec]")

The failure occurred because the test was measuring the test thread's observation delay rather than the actual interval between emit events.

Previously, start_time was recorded at the beginning of the whileloop. If the OS thread scheduling was delayed, the time spent waiting was excluded from the current cycle and bled into the next, causing elapsed_seconds to falsely fall below the lower_interval threshold.

By moving start_time = Fluent::Clock.now outside the loop and updating it immediately after confirming an emit, we accurately measure the duration between consecutive emits. This makes the test resilient to CI scheduling jitter.

Docs Changes:
N/A

Release Note:
N/A

The `test "lines collected with throttling"` was intermittently failing, especially on macOS CI runners.

```
4) Failure: test: lines collected with throttling(TailInputTest::throttling logs at in_tail level):
  elapsed_seconds 1.81999524999992 is out of allowed range:
    lower: 1.88 [sec]
    upper: 3.3200000000000003 [sec].
  <true> expected but was
  <false>
/Users/runner/work/fluentd/fluentd/test/plugin/test_in_tail.rb:2793:in 'block (3 levels) in <class:TailInputTest>'
     2790:           sleep(sleep_interval) while d.emit_count <= emit_count
     2791:           elapsed_seconds = Fluent::Clock.now - start_time
     2792:           if emit_count > 0
  => 2793:             assert_true(elapsed_seconds > lower_interval && elapsed_seconds < upper_interval,
     2794:                         "elapsed_seconds #{elapsed_seconds} is out of allowed range:\n" +
     2795:                         "  lower: #{lower_interval} [sec]\n" +
     2796:                         "  upper: #{upper_interval} [sec]")
```

The failure occurred because the test was measuring the test thread's observation delay rather than the actual interval between emit events.

Previously, `start_time` was recorded at the beginning of the `while`loop.
If the OS thread scheduling was delayed, the time spent waiting was excluded from the current cycle and bled into the next, causing `elapsed_seconds` to falsely fall below the `lower_interval` threshold.

By moving `start_time = Fluent::Clock.now` outside the loop and updating it immediately after confirming an emit, we accurately measure the duration between consecutive emits. This makes the test resilient to CI scheduling jitter.

Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
@Watson1978 Watson1978 added CI Test/CI issues backport to v1.19 We will backport this fix to the LTS branch labels Jun 6, 2026
@Watson1978 Watson1978 requested a review from kenhys June 6, 2026 06:27
@Watson1978 Watson1978 added this to the v1.20.0 milestone Jun 6, 2026

@kenhys kenhys left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kenhys kenhys merged commit 3e53f56 into fluent:master Jun 8, 2026
36 of 37 checks passed
@Watson1978 Watson1978 deleted the test_in_tail branch June 8, 2026 04:33
Watson1978 added a commit that referenced this pull request Jun 15, 2026
**Which issue(s) this PR fixes**: 
Backport #5381
Fixes #

**What this PR does / why we need it**: 
The `test "lines collected with throttling"` was intermittently failing,
especially on macOS CI runners.

```
4) Failure: test: lines collected with throttling(TailInputTest::throttling logs at in_tail level):
  elapsed_seconds 1.81999524999992 is out of allowed range:
    lower: 1.88 [sec]
    upper: 3.3200000000000003 [sec].
  <true> expected but was
  <false>
/Users/runner/work/fluentd/fluentd/test/plugin/test_in_tail.rb:2793:in 'block (3 levels) in <class:TailInputTest>'
     2790:           sleep(sleep_interval) while d.emit_count <= emit_count
     2791:           elapsed_seconds = Fluent::Clock.now - start_time
     2792:           if emit_count > 0
  => 2793:             assert_true(elapsed_seconds > lower_interval && elapsed_seconds < upper_interval,
     2794:                         "elapsed_seconds #{elapsed_seconds} is out of allowed range:\n" +
     2795:                         "  lower: #{lower_interval} [sec]\n" +
     2796:                         "  upper: #{upper_interval} [sec]")
```

The failure occurred because the test was measuring the test thread's
observation delay rather than the actual interval between emit events.

Previously, `start_time` was recorded at the beginning of the
`while`loop. If the OS thread scheduling was delayed, the time spent
waiting was excluded from the current cycle and bled into the next,
causing `elapsed_seconds` to falsely fall below the `lower_interval`
threshold.

By moving `start_time = Fluent::Clock.now` outside the loop and updating
it immediately after confirming an emit, we accurately measure the
duration between consecutive emits. This makes the test resilient to CI
scheduling jitter.

**Docs Changes**:
N/A

**Release Note**: 
N/A

Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Shizuo Fujita <fujita@clear-code.com>
@Watson1978 Watson1978 added the backported "backport to LTS" is done label Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport to v1.19 We will backport this fix to the LTS branch backported "backport to LTS" is done CI Test/CI issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants