Skip to content

Conversation

@see-quick
Copy link
Member

@see-quick see-quick commented Nov 11, 2025

Type of change

  • Enhancement / new feature

Description

This PR adds a perf report, which might look like this [1] (ignore those numbers; important is the format). This report would always be appended to the PR as a message when performance tests are triggered.

Currently, I am using only one agent (i.e., ubuntu-latest) for testing purposes on my fork, but the plan is to use both x64 and arm-based agents.

[1] - see-quick#15 (comment)

Checklist

  • Write tests
  • Make sure all tests pass
  • Update documentation

@see-quick see-quick added this to the 0.50.0 milestone Nov 11, 2025
@see-quick see-quick self-assigned this Nov 11, 2025
@see-quick see-quick requested review from a team and Frawless November 11, 2025 12:15
@codecov
Copy link

codecov bot commented Nov 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.78%. Comparing base (4503119) to head (42a8dd1).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12124      +/-   ##
============================================
- Coverage     74.81%   74.78%   -0.04%     
- Complexity     6619     6624       +5     
============================================
  Files           377      377              
  Lines         25329    25349      +20     
  Branches       3394     3398       +4     
============================================
+ Hits          18951    18957       +6     
- Misses         4991     5007      +16     
+ Partials       1387     1385       -2     

see 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

${{ steps.generate_report.outputs.summary }}
- name: Add performance report to job summary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, maybe there could be some usage for common sts execution as well?

@see-quick
Copy link
Member Author

/gha run pipeline=performance

@github-actions
Copy link

github-actions bot commented Nov 18, 2025

⏳ System test verification started: link

The following 2 job(s) will be executed:

  • performance-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • performance-arm64 (oracle-vm-8cpu-32gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link

❌ System test verification failed: link

@see-quick
Copy link
Member Author

/gha run pipeline=performance

@github-actions
Copy link

github-actions bot commented Nov 19, 2025

⏳ System test verification started: link

The following 2 job(s) will be executed:

  • performance-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • performance-arm64 (oracle-vm-8cpu-32gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link

❌ System test verification failed: link

@see-quick
Copy link
Member Author

/gha run pipeline=performance

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

⏳ System test verification started: link

The following 2 job(s) will be executed:

  • performance-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • performance-arm64 (oracle-vm-8cpu-32gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link

🎉 System test verification passed: link

Signed-off-by: see-quick <[email protected]>
@see-quick
Copy link
Member Author

/gha run pipeline=performance

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

⏳ System test verification started: link

The following 2 job(s) will be executed:

  • performance-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • performance-arm64 (oracle-vm-8cpu-32gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link

🎉 System test verification passed: link

@see-quick
Copy link
Member Author

/gha run pipeline=performance

@github-actions
Copy link

github-actions bot commented Nov 24, 2025

⏳ System test verification started: link

The following 2 job(s) will be executed:

  • performance-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • performance-arm64 (oracle-vm-8cpu-32gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link

🎉 System test verification passed: link

@see-quick see-quick requested a review from Frawless November 24, 2025 13:41
/**
* Find the latest timestamped results directory
*/
function findLatestResultsDir(baseDir) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are multiple results from one run that you need to find the latest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this was mainly for local testing where I had multiple directories and runs:

├── 2025-11-20-15-57-41
│   └── user-operator
│       └── latencyUseCase
│           ├── users-1000-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-2000-tp--cache--bq--bb-100-bt-100-utp-
│           └── users-3000-tp--cache--bq--bb-100-bt-100-utp-
├── 2025-11-20-16-10-15
│   └── user-operator
│       └── latencyUseCase
│           ├── users-1000-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-2000-tp--cache--bq--bb-100-bt-100-utp-
│           └── users-3000-tp--cache--bq--bb-100-bt-100-utp-
├── 2025-11-20-16-35-19
│   └── user-operator
│       └── latencyUseCase
│           ├── users-1000-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-2000-tp--cache--bq--bb-100-bt-100-utp-
│           └── users-3000-tp--cache--bq--bb-100-bt-100-utp-
├── 2025-11-20-17-02-58
│   └── user-operator
│       └── latencyUseCase
│           ├── users-1000-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-2000-tp--cache--bq--bb-100-bt-100-utp-
│           └── users-3000-tp--cache--bq--bb-100-bt-100-utp-
├── 2025-11-20-17-15-26
│   └── user-operator
│       └── latencyUseCase
│           ├── users-1000-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-2000-tp--cache--bq--bb-100-bt-100-utp-
│           └── users-3000-tp--cache--bq--bb-100-bt-100-utp-
├── 2025-11-21-09-50-34
│   ├── topic-operator
│   │   └── scalabilityUseCase
│   │       ├── max-batch-size-100-max-linger-time-100-with-clients-false-number-of-topics-2
│   │       └── max-batch-size-100-max-linger-time-100-with-clients-false-number-of-topics-3
│   └── user-operator
│       ├── latencyUseCase
│       │   ├── users-10-tp--cache--bq--bb-100-bt-100-utp-
│       │   ├── users-20-tp--cache--bq--bb-100-bt-100-utp-
│       │   └── users-30-tp--cache--bq--bb-100-bt-100-utp-
│       └── scalabilityUseCase
│           ├── users-10-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-12-tp--cache--bq--bb-100-bt-100-utp-
│           ├── users-14-tp--cache--bq--bb-100-bt-100-utp-
│           └── users-16-tp--cache--bq--bb-100-bt-100-utp-
└── 2025-11-24-12-16-22
    └── user-operator
        └── latencyUseCase
            ├── users-1000-tp--cache--bq--bb-100-bt-100-utp-
            ├── users-2000-tp--cache--bq--bb-100-bt-100-utp-
            └── users-3000-tp--cache--bq--bb-100-bt-100-utp-

so it will always pick latest. I think I can re-name that method (to something like that findTimestampedResultsDir``)? But in general per one architecture there should be just only ONE timestamp`. If more architectures are run then we handle it differently... (we don't care about that here).

@see-quick
Copy link
Member Author

/gha run pipeline=performance

@github-actions
Copy link

github-actions bot commented Dec 1, 2025

⏳ System test verification started: link

The following 2 job(s) will be executed:

  • performance-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • performance-arm64 (oracle-vm-8cpu-32gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link

github-actions bot commented Dec 1, 2025

🎉 System test verification passed: link

Copy link
Member

@Frawless Frawless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are fine form my POV. I am not sure with 450 lines of js to generate the report especially when we have something similar already in the tests. AFAIU it is not trivial to re-use it. I wonder what others think about it. Otherwise I am fine with it as long as it will be used :)

@see-quick see-quick requested review from a team and im-konge December 2, 2025 13:55
@see-quick
Copy link
Member Author

Basically, there are two approaches... the first one is (where we would have +500 LOC to merge two or more results from architectures):

Performance Test Results

Test Run: 2025-11-12 20:47

Topic Operator

Use Case: scalabilityUseCase

Configuration:

  • MAX QUEUE SIZE: 2147483647
  • MAX BATCH SIZE (ms): 100
  • MAX BATCH LINGER (ms): 100
  • PROCESS TYPE: TOPIC-CONCURRENT

Results:

# NUMBER OF TOPICS NUMBER OF EVENTS Reconciliation interval (ms) [AMD64] Reconciliation interval (ms) [ARM64]
1 2 8 10229 10167
2 32 98 11505 10504
3 125 375 42367 41202
4 250 750 74596 72361

User Operator

Use Case: scalabilityUseCase

Configuration:

  • WORK_QUEUE_SIZE: 1024
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Reconciliation interval (ms) [AMD64] Reconciliation interval (ms) [ARM64]
1 10 10472 10797
2 100 33036 33851
3 200 54940 55822
4 500 133782 135474

Use Case: latencyUseCase

Configuration:

  • WORK_QUEUE_SIZE: 2048
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Min Latency (ms) [AMD64] Min Latency (ms) [ARM64] Max Latency (ms) [AMD64] Max Latency (ms) [ARM64] Average Latency (ms) [AMD64] Average Latency (ms) [ARM64] P50 Latency (ms) [AMD64] P50 Latency (ms) [ARM64] P95 Latency (ms) [AMD64] P95 Latency (ms) [ARM64] P99 Latency (ms) [AMD64] P99 Latency (ms) [ARM64]
1 110 12 14 69 103 27.78 25.03 26 22 39 45 54 79
2 200 11 15 75 66 29.93 27.13 28 25 48 45 75 59
3 300 10 12 61 98 26.0 25.53 26 23 41 41 50 89

@see-quick
Copy link
Member Author

see-quick commented Dec 2, 2025

or second one have (but with no need + 450LOC of javascript) => but with price with a lot of redudancy from my POV:

Performance Test Results

AMD64

Test Run: 2025-11-18 14:43

Topic Operator

Use Case: latencyUseCase

Configuration:

  • WORK_QUEUE_SIZE: 2048
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Min Latency (ms) Max Latency (ms) Average Latency (ms) P50 Latency (ms) P95 Latency (ms) P99 Latency (ms)
1 110 14 75 27.51 24 51 73
2 300 14 51 24.24 22 38 49
3 200 13 72 25.56 23 44 57

Use Case: scalabilityUseCase

Configuration:

  • MAX QUEUE SIZE: 2147483647
  • MAX BATCH SIZE (ms): 100
  • MAX BATCH LINGER (ms): 100
  • PROCESS TYPE: TOPIC-CONCURRENT

Results:

# NUMBER OF TOPICS NUMBER OF EVENTS Reconciliation interval (ms)
1 2 8 10130
2 32 98 10441
3 125 375 41369
4 250 750 71977

User Operator

Use Case: latencyUseCase

Configuration:

  • WORK_QUEUE_SIZE: 2048
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Min Latency (ms) Max Latency (ms) Average Latency (ms) P50 Latency (ms) P95 Latency (ms) P99 Latency (ms)
1 110 14 75 27.51 24 51 73
2 200 13 72 25.56 23 44 57
3 300 14 51 24.24 22 38 49

Use Case: scalabilityUseCase

Configuration:

  • WORK_QUEUE_SIZE: 1024
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Reconciliation interval (ms)
1 10 10641
2 100 33107
3 200 54157
4 500 134269

ARCH (another arch)

Test Run: 2025-11-18 14:49

Topic Operator

Use Case: latencyUseCase

Configuration:

  • WORK_QUEUE_SIZE: 2048
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Min Latency (ms) Max Latency (ms) Average Latency (ms) P50 Latency (ms) P95 Latency (ms) P99 Latency (ms)
1 110 14 75 27.51 24 51 73
2 300 14 51 24.24 22 38 49
3 200 13 72 25.56 23 44 57

Use Case: scalabilityUseCase

Configuration:

  • MAX QUEUE SIZE: 2147483647
  • MAX BATCH SIZE (ms): 100
  • MAX BATCH LINGER (ms): 100
  • PROCESS TYPE: TOPIC-CONCURRENT

Results:

# NUMBER OF TOPICS NUMBER OF EVENTS Reconciliation interval (ms)
1 2 8 10130
2 32 98 10441
3 125 375 41369
4 250 750 71977

User Operator

Use Case: latencyUseCase

Configuration:

  • WORK_QUEUE_SIZE: 2048
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Min Latency (ms) Max Latency (ms) Average Latency (ms) P50 Latency (ms) P95 Latency (ms) P99 Latency (ms)
1 110 14 75 27.51 24 51 73
2 200 13 72 25.56 23 44 57
3 300 14 51 24.24 22 38 49

Use Case: scalabilityUseCase

Configuration:

  • WORK_QUEUE_SIZE: 1024
  • BATCH_MAXIMUM_BLOCK_SIZE: 100
  • BATCH_MAXIMUM_BLOCK_TIME_MS: 100

Results:

# NUMBER OF KAFKA USERS Reconciliation interval (ms)
1 10 10641
2 100 33107
3 200 54157
4 500 134269

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants