-
Notifications
You must be signed in to change notification settings - Fork 22
Adding Pinned host buffer benchmarks #576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Adding Pinned host buffer benchmarks #576
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Signed-off-by: niranda perera <[email protected]>
Signed-off-by: niranda perera <[email protected]>
Signed-off-by: niranda perera <[email protected]>
f6a2faf to
ddc5452
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good but we need to run a smoke test in CI: https://github.com/rapidsai/rapidsmpf/blob/f67d8ecc3a5ff83d3d0bf92dc10f2be2a499ea24/ci/run_cpp_benchmark_smoketests.sh
Signed-off-by: niranda perera <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The priming benchmark is measuring very misleading timings. Please also cull all the LLM-produced comments that just explain in words exactly what the next line of code says in code.
| auto latency_to_first = std::chrono::duration_cast<std::chrono::nanoseconds>( | ||
| first_allocation_time - start_time | ||
| ) | ||
| .count(); | ||
| auto first_round_duration_ns = | ||
| std::chrono::duration_cast<std::chrono::nanoseconds>( | ||
| first_round_end - start_time | ||
| ) | ||
| .count(); | ||
| auto second_round_duration_ns = | ||
| std::chrono::duration_cast<std::chrono::nanoseconds>( | ||
| second_round_end - first_round_end | ||
| ) | ||
| .count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These timings are at least partly nonsense. The latency to first makes sense. The first_round_duration kind of doesn't because we've synced after the very first allocation, but ok. I guess it makes kind of sense.
The second_round_duration includes the time to deallocate all of the allocations from the first round. This makes no sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah. Good point. I was simply following this
https://github.com/rapidsai/rmm/blob/branch-25.12/cpp/benchmarks/async_priming/async_priming_bench.cpp
I guess the same issue is here as well.
Co-authored-by: Lawrence Mitchell <[email protected]>
Co-authored-by: Lawrence Mitchell <[email protected]>
Signed-off-by: niranda perera <[email protected]>

This PR adds benchmarks for pinned host buffer
Depends on #549