-
Notifications
You must be signed in to change notification settings - Fork 713
cloud_storage/tests: use better containers in s3_imposter #29512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR replaces standard library containers in s3_imposter with more memory-efficient alternatives to prevent allocation failures during high-volume testing. The motivation stems from a test failure where ~800KB could not be allocated due to container growth in s3_imposter.
Changes:
- Replaced
std::vectorwithchunked_vectorfor various collections in s3_imposter and related test utilities - Replaced
std::map/std::multimapwithabsl::btree_map/absl::btree_multimapfor ordered collections - Updated function signatures and call sites throughout test code to use the new container types
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/v/cloud_io/tests/s3_imposter.h | Updated s3_imposter interface to use chunked_vector and absl::btree_multimap |
| src/v/cloud_io/tests/s3_imposter.cc | Implemented container type changes in s3_imposter, replacing std containers |
| src/v/cloud_io/tests/BUILD | Added dependencies for chunked_vector and absl::btree containers |
| src/v/cloud_storage/tests/util.h | Changed return types of make_imposter_expectations to chunked_vector |
| src/v/cloud_storage/tests/util.cc | Updated implementations to use chunked_vector and std::move semantics |
| src/v/cloud_storage/tests/BUILD | Added chunked_vector dependency |
| src/v/cloud_storage/tests/remote_test.cc | Changed local variables to const references to avoid copies |
| src/v/cloud_storage/tests/remote_partition_test.cc | Updated expectation handling to use chunked_vector with std::move |
| src/v/cloud_storage/tests/async_manifest_view_test.cc | Changed _expectations field to chunked_vector and added std::move |
| src/v/cloud_storage/tests/topic_mount_handler_test.cc | Refactored expectation construction to use chunked_vector |
| src/v/datalake/coordinator/tests/iceberg_snapshot_remover_test.cc | Updated remove_expectations call to use chunked_vector::single |
| src/v/iceberg/tests/table_io_test.cc | Updated add_expectations call to use chunked_vector::single |
| src/v/lsm/io/tests/persistence_test.cc | Updated remove_expectations call to use chunked_vector::single |
| // Build new vector excluding manifest expectation | ||
| chunked_vector<cloud_storage_fixture::expectation> expectations; | ||
| for (auto& e : all_expectations) { | ||
| if (e.url != manifest_url) { | ||
| expectations.push_back(std::move(e)); | ||
| } | ||
| } |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop constructs a new chunked_vector by copying/moving elements from all_expectations except the manifest. This is less efficient than removing the manifest element directly. Consider using std::remove_if or erase-remove idiom on all_expectations instead of creating a new vector.
| s3_imposter_fixture::req_pred_t predicate) const { | ||
| std::vector<http_test_utils::request_info> matching_requests; | ||
| matching_requests.reserve(_requests.size()); | ||
| chunked_vector<http_test_utils::request_info> matching_requests; |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reserve() call was removed when converting from std::vector to chunked_vector. While chunked_vector has different memory characteristics, pre-allocating capacity when the upper bound is known (_requests.size()) could still improve performance by reducing reallocations.
| chunked_vector<http_test_utils::request_info> matching_requests; | |
| chunked_vector<http_test_utils::request_info> matching_requests; | |
| matching_requests.reserve(_requests.size()); |
Saw an instance where we failed to allocate ~800KB because one of the containers in s3_imposters grew to be large. It isn't unreasonable to expect a test to bombard s3_imposter with requests, especially when trying to stress certain subsystems. This replaces the containers in s3_imposter with chunked_vector and absl::btree_multimap, which are more aligned with best practices of our codebase.
Saw one instance of this test using ~1GB which contributed to it crashing.
2b04ff8 to
faf9722
Compare
CI test resultstest results on build#80060
|
| } | ||
|
|
||
| void listen() { set_expectations_and_listen(_expectations); } | ||
| void listen() { set_expectations_and_listen(std::move(_expectations)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this ok? just happens once and then no more uses of expectations member var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this is only used to set up expectations at the beginning of certain tests
Saw an instance where we failed to allocate ~800KB because one of the containers in s3_imposters grew to be large. It isn't unreasonable to expect a test to bombard s3_imposter with requests, especially when trying to stress certain subsystems.
This replaces the containers in s3_imposter with chunked_vector and absl::btree_multimap, which are more aligned with best practices of our codebase.
Backports Required
Release Notes