fix: scan_from backtrack across sibling subtrees (#850)#851
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds allocator support to the core ART db, fixes iterator seek backtracking in core and OLC variants when GTE/LTE child lookups fail, and adds a regression test that exercises the corrected backtrack behavior. ChangesAllocator Support and Seek Backtracking Fixes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@olc_art.hpp`:
- Around line 1041-1042: The deferred destroy currently enqueues default_destroy
(which calls free_aligned) causing freed OLC memory to use the global allocator;
change the deferred callback passed to alloc.defer_dealloc in places using
inode_ptr / leaf_ptr so it captures and calls the owning allocator's
dealloc/destroy path instead of default_destroy. Concretely, replace uses of
default_destroy in the calls around alloc.defer_dealloc(inode_ptr,
sizeof(INode), &default_destroy, alloc.ctx) and the similar leaf case (lines
referencing inode_ptr/leaf_ptr and get_db().get_allocator()) with an
allocator-aware destroy thunk that forwards the pointer, size and context to
alloc.dealloc or alloc.destroy (or a small wrapper that calls alloc.dealloc with
alloc.ctx), so retired nodes are reclaimed through alloc rather than the global
free. Ensure the thunk references the same allocator instance (alloc) obtained
from this->get_db().get_allocator().
- Around line 175-176: The change removed QSBR lifetime protection by aliasing
value_view to unodb::value_view and returning raw value_view from OLC methods;
restore a guard-carrying wrapper so readers keep the QSBR guard while they hold
views. Update the type alias and get()/iterator/scan return types (e.g., replace
using value_view = unodb::value_view and using get_result =
std::optional<value_type> with a guarded wrapper type) so get(), the
iterator/scan read functions, and any related methods return a QSBR-guarded view
(or a pair/struct that carries the guard) rather than a plain unodb::value_view,
ensuring the QSBR guard is acquired before reading and held for the lifetime of
the returned view.
In `@qsbr.cpp`:
- Around line 550-571: qsbr_defer_dealloc currently ignores the destroy_callback
and ctx, which causes deferred frees to bypass allocator-specific destruction
and call qsbr::deallocate (free_aligned) later; fix by capturing and queuing the
full deferred action (ptr, size, destroy_callback, ctx) instead of discarding
callback/ctx: extend or overload this_thread().on_next_epoch_deallocate (or add
a new on_next_epoch_deferred_action) to accept and store the
destroy_callback_type and ctx alongside ptr/size, and ensure the reclamation
path invokes that stored destroy_callback(ctx, ptr) or uses the stored allocator
context when freeing so olc_default_allocator behavior is preserved for deferred
frees and qsbr::deallocate is not hardcoded to free_aligned for queued entries.
In `@test/CMakeLists.txt`:
- Around line 95-96: The DEPENDS list for the tests_for_coverage target is
missing the test_olc_no_qsbr dependency, so ensure the tests_for_coverage target
will build that test before running ctest by adding test_olc_no_qsbr to its
DEPENDS list (i.e., include test_olc_no_qsbr alongside test_key_encode_decode,
test_art, test_art_key_view, etc. in the DEPENDS clause that defines
tests_for_coverage).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a1668594-faf8-46b1-bf6f-30c791b0cd0c
📒 Files selected for processing (13)
.github/workflows/build.ymlCMakeLists.txtart.hppart_allocator.hppart_internal_impl.hppmutex_art.hppolc_art.hppqsbr.cppqsbr.hpptest/CMakeLists.txttest/test_art_allocator.cpptest/test_art_scan.cpptest/test_olc_no_qsbr.cpp
| using value_view = unodb::value_view; | ||
| using get_result = std::optional<value_type>; |
There was a problem hiding this comment.
Don't strip QSBR lifetime protection from OLC value_view reads.
For Value = unodb::value_view, these changes return a plain value_view from both get() and iterator/scan reads after the read-side protection has been released. A concurrent erase can now leave callers holding a dangling view into a retired leaf. Keep a guard-carrying QSBR wrapper here, or this OLC surface is no longer memory-safe.
Also applies to: 415-416, 3740-3756
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@olc_art.hpp` around lines 175 - 176, The change removed QSBR lifetime
protection by aliasing value_view to unodb::value_view and returning raw
value_view from OLC methods; restore a guard-carrying wrapper so readers keep
the QSBR guard while they hold views. Update the type alias and
get()/iterator/scan return types (e.g., replace using value_view =
unodb::value_view and using get_result = std::optional<value_type> with a
guarded wrapper type) so get(), the iterator/scan read functions, and any
related methods return a QSBR-guarded view (or a pair/struct that carries the
guard) rather than a plain unodb::value_view, ensuring the QSBR guard is
acquired before reading and held for the lifetime of the returned view.
| const auto& alloc = this->get_db().get_allocator(); | ||
| alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx); |
There was a problem hiding this comment.
Deferred reclamation is freeing OLC memory through the wrong allocator.
Both deleters queue default_destroy, which always ends in free_aligned(). If an olc_db instance was built with a custom allocator, retired nodes and leaves will be deallocated through the global heap instead of alloc.dealloc, which is an invalid-free bug as soon as delete/clear/destruction hits deferred reclamation. Route the destroy callback back through the owning allocator instead of hardcoding the default one.
Also applies to: 1067-1068
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@olc_art.hpp` around lines 1041 - 1042, The deferred destroy currently
enqueues default_destroy (which calls free_aligned) causing freed OLC memory to
use the global allocator; change the deferred callback passed to
alloc.defer_dealloc in places using inode_ptr / leaf_ptr so it captures and
calls the owning allocator's dealloc/destroy path instead of default_destroy.
Concretely, replace uses of default_destroy in the calls around
alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx) and
the similar leaf case (lines referencing inode_ptr/leaf_ptr and
get_db().get_allocator()) with an allocator-aware destroy thunk that forwards
the pointer, size and context to alloc.dealloc or alloc.destroy (or a small
wrapper that calls alloc.dealloc with alloc.ctx), so retired nodes are reclaimed
through alloc rather than the global free. Ensure the thunk references the same
allocator instance (alloc) obtained from this->get_db().get_allocator().
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Remove incorrect pop() before backtrack loop in seek(). When gte_key_byte() or lte_key_byte() returns nothing, the current node was never pushed onto the stack, so the stack top is already the first ancestor to check for a sibling. The spurious pop() discarded that ancestor, causing seek to return end instead of finding the next subtree. Fixes both FWD and REV paths in art.hpp and olc_art.hpp. Adds regression test: scanFromBacktrackAcrossSiblingSubtrees.
18e0214 to
a0f312c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/test_art_scan.cpp`:
- Around line 1091-1117: Add symmetric assertions for scan_range to mirror the
existing scan_from checks: after the forward scan_from block (using
db.scan_from(250,...)), add a forward db.scan_range call that covers the same
keys (e.g., range starting at 250 up to a value that includes 300,400,500)
collecting decoded keys into a results vector with the same lambda (using
unodb::visitor and decode) and assert results size==3 and values 300,400,500;
then add a reverse call db.scan_range(..., false) over the corresponding range,
collect decoded keys, and assert size==2 and values 200,100 to ensure range-path
regression is covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0bd80479-4a97-4512-a483-c447b2c4d157
📒 Files selected for processing (3)
art.hppolc_art.hpptest/test_art_scan.cpp
Adds forward scan_range(250, 501) and reverse scan_range(250, 99) assertions to verify the fix also works through the range API path. Suggested-by: CodeRabbit
|
Actionable comments posted: 0 |
|
@laurynas-biveinis This one looks ready to merge. The. last CI job is internally complete and appears stuck in cleanup. |
scan_from (and scan_range) fails to find keys when the seek target falls beyond all children in a subtree, requiring backtrack to a sibling subtree at a higher level.
Reproducer: Insert 5 keys {100, 200, 300, 400, 500} as 8-byte big-endian uint64. The ART branches at byte 6: child 0x00 holds {100, 200}, child 0x01 holds {300, 400, 500}. scan_from(250) enters child 0x00, finds no child ≥ 0xFA at byte 7, and should backtrack to byte 6 to try child
0x01. Instead it returns end (no results).
Root cause: In seek(), when gte_key_byte() (FWD) or lte_key_byte() (REV) returns nothing, the backtrack code did if (!empty()) pop(); before the while-loop. The current node was never pushed onto the stack, so the stack top is already the first ancestor to check for a sibling. The
spurious pop() discarded that ancestor.
Fix: Remove the incorrect pop() in 4 places — FWD and REV paths in both art.hpp and olc_art.hpp. Add regression test covering both directions.
Fixes #850
Summary by CodeRabbit
New Features
Bug Fixes
Tests