Skip to content

fix: scan_from backtrack across sibling subtrees (#850)#851

Merged
laurynas-biveinis merged 2 commits into
unodb-dev:masterfrom
thompsonbry:fix/scan-from-backtrack-850
May 29, 2026
Merged

fix: scan_from backtrack across sibling subtrees (#850)#851
laurynas-biveinis merged 2 commits into
unodb-dev:masterfrom
thompsonbry:fix/scan-from-backtrack-850

Conversation

@thompsonbry
Copy link
Copy Markdown
Contributor

@thompsonbry thompsonbry commented May 21, 2026

scan_from (and scan_range) fails to find keys when the seek target falls beyond all children in a subtree, requiring backtrack to a sibling subtree at a higher level.

Reproducer: Insert 5 keys {100, 200, 300, 400, 500} as 8-byte big-endian uint64. The ART branches at byte 6: child 0x00 holds {100, 200}, child 0x01 holds {300, 400, 500}. scan_from(250) enters child 0x00, finds no child ≥ 0xFA at byte 7, and should backtrack to byte 6 to try child
0x01. Instead it returns end (no results).

Root cause: In seek(), when gte_key_byte() (FWD) or lte_key_byte() (REV) returns nothing, the backtrack code did if (!empty()) pop(); before the while-loop. The current node was never pushed onto the stack, so the stack top is already the first ancestor to check for a sibling. The
spurious pop() discarded that ancestor.

Fix: Remove the incorrect pop() in 4 places — FWD and REV paths in both art.hpp and olc_art.hpp. Add regression test covering both directions.

Fixes #850

Summary by CodeRabbit

  • New Features

    • Added support for providing a custom memory allocator via a new constructor and accessor.
  • Bug Fixes

    • Corrected iterator/scan behavior when backtracking across sibling subtrees so forward and reverse range scans return the correct sequences.
  • Tests

    • Added a regression test covering cross-subtree scanning and backtracking scenarios.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b18bfa9e-a36c-45af-9e74-2cb48c47d8bb

📥 Commits

Reviewing files that changed from the base of the PR and between a0f312c and 6e6884a.

📒 Files selected for processing (1)
  • test/test_art_scan.cpp

Walkthrough

Adds allocator support to the core ART db, fixes iterator seek backtracking in core and OLC variants when GTE/LTE child lookups fail, and adds a regression test that exercises the corrected backtrack behavior.

Changes

Allocator Support and Seek Backtracking Fixes

Layer / File(s) Summary
ART db allocator constructor and accessor
art.hpp
Adds art_allocator.hpp, a constexpr explicit db(const allocator_type& alloc) noexcept constructor that asserts allocator function pointers, get_allocator() accessor, and an allocator_ member initialized from detail::default_allocator.
Iterator seek backtracking fix in core ART
art.hpp
Removes an incorrect pop() in db::iterator::seek() forward/reverse paths when gte_key_byte/lte_key_byte finds no candidate; clarifies the node where lookup failed was never pushed so backtracking uses the existing ancestor stack entries.
Iterator seek backtracking fix in OLC variant
olc_art.hpp
Applies the same stack-unwinding correction to olc_db::iterator::try_seek() forward/reverse failure paths; updates comments and keeps unwinding from the correct ancestor.
Seek backtracking regression test
test/test_art_scan.cpp
Adds ARTScanTest.scanFromBacktrackAcrossSiblingSubtrees that inserts {100,200,300,400,500} and asserts scan_from(250) and scan_range visit expected keys in forward and reverse directions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A byte looked left and then it stalled,
No pop where none was pushed — the stack stood tall.
Allocators snug, the seekers roam anew,
Sibling doors open; scans find what is true. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.61% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title accurately summarizes the main fix: removing incorrect pop() logic to enable proper backtracking across sibling subtrees in scan_from.
Linked Issues check ✅ Passed PR fully addresses issue #850: fixes seek() backtrack logic in both art.hpp and olc_art.hpp (forward and reverse paths), removes the spurious pop(), and adds comprehensive regression test.
Out of Scope Changes check ✅ Passed All changes directly support the fix: allocator support added as prerequisite infrastructure, iterator seek logic corrected per spec, and test added. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@olc_art.hpp`:
- Around line 1041-1042: The deferred destroy currently enqueues default_destroy
(which calls free_aligned) causing freed OLC memory to use the global allocator;
change the deferred callback passed to alloc.defer_dealloc in places using
inode_ptr / leaf_ptr so it captures and calls the owning allocator's
dealloc/destroy path instead of default_destroy. Concretely, replace uses of
default_destroy in the calls around alloc.defer_dealloc(inode_ptr,
sizeof(INode), &default_destroy, alloc.ctx) and the similar leaf case (lines
referencing inode_ptr/leaf_ptr and get_db().get_allocator()) with an
allocator-aware destroy thunk that forwards the pointer, size and context to
alloc.dealloc or alloc.destroy (or a small wrapper that calls alloc.dealloc with
alloc.ctx), so retired nodes are reclaimed through alloc rather than the global
free. Ensure the thunk references the same allocator instance (alloc) obtained
from this->get_db().get_allocator().
- Around line 175-176: The change removed QSBR lifetime protection by aliasing
value_view to unodb::value_view and returning raw value_view from OLC methods;
restore a guard-carrying wrapper so readers keep the QSBR guard while they hold
views. Update the type alias and get()/iterator/scan return types (e.g., replace
using value_view = unodb::value_view and using get_result =
std::optional<value_type> with a guarded wrapper type) so get(), the
iterator/scan read functions, and any related methods return a QSBR-guarded view
(or a pair/struct that carries the guard) rather than a plain unodb::value_view,
ensuring the QSBR guard is acquired before reading and held for the lifetime of
the returned view.

In `@qsbr.cpp`:
- Around line 550-571: qsbr_defer_dealloc currently ignores the destroy_callback
and ctx, which causes deferred frees to bypass allocator-specific destruction
and call qsbr::deallocate (free_aligned) later; fix by capturing and queuing the
full deferred action (ptr, size, destroy_callback, ctx) instead of discarding
callback/ctx: extend or overload this_thread().on_next_epoch_deallocate (or add
a new on_next_epoch_deferred_action) to accept and store the
destroy_callback_type and ctx alongside ptr/size, and ensure the reclamation
path invokes that stored destroy_callback(ctx, ptr) or uses the stored allocator
context when freeing so olc_default_allocator behavior is preserved for deferred
frees and qsbr::deallocate is not hardcoded to free_aligned for queued entries.

In `@test/CMakeLists.txt`:
- Around line 95-96: The DEPENDS list for the tests_for_coverage target is
missing the test_olc_no_qsbr dependency, so ensure the tests_for_coverage target
will build that test before running ctest by adding test_olc_no_qsbr to its
DEPENDS list (i.e., include test_olc_no_qsbr alongside test_key_encode_decode,
test_art, test_art_key_view, etc. in the DEPENDS clause that defines
tests_for_coverage).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a1668594-faf8-46b1-bf6f-30c791b0cd0c

📥 Commits

Reviewing files that changed from the base of the PR and between 36e4770 and 18e0214.

📒 Files selected for processing (13)
  • .github/workflows/build.yml
  • CMakeLists.txt
  • art.hpp
  • art_allocator.hpp
  • art_internal_impl.hpp
  • mutex_art.hpp
  • olc_art.hpp
  • qsbr.cpp
  • qsbr.hpp
  • test/CMakeLists.txt
  • test/test_art_allocator.cpp
  • test/test_art_scan.cpp
  • test/test_olc_no_qsbr.cpp

Comment thread olc_art.hpp Outdated
Comment on lines +175 to +176
using value_view = unodb::value_view;
using get_result = std::optional<value_type>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Don't strip QSBR lifetime protection from OLC value_view reads.

For Value = unodb::value_view, these changes return a plain value_view from both get() and iterator/scan reads after the read-side protection has been released. A concurrent erase can now leave callers holding a dangling view into a retired leaf. Keep a guard-carrying QSBR wrapper here, or this OLC surface is no longer memory-safe.

Also applies to: 415-416, 3740-3756

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@olc_art.hpp` around lines 175 - 176, The change removed QSBR lifetime
protection by aliasing value_view to unodb::value_view and returning raw
value_view from OLC methods; restore a guard-carrying wrapper so readers keep
the QSBR guard while they hold views. Update the type alias and
get()/iterator/scan return types (e.g., replace using value_view =
unodb::value_view and using get_result = std::optional<value_type> with a
guarded wrapper type) so get(), the iterator/scan read functions, and any
related methods return a QSBR-guarded view (or a pair/struct that carries the
guard) rather than a plain unodb::value_view, ensuring the QSBR guard is
acquired before reading and held for the lifetime of the returned view.

Comment thread olc_art.hpp
Comment on lines +1041 to +1042
const auto& alloc = this->get_db().get_allocator();
alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Deferred reclamation is freeing OLC memory through the wrong allocator.

Both deleters queue default_destroy, which always ends in free_aligned(). If an olc_db instance was built with a custom allocator, retired nodes and leaves will be deallocated through the global heap instead of alloc.dealloc, which is an invalid-free bug as soon as delete/clear/destruction hits deferred reclamation. Route the destroy callback back through the owning allocator instead of hardcoding the default one.

Also applies to: 1067-1068

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@olc_art.hpp` around lines 1041 - 1042, The deferred destroy currently
enqueues default_destroy (which calls free_aligned) causing freed OLC memory to
use the global allocator; change the deferred callback passed to
alloc.defer_dealloc in places using inode_ptr / leaf_ptr so it captures and
calls the owning allocator's dealloc/destroy path instead of default_destroy.
Concretely, replace uses of default_destroy in the calls around
alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx) and
the similar leaf case (lines referencing inode_ptr/leaf_ptr and
get_db().get_allocator()) with an allocator-aware destroy thunk that forwards
the pointer, size and context to alloc.dealloc or alloc.destroy (or a small
wrapper that calls alloc.dealloc with alloc.ctx), so retired nodes are reclaimed
through alloc rather than the global free. Ensure the thunk references the same
allocator instance (alloc) obtained from this->get_db().get_allocator().

Comment thread qsbr.cpp Outdated
Comment thread test/CMakeLists.txt Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Remove incorrect pop() before backtrack loop in seek(). When
gte_key_byte() or lte_key_byte() returns nothing, the current node
was never pushed onto the stack, so the stack top is already the
first ancestor to check for a sibling. The spurious pop() discarded
that ancestor, causing seek to return end instead of finding the
next subtree.

Fixes both FWD and REV paths in art.hpp and olc_art.hpp.
Adds regression test: scanFromBacktrackAcrossSiblingSubtrees.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/test_art_scan.cpp`:
- Around line 1091-1117: Add symmetric assertions for scan_range to mirror the
existing scan_from checks: after the forward scan_from block (using
db.scan_from(250,...)), add a forward db.scan_range call that covers the same
keys (e.g., range starting at 250 up to a value that includes 300,400,500)
collecting decoded keys into a results vector with the same lambda (using
unodb::visitor and decode) and assert results size==3 and values 300,400,500;
then add a reverse call db.scan_range(..., false) over the corresponding range,
collect decoded keys, and assert size==2 and values 200,100 to ensure range-path
regression is covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0bd80479-4a97-4512-a483-c447b2c4d157

📥 Commits

Reviewing files that changed from the base of the PR and between 18e0214 and a0f312c.

📒 Files selected for processing (3)
  • art.hpp
  • olc_art.hpp
  • test/test_art_scan.cpp

Comment thread test/test_art_scan.cpp
Adds forward scan_range(250, 501) and reverse scan_range(250, 99)
assertions to verify the fix also works through the range API path.

Suggested-by: CodeRabbit
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@thompsonbry
Copy link
Copy Markdown
Contributor Author

@laurynas-biveinis This one looks ready to merge. The. last CI job is internally complete and appears stuck in cleanup.

@laurynas-biveinis laurynas-biveinis self-assigned this May 29, 2026
@laurynas-biveinis laurynas-biveinis merged commit fa7a2a3 into unodb-dev:master May 29, 2026
209 of 213 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: db::scan_from fails to backtrack across sibling subtrees for GTE/LTE seek.

2 participants