fix: scan_from backtrack across sibling subtrees (#850) by thompsonbry · Pull Request #851 · unodb-dev/unodb

thompsonbry · 2026-05-21T22:37:28Z

scan_from (and scan_range) fails to find keys when the seek target falls beyond all children in a subtree, requiring backtrack to a sibling subtree at a higher level.

Reproducer: Insert 5 keys {100, 200, 300, 400, 500} as 8-byte big-endian uint64. The ART branches at byte 6: child 0x00 holds {100, 200}, child 0x01 holds {300, 400, 500}. scan_from(250) enters child 0x00, finds no child ≥ 0xFA at byte 7, and should backtrack to byte 6 to try child
0x01. Instead it returns end (no results).

Root cause: In seek(), when gte_key_byte() (FWD) or lte_key_byte() (REV) returns nothing, the backtrack code did if (!empty()) pop(); before the while-loop. The current node was never pushed onto the stack, so the stack top is already the first ancestor to check for a sibling. The
spurious pop() discarded that ancestor.

Fix: Remove the incorrect pop() in 4 places — FWD and REV paths in both art.hpp and olc_art.hpp. Add regression test covering both directions.

Fixes #850

Summary by CodeRabbit

New Features
- Added support for providing a custom memory allocator via a new constructor and accessor.
Bug Fixes
- Corrected iterator/scan behavior when backtracking across sibling subtrees so forward and reverse range scans return the correct sequences.
Tests
- Added a regression test covering cross-subtree scanning and backtracking scenarios.

coderabbitai · 2026-05-21T22:37:38Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b18bfa9e-a36c-45af-9e74-2cb48c47d8bb

📥 Commits

Reviewing files that changed from the base of the PR and between a0f312c and 6e6884a.

📒 Files selected for processing (1)

test/test_art_scan.cpp

Walkthrough

Adds allocator support to the core ART db, fixes iterator seek backtracking in core and OLC variants when GTE/LTE child lookups fail, and adds a regression test that exercises the corrected backtrack behavior.

Changes

Allocator Support and Seek Backtracking Fixes

Layer / File(s)	Summary
ART db allocator constructor and accessor `art.hpp`	Adds `art_allocator.hpp`, a `constexpr explicit db(const allocator_type& alloc) noexcept` constructor that asserts allocator function pointers, `get_allocator()` accessor, and an `allocator_` member initialized from `detail::default_allocator`.
Iterator seek backtracking fix in core ART `art.hpp`	Removes an incorrect `pop()` in `db::iterator::seek()` forward/reverse paths when `gte_key_byte`/`lte_key_byte` finds no candidate; clarifies the node where lookup failed was never pushed so backtracking uses the existing ancestor stack entries.
Iterator seek backtracking fix in OLC variant `olc_art.hpp`	Applies the same stack-unwinding correction to `olc_db::iterator::try_seek()` forward/reverse failure paths; updates comments and keeps unwinding from the correct ancestor.
Seek backtracking regression test `test/test_art_scan.cpp`	Adds `ARTScanTest.scanFromBacktrackAcrossSiblingSubtrees` that inserts {100,200,300,400,500} and asserts `scan_from(250)` and `scan_range` visit expected keys in forward and reverse directions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A byte looked left and then it stalled,
No pop where none was pushed — the stack stood tall.
Allocators snug, the seekers roam anew,
Sibling doors open; scans find what is true. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.61% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title accurately summarizes the main fix: removing incorrect pop() logic to enable proper backtracking across sibling subtrees in scan_from.
Linked Issues check	✅ Passed	PR fully addresses issue `#850`: fixes seek() backtrack logic in both art.hpp and olc_art.hpp (forward and reverse paths), removes the spurious pop(), and adds comprehensive regression test.
Out of Scope Changes check	✅ Passed	All changes directly support the fix: allocator support added as prerequisite infrastructure, iterator seek logic corrected per spec, and test added. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@olc_art.hpp`:
- Around line 1041-1042: The deferred destroy currently enqueues default_destroy
(which calls free_aligned) causing freed OLC memory to use the global allocator;
change the deferred callback passed to alloc.defer_dealloc in places using
inode_ptr / leaf_ptr so it captures and calls the owning allocator's
dealloc/destroy path instead of default_destroy. Concretely, replace uses of
default_destroy in the calls around alloc.defer_dealloc(inode_ptr,
sizeof(INode), &default_destroy, alloc.ctx) and the similar leaf case (lines
referencing inode_ptr/leaf_ptr and get_db().get_allocator()) with an
allocator-aware destroy thunk that forwards the pointer, size and context to
alloc.dealloc or alloc.destroy (or a small wrapper that calls alloc.dealloc with
alloc.ctx), so retired nodes are reclaimed through alloc rather than the global
free. Ensure the thunk references the same allocator instance (alloc) obtained
from this->get_db().get_allocator().
- Around line 175-176: The change removed QSBR lifetime protection by aliasing
value_view to unodb::value_view and returning raw value_view from OLC methods;
restore a guard-carrying wrapper so readers keep the QSBR guard while they hold
views. Update the type alias and get()/iterator/scan return types (e.g., replace
using value_view = unodb::value_view and using get_result =
std::optional<value_type> with a guarded wrapper type) so get(), the
iterator/scan read functions, and any related methods return a QSBR-guarded view
(or a pair/struct that carries the guard) rather than a plain unodb::value_view,
ensuring the QSBR guard is acquired before reading and held for the lifetime of
the returned view.

In `@qsbr.cpp`:
- Around line 550-571: qsbr_defer_dealloc currently ignores the destroy_callback
and ctx, which causes deferred frees to bypass allocator-specific destruction
and call qsbr::deallocate (free_aligned) later; fix by capturing and queuing the
full deferred action (ptr, size, destroy_callback, ctx) instead of discarding
callback/ctx: extend or overload this_thread().on_next_epoch_deallocate (or add
a new on_next_epoch_deferred_action) to accept and store the
destroy_callback_type and ctx alongside ptr/size, and ensure the reclamation
path invokes that stored destroy_callback(ctx, ptr) or uses the stored allocator
context when freeing so olc_default_allocator behavior is preserved for deferred
frees and qsbr::deallocate is not hardcoded to free_aligned for queued entries.

In `@test/CMakeLists.txt`:
- Around line 95-96: The DEPENDS list for the tests_for_coverage target is
missing the test_olc_no_qsbr dependency, so ensure the tests_for_coverage target
will build that test before running ctest by adding test_olc_no_qsbr to its
DEPENDS list (i.e., include test_olc_no_qsbr alongside test_key_encode_decode,
test_art, test_art_key_view, etc. in the DEPENDS clause that defines
tests_for_coverage).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a1668594-faf8-46b1-bf6f-30c791b0cd0c

📥 Commits

Reviewing files that changed from the base of the PR and between 36e4770 and 18e0214.

📒 Files selected for processing (13)

.github/workflows/build.yml
CMakeLists.txt
art.hpp
art_allocator.hpp
art_internal_impl.hpp
mutex_art.hpp
olc_art.hpp
qsbr.cpp
qsbr.hpp
test/CMakeLists.txt
test/test_art_allocator.cpp
test/test_art_scan.cpp
test/test_olc_no_qsbr.cpp

coderabbitai · 2026-05-21T22:44:48Z

+  using value_view = unodb::value_view;
+  using get_result = std::optional<value_type>;


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Don't strip QSBR lifetime protection from OLC value_view reads.

For Value = unodb::value_view, these changes return a plain value_view from both get() and iterator/scan reads after the read-side protection has been released. A concurrent erase can now leave callers holding a dangling view into a retired leaf. Keep a guard-carrying QSBR wrapper here, or this OLC surface is no longer memory-safe.

Also applies to: 415-416, 3740-3756

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@olc_art.hpp` around lines 175 - 176, The change removed QSBR lifetime protection by aliasing value_view to unodb::value_view and returning raw value_view from OLC methods; restore a guard-carrying wrapper so readers keep the QSBR guard while they hold views. Update the type alias and get()/iterator/scan return types (e.g., replace using value_view = unodb::value_view and using get_result = std::optional<value_type> with a guarded wrapper type) so get(), the iterator/scan read functions, and any related methods return a QSBR-guarded view (or a pair/struct that carries the guard) rather than a plain unodb::value_view, ensuring the QSBR guard is acquired before reading and held for the lifetime of the returned view.

coderabbitai · 2026-05-21T22:44:48Z

+    const auto& alloc = this->get_db().get_allocator();
+    alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx);


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Deferred reclamation is freeing OLC memory through the wrong allocator.

Both deleters queue default_destroy, which always ends in free_aligned(). If an olc_db instance was built with a custom allocator, retired nodes and leaves will be deallocated through the global heap instead of alloc.dealloc, which is an invalid-free bug as soon as delete/clear/destruction hits deferred reclamation. Route the destroy callback back through the owning allocator instead of hardcoding the default one.

Also applies to: 1067-1068

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@olc_art.hpp` around lines 1041 - 1042, The deferred destroy currently enqueues default_destroy (which calls free_aligned) causing freed OLC memory to use the global allocator; change the deferred callback passed to alloc.defer_dealloc in places using inode_ptr / leaf_ptr so it captures and calls the owning allocator's dealloc/destroy path instead of default_destroy. Concretely, replace uses of default_destroy in the calls around alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx) and the similar leaf case (lines referencing inode_ptr/leaf_ptr and get_db().get_allocator()) with an allocator-aware destroy thunk that forwards the pointer, size and context to alloc.dealloc or alloc.destroy (or a small wrapper that calls alloc.dealloc with alloc.ctx), so retired nodes are reclaimed through alloc rather than the global free. Ensure the thunk references the same allocator instance (alloc) obtained from this->get_db().get_allocator().

codecov · 2026-05-22T00:46:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Remove incorrect pop() before backtrack loop in seek(). When gte_key_byte() or lte_key_byte() returns nothing, the current node was never pushed onto the stack, so the stack top is already the first ancestor to check for a sibling. The spurious pop() discarded that ancestor, causing seek to return end instead of finding the next subtree. Fixes both FWD and REV paths in art.hpp and olc_art.hpp. Adds regression test: scanFromBacktrackAcrossSiblingSubtrees.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/test_art_scan.cpp`:
- Around line 1091-1117: Add symmetric assertions for scan_range to mirror the
existing scan_from checks: after the forward scan_from block (using
db.scan_from(250,...)), add a forward db.scan_range call that covers the same
keys (e.g., range starting at 250 up to a value that includes 300,400,500)
collecting decoded keys into a results vector with the same lambda (using
unodb::visitor and decode) and assert results size==3 and values 300,400,500;
then add a reverse call db.scan_range(..., false) over the corresponding range,
collect decoded keys, and assert size==2 and values 200,100 to ensure range-path
regression is covered.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0bd80479-4a97-4512-a483-c447b2c4d157

📥 Commits

Reviewing files that changed from the base of the PR and between 18e0214 and a0f312c.

📒 Files selected for processing (3)

art.hpp
olc_art.hpp
test/test_art_scan.cpp

Adds forward scan_range(250, 501) and reverse scan_range(250, 99) assertions to verify the fix also works through the range API path. Suggested-by: CodeRabbit

coderabbitai · 2026-05-28T17:55:52Z

Actionable comments posted: 0

thompsonbry · 2026-05-29T12:46:51Z

@laurynas-biveinis This one looks ready to merge. The. last CI job is internally complete and appears stuck in cleanup.

thompsonbry temporarily deployed to coverage May 21, 2026 22:37 — with GitHub Actions Inactive

thompsonbry had a problem deploying to claude May 21, 2026 22:37 — with GitHub Actions Failure

thompsonbry had a problem deploying to sonarcloud May 21, 2026 22:37 — with GitHub Actions Failure

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

thompsonbry force-pushed the fix/scan-from-backtrack-850 branch from 18e0214 to a0f312c Compare May 28, 2026 12:27

thompsonbry had a problem deploying to sonarcloud May 28, 2026 12:28 — with GitHub Actions Failure

thompsonbry had a problem deploying to claude May 28, 2026 12:28 — with GitHub Actions Failure

thompsonbry temporarily deployed to coverage May 28, 2026 12:28 — with GitHub Actions Inactive

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Comment thread test/test_art_scan.cpp

test: add scan_range assertions to backtrack regression test

6e6884a

Adds forward scan_range(250, 501) and reverse scan_range(250, 99) assertions to verify the fix also works through the range API path. Suggested-by: CodeRabbit

thompsonbry had a problem deploying to claude May 28, 2026 17:52 — with GitHub Actions Failure

thompsonbry temporarily deployed to coverage May 28, 2026 17:52 — with GitHub Actions Inactive

thompsonbry had a problem deploying to sonarcloud May 28, 2026 17:52 — with GitHub Actions Failure

laurynas-biveinis self-assigned this May 29, 2026

laurynas-biveinis merged commit fa7a2a3 into unodb-dev:master May 29, 2026
209 of 213 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: scan_from backtrack across sibling subtrees (#850)#851

fix: scan_from backtrack across sibling subtrees (#850)#851
laurynas-biveinis merged 2 commits into
unodb-dev:masterfrom
thompsonbry:fix/scan-from-backtrack-850

thompsonbry commented May 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 21, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 21, 2026

Uh oh!

coderabbitai Bot May 21, 2026

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented May 22, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

thompsonbry commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		using value_view = unodb::value_view;
		using get_result = std::optional<value_type>;

		const auto& alloc = this->get_db().get_allocator();
		alloc.defer_dealloc(inode_ptr, sizeof(INode), &default_destroy, alloc.ctx);

Conversation

thompsonbry commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented May 22, 2026

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

thompsonbry commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thompsonbry commented May 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 21, 2026 •

edited

Loading