Skip to content

Replace custom ctz() with C++20 std::countr_zero#813

Merged
laurynas-biveinis merged 1 commit into
masterfrom
c++20-ctz
Dec 2, 2025
Merged

Replace custom ctz() with C++20 std::countr_zero#813
laurynas-biveinis merged 1 commit into
masterfrom
c++20-ctz

Conversation

@laurynas-biveinis
Copy link
Copy Markdown
Collaborator

@laurynas-biveinis laurynas-biveinis commented Nov 25, 2025

Remove the custom platform-specific count trailing zeros implementation
and use std::countr_zero from the header instead.

This also enables upgrading shared_len() and get_shared_length() to
full constexpr, since std::countr_zero is constexpr on all platforms
including MSVC.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • Refactor
    • Replaced custom bit-utility implementations with C++20 standard equivalents for improved portability and maintainability.
  • Chore
    • Removed an internal trailing-zero helper, added a standard header, and introduced a small overload for consistent behavior across internal APIs.

✏️ Tip: You can customize this high-level summary in your review settings.

@laurynas-biveinis laurynas-biveinis self-assigned this Nov 25, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 25, 2025

Walkthrough

Replaced the project-local trailing-zero utility with C++20 std::countr_zero, added <bit> includes, removed the custom ctz template, made a get_shared_length overload for ArtKey, and changed one get_shared_length declaration to constexpr; no other behavior or API-signature removals.

Changes

Cohort / File(s) Summary
Header: art_common.hpp
art_common.hpp
Added #include <bit> and replaced detail::ctz(...) with std::countr_zero(...) in lexicographic_successor.
Implementation: art_internal_impl.hpp
art_internal_impl.hpp
Added #include <bit>; replaced detail::ctz(...) with std::countr_zero(...) across shared-length and inode paths; changed key_prefix_snapshot::get_shared_length(std::uint64_t) to constexpr; added key_prefix::get_shared_length(ArtKey) overload delegating to the u64 overload.
Portability utilities: portability_builtins.hpp
portability_builtins.hpp
Removed the ctz template and its MSVC/non-MSVC specializations; retained popcount and bit_cast.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Pay special attention to:
    • Any remaining call sites referencing detail::ctz.
    • Behavior of std::countr_zero on zero inputs compared to the removed ctz.
    • The new constexpr declaration and the ArtKey overload for correctness and ABI/inline effects.

Possibly related PRs

Poem

Bits tumble, headers sing,
Countr_zero now does the ring.
Constexpr tightened, overloads chimed,
Old ctz left far behind. 🎉

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and clearly describes the main objective of the changeset: replacing a custom ctz() implementation with C++20's std::countr_zero.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch c++20-ctz

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 487514d and 819c14a.

📒 Files selected for processing (3)
  • art_common.hpp (1 hunks)
  • art_internal_impl.hpp (12 hunks)
  • portability_builtins.hpp (0 hunks)
💤 Files with no reviewable changes (1)
  • portability_builtins.hpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (6)
art_internal_impl.hpp (6)

8-13: Explicit <bit> include is appropriate

Given the new uses of std::countr_zero and other bit utilities in this file, pulling in <bit> here instead of relying on transitive includes is correct and keeps dependencies explicit.


582-602: key_prefix_snapshot::shared_len correctly guards std::countr_zero

shared_len() computes:

  • diff = k1 ^ k2;
  • clamped = diff | (1ULL << (clamp_byte_pos * 8U));

With clamp_byte_pos < 8 (asserted) this guarantees clamped != 0, so std::countr_zero(clamped) never sees zero and the standard precondition is satisfied. Shifting by >> 3U to convert bit position to byte index matches the previous ctz‑based behaviour, and making get_shared_length() constexpr is safe given the purely bitwise logic.


1472-1507: N4 find_child SIMD paths keep std::countr_zero safe

In both the x86‑64 and AArch64 implementations, you only call std::countr_zero when the mask (bit_field / masked_pos) is non‑zero:

  • x86‑64: guarded by if (bit_field != 0).
  • AArch64: early‑return on if (masked_pos == 0).

That preserves the original ctz semantics and avoids the zero‑argument UB.


1894-1937: N16 find_child SIMD paths also respect std::countr_zero’s preconditions

Similarly for N16:

  • x86‑64: bit_field is masked and then checked if (bit_field != 0) before std::countr_zero(bit_field).
  • AArch64: masked_pos is ANDed with the appropriate mask; if (masked_pos == 0) returns early before std::countr_zero(masked_pos).

So all new std::countr_zero calls here are on non‑zero unsigned masks; no new UB is introduced.


2036-2072: get_sorted_key_array_insert_position handles the zero‑mask case explicitly

The x86‑64 branch now does:

const auto bit_field = ...;
const auto result =
    (bit_field != 0)
        ? static_cast<std::uint8_t>(std::countr_zero(bit_field))
        : static_cast<std::uint8_t>(children_count_);

This exactly mirrors the prior ctz logic while ensuring std::countr_zero never sees zero; fallback to children_count_ when bit_field == 0 is unchanged.


2208-2310: N48 add_to_nonfull: confirm mask assumptions around std::countr_zero

In all three SIMD variants of add_to_nonfull:

  • SSE4.2: cmp_mask is computed from _mm_movemask_epi8(vec_cmp) and you only call std::countr_zero(cmp_mask) when cmp_mask != 0.
  • AVX2: cmp_mask comes from _mm256_movemask_epi8(vec_cmp) and is used in std::countr_zero(cmp_mask) only after _mm256_testz_si256(...) has verified a non‑zero comparison result.
  • AArch64: scalar_pos is derived from narrowed comparison vectors and std::countr_zero(scalar_pos) is only executed when scalar_pos != 0.

So every std::countr_zero call is guarded against a zero argument. The loop still relies on the existing invariant that children_count_ < capacity implies at least one nullptr slot in children, which was already required by the original ctz‑based code.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.39%. Comparing base (61773d1) to head (819c14a).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #813      +/-   ##
==========================================
+ Coverage   98.30%   98.39%   +0.09%     
==========================================
  Files          35       35              
  Lines        6497     6494       -3     
==========================================
+ Hits         6387     6390       +3     
+ Misses        110      104       -6     
Flag Coverage Δ
Debug 98.35% <100.00%> (+0.26%) ⬆️
Release 97.70% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@laurynas-biveinis laurynas-biveinis force-pushed the c++20-ctz branch 3 times, most recently from 69c2f59 to 487514d Compare November 30, 2025 20:01
Remove the custom platform-specific count trailing zeros implementation
and use std::countr_zero from the <bit> header instead.

This also enables upgrading shared_len() and get_shared_length() to
full constexpr, since std::countr_zero is constexpr on all platforms
including MSVC.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Dec 1, 2025

@laurynas-biveinis laurynas-biveinis merged commit 04f0be7 into master Dec 2, 2025
236 of 239 checks passed
@laurynas-biveinis laurynas-biveinis deleted the c++20-ctz branch December 2, 2025 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant