feat: Support `u32` indices for `HashJoinExec` #16434

jonathanc-n · 2025-06-18T01:18:47Z

Which issue does this PR close?

Closes Support u32 indices in HashJoinExec #16179 .

Rationale for this change

We can use u32 indices instead of u64 indices when there are less than u32::MAX rows when building the hashmap. This acts as a memory optimization

What changes are included in this PR?

During HashJoinExec we construct the JoinLeftData with a Box<dyn JoinHashMapType> choosing between a u32 indice or u64 JoinHashMap.

I changed the JoinHashMapType to hold the update_from_iter, get_matched_indice, and get_matched_indices_with_limit_offset and split the JoinHashMap into JoinHashMapU32 and JoinHashMapU64.

I deliberately did not try to expose a generic in the trait, nor did I try to create a generic on the JoinHashMap struct as doing so would lead to many problems with having to call preceding functions with a generic; doing so is not possible since we are determining the JoinHashMapType during runtime.

Are these changes tested?

Yes I added a test for checking the hashmap created using u32 indices.

jonathanc-n · 2025-06-18T01:21:26Z

cc @Dandandan

alamb · 2025-06-18T21:31:25Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP Thu Apr 24 20:41:05 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing support-u32-hashmap (732fa21) to 056f546 diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-06-18T22:11:26Z

🤖: Benchmark completed

Details

Comparing HEAD and support-u32-hashmap
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ support-u32-hashmap ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  1891.38 ms │          1886.58 ms │     no change │
│ QQuery 1     │   697.56 ms │           704.74 ms │     no change │
│ QQuery 2     │  1381.91 ms │          1361.41 ms │     no change │
│ QQuery 3     │   669.87 ms │           634.92 ms │ +1.06x faster │
│ QQuery 4     │  1327.15 ms │          1339.32 ms │     no change │
│ QQuery 5     │ 14912.55 ms │         14887.37 ms │     no change │
│ QQuery 6     │  2044.07 ms │          2059.57 ms │     no change │
│ QQuery 7     │  1784.47 ms │          1873.16 ms │     no change │
│ QQuery 8     │   797.99 ms │           802.46 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 25506.95ms │
│ Total Time (support-u32-hashmap)   │ 25549.53ms │
│ Average Time (HEAD)                │  2834.11ms │
│ Average Time (support-u32-hashmap) │  2838.84ms │
│ Queries Faster                     │          1 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          8 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ support-u32-hashmap ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │    15.40 ms │            15.56 ms │    no change │
│ QQuery 1     │    33.85 ms │            33.67 ms │    no change │
│ QQuery 2     │    81.80 ms │            80.27 ms │    no change │
│ QQuery 3     │    94.51 ms │            98.91 ms │    no change │
│ QQuery 4     │   580.34 ms │           587.97 ms │    no change │
│ QQuery 5     │   818.51 ms │           852.27 ms │    no change │
│ QQuery 6     │    22.18 ms │            23.65 ms │ 1.07x slower │
│ QQuery 7     │    36.35 ms │            36.60 ms │    no change │
│ QQuery 8     │   840.07 ms │           846.70 ms │    no change │
│ QQuery 9     │  1163.06 ms │          1126.59 ms │    no change │
│ QQuery 10    │   256.44 ms │           253.13 ms │    no change │
│ QQuery 11    │   284.27 ms │           275.93 ms │    no change │
│ QQuery 12    │   855.55 ms │           892.86 ms │    no change │
│ QQuery 13    │  1240.05 ms │          1256.85 ms │    no change │
│ QQuery 14    │   788.52 ms │           799.55 ms │    no change │
│ QQuery 15    │   756.94 ms │           759.38 ms │    no change │
│ QQuery 16    │  1588.12 ms │          1573.61 ms │    no change │
│ QQuery 17    │  1596.56 ms │          1595.53 ms │    no change │
│ QQuery 18    │  2844.62 ms │          2904.45 ms │    no change │
│ QQuery 19    │    82.55 ms │            85.96 ms │    no change │
│ QQuery 20    │  1133.70 ms │          1187.01 ms │    no change │
│ QQuery 21    │  1273.35 ms │          1310.45 ms │    no change │
│ QQuery 22    │  2112.94 ms │          2171.74 ms │    no change │
│ QQuery 23    │  7356.55 ms │          7416.88 ms │    no change │
│ QQuery 24    │   427.69 ms │           444.19 ms │    no change │
│ QQuery 25    │   299.39 ms │           302.95 ms │    no change │
│ QQuery 26    │   433.22 ms │           449.51 ms │    no change │
│ QQuery 27    │  1555.11 ms │          1553.18 ms │    no change │
│ QQuery 28    │ 11668.95 ms │         11887.97 ms │    no change │
│ QQuery 29    │   519.95 ms │           508.95 ms │    no change │
│ QQuery 30    │   768.44 ms │           778.51 ms │    no change │
│ QQuery 31    │   799.04 ms │           818.27 ms │    no change │
│ QQuery 32    │  2409.37 ms │          2374.14 ms │    no change │
│ QQuery 33    │  3138.93 ms │          3153.14 ms │    no change │
│ QQuery 34    │  3162.98 ms │          3160.19 ms │    no change │
│ QQuery 35    │  1212.81 ms │          1231.29 ms │    no change │
│ QQuery 36    │   123.25 ms │           125.52 ms │    no change │
│ QQuery 37    │    57.19 ms │            57.98 ms │    no change │
│ QQuery 38    │   127.02 ms │           124.89 ms │    no change │
│ QQuery 39    │   195.28 ms │           197.94 ms │    no change │
│ QQuery 40    │    47.18 ms │            48.68 ms │    no change │
│ QQuery 41    │    45.15 ms │            42.97 ms │    no change │
│ QQuery 42    │    38.79 ms │            39.72 ms │    no change │
└──────────────┴─────────────┴─────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 52886.01ms │
│ Total Time (support-u32-hashmap)   │ 53485.49ms │
│ Average Time (HEAD)                │  1229.91ms │
│ Average Time (support-u32-hashmap) │  1243.85ms │
│ Queries Faster                     │          0 │
│ Queries Slower                     │          1 │
│ Queries with No Change             │         42 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ support-u32-hashmap ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1     │ 100.68 ms │            99.40 ms │ no change │
│ QQuery 2     │  21.39 ms │            21.69 ms │ no change │
│ QQuery 3     │  32.64 ms │            32.28 ms │ no change │
│ QQuery 4     │  18.13 ms │            18.58 ms │ no change │
│ QQuery 5     │  48.89 ms │            49.42 ms │ no change │
│ QQuery 6     │  11.87 ms │            11.86 ms │ no change │
│ QQuery 7     │  87.37 ms │            83.45 ms │ no change │
│ QQuery 8     │  23.84 ms │            23.90 ms │ no change │
│ QQuery 9     │  53.64 ms │            53.53 ms │ no change │
│ QQuery 10    │  42.80 ms │            42.86 ms │ no change │
│ QQuery 11    │  11.21 ms │            11.22 ms │ no change │
│ QQuery 12    │  34.76 ms │            34.72 ms │ no change │
│ QQuery 13    │  26.15 ms │            26.24 ms │ no change │
│ QQuery 14    │   9.82 ms │             9.88 ms │ no change │
│ QQuery 15    │  19.41 ms │            19.59 ms │ no change │
│ QQuery 16    │  18.81 ms │            18.85 ms │ no change │
│ QQuery 17    │  96.08 ms │            94.52 ms │ no change │
│ QQuery 18    │ 193.90 ms │           189.03 ms │ no change │
│ QQuery 19    │  25.54 ms │            25.12 ms │ no change │
│ QQuery 20    │  33.31 ms │            31.69 ms │ no change │
│ QQuery 21    │ 148.25 ms │           146.30 ms │ no change │
│ QQuery 22    │  15.18 ms │            15.31 ms │ no change │
└──────────────┴───────────┴─────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1073.66ms │
│ Total Time (support-u32-hashmap)   │ 1059.45ms │
│ Average Time (HEAD)                │   48.80ms │
│ Average Time (support-u32-hashmap) │   48.16ms │
│ Queries Faster                     │         0 │
│ Queries Slower                     │         0 │
│ Queries with No Change             │        22 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

jonathanc-n · 2025-06-18T22:36:42Z

Those benchmarks make sense, just saves memory.

Copilot

Pull Request Overview

Adds support for using 32-bit row indices in the hash join executor when the build side has fewer than u32::MAX rows to save memory.

Introduce JoinHashMapType trait with JoinHashMapU32 and JoinHashMapU64 implementations
Change all callers and constructors of the old JoinHashMap to use Box<dyn JoinHashMapType>
Update memory estimation to pick u32 or u64 variant based on row count and add corresponding tests

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
joins/utils.rs	Remove obsolete `JoinHashMap` export, keep only `JoinHashMapType`
joins/symmetric_hash_join.rs	Update `get_matched_indices` calls to accept boxed iterator
joins/stream_join_utils.rs	Implement `JoinHashMapType` for `PruningJoinHashMap` using new helpers
joins/join_hash_map.rs	Define `JoinHashMapType`, `JoinHashMapU32`, `JoinHashMapU64`, and generic helpers
joins/hash_join.rs	Construct boxed hash map variant, update memory estimation, and adjust tests

Comments suppressed due to low confidence (3)

datafusion/physical-plan/src/joins/join_hash_map.rs:38

[nitpick] Clarify this comment to indicate that the chained list uses either Vec<u32> or Vec<u64> rather than both. E.g., “stored as either Vec<u32> or Vec<u64> based on size requirements.”

/// The indices (values) are stored in a separate chained list stored as `Vec<u32>` `Vec<u64>`.

datafusion/physical-plan/src/joins/hash_join.rs:3581

[nitpick] The test name now differs from the u32 variant (which uses collisions plural). For consistency, consider renaming both tests to match a common pattern, e.g., join_with_hash_collision_u32 and join_with_hash_collision_u64.

    fn join_with_hash_collisions_u64() -> Result<()> {

datafusion/physical-plan/src/joins/join_hash_map.rs:93

[nitpick] Add a doc comment to JoinHashMapType explaining its purpose and when each method should be used; this will help maintainers understand the runtime‐selected index strategy.

pub trait JoinHashMapType: Send + Sync {

datafusion/physical-plan/src/joins/hash_join.rs

Copilot · 2025-06-19T11:34:53Z

datafusion/physical-plan/src/joins/stream_join_utils.rs

+        // Flatten the deque
+        let next: Vec<u64> = self.next.iter().copied().collect();
+        get_matched_indices::<u64>(&self.map, &next, iter, deleted_offset)


[nitpick] Collecting the VecDeque into a new Vec<u64> on every call can be expensive. Consider reusing a buffer or accessing the deque’s contiguous slice when possible to reduce allocations.

Suggested change

// Flatten the deque

let next: Vec<u64> = self.next.iter().copied().collect();

get_matched_indices::<u64>(&self.map, &next, iter, deleted_offset)

// Access the deque's slices directly

let (head, tail) = self.next.as_slices();

get_matched_indices::<u64>(&self.map, head, tail, iter, deleted_offset)

This seems to be alittle bit difficult to do without changing function parameters

feat: Support u32 indices for HashJoinExec

18e8ad8

github-actions bot added the physical-plan Changes to the physical-plan crate label Jun 18, 2025

Merge branch 'main' into support-u32-hashmap

2575a3e

fix: Fix NullEquality conflicts

732fa21

alamb requested review from Dandandan and Copilot June 19, 2025 11:32

Copilot AI reviewed Jun 19, 2025

View reviewed changes

jonathanc-n added 2 commits June 19, 2025 09:26

Merge branch 'main' into support-u32-hashmap

7b40015

fixes

5e24b78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support `u32` indices for `HashJoinExec` #16434

feat: Support `u32` indices for `HashJoinExec` #16434

Uh oh!

jonathanc-n commented Jun 18, 2025

Uh oh!

jonathanc-n commented Jun 18, 2025

Uh oh!

alamb commented Jun 18, 2025

Uh oh!

alamb commented Jun 18, 2025

Uh oh!

jonathanc-n commented Jun 18, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jun 19, 2025

Uh oh!

jonathanc-n Jun 19, 2025

Uh oh!

Uh oh!

feat: Support u32 indices for HashJoinExec #16434

Are you sure you want to change the base?

feat: Support u32 indices for HashJoinExec #16434

Uh oh!

Conversation

jonathanc-n commented Jun 18, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Uh oh!

jonathanc-n commented Jun 18, 2025

Uh oh!

alamb commented Jun 18, 2025

Uh oh!

alamb commented Jun 18, 2025

Uh oh!

jonathanc-n commented Jun 18, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

jonathanc-n Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

feat: Support `u32` indices for `HashJoinExec` #16434

feat: Support `u32` indices for `HashJoinExec` #16434