Skip to content

Custom key comparison #108

@matthew-mcallister

Description

@matthew-mcallister

Would it be desirable and feasible to support user-supplied key comparison? Basically an equivalent feature to comparators in RocksDB.

Byte-wise comparison is great for many use cases, particularly fixed-size keys and UTF-8 data. However, it leaves something to be desired when working with variable-length binary keys.

Example 1

Let's say my key type is (String, u8) and we want to store two keys, ("a", 255) and ("ab", 0). Under lexicographical ordering, ("a", 255) comes before ("ab", 0). If we concatenate the two components into a Vec<u8>, then we have the encodings

("a", 255) => 61 FF
("ab", 0)  => 61 62 00

You can see how this flips the order of the two keys. I can work around this by using null-terminated strings, so the keys become

("a", 255) => 61 00 FF
("ab", 0)  => 61 62 00 FF

which gives the correct ordering at the cost of converting my Rust strings to C strings and not supporting UTF-8 null.

Example 2

Let's say the key type is (Vec<u8>, u8). Here, we're in even more trouble, because we can't use the null terminator trick. Take (b"a", 255) and (b"a\0b", 255) as examples. The first key should come before the second, but there's no cheap way to ensure this. We have to re-encode the byte strings into another format where the null-terminator trick works, such as null-terminated hex strings.

Custom comparison

With custom comparisons, you can encode keys using any fast, zero-copy representation you want and still have precise control over ordering. In the examples above, this would look like

fn my_comparison_operator(left: &[u8], right: &[u8]) -> std::cmp::Ordering {
    std::cmp::Ord::cmp(
        &(&left[..left.len() - 1], left[left.len() - 1]),
        &(&right[..right.len() - 1], right[right.len() - 1])
    )
}

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions