Skip to content

Add ((un)checked_)exact_div methods for integer types #337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
newpavlov opened this issue Feb 17, 2024 · 23 comments
Closed

Add ((un)checked_)exact_div methods for integer types #337

newpavlov opened this issue Feb 17, 2024 · 23 comments
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@newpavlov
Copy link

newpavlov commented Feb 17, 2024

Proposal

Problem statement

In some cases it desirable to get division result and ensure that the division has no remainder. For example, a function which works with memory mappings may accept size in bytes, but it requires that the size must be multiple of OS page size. Today we have to insert separate checks (e.g. assert_eq!(size / OS_PAGE_SIZE, 0)) somewhere near the division. It makes code less concise and not great at expressing intent.

Motivating examples or use cases

In our code we have code which works with pages and accepts size in bytes. It requires that the size should be multiple of the page size. The size is provided in bytes for user convenience and because page size can depend on OS and constants.

So we have a bunch of code like this:

const PAGE_SIZE: usize = 1 << 14;
const OS_PAGE_SIZE: usize = 1 << 12;

// somewhere else in the project
let os_pages: usize = pages_len * (PAGE_SIZE / OS_PAGE_SIZE);
// the constants defined in a separate faraway module,
// so to make correctness reasoning local, we assert it near the division
assert_eq(PAGE_SIZE % OS_PAGE_SIZE, 0);

// another place
let pages_to_map: usize = cfg.mapping_size / PAGE_SIZE;
if cfg.mapping_size / PAGE_SIZE {
    return Err(Error::NotPageMultiple);
}
let pages_to_map: u32 = pages_to_map.try_into().map_err(|_| Error::TooBig)?;

To simplify the code we can introduce a helper extension trait and implement it for necessary integer types, but we need to introduce the trait and import it every time the methods are used. Also in a multi-crate projects it becomes even more annoying. We either have to define the trait in each crate where it needed or introduce a whole separate crate for this small functionality.

Solution sketch

Introduce the following methods to all integers through int_impl! and uint_impl! macros:

/// Checked integer division without remainder. Computes `self / rhs`,
/// returning `None` if `rhs == 0` or if division remainder is not zero.
pub const fn checked_exact_div(self, rhs: Self) -> Option<Self> { ... }

/// Integer division without remainder. Computes `self / rhs`,
/// panics if `rhs == 0` or if division remainder is not zero.
pub const fn exact_div(self, rhs: Self) -> Self { ... }

/// Integer division without remainder.
/// Unchecked version of `exact_div`.
pub unsafe const fn unchecked_exact_div(self, rhs: Self) -> Self { ... }

For signed integer the conditions also include potential overflow when MIN is divided by -1.

Alternatives

Users may use explicit checks based on the remainder operator or custom traits as shown in the examples section. Arguably, the functionality is too small for a separate crate.

Links and related work

rust-lang/rust#116632

@newpavlov newpavlov added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Feb 17, 2024
@pitaj
Copy link

pitaj commented Feb 17, 2024

For the const case, as in your example, it's best to do that assertion at compile time:

const PAGE_SIZE: usize = 1 << 14;
const OS_PAGE_SIZE: usize = 1 << 12;
const _: () = {
    assert!(PAGE_SIZE % OS_PAGE_SIZE == 0);
};

I recommend you provide a different example that shows how this method would be beneficial at runtime.

@newpavlov
Copy link
Author

newpavlov commented Feb 17, 2024

The point is that constant definitions with static assertions (we use them in our code, but they are not relevant here) can be quite far away from operations which depend on the assert. The motivation for the proposed methods is to clearly express that we expect that two numbers divide without remainder without any additional clutter associated with asserts or branches.

Also, the second part of the example is for runtime values.

@pitaj
Copy link

pitaj commented Feb 17, 2024

In that case, you should provide a const PAGES_PER_OS_PAGE or something that does the division (and remainder check) ahead of time, and use that everywhere instead.

I missed the runtime example cfg.mapping_size, but even in that case maybe you should do the division when you load the config and store the result in the cfg struct rather than having divisions all throughout your codebase.

All that said, this seems pretty niche. Also for the name, I'm not a fan of norem, maybe exact or factor instead?

@pitaj
Copy link

pitaj commented Feb 18, 2024

Here's another alternative. If we had a std version of div_rem then you could match on the result:

if let (quotient, 0) = whatever.div_rem(PAGE_SIZE) {
    // divides evenly
} else {
    // there was a remainder
}

@agausmann
Copy link

agausmann commented Dec 13, 2024

One usecase for this is a subproblem of Advent of Code 2024 day 13 (spoiler ahead).

The problem overall deals with integer systems of equations, and it is interested in whether or not the solution values are integers.

At the final step of my solver, it has integer equations of the form A*x = B and C*y = D where x,y are unknowns. The solution is x = B / A and y = D / C, but first it has to check B % A == 0 and D % C == 0.

if b % a == 0 && d % c == 0 {
    Some((b / a, d / c))
} else {
    None
}

With a function like checked_norem_div, it could be written simply as

b.checked_norem_div(a).zip(d.checked_norem_div(c))

Before finding this issue, the name that I thought of is exact_div - only return the result of a / b if it is exactly an integer (i.e., if a is exactly a multiple of b)

@tgross35
Copy link

Before finding this issue, the name that I thought of is exact_div - only return the result of a / b if it is exactly an integer (i.e., if a is divisible by b)

The name that popped into my head was div_exact, maybe this bikeshed is on to something :)

@agausmann
Copy link

agausmann commented Dec 13, 2024

At first, I did think of div_exact, but every math operation has the modifier before the root (checked_, overflowing_, wrapping_, strict_, etc)

@agausmann
Copy link

*Except for _euclid and _signed

@kennytm
Copy link
Member

kennytm commented Dec 14, 2024

We already have std::intrinsics::exact_div at home1. The stable API name should better follow this.

Footnotes

  1. it is used to support as_chunks_unchecked and align_offset

@shepmaster
Copy link
Member

My AoC solution used:

fn evenly_divide(n: u64, d: u64) -> Option<u64> {
    (n % d == 0).then_some(n / d)
}

@scottmcm
Copy link
Member

Note that the exact_div intrinsic is unsafe because we tell LLVM it's guaranteed.

Makes me wonder if it makes sense to have a safe -> Option<T> version and an unsafe -> T version.

(And I wonder if it's make sense to have at least the unsafe version take T, NonZero<T> to help emphasize the extra unsafety condition.)

@RalfJung
Copy link
Member

We already have std::intrinsics::exact_div at home1. The stable API name should better follow this.

Unstable intrinsics are not a precedent for stable names by any means. They typically get added without much thought about the name.

So, we can take it as one data point, but it's completely fine if the stable API has a different name -- this happens all the time. Sometimes we rename the intrinsic, many times we do not bother.

@abgros
Copy link

abgros commented Mar 31, 2025

So, we can take it as one data point, but it's completely fine if the stable API has a different name -- this happens all the time. Sometimes we rename the intrinsic, many times we do not bother.

I think exact_div is a great name because it describes exactly what the operation does without being cumbersome to write. And at least for me, it takes a second to read "norem" as "no remainder" (but maybe I could get used to it...).

@newpavlov
Copy link
Author

I don't have any particular attachment to the norem name, so I am fine with renaming the methods to checked_exact_div/unchecked_exact_div/exact_div.

@newpavlov newpavlov changed the title Add (checked_)norem_div methods for integer types Add ((un)checked_)exact_div methods for integer types Mar 31, 2025
@Amanieu Amanieu added the I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. label Apr 13, 2025
@Amanieu
Copy link
Member

Amanieu commented Apr 13, 2025

I just encountered a use case for this where a SIMD search returns a bit mask which always contains an even number of one bits. The bit count needs to be divided by 2, but this is always an exact division due to the way the mask is calculated. By using exact_div the compiler can eliminate a right shift instruction by folding it into a later left shift when the result is used as an array offset. This is only possible when the bit shifted out is 0.

unsafe fn search64(keys: *const i64, search: i64) -> usize {
    unsafe {
        let search = _mm256_set1_epi64x(search);
        let cmp = |offset: usize| {
            let vec = _mm256_load_si256(keys.cast::<__m256i>().add(offset));
            _mm256_cmpgt_epi64(search, vec)
        };
        let a = cmp(0);
        let b = cmp(1);
        let c = cmp(2);
        let d = cmp(3);
        let tmp1 = _mm256_blend_epi32(a, b, 0b01010101);
        let tmp2 = _mm256_blend_epi32(c, d, 0b01010101);
        let tmp3 = _mm256_packs_epi32(tmp1, tmp2);
        let mask = _mm256_movemask_epi8(tmp3);
        mask.count_ones() as usize / 2
    }
}

Regarding naming, I don't see much use in the panicking version of exact_div. I would instead prefer only have 2 methods:

pub const fn exact_div(self, rhs: Self) -> Option<Self> { ... }
pub unsafe const fn exact_div_unchecked(self, rhs: Self) -> Self { ... }

I've put the _unchecked at the back since this is the convention for unchecked versions of methods returning Option.

@newpavlov
Copy link
Author

I don't see much use in the panicking version of exact_div

I disagree. As I wrote in the OP, there is a plenty of cases where we know that we divide without remainder and it would be a bug if this assumption is wrong for some reason. We would have to add a bunch of annoying unwraps everywhere in such cases. Plus your proposal would not be consistent with other methods.

@BurntSushi
Copy link
Member

I don't have a strong opinion on whether we should have a panicking version of this or not (although I do weakly think we should), but I do somewhat strongly feel that if we have an exact division method returning an Option<Self> on an integer type, then its name should start with checked_. As of today, every single method on integer types (as far as I can see) that returns an Option starts with a checked_ prefix. I'm not one for a foolish consistency, but I think it makes sense here.

Overall +1 on adding these. Seems like we can probably mark this as accepted and hash out the naming later.

@pitaj
Copy link

pitaj commented Apr 13, 2025

Maybe the panicking version should be called strict_exact_div for similar reasons.

@Amanieu
Copy link
Member

Amanieu commented Apr 15, 2025

We discussed this in the libs-api meeting. We see value in both checked and unchecked versions of this. There were some concerns about the panicking exact_div and how useful it would be, but these can be deferred until stabilization.

This ACP can be closed once a tracking issue has been created.

@Amanieu Amanieu added ACP-accepted API Change Proposal is accepted (seconded with no objections) and removed I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. labels Apr 15, 2025
@scottmcm
Copy link
Member

David: the optimization that it enables is that x / 3 == 0 optimizes to x < 3, but with exact div it optimizes to x == 0.

(The unsafe intrinsic version was originally added for slice iterators, so that the (end - start) / SIZE calculation of ESI::len optimized to start == end in ESI::is_empty.)

@newpavlov
Copy link
Author

Tracking issue: rust-lang/rust#139911

@TDecking
Copy link

As for naming these functions, I want to point out that on the bignum side, the malachite and rug crates provide div_exact, but not exact_div.

@newpavlov
Copy link
Author

On the other hand we have the exact_div intrinsic and a bunch of existing *_div methods. The only exception is the *_div_euclid methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests