Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for slice::array_chunks #74985

Open
5 tasks
lcnr opened this issue Jul 31, 2020 · 73 comments
Open
5 tasks

Tracking Issue for slice::array_chunks #74985

lcnr opened this issue Jul 31, 2020 · 73 comments
Labels
A-array Area: `[T; N]` A-const-generics Area: const generics (parameters and arguments) A-slice Area: `[T]` C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC F-generic_const_exprs `#![feature(generic_const_exprs)]` I-libs-api-nominated Nominated for discussion during a libs-api team meeting. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@lcnr
Copy link
Contributor

lcnr commented Jul 31, 2020

The feature gate for the issue is #![feature(array_chunks)].

Also tracks as_(r)chunks(_mut) in #![feature(slice_as_chunks)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

Implementation history

@lcnr lcnr added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC A-const-generics Area: const generics (parameters and arguments) F-const_generics `#![feature(const_generics)]` T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jul 31, 2020
@KodrAus KodrAus added Libs-Tracked Libs issues that are tracked on the team's project board. A-slice Area: `[T]` labels Aug 6, 2020
bors added a commit to rust-lang-ci/rust that referenced this issue Sep 12, 2020
Add `slice::array_chunks_mut`

This follows `array_chunks` from rust-lang#74373 with a mutable version, `array_chunks_mut`. The implementation is identical apart from mutability. The new tests are adaptations of the `chunks_exact_mut` tests, plus an inference test like the one for `array_chunks`.

I reused the unstable feature `array_chunks` and tracking issue rust-lang#74985, but I can separate that if desired.

r? `@withoutboats`
cc `@lcnr`
tmandry added a commit to tmandry/rust that referenced this issue Sep 16, 2020
Add array_windows fn

This mimicks the functionality added by array_chunks, and implements a const-generic form of
`windows`. It makes egregious use of `unsafe`, but by necessity because the array must be
re-interpreted as a slice of arrays, and unlike array_chunks this cannot be done by casting the
original array once, since each time the index is advanced it needs to move one element, not
`N`.

I'm planning on adding more tests, but this should be good enough as a premise for the functionality.
Notably: should there be more functions overwritten for the iterator implementation/in general?

~~I've marked the issue as rust-lang#74985 as there is no corresponding exact issue for `array_windows`, but it's based of off `array_chunks`.~~

Edit: See Issue rust-lang#75027 created by @lcnr for tracking issue

~~Do not merge until I add more tests, please.~~

r? @lcnr
JohnTitor added a commit to JohnTitor/rust that referenced this issue Oct 26, 2020
…lbertodt

Add [T]::as_chunks(_mut)

Allows getting the slices directly, rather than just through an iterator as in `array_chunks(_mut)`.  The constructors for those iterators are then written in terms of these methods, so the iterator constructors no longer have any `unsafe` of their own.

Unstable, of course. rust-lang#74985
@joshlf
Copy link
Contributor

joshlf commented Feb 10, 2021

Is there any intention of providing the reverse versions of these functions?

@thomcc
Copy link
Member

thomcc commented Apr 25, 2021

Is there any intention of providing the reverse versions of these functions?

What would that look like?

@joshlf
Copy link
Contributor

joshlf commented Apr 26, 2021

Same API, but iterating backwards. I have a use case in which I need to convert either a prefix or a suffix of a slice to an array reference. This API is perfect for the prefix use case, and I'd love to be able to use it for the suffix use case as well.

@newpavlov
Copy link
Contributor

@joshlf
Both slice of arrays and custom iterator type would implement the DoubleEndedIterator, so your use case does not need a separate reverse method.

@joshlf
Copy link
Contributor

joshlf commented Apr 26, 2021

That's only true if N evenly divides the length of the slice (see array_chunks docs). Otherwise, the remaining elements are only accessible via the remainder method, and won't be iterated over in the DoubleEndedIterator implementation.

@newpavlov
Copy link
Contributor

Ah, I misunderstood your use case, you want [0, 1, 2, 3, 4, 5, 6] to be split into [4, 5, 6], [1, 2, 3], [0], not into [6], [3, 4, 5], [0, 1, 2]. It indeed requires a separate method.

@joshlf
Copy link
Contributor

joshlf commented Apr 26, 2021

Yep, exactly 😃

@thomcc
Copy link
Member

thomcc commented Apr 27, 2021

Yeah, array_rchunks is definitely worth having.

@sffc
Copy link

sffc commented May 12, 2021

Also useful would be the inverse operation: [[0, 1, 2], [3, 4, 5]] to [0, 1, 2, 3, 4, 5]. (from &[[T; 3]] to &[T])

@ChayimFriedman2
Copy link
Contributor

Do we need to wait until const_evaluatable_unchecked is stabilized so we can deny N = 0 at compile time to stabilize this feature? Perhaps we can just mention in the documentation that currently it panics but it may be uplifted to a compile time error in the future.

@fzyzcjy
Copy link

fzyzcjy commented Oct 7, 2021

Hi, is there any alternatives before this is stable? Thanks!

@CryZe
Copy link
Contributor

CryZe commented Oct 7, 2021

Hi, is there any alternatives before this is stable? Thanks!

Yeah bytemuck::cast_slice and bytemuck::cast_slice_mut can do mostly the same (with a Pod requirement though)

@fzyzcjy
Copy link

fzyzcjy commented Oct 7, 2021

Thanks!

@traviscross
Copy link
Contributor

traviscross commented Feb 24, 2025

Maybe neither here nor there, but note that the following works:

const fn divide<const D: u64>(n: u64) -> u64 {
    const { assert!(D > 0) }; // <--- Post-mono assertion.
    n / D
}

const fn divide_checked<const D: u64>(n: u64) -> Option<u64> {
    match D {
        0 => None,
        1.. => Some(divide::<D>(n)),
    }
}

fn main() {
    dbg!({const { divide_checked::<1>(1) }});
    dbg!({const { divide_checked::<0>(1) }}); // <--- Look here.
    dbg!(divide_checked::<1>(1));
}

Playground link

That is, if the function within which we want to compile-time assert is parametric over a type or const generic, is const, and we're always in a const context when we'd see the "bad" generic argument for it, then something like what we want here does work currently.

(The question of whether we'll guarantee to not later disallow this is #122301.)

Interestingly, this is the odd case of a call that we allow in a const context but disallow outside of one.

@the8472
Copy link
Member

the8472 commented Feb 25, 2025

From the options in #74985 (comment) we would like a combination of point 2 and 3.

Runtime panics are not great because they fail it compiles, it works or make illegal states non-representatble principles, for precisely the thing that we're making legible to the type system.

The current CTFE approach, i.e. post-mono errors, is suboptimal because it would not be forwards-compatible with pattern types or where bounds. It would also bet difficult to work with for a more-generic caller. The const-trick from #122301 ¹² would help with the latter issue but not with the former. It's also a somewhat obscure trick.

So we'd like where N > 0 or where N: usize is 1.. bounds on array_chunks. That's the first language feature we'd need. But on its own that would be insufficient for generic callers because they wouldn't have any way to generalize to all N via some fallback.

So the second desired feature is a way for generic callers to locally discharge the restriction. E.g. const if, inline where expressions or unifying const generic impls

// std function
fn chunks<const N: usize>()
where
    const { N > 0 },
{
    // ...
}

// user code
fn use_chunks<const N: usize>()  {
    where N > 0 {
       chunks<N>();
    } else {
       fallback();
    }
}

¹ That issue was actually motivated by this one, but it turned out insufficient.
² Additional prior discussion: rust-lang/rfcs#3582 (comment)

@Noratrieb
Copy link
Member

Noratrieb commented Feb 26, 2025

I agree with you that this would be a nice way to implement this. But I think this feature is so incredibly far off that we really shouldn't be blocking this very useful helper (that replaced chunks_exact, which has a lot more room for bugs) for literal decades because of one of its weird error edge cases.
Passing 0 is a mistake, and I don't think a lot of people will run into it. It's an obscure edge case, and I think both suggested ways of detecting it (post mono error or runtime error) are perfectly fine in practice, so I really think we should stabilize it with one of the two (whatever you prefer) instead of blocking it for decades.

@Noratrieb
Copy link
Member

I would suggest going with a post mono error, as that has significant advantages. The only disadvantage is that it may impede flexibility. After stabilizing, in the post mono error, we could link to a tracking issue where people can complain if their use case was blocked by this inflexibility, and we should be able to make it a runtime error in case it's a problem (since that's backwards compatible as long as we document that people shouldn't rely on it being a post mono error).

@workingjubilee
Copy link
Member

Would any of the people involved in the current state of the relevant const generic bounds involving pattern (types?)... I'm not entirely sure if I should be asking @oli-obk or @BoxyUwU... be able to weigh in on the projected development cycle for such a feature as libs-api is asking? @Noratrieb has projected "literal decades" and I have no reason to doubt such a guess, currently, but are we talking 1 or 10?

@oli-obk
Copy link
Contributor

oli-obk commented Feb 26, 2025

Uh, please do not put pattern types into the consideration for any API decisions. They are an internal language feature and will stay so for the forseeable future. Getting them anywhere production ready is definitely off far enough that waiting for them is at present like waiting for specialization to be stable. If you'd like to know more, join https://rust-lang.zulipchat.com/#narrow/channel/481660-t-lang.2Fpattern-types

@BoxyUwU
Copy link
Member

BoxyUwU commented Feb 26, 2025

I think there is basically no universe in which we get N > 0 bounds in the next 2-3 years, it wouldn't surprise me if it took much longer than that even. Supporting fn foo<const N: NonZeroU32> is likely to be available much sooner (maybe the next year or so?), though the ability to take a generic const N: usize and pass that to a const M: NonZeroU32 is once again very far off. rust-lang/project-const-generics#26 I also do not see happening for a long time, it would definitely be blocked on -Znext-solver being stable, and I don't know that anyone has really sat down and properly thought out how it should even be implemented (so lets also say definitely not happening in the next 2-3 years for this too).

One thing that could be interesting is adding an unstable attribute for const generics that requires the argument to be fully concrete, i.e. forbid foo::<N> and only allow foo::<1> or foo::<{ MY_CONST_ITEM }>. This combined with PMEs would be fully forwards compatible (ish?) with whatever design we wind up with down the road for bounding const parameters (though it would rule out using const N: NonZeroUsize in the future, but possible we could add some hacks to the compiler if that winds up being the desired route). This wouldn't be as flexible on stabilization as desired but it seems more flexible than an error saying the function is unstable.

@the8472
Copy link
Member

the8472 commented Feb 26, 2025

Ok then our options are

Uh, please do not put pattern types into the consideration for any API decisions.

I was gesturing at a general category of features to solve our problems by limiting the range of valid const values. If NonZeroUsize is easier than pattern types or where bounds that's useful information. If there are any other options then we'd like to hear them too.

@oli-obk
Copy link
Contributor

oli-obk commented Feb 26, 2025

I think just a mono time error seems fine. There are enough APIs that we would like to improve with better const checks, this one isn't particularly special. We can just ship it and handle it however we'll handle these in the future. Even a panic doesn't seem bad, we do that for lots of APIs in libstd

@the8472
Copy link
Member

the8472 commented Feb 26, 2025

There are enough APIs that we would like to improve with better const checks

Which stable ones are you thinking of? There are a bunch of array ones like this one that have been kept unstable due to this issue.

We can just ship it and handle it however we'll handle these in the future.

I don't think anyone has mentioned a plan how to handle these other than living with the limitation.

@crumblingstatue
Copy link
Contributor

One thing that could be interesting is adding an unstable attribute for const generics that requires the argument to be fully concrete,

I think this might be worth looking into, if it's not too difficult to implement.
It would allow maximum forwards compatibility, and may also be helpful in stabilizing other const generics dependent APIs, like array_windows.

@BoxyUwU
Copy link
Member

BoxyUwU commented Feb 26, 2025

request a lang feature to require concrete calls so we can maybe upgrade later, but it'll be much later too

for the record I don't expect the lang feature for concrete calls to take very long. it's the kind of thing that could probably be implemented in a day or so. but the transitioning off it to a proper type system feature for requiring N > 0 would be much later

@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated

We discussed this in lang triage today. In particular, it had been asserted and then repeated in this thread that the lang team prefers runtime panics over post-mono errors. That's not the case. We like errors to happen as soon as possible. So we prefer pre-mono errors over post-mono errors when possible, but similarly we prefer post-mono errors over runtime errors when that is possible and practical.

It's of course for libs-api to decide what's practical in this case.

Regarding what counsel we can provide on the language direction and timeline here, probably all of us want something in the spirit of const if, but as has been discussed on this thread, that or other similar solutions along this axis could take awhile.

@rustbot rustbot removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Feb 26, 2025
@workingjubilee
Copy link
Member

workingjubilee commented Feb 26, 2025

Sounds good.

It seems like our preferred paths are the slightly hacky #[rustc_your_argument_must_be_this_concrete_to_pass] or the monomorphizing error.

As the compiler magic that prevents jamming in random expressions is

  • forward-compat with moving to a monomorphizing error
  • forward-compat with moving to a panic
  • NOT forward-compat with moving to a NonZeroUint type unless...
  • ...with coercion from uint types to NonZeroUint, where it is forward-compat

It seems like the logical first move would be the enforced concrete arguments, then?
It feels to me that we would not want NonZeroUint for this argument without something that could coerce values found to be in the, er, nonzero range? Because I expect we'll see something like this a lot in code: data.array_chunks::<16>()

@crumblingstatue
Copy link
Contributor

crumblingstatue commented Feb 27, 2025

One of the truly magical things about array_chunks (and similar APIS, like array_windows) is that the length can be inferred, so you can write very ergonomic code like

for [a, b, c] in slice.array_chunks() {
    // do stuff with a, b, and c
}

If we want a solution that enforces a non zero uint, we should take care that it doesn't break inference.

EDIT: Also, the inferred length should count as "concrete", if we implement a concrete enforcing attribute.

@scottmcm
Copy link
Member

I would say that lang wants compile-time checks where feasible, but also wants to avoid nuisance compilation failures that you can't suppress even though the thing it's complaining about is only in dead code.

TBH I'm tempted to just give this a runtime assert and a custom lint that's deny-by-default if it notices N is ever always zero. That'll catch the more likely mistake of somehow unintentionally inferring it as zero (maybe you typed if let ([[]], tail) instead of if let ([], tail), or something) without being annoying for the "look, I have a runtime check for N != 0 before calling it" case. And it's a really useful helper (though I'm of course biased) so I'd rather it be stable than perfect.

I'm reminded of the runtime assert in https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.as_flattened, where I just think it's fine, and better than blocking arrays-of-ZSTs entirely in post-mono. (Though admittedly that one requires the combination of two unlikely things, not just one.)

@sffc
Copy link

sffc commented Feb 27, 2025

Maybe just give chunk size 0 a well-defined behavior such as never evaluating the loop expression?

And there can be a runtime panic in debug mode like for integer overflow.

I'd like to avoid a runtime panic being unconditionally compiled into code for N > 0. If we can assume that the compiler will eliminate it, maybe that's fine.

@GoldsteinE
Copy link
Contributor

It’s an if on a const expression, so surely it will be optimized out.

@Qix-
Copy link

Qix- commented Mar 3, 2025

Just a thought, could array chunks simply return an Option and document that it returns None if the window size is 0? It could even be extended such that if the windows size is 0 and the slice length is 0, that's well defined, returning Some.

Since the window size is constant, then in most cases it'd return Some unconditionally, which I assume would make any downstream users subject to constant folding optimizations. Also makes the behavior explicit, without needing any unstable syntax stuff.

@its-the-shrimp
Copy link
Contributor

Just a thought, could array chunks simply return an Option and document that it returns None if the window size is 0? It could even be extended such that if the windows size is 0 and the slice length is 0, that's well defined, returning Some.

Since the window size is constant, then in most cases it'd return Some unconditionally, which I assume would make any downstream users subject to constant folding optimizations. Also makes the behavior explicit, without needing any unstable syntax stuff.

This'd make the runtime value depend on the compile-time argument, which is not too good in terms of developer experience

@sffc
Copy link

sffc commented Mar 3, 2025

Maybe just give chunk size 0 a well-defined behavior such as never evaluating the loop expression?

Thought: with this approach, the loop expression becomes dead code, so it will generate a compiler warning.

@Kixunil
Copy link
Contributor

Kixunil commented Mar 4, 2025

Because I expect we'll see something like this a lot in code: data.array_chunks::<16>()

This is true. In a large-ish codebase I co-maintain we have 10 instances of chunks_exact. Every single one of them has a constant argument. Getting array_chunks there would be great.

Honestly I can't even think of a case when it'd make sense to need if N == { fallback() }. I don't think it's realistic that anyone will ever need it. This method is generally used for very specific purposes and there's no way to fulfill that purpose if your argument is 0. So postponing stabilization just to allow this fringe use cases doesn't make sense.

Regarding NonZeroUsize, I'd very much like to see that with coercion. It's indeed true that 9 out of those 10 cases I found use a literal, the other one uses a constant. Even having a general mechanism for these coercions could be very useful for other things as well:

pub struct EvenLen(usize); // useful for hex stuff?

impl EvenLen {
    #[const_coerce_literals] // this function gets implicitly called in const context when a literal is coerced
    pub const fn from_usize_unwrap(len: usize) -> Self {
        assert_eq!(len % 2, 0);
        EvenLen(len)
    }
}

But still, this can be added later. My point is that the functions really are used with literals so having NonZeroUsize without coercion might be even worse than runtime panic.

So in conclusion, I think the proposal requiring the generic argument to be concrete makes a lot of sense, is already very useful and and I don't see anything wrong with it. I'm even going to give a shot writing a helper extension trait to have array_chunks in stable.

Also what @crumblingstatue said about inferred values is interesting but ultimately if it's going to delay stabilization I think it's not worth it since spelling the constant is not that big deal.

@scottmcm
Copy link
Member

Come to think of it, the fact that we now have split_first_chunk and friends would help avoid a bunch of the places where .as_chunks was particularly helpful for conditional things. .as_chunks()[0] is no longer worth doing, since .first_chunk().unwrap() is better -- and supports N == 0 just fine.


To address something from the notes, I do think that inferring the length is useful, even if there was a restriction that it needs to be concrete.

For example, it's handy to

let (chunks, tail) = foo.as_chunks();
for x in chunks {
    bar(usize::from_le_bytes(*x));
}

which feel really nicely magical with the inferred length.

So it'd be nice if that can still work, even if a generic parameter as the length were blocked for a bit.


@Kixunil Out of curiosity, of your chunks_exact calls, how many of them would prefer array_chunks and how many would be as_chunks? It would be interesting to know which things are better when (cc #76354).

@Kixunil
Copy link
Contributor

Kixunil commented Mar 14, 2025

@scottmcm good points!

All of the cases use iterators but that's survivorship bias - chunks_exact doesn't support random access so if we need random access anywhere we are not using the method there but something else. However I can't think of a good regexp that would find those places and I also can't think of where we would need it. Everything is simple enough to use iteration.

Except splitting/subslicing of arrays.

We have a bunch of places where we need to split arrays, not slices into sub-arrays of known length. So the operation is statically known to not fail in theory but not directly provable to the compiler. I wrote an extension trait for arrays that allows such splitting (also compile-time asserting that the length of the sub-arrays sums up to the original array) and even found a bug in our code. Pretty cool. So code like let foo: &[u8; 8] = bar[0..8].try_into().expect("statically known len"); becomes let foo = bar.split_array::<8, 8>(); if bar: [u8; 16] with the failed check causing post-mono error. I think this would be likely useful addition to core as well but more hairy than as_chunks/array_chunks because, as opposed to the .. operator, this one takes two lengths of the arrays and also one of the lengths is in principle useless. So I think long-term core will want to have just one parameter and stabilizing a method with two parameters is not great.

Side note: I've implemented as_chunks rather than array_chunks so that once its iterated it has the unstable TrustedLen implemented to get access to optimizations. However now that I think about it I could've just added chunk_arrays (too) returning core::slice::Iter<'_, [T; N]>. But it does inform about #76354 a bit: in one case there was a check for remainder being empty which got much simpler (though could've been that simple with chunks_exact too) and in all other cases we were iterating within array where typing the extra .0 was not too obnoxious (we were also passing it into zip, so no .iter() was needed. But it also reminds me we probably want to have something like array_split_divisible which compile-time-asserts that remainder is empty.

@fintelia
Copy link
Contributor

In my code, the two versions that would be useful are as_chunks for cases where I want to process the remainder, and a different method that returned only the chucks but panicked if the slice length wasn't divisible by the chunk size. For instance, when working with RGBA data represented as a byte slice, you probably don't want to continuously add assertions about the slice length but it feels a bit ugly to use array_chunks and just silently ignore the the remainder.

@Kixunil
Copy link
Contributor

Kixunil commented Mar 14, 2025

@fintelia sounds like you may want to pass around &[[u8; 4]], proving divisibility, instead of &[u8] so that you don't need to check (and panic) and only perform the check in the top-most function. I wouldn't be surprised if this also has additional performance benefits.

@fintelia
Copy link
Contributor

@Kixunil yeah that could help for certain cases. Though something like this part of the png crate couldn't really benefit. The problem is that there's multiple different pixel formats supported, with some code generically handling them all as byte slices and other parts matching on each of the possible sizes and doing array operations on them.

@Kixunil
Copy link
Contributor

Kixunil commented Mar 15, 2025

@fintelia I'd use enum Pixels<'a> { One(&'a [u8]), Two(&[[u8; 2]]), Three(&[[u8; 3]], ...) instead of having BytesPerPixel and the pixels separated. I also see a lot of opportunity to reduce duplication in that code.

@Noratrieb
Copy link
Member

This is getting really off topic, please stay on topic on tracking issues to keep them organized and avoid sending out too many notifications to people that are subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-array Area: `[T; N]` A-const-generics Area: const generics (parameters and arguments) A-slice Area: `[T]` C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC F-generic_const_exprs `#![feature(generic_const_exprs)]` I-libs-api-nominated Nominated for discussion during a libs-api team meeting. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests