Skip to content

add array_permutations() #1013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions src/lazy_buffer.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use alloc::vec::Vec;
use core::array::from_fn;
use std::iter::Fuse;
use std::ops::Index;

Expand Down Expand Up @@ -63,6 +64,13 @@ where
pub fn get_array<const K: usize>(&self, indices: [usize; K]) -> [I::Item; K] {
indices.map(|i| self.buffer[i].clone())
}

pub fn get_array_from_fn<const K: usize>(
&self,
mut f: impl FnMut(usize) -> usize,
) -> [I::Item; K] {
from_fn(|i| self.buffer[f(i)].clone())
}
}

impl<I, J> Index<J> for LazyBuffer<I>
Expand Down
53 changes: 49 additions & 4 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ pub mod structs {
pub use crate::peek_nth::PeekNth;
pub use crate::peeking_take_while::PeekingTakeWhile;
#[cfg(feature = "use_alloc")]
pub use crate::permutations::Permutations;
pub use crate::permutations::{ArrayPermutations, Permutations};
#[cfg(feature = "use_alloc")]
pub use crate::powerset::Powerset;
pub use crate::process_results_impl::ProcessResults;
Expand Down Expand Up @@ -1792,6 +1792,51 @@ pub trait Itertools: Iterator {
combinations_with_replacement::combinations_with_replacement(self, k)
}

/// Return an iterator adaptor that iterates over all `K`-permutations of the
/// elements from an iterator.
///
/// Iterator element type is `[Self::Item; K]`. The iterator
/// produces a new array per iteration, and clones the iterator elements.
///
/// If `K` is greater than the length of the input iterator, the resultant
/// iterator adaptor will be empty.
/// ```
/// use itertools::Itertools;
///
/// let perms = (5..8).array_permutations();
/// itertools::assert_equal(perms, vec![
/// [5, 6],
/// [5, 7],
/// [6, 5],
/// [6, 7],
/// [7, 5],
/// [7, 6],
/// ]);
/// ```
///
/// Note: Permutations does not take into account the equality of the iterated values.
///
/// ```
/// use itertools::Itertools;
///
/// let it = vec![-2, -2].into_iter().array_permutations();
/// itertools::assert_equal(it, vec![
/// [-2, -2], // Note: these are the same
/// [-2, -2], // Note: these are the same
/// ]);
/// ```
///
/// Note: The source iterator is collected lazily, and will not be
/// re-iterated if the permutations adaptor is completed and re-iterated.
#[cfg(feature = "use_alloc")]
fn array_permutations<const K: usize>(self) -> ArrayPermutations<Self, K>
where
Self: Sized,
Self::Item: Clone,
{
permutations::array_permutations(self)
}

/// Return an iterator adaptor that iterates over all k-permutations of the
/// elements from an iterator.
///
Expand Down Expand Up @@ -1823,10 +1868,10 @@ pub trait Itertools: Iterator {
/// ```
/// use itertools::Itertools;
///
/// let it = vec![2, 2].into_iter().permutations(2);
/// let it = vec![-2, -2].into_iter().permutations(2);
/// itertools::assert_equal(it, vec![
/// vec![2, 2], // Note: these are the same
/// vec![2, 2], // Note: these are the same
/// vec![-2, -2], // Note: these are the same
/// vec![-2, -2], // Note: these are the same
/// ]);
/// ```
///
Expand Down
92 changes: 73 additions & 19 deletions src/permutations.rs
Original file line number Diff line number Diff line change
@@ -1,29 +1,39 @@
use alloc::boxed::Box;
use alloc::vec::Vec;
use core::marker::PhantomData;
use std::fmt;
use std::iter::once;
use std::iter::FusedIterator;

use super::lazy_buffer::LazyBuffer;
use crate::size_hint::{self, SizeHint};

/// Iterator for `Vec` valued permutations returned by
/// [`.permutations()`](crate::Itertools::permutations)
pub type ArrayPermutations<I, const K: usize> = PermutationsGeneric<I, [<I as Iterator>::Item; K]>;
/// Iterator for const generic permutations returned by
/// [`.array_permutations()`](crate::Itertools::array_permutations)
pub type Permutations<I> = PermutationsGeneric<I, Vec<<I as Iterator>::Item>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the second generic parameter Vec<<I as Iterator>::Item> seems odd when PermutationsGeneric has [usize; K].

[Array]Combinations have Vec<usize> and [usize; K], respectively. Unless there's a good reason, we should adopt this pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an error, array_permutations() was only working on Iterator<Item = usize> before. Fixed now.


/// An iterator adaptor that iterates through all the `k`-permutations of the
/// elements from an iterator.
///
/// See [`.permutations()`](crate::Itertools::permutations) for
/// See [`.permutations()`](crate::Itertools::permutations) and
/// [`.array_permutations()`](crate::Itertools::array_permutations) for
/// more information.
#[must_use = "iterator adaptors are lazy and do nothing unless consumed"]
pub struct Permutations<I: Iterator> {
pub struct PermutationsGeneric<I: Iterator, Item> {
vals: LazyBuffer<I>,
state: PermutationState,
_item: PhantomData<Item>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the decision Vec<usize> vs [usize; K] should be propagated onto PermutationState.

Copy link
Contributor Author

@ronnodas ronnodas Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current algorithm, PermutationState::Loaded.cycles could have that type, but that field is not directly being used to index in the buffer, so it's a "coincidence".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using a PoolIndex for cycles itself is wrong, but we already exploit the fact that the length is k (https://docs.rs/itertools/latest/src/itertools/permutations.rs.html#112). So it should possibly be PoolIdx<usize>::Item.

}

impl<I> Clone for Permutations<I>
impl<I, Item> Clone for PermutationsGeneric<I, Item>
where
I: Clone + Iterator,
I::Item: Clone,
{
clone_fields!(vals, state);
clone_fields!(vals, state, _item);
}

#[derive(Clone, Debug)]
Expand All @@ -41,34 +51,44 @@ enum PermutationState {
End,
}

impl<I> fmt::Debug for Permutations<I>
impl<I, Item> fmt::Debug for PermutationsGeneric<I, Item>
where
I: Iterator + fmt::Debug,
I::Item: fmt::Debug,
{
debug_fmt_fields!(Permutations, vals, state);
}

pub fn array_permutations<I: Iterator, const K: usize>(iter: I) -> ArrayPermutations<I, K> {
PermutationsGeneric {
vals: LazyBuffer::new(iter),
state: PermutationState::Start { k: K },
_item: PhantomData,
}
}

pub fn permutations<I: Iterator>(iter: I, k: usize) -> Permutations<I> {
Permutations {
PermutationsGeneric {
vals: LazyBuffer::new(iter),
state: PermutationState::Start { k },
_item: PhantomData,
}
}

impl<I> Iterator for Permutations<I>
impl<I, Item> Iterator for PermutationsGeneric<I, Item>
where
I: Iterator,
I::Item: Clone,
Item: PermItem<I::Item>,
{
type Item = Vec<I::Item>;
type Item = Item;

fn next(&mut self) -> Option<Self::Item> {
let Self { vals, state } = self;
let Self { vals, state, _item } = self;
match state {
PermutationState::Start { k: 0 } => {
*state = PermutationState::End;
Some(Vec::new())
Some(Item::extract_start(vals, 0, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see calling extract_start with 0 here has the merit to save one function, but it requires us to account for the special case len==0 in extract_start, right? Would unconditionally working on indices simplify and get rid of special cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that whatever code goes here needs to generically produce a [T; K] even though we know it will only ever be called if K == 0. We could make this a separate function that does whatever arbitrary thing for K > 0 (say buf.get_array([0; K])) since that code will never get called. The first thing I tried that doesn't work is the following:

impl PoolIndex for [usize; K]{
    ...
    fn empty_item() -> Option<Self::Item> {
        if K == 0 {
            Some([])
        } else {
            None
        }
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this attempt won't work. However, we could introduce PoolIndex::from_fn to do whatever we want and return either a Vec or an array.

}
&mut PermutationState::Start { k } => {
vals.prefill(k);
Expand All @@ -77,14 +97,11 @@ where
return None;
}
*state = PermutationState::Buffered { k, min_n: k };
Some(vals[0..k].to_vec())
Some(Item::extract_start(vals, k, k - 1))
}
PermutationState::Buffered { ref k, min_n } => {
if vals.get_next() {
let item = (0..*k - 1)
.chain(once(*min_n))
.map(|i| vals[i].clone())
.collect();
let item = Item::extract_start(vals, *k, *min_n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a separate variant Buffered prevents us from using PoolIndex. Maybe always constructing indices might allow PoolIndex here, simplifying implementation and improving maintenance.

*min_n += 1;
Some(item)
} else {
Expand All @@ -99,7 +116,7 @@ where
return None;
}
}
let item = vals.get_at(&indices[0..*k]);
let item = Item::extract_from_prefix(vals, &indices[0..*k]);
*state = PermutationState::Loaded { indices, cycles };
Some(item)
}
Expand All @@ -110,14 +127,14 @@ where
return None;
}
let k = cycles.len();
Some(vals.get_at(&indices[0..k]))
Some(Item::extract_from_prefix(vals, &indices[0..k]))
}
PermutationState::End => None,
}
}

fn count(self) -> usize {
let Self { vals, state } = self;
let Self { vals, state, _item } = self;
let n = vals.count();
state.size_hint_for(n).1.unwrap()
}
Expand All @@ -130,10 +147,11 @@ where
}
}

impl<I> FusedIterator for Permutations<I>
impl<I, Item> FusedIterator for PermutationsGeneric<I, Item>
where
I: Iterator,
I::Item: Clone,
Item: PermItem<I::Item>,
{
}

Expand Down Expand Up @@ -184,3 +202,39 @@ impl PermutationState {
}
}
}

/// A type that can be picked out from a pool or buffer of items from an inner iterator
/// and in a generic way given their indices.
pub trait PermItem<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this new trait really necessary? Couldn't we re-use/generalize/extend PoolIndex?

Copy link
Contributor Author

@ronnodas ronnodas Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what I started with but it seems that the operations required for ArrayPermutations are different from the ones for {Array}Combinations. Even in the existing Permutations implementation you can see that there are as many instances of directly indexing the buffer as calling buf.get_at(). We could create an intermediate slice/array of indices and call versions of these methods moved to the PoolIndex trait but that seems like an unnecessary level of indirection.

fn extract_start<I: Iterator<Item = T>>(buf: &LazyBuffer<I>, len: usize, last: usize) -> Self;

fn extract_from_prefix<I: Iterator<Item = T>>(buf: &LazyBuffer<I>, indices: &[usize]) -> Self;
}

impl<T: Clone, const K: usize> PermItem<T> for [T; K] {
fn extract_start<I: Iterator<Item = T>>(buf: &LazyBuffer<I>, len: usize, last: usize) -> Self {
assert_eq!(len, K);
buf.get_array_from_fn(|i| if i + 1 < len { i } else { last })
}

fn extract_from_prefix<I: Iterator<Item = T>>(buf: &LazyBuffer<I>, indices: &[usize]) -> Self {
buf.get_array_from_fn(|i| indices[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this essentially buf.get_array(indices)? (Would support my theory that PermItem should actually be part of PoolIndex.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that when this function is called, there does not exist an instance of [usize; K] to call buf.get_array() on. We could do something like buf.get_array(indices.try_into().unwrap()) instead if that seems better.

}
}

impl<T: Clone> PermItem<T> for Vec<T> {
fn extract_start<I: Iterator<Item = T>>(buf: &LazyBuffer<I>, len: usize, last: usize) -> Self {
if len == 0 {
Vec::new()
} else {
(0..len - 1)
.chain(once(last))
.map(|i| buf[i].clone())
.collect()
}
Comment on lines +227 to +234
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special cases that might go away if we always worked on indices.

}

fn extract_from_prefix<I: Iterator<Item = T>>(buf: &LazyBuffer<I>, indices: &[usize]) -> Self {
buf.get_at(indices)
}
}
10 changes: 10 additions & 0 deletions tests/laziness.rs
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,11 @@ must_use_tests! {
tuple_combinations3 {
let _ = Panicking.tuple_combinations::<(_, _, _)>();
}
array_combinations {
let _ = Panicking.array_combinations::<0>();
let _ = Panicking.array_combinations::<1>();
let _ = Panicking.array_combinations::<2>();
}
combinations {
let _ = Panicking.combinations(0);
let _ = Panicking.combinations(1);
Expand All @@ -217,6 +222,11 @@ must_use_tests! {
let _ = Panicking.combinations_with_replacement(1);
let _ = Panicking.combinations_with_replacement(2);
}
array_permutations {
let _ = Panicking.array_permutations::<0>();
let _ = Panicking.array_permutations::<1>();
let _ = Panicking.array_permutations::<2>();
}
permutations {
let _ = Panicking.permutations(0);
let _ = Panicking.permutations(1);
Expand Down
Loading