Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SplitAt (split a sequence in two at a specified index) #315

Open
atifaziz opened this issue Jun 13, 2017 · 3 comments
Open

SplitAt (split a sequence in two at a specified index) #315

atifaziz opened this issue Jun 13, 2017 · 3 comments
Assignees

Comments

@atifaziz
Copy link
Member

atifaziz commented Jun 13, 2017

I'd like to propose a new operator that splits a sequence in two at a given index. The first sequence will contain all the items that appear before the index in the source sequence and the second will contain those items that appear at and after the index. A negative index would be allowed and would have the equivalent behavior of specifying an index of 0.

The follow table illustrates the splits that SplitAt will produce for a sample of indexes on a given source:

Source Index Split 1 Split 2
[1,2,3,4,5,6,7,8,9,10] 0 [] [1,2,3,4,5,6,7,8,9,10]
[1,2,3,4,5,6,7,8,9,10] 2 [1,2] [3,4,5,6,7,8,9,10]
[1,2,3,4,5,6,7,8,9,10] 5 [1,2,3,4,5] [6,7,8,9,10]
[1,2,3,4,5,6,7,8,9,10] 10 [1,2,3,4,5,6,7,8,9,10] []
[1,2,3,4,5,6,7,8,9,10] 20 [1,2,3,4,5,6,7,8,9,10] []
[1,2,3,4,5,6,7,8,9,10] -5 [] [1,2,3,4,5,6,7,8,9,10]

Prototype

public static TResult SplitAt<T, TResult>(
    this IEnumerable<T> source, int index,
    Func<IEnumerable<T>, IEnumerable<T>, TResult> resultSelector) =>
    source.Index()
          .Partition(e => e.Key < index,
                     (xs, ys) => resultSelector(from x in xs select x.Value,
                                                from y in ys select y.Value));

Example

var xs = Enumerable.Range(1, 10);
var results =
    from i in new[] { 0, 2, 5, 10, 20, -5 }
    select xs.SplitAt(i, (h, t) => new
    {
        Index = i,
        Head  = "[" + string.Join(",", h) + "]",
        Tail  = "[" + string.Join(",", t) + "]",
    });

foreach (var result in results)
    Console.WriteLine(result);

Output

{ Index = 0, Head = [], Tail = [1,2,3,4,5,6,7,8,9,10] }
{ Index = 2, Head = [1,2], Tail = [3,4,5,6,7,8,9,10] }
{ Index = 5, Head = [1,2,3,4,5], Tail = [6,7,8,9,10] }
{ Index = 10, Head = [1,2,3,4,5,6,7,8,9,10], Tail = [] }
{ Index = 20, Head = [1,2,3,4,5,6,7,8,9,10], Tail = [] }
{ Index = -5, Head = [], Tail = [1,2,3,4,5,6,7,8,9,10] }
@atifaziz atifaziz added this to the 2.6.0 milestone Jun 14, 2017
@atifaziz atifaziz self-assigned this Jun 14, 2017
@fsateler
Copy link
Member

fsateler commented Jun 15, 2017

We circle back a bit to the discussion in #306 . What to do when the index is out of range? Here you propose to be permissive. In general I argue for strictness, because I prefer my program tell me when instructions are unclear instead of making guesses for me. Of course, I'm not that strict on strictness (ha!), so I guess it depends on what the common use-cases would be. What use-cases do you see for this operator, that make permissiveness be a more sensible option?

@atifaziz
Copy link
Member Author

atifaziz commented Jun 16, 2017

We circle back a bit to the discussion in #306 . What to do when the index is out of range?

True. SplitAt is permissive in the same way that Take and Skip are. Perhaps the problem here is using the word index rather than count and having at in the name. I am not sure if calling it SplitHeads would have helped 😆 where the argument would have been the count of head elements desired in the first part followed by remaining tail elements in the second.

What use-cases do you see for this operator, that make permissiveness be a more sensible option?

If you look at each part of the split as simple buckets then permissiveness is acceptable. If you run a Max on each bucket and one of them turned out empty (because index was out of range so to speak) then Max will throw. You get your strictness based on what you do with the buckets next. A Count, for example, would be happy with empty buckets.

@atifaziz atifaziz removed this from the 2.6.0 milestone Jun 30, 2017
@atifaziz
Copy link
Member Author

atifaziz commented Apr 5, 2018

Based on @leandromoh's feedback in PR #316, I am inclined to revise this so that instead of splitting a sequence in two and returning a pivoted result of the two parts, it returns a sequence of sequences where the sub-sequences start at the given offset into the source sequence and stop before the next. Below is an initial draft of what this could look like:

public static IEnumerable<IEnumerable<T>> SplitAt<T>(
    this IEnumerable<T> source, params int[] offsets)
{
    using (var oe = offsets.Concat(int.MaxValue).GetEnumerator())
    {
        oe.MoveNext();
        var offset = oe.Current;

        List<T> list = null;
        foreach (var e in source.Index())
        {
            retry:
            if (e.Key < offset)
            {
                if (list == null)
                    list = new List<T>();
                list.Add(e.Value);
            }
            else
            {
                yield return list ?? Enumerable.Empty<T>();
                offset = oe.MoveNext() ? oe.Current : -1;
                list = null;
                goto retry;
            }
        }

        if (list != null)
            yield return list;

        while (oe.MoveNext())
            yield return Enumerable.Empty<T>();
    }
}

This approach would be lazier and also support a source sequence that's infinite.

The initial example would now read as:

var xs = Enumerable.Range(1, 10);
var results =
    from i in new[] { 0, 2, 5, 10, 20, -5 }
    select xs.SplitAt(i).Fold((h, t) => new
    {
        Index = i,
        Head = "[" + string.Join(",", h) + "]",
        Tail = "[" + string.Join(",", t) + "]",
    });

foreach (var result in results)
    Console.WriteLine(result);

Note that Fold is used to effectively get convenient access to the two parts as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants