Skip to content

improve TimeSeries split performance#3933

Open
samirromdhani wants to merge 21 commits into
mainfrom
fix/1634-timeseries-split-quadratic-performance-for-many-small-chunks
Open

improve TimeSeries split performance#3933
samirromdhani wants to merge 21 commits into
mainfrom
fix/1634-timeseries-split-quadratic-performance-for-many-small-chunks

Conversation

@samirromdhani

@samirromdhani samirromdhani commented May 29, 2026

Copy link
Copy Markdown
Contributor

Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • A PR or issue has been opened in all impacted repositories (if any)

Does this PR already have an issue describing the problem?

Fixes #1634

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior (if this is a feature change)?

Does this PR introduce a breaking change or deprecate an API?

  • Yes
  • No

If yes, please check if the following requirements are fulfilled

  • The Breaking Change or Deprecated label has been added
  • The migration steps are described in the following section

What changes might users need to make in their application due to this PR? (migration steps)

  • The default behavior of the split method was changed to improve performance in both execution time and memory consumption.
  • A new method was introduced:toCompactArray, split now uses that.
  • toArray still available, users can choose toCompactArray for improved memory usage.

Other information:

powsybl-benchmark way with tsSize = 100000:

Benchmark     Mode  Cnt        Score   Error  Units
splitV0       avgt       3276192,839          us/op       <- split before performance improvement
split         avgt          3236,725          us/op       <- split after performance improvement

splitV0:gc.alloc.rate.norm  avgt       40016480045,333            B/op  <- split before performance improvement
split:gc.alloc.rate.norm    avgt           6801644,438            B/op <- split after performance improvement

Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
CalculatedTimeSeries is not concerned by compact array, calc split returns copies and not create NaN

Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani samirromdhani changed the base branch from main to fix/1609-split-time-series-and-toarray-wastes-a-lot-memory May 29, 2026 15:47
@samirromdhani samirromdhani force-pushed the fix/1634-timeseries-split-quadratic-performance-for-many-small-chunks branch 3 times, most recently from 46fb6ab to 6893ad2 Compare May 29, 2026 16:02
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani samirromdhani force-pushed the fix/1634-timeseries-split-quadratic-performance-for-many-small-chunks branch 5 times, most recently from fc02f7c to 2865686 Compare June 1, 2026 14:17
@samirromdhani samirromdhani self-assigned this Jun 1, 2026
@samirromdhani samirromdhani marked this pull request as ready for review June 1, 2026 15:26
@samirromdhani samirromdhani changed the title WIP: improve TimeSeries split performance improve TimeSeries split performance Jun 1, 2026
@samirromdhani samirromdhani marked this pull request as draft June 2, 2026 09:28
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani samirromdhani force-pushed the fix/1609-split-time-series-and-toarray-wastes-a-lot-memory branch from 64b7a54 to 08453cf Compare June 2, 2026 13:40
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani samirromdhani force-pushed the fix/1634-timeseries-split-quadratic-performance-for-many-small-chunks branch 2 times, most recently from 6811406 to 1a27bd3 Compare June 3, 2026 12:38
@samirromdhani samirromdhani marked this pull request as ready for review June 3, 2026 12:44
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani samirromdhani marked this pull request as draft June 4, 2026 07:24
@samirromdhani samirromdhani force-pushed the fix/1634-timeseries-split-quadratic-performance-for-many-small-chunks branch from 1a27bd3 to 57690a9 Compare June 4, 2026 07:25
@samirromdhani samirromdhani marked this pull request as ready for review June 4, 2026 07:26
Base automatically changed from fix/1609-split-time-series-and-toarray-wastes-a-lot-memory to main June 4, 2026 07:31
@sonarqubecloud

Copy link
Copy Markdown

@rolnico rolnico left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we go the right way on this.

What if we tried to change the method AbstractTimeSeries.split(int) instead?

    public List<T> split(int newChunkSize) {
        if (chunks.isEmpty()) {
            return List.of();
        }

        int minOffset = getMinOffset();

        // Sort chunks by offset
        List<C> sortedChunks = getSortedChunks();

        // Map from bucket index -> list of chunk pieces that fall in that bucket.
        Map<Integer, List<C>> bucketMap = new LinkedHashMap<>();

        for (C chunk : sortedChunks) {
            int chunkStart = chunk.getOffset();
            int chunkEnd = chunkStart + chunk.getLength() - 1;

            int firstBucket = (chunkStart - minOffset) / newChunkSize;
            int lastBucket = (chunkEnd - minOffset) / newChunkSize;

            for (int b = firstBucket; b <= lastBucket; b++) {
                int bucketStart = minOffset + b * newChunkSize;
                int bucketEnd = bucketStart + newChunkSize - 1;

                // Intersection of chunk and bucket
                int intersectStart = Math.max(chunkStart, bucketStart);
                int intersectEnd = Math.min(chunkEnd, bucketEnd);

                // Trim the chunk to [intersectStart, intersectEnd]
                C slice = chunk;
                if (intersectStart > chunkStart) {
                    slice = slice.splitAt(intersectStart).getChunk2();
                }
                int sliceEnd = intersectStart + slice.getLength() - 1;
                if (sliceEnd > intersectEnd) {
                    slice = slice.splitAt(intersectEnd + 1).getChunk1();
                }

                bucketMap.computeIfAbsent(b, k -> new ArrayList<>()).add(slice);
            }
        }

        // Build one time series per non-empty bucket
        List<T> result = new ArrayList<>(bucketMap.size());
        for (List<C> pieces : bucketMap.values()) {
            result.add(createTimeSeries(pieces));
        }
        return result;
    }

With something like this, we would avoid getting chunks filled with NaN. However, there are still issues because the method TimeSeries.split(List, int) expect a chunkCount based on the index, so we would have to change that or to generate empty chunks?

What do you think of it?

}

@Override
public List<DoubleTimeSeries> split(int newChunkSize) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code would potentially generate chunks/time series with only NaN values if there is a gap between the initial chunks:

    @Test
    void testSplitIssue() {
        RegularTimeSeriesIndex index = RegularTimeSeriesIndex.create(Interval.parse("2015-01-01T00:00:00Z/2015-01-01T01:45:00Z"), Duration.ofMinutes(15));
        TimeSeriesMetadata metadata = new TimeSeriesMetadata("ts1", TimeSeriesDataType.DOUBLE, index);
        UncompressedDoubleDataChunk chunk1 = new UncompressedDoubleDataChunk(0, new double[]{1d, 2d, 3d});
        UncompressedDoubleDataChunk chunk2 = new UncompressedDoubleDataChunk(6, new double[]{7d, 8d});
        StoredDoubleTimeSeries timeSeries = new StoredDoubleTimeSeries(metadata, chunk1, chunk2);

        // Split on multiple sizes
        List<DoubleTimeSeries> split2TimeSeries = timeSeries.split(2);
        List<DoubleTimeSeries> split3TimeSeries = timeSeries.split(3);
        List<DoubleTimeSeries> split4TimeSeries = timeSeries.split(4);

        assertEquals(4, split2TimeSeries.size());
        assertEquals(3, split3TimeSeries.size());
        assertEquals(2, split4TimeSeries.size());

        assertArrayEquals(new double[]{1d, 2d}, split2TimeSeries.get(0).toCompactArray(), 0d);
        assertArrayEquals(new double[]{3d, NaN}, split2TimeSeries.get(1).toCompactArray(), 0d);
        assertArrayEquals(new double[]{NaN, NaN}, split2TimeSeries.get(2).toCompactArray(), 0d);
        assertArrayEquals(new double[]{7d, 8d}, split2TimeSeries.get(3).toCompactArray(), 0d);
    }

Do we want this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, i think if the range is entirely a gap, it should return empty series.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will keep an empty series instead of either skipping it or filling it with NaN values, this preserves positional contract for split(List, int) and preserves the memory goal (e.g splitTestHuge)

@finchello

Copy link
Copy Markdown

Hi @rolnico, thanks for looping me in — happy to share a view.

I like the bucket approach: slicing each chunk into aligned [minOffset + b*newChunkSize, …] buckets is much easier to reason about than the recursive merge logic, and it makes the #3941 behaviour fall out for free — a gap is just an empty bucket, so there's no "merge across a gap" case and no Chunks are not successive exception.

On your open question, I think it's the real constraint and it points to the answer. TimeSeries.split(List, int) (TimeSeries.java ~L189-197) zips positionally:

int chunkCount = computeChunkCount(index, newChunkSize); // ceil(pointCount / newChunkSize)
for (int i = 0; i < chunkCount; i++) {
    splitList.get(i).add(split.get(i));
}

so it needs every series' split(int) to return exactly chunkCount pieces, one per window, in order — bucket b must sit at position b. If we build the result only from the non-empty bucketMap.values(), series with gaps return fewer pieces and that positional zip misaligns / throws.

So instead of "change split(List, int)" or "fill with NaN", maybe a third option: emit all chunkCount buckets in order, and represent an empty bucket as a data-less series (empty chunk list) rather than a NaN-filled chunk. That keeps the positional contract, still avoids allocating the NaN fill (the memory win you're after), and resolves #3941 structurally. toArray() on an empty-chunk series already yields the correct all-NaN window via getCheckedChunks's gap fill, so consumers shouldn't see a difference.

Two things I'd check first: createTimeSeries/the constructor path needs to accept an empty chunk list for a bucket (today it takes a single chunk), and whether any caller relies on each split piece being non-empty.

If that direction sounds right, it would also mean #3941 is fixed by this PR directly — happy to drop my separate adjacency-guard patch and instead add a few gapped-chunk test cases for the bucket version (incl. the split(4) repro). Thanks!

@rolnico

rolnico commented Jun 26, 2026

Copy link
Copy Markdown
Member

Hi @rolnico, thanks for looping me in — happy to share a view.
[...]
If that direction sounds right, it would also mean #3941 is fixed by this PR directly — happy to drop my separate adjacency-guard patch and instead add a few gapped-chunk test cases for the bucket version (incl. the split(4) repro). Thanks!

It seems like an interesting idea worth testing. Could you open a new PR with your proposal, so that we can have a look, compare and test it?

@finchello

Copy link
Copy Markdown

Quick update — I prototyped the bucket version locally and ran it against the time-series suite. Testing surfaced the real trade-off behind your "change split(List,int) or generate empty chunks?" question:

  • If split(int) emits all chunkCount buckets (empty ones as data-less series), the positional split(List, int) zip stays correct and [TimeSeries] split handle non-successive chunks inconsistently #3941 is fixed — but it allocates O(pointCount / newChunkSize) series regardless of data sparsity. The existing splitTestHuge (pointCount ~1e8, split(2)) makes it concrete: ~3 series today → ~50M, which works against this PR's memory goal.
  • If split(int) emits only non-empty buckets (your original sketch), memory scales with actual data — but the positional zip in split(List, int) (split.get(i) for i in 0..chunkCount) misaligns.

So the crux is the one you flagged: to keep buckets sparse and alignable, split(List, int) would need to align by bucket index rather than list position (each piece tagged with its bucket index, or split returning a Map<bucketIndex, series>). That gets both the memory win and consistent gap handling.

Before I open the PR: which contract do you prefer — (a) dense grid of chunkCount pieces (simpler, but O(pointCount/chunkSize) memory), or (b) sparse buckets + index-based alignment in split(List, int) (more change, keeps the perf win)? Happy to implement either — I have (a) working and can adapt.

(Small aside: bucket alignment should be to absolute index 0, not minOffset, otherwise series with different minOffsets won't line up in split(List, int).)

Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani

samirromdhani commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Hello, thanks for the feedback,

I agree that the bucket slicing is clearer than the current recursive split, but it change split semantics, and here are the two options, as @rolnico noted :

  • Generate empty chunks: this breaks some tests that don't treat a gap as a chunk. (tests like splitTest and splitTestHuge)
  • or change TimeSeries.split(List, int)

I'm not against that direction, but I'd rather keep it separate:

What do you think ?

@samirromdhani samirromdhani marked this pull request as draft July 1, 2026 12:34
…ct chunk view

- Rewrite AbstractTimeSeries.split(int) with compact chunk
- Remove recursive split and splitChunk helper (inused)
- Add tests

Signed-off-by: Samir Romdhani <samir.romdhani_externe@rte-france.com>
@samirromdhani samirromdhani marked this pull request as ready for review July 1, 2026 14:41
@samirromdhani

Copy link
Copy Markdown
Contributor Author

I'm not sure if we go the right way on this.

What if we tried to change the method AbstractTimeSeries.split(int) instead?

    public List<T> split(int newChunkSize) {
        if (chunks.isEmpty()) {
            return List.of();
        }

        int minOffset = getMinOffset();

        // Sort chunks by offset
        List<C> sortedChunks = getSortedChunks();

        // Map from bucket index -> list of chunk pieces that fall in that bucket.
        Map<Integer, List<C>> bucketMap = new LinkedHashMap<>();

        for (C chunk : sortedChunks) {
            int chunkStart = chunk.getOffset();
            int chunkEnd = chunkStart + chunk.getLength() - 1;

            int firstBucket = (chunkStart - minOffset) / newChunkSize;
            int lastBucket = (chunkEnd - minOffset) / newChunkSize;

            for (int b = firstBucket; b <= lastBucket; b++) {
                int bucketStart = minOffset + b * newChunkSize;
                int bucketEnd = bucketStart + newChunkSize - 1;

                // Intersection of chunk and bucket
                int intersectStart = Math.max(chunkStart, bucketStart);
                int intersectEnd = Math.min(chunkEnd, bucketEnd);

                // Trim the chunk to [intersectStart, intersectEnd]
                C slice = chunk;
                if (intersectStart > chunkStart) {
                    slice = slice.splitAt(intersectStart).getChunk2();
                }
                int sliceEnd = intersectStart + slice.getLength() - 1;
                if (sliceEnd > intersectEnd) {
                    slice = slice.splitAt(intersectEnd + 1).getChunk1();
                }

                bucketMap.computeIfAbsent(b, k -> new ArrayList<>()).add(slice);
            }
        }

        // Build one time series per non-empty bucket
        List<T> result = new ArrayList<>(bucketMap.size());
        for (List<C> pieces : bucketMap.values()) {
            result.add(createTimeSeries(pieces));
        }
        return result;
    }

With something like this, we would avoid getting chunks filled with NaN. However, there are still issues because the method TimeSeries.split(List, int) expect a chunkCount based on the index, so we would have to change that or to generate empty chunks?

What do you think of it?

After testing cases, it seems that the bucket approach needs remerging and not only alignment with TimeSeries.split(List, int)

  • The bucket split keeps one slice per source, (see splitMultiChunkTimeSeriesTest()): a bucket ends up with [2.0] + [3.0] even they are successive, so adjacent slices have to be merged back into one chunk! doing that merge here will add complexity
  • That's the reason for the compact way (toCompactChunk + splitAt): the data is already in one continuous block, so no merge step and in case of a full gap, we have a empty series to covers existing behaviours.

I'm fine moving further changes to a separate PR if they're considered out of the perf scope. still, these changes can be justified by the points discussed in this issue:

  • gaps are now handled cleanly (empty series, no NaN filled chunks), no exceptions from split(List, int)
  • The recursive split are removed

@sonarqubecloud

sonarqubecloud Bot commented Jul 1, 2026

Copy link
Copy Markdown

@finchello

Copy link
Copy Markdown

Thanks @samirromdhani — that compact-chunk approach is neat. Compacting to one continuous block before splitAt sidesteps exactly the remerge problem I hit prototyping the bucket version (adjacent slices from different source chunks landing in the same window, as in splitMultiChunkTimeSeriesTest), and getting an empty series for full gaps for free is a clean way to keep existing behaviour while dropping the recursive split. Nice.
+1 on the scoping: keep this PR as the perf fix (+ the cleaner gap handling that falls out of it), and move the bucket-based split with the split(List, int) index-alignment change to a separate PR — that one carries the real contract change and deserves its own review.
If your updated split now returns an empty series for a gapped window and no longer throws Chunks are not successive, that effectively resolves #3941 here. Happy to contribute the gapped-chunk regression tests (the #3941 repro incl. split(4), plus a split(List, int) alignment case) so that behaviour is locked in — then I can close my separate #3941 patch. Just say the word and I'll open a small test-only PR against your branch (or paste the cases here). Thanks for driving this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Timeseries split quadratic performance for many small chunks

3 participants