From 888985aa75389079409e57654d6013b431a6d53f Mon Sep 17 00:00:00 2001 From: Daryle Walker Date: Wed, 30 Apr 2025 20:59:06 -0600 Subject: [PATCH 1/6] Add methods scanning already-sorted sequences Add a method for already-sorted sequences that returns each unique value, paired with the count of each value. Add another method that returns each unique value. Have both eager and lazy variants for each method. --- CHANGELOG.md | 6 + Guides/README.md | 2 + Guides/SortedDuplicates.md | 65 ++++ .../Documentation.docc/Selecting.md | 16 + Sources/Algorithms/SortedDuplicates.swift | 287 ++++++++++++++++++ .../SortedDuplicatesTests.swift | 82 +++++ 6 files changed, 458 insertions(+) create mode 100644 Guides/SortedDuplicates.md create mode 100644 Sources/Algorithms/SortedDuplicates.swift create mode 100644 Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift diff --git a/CHANGELOG.md b/CHANGELOG.md index fb673393..c39be4df 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,12 @@ This project follows semantic versioning. - Bidirectional collections have a new `ends(with:)` method that matches the behavior of the standard library's `starts(with:)` method. ([#224]) +- Sequences that are already sorted can use the `countSortedDuplicates` and + `withoutSortedDuplicates` methods, with eager and lazy versions. + The former returns each unique value paired with the count of + that value's occurances. + The latter returns each unique value, + turning a possibly non-decreasing sequence to a strictly-increasing one. diff --git a/Guides/README.md b/Guides/README.md index d4894882..8a245609 100644 --- a/Guides/README.md +++ b/Guides/README.md @@ -32,6 +32,7 @@ These guides describe the design and intention behind the APIs included in the ` - [`suffix(while:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Suffix.md): Returns the suffix of a collection where all element pass a given predicate. - [`trimmingPrefix(while:)`, `trimmingSuffix(while)`, `trimming(while:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Trim.md): Returns a slice by trimming elements from a collection's start, end, or both. The mutating `trim...` methods trim a collection in place. - [`uniqued()`, `uniqued(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Unique.md): The unique elements of a collection, preserving their order. +- [`withoutSortedDuplicates()`, `withoutSortedDuplicates(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/SortedDuplicates.md): Given an already-sorted sequence and the sorting predicate, reduce all runs of a unique value to a single element each. Has eager and lazy variants. - [`minAndMax()`, `minAndMax(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/MinMax.md): Returns the smallest and largest elements of a sequence. #### Partial sorting @@ -42,6 +43,7 @@ These guides describe the design and intention behind the APIs included in the ` - [`adjacentPairs()`](https://github.com/apple/swift-algorithms/blob/main/Guides/AdjacentPairs.md): Lazily iterates over tuples of adjacent elements. - [`chunked(by:)`, `chunked(on:)`, `chunks(ofCount:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes or chunks of a given count. +- [`countSortedDuplicates()`, `countSortedDuplicates(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/SortedDuplicates.md): Given an already-sorted sequence and the sorting predicate, return each unique value, pairing each with the number of occurances. Has eager and lazy variants. - [`firstNonNil(_:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/FirstNonNil.md): Returns the first non-`nil` result from transforming a sequence's elements. - [`grouped(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Grouped.md): Group up elements using the given closure, returning a Dictionary of those groups, keyed by the results of the closure. - [`indexed()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Indexed.md): Iterate over tuples of a collection's indices and elements. diff --git a/Guides/SortedDuplicates.md b/Guides/SortedDuplicates.md new file mode 100644 index 00000000..1c6a3e7d --- /dev/null +++ b/Guides/SortedDuplicates.md @@ -0,0 +1,65 @@ +# Sorted Duplicates +[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/SortedDuplicates.swift) | + [Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift)] + +Being a given a sequence that is already sorted, recognize each run of +identical values. +Use that to determine the length of each identical-value run of +identical values. +Or filter out the duplicate values by removing all occurances of +a given value besides the first. + +```swift +// Put examples here +``` + +## Detailed Design + +```swift +extension Sequence { + public func countSortedDuplicates( + by areInIncreasingOrder: (Element, Element) throws -> Bool + ) rethrows -> [(value: Element, count: Int)] + + public func withoutSortedDuplicates( + by areInIncreasingOrder: (Element, Element) throws -> Bool + ) rethrows -> [Element] +} + +extension Sequence where Self.Element : Comparable { + public func countSortedDuplicates() -> [(value: Element, count: Int)] + + public func withoutSortedDuplicates() -> [Element] +} + +extension LazySequenceProtocol { + public func countSortedDuplicates( + by areInIncreasingOrder: @escaping (Element, Element) -> Bool + ) -> LazyCountDuplicatesSequence + + public func withoutSortedDuplicates( + by areInIncreasingOrder: @escaping (Element, Element) -> Bool + ) -> some (Sequence & LazySequenceProtocol) +} + +extension LazySequenceProtocol where Self.Element : Comparable { + public func countSortedDuplicates() + -> LazyCountDuplicatesSequence + + public func withoutSortedDuplicates() + -> some (Sequence & LazySequenceProtocol) +} + +public struct LazyCountDuplicatesSequence + : LazySequenceProtocol +{ /*...*/ } + +public struct CountDuplicatesIterator + : IteratorProtocol +{ /*...*/ } +``` + +### Complexity + +Calling the lazy methods, those defined on `LazySequenceProtocol`, is O(_1_). +Calling the eager methods, those returning an array, is O(_n_). diff --git a/Sources/Algorithms/Documentation.docc/Selecting.md b/Sources/Algorithms/Documentation.docc/Selecting.md index e5bed603..aca0c242 100644 --- a/Sources/Algorithms/Documentation.docc/Selecting.md +++ b/Sources/Algorithms/Documentation.docc/Selecting.md @@ -18,8 +18,24 @@ or iterate of elements with their indices. - ``Swift/Collection/indexed()`` +### Counting each Element in a Sorted Sequence + +- ``Swift/Sequence/countSortedDuplicates(by:)`` +- ``Swift/Sequence/countSortedDuplicates()`` +- ``Swift/LazySequenceProtocol/countSortedDuplicates(by:)`` +- ``Swift/LazySequenceProtocol/countSortedDuplicates()`` + +### Removing Duplicates from a Sorted Sequence + +- ``Swift/Sequence/withoutSortedDuplicates(by:)`` +- ``Swift/Sequence/withoutSortedDuplicates()`` +- ``Swift/LazySequenceProtocol/withoutSortedDuplicates(by:)`` +- ``Swift/LazySequenceProtocol/withoutSortedDuplicates()`` + ### Supporting Types - ``IndexedCollection`` - ``StridingSequence`` - ``StridingCollection`` +- ``LazyCountDuplicatesSequence`` +- ``CountDuplicatesIterator`` diff --git a/Sources/Algorithms/SortedDuplicates.swift b/Sources/Algorithms/SortedDuplicates.swift new file mode 100644 index 00000000..5bfbb35c --- /dev/null +++ b/Sources/Algorithms/SortedDuplicates.swift @@ -0,0 +1,287 @@ +//===----------------------------------------------------------------------===// +// +// This source file is part of the Swift Algorithms open source project +// +// Copyright (c) 2025 Apple Inc. and the Swift project authors +// Licensed under Apache License v2.0 with Runtime Library Exception +// +// See https://swift.org/LICENSE.txt for license information +// +//===----------------------------------------------------------------------===// + +extension Sequence { + /// Assuming this sequence is already sorted along the given predicate, + /// return a collection of the given type, + /// storing the first occurance of each unique element value in + /// this sequence paired with its total number of occurances. + /// + /// - Precondition: This sequence must be finite, + /// and be sorted according to the given predicate. + /// + /// - Parameters: + /// - type: A reference to the returned collection's type. + /// - areInIncreasingOrder: The sorting predicate. + /// - Returns: A collection of pairs, + /// one for each element equivalence class present in this sequence, + /// in order of appearance. + /// The first member is the value of the earliest element for + /// an equivalence class. + /// The second member is the number of occurances of that + /// equivalence class. + /// + /// - Complexity: O(`n`), where *n* is the length of this sequence. + @usableFromInline + func countSortedDuplicates( + storingIn type: T.Type, + by areInIncreasingOrder: (Element, Element) throws -> Bool + ) rethrows -> T + where T: RangeReplaceableCollection, T.Element == (value: Element, count: Int) + { + try withoutActuallyEscaping(areInIncreasingOrder) { + let sequence = LazyCountDuplicatesSequence(self, by: $0) + var iterator = sequence.makeIterator() + var result = T() + result.reserveCapacity(sequence.underestimatedCount) + while let element = try iterator.throwingNext() { + result.append(element) + } + return result + } + } + + /// Assuming this sequence is already sorted along the given predicate, + /// return an array of each unique element paired with its number of + /// occurances. + /// + /// - Precondition: This sequence must be finite, + /// and be sorted according to the given predicate. + /// + /// - Parameters: + /// - areInIncreasingOrder: The sorting predicate. + /// - Returns: An array of pairs, + /// one for each element equivalence class present in this sequence, + /// in order of appearance. + /// The first member is the value of the earliest element for + /// an equivalence class. + /// The second member is the number of occurances of that + /// equivalence class. + /// + /// - Complexity: O(`n`), where *n* is the length of this sequence. + @inlinable + public func countSortedDuplicates( + by areInIncreasingOrder: (Element, Element) throws -> Bool + ) rethrows -> [(value: Element, count: Int)] { + try countSortedDuplicates(storingIn: Array.self, by: areInIncreasingOrder) + } + + /// Assuming this sequence is already sorted along the given predicate, + /// return an array of each unique element, by equivalence class. + /// + /// - Precondition: This sequence must be finite, + /// and be sorted according to the given predicate. + /// + /// - Parameters: + /// - areInIncreasingOrder: The sorting predicate. + /// + /// - Returns: An array with the earliest element in this sequence for + /// each equivalence class. + /// + /// - Complexity: O(`n`), where *n* is the length of this sequence. + @inlinable + public func withoutSortedDuplicates( + by areInIncreasingOrder: (Element, Element) throws -> Bool + ) rethrows -> [Element] { + try countSortedDuplicates(by: areInIncreasingOrder).map(\.value) + } +} + +extension Sequence where Element: Comparable { + /// Assuming this sequence is already sorted, + /// return an array of each unique value paired with its number of + /// occurances. + /// + /// - Precondition: This sequence must be finite and sorted. + /// + /// - Returns: An array of pairs, + /// one for each unique element value in this sequence, + /// in order of appearance. + /// The first member is the earliest element for a value. + /// The second member is the count of that value's occurances. + /// + /// - Complexity: O(`n`), where *n* is the length of this sequence. + @inlinable + public func countSortedDuplicates() -> [(value: Element, count: Int)] { + countSortedDuplicates(by: <) + } + + /// Assuming this sequence is already sorted, + /// return an array of the first elements of each unique value. + /// + /// - Precondition: This sequence must be finite and sorted. + /// + /// - Parameters: + /// - areInIncreasingOrder: The sorting predicate. + /// + /// - Returns: An array with the earliest element in this sequence for + /// each value. + /// + /// - Complexity: O(`n`), where *n* is the length of this sequence. + @inlinable + public func withoutSortedDuplicates() -> [Element] { + withoutSortedDuplicates(by: <) + } +} + +extension LazySequenceProtocol { + /// Assuming this sequence is already sorted along the given predicate, + /// return a sequence that will lazily generate each unique + /// element paired with its number of occurances. + /// + /// - Precondition: This squence is sorted according to the given predicate, + /// and cannot end with an infinite run of a single equivalence class. + /// + /// - Parameters: + /// - areInIncreasingOrder: The sorting predicate. + /// + /// - Returns: A sequence that lazily generates the first element of + /// each equivalence class present in this sequence paired with + /// the number of occurances for that class. + @inlinable + public func countSortedDuplicates( + by areInIncreasingOrder: @escaping (Element, Element) -> Bool + ) -> LazyCountDuplicatesSequence { + .init(elements, by: areInIncreasingOrder) + } + + /// Assuming this sequence is already sorted along the given predicate, + /// return a sequence that will lazily vend each unique element. + /// + /// - Precondition: This squence is sorted according to the given predicate, + /// and cannot end with an infinite run of a single equivalence class. + /// + /// - Parameters: + /// - areInIncreasingOrder: The sorting predicate. + /// + /// - Returns: A sequence that lazily generates the first element of + /// each equivalence class present in this sequence. + @inlinable + public func withoutSortedDuplicates( + by areInIncreasingOrder: @escaping (Element, Element) -> Bool + ) -> some (Sequence & LazySequenceProtocol) { + countSortedDuplicates(by: areInIncreasingOrder).lazy.map(\.value) + } +} + +extension LazySequenceProtocol where Element: Comparable { + /// Assuming this sequence is already sorted, + /// return an array of each unique value paired with its number of + /// occurances. + /// + /// - Precondition: This sequence is sorted, + /// and cannot end with an infinite run of a single value. + /// + /// - Returns: A sequence that lazily generates the first element of + /// each value paired with the count of that value's occurances. + @inlinable + public func countSortedDuplicates() -> LazyCountDuplicatesSequence { + countSortedDuplicates(by: <) + } + + /// Assuming this sequence is already sorted, + /// return a sequence that will lazily vend each unique value. + /// + /// - Precondition: This sequence is sorted, + /// and cannot end with an infinite run of a single value. + /// + /// - Returns: A sequence that lazily generates the first element of + /// each value. + @inlinable + public func withoutSortedDuplicates() -> some ( + Sequence & LazySequenceProtocol + ) { + withoutSortedDuplicates(by: <) + } +} + +// MARK: - Sequence + +/// Lazily vends the count of each run of duplicate values from +/// a sorted source. +public struct LazyCountDuplicatesSequence { + /// The predicate for which `base` is sorted by. + let areInIncreasingOrder: (Base.Element, Base.Element) throws -> Bool + /// The source of elements, which must be sorted by `areInIncreasingOrder`. + var base: Base + + /// Creates a sequence based on the given sequence, + /// which must be sorted by the given predicate, + /// that'll vend each unique element value and that value's appearance count. + @usableFromInline + init( + _ base: Base, + by areInIncreasingOrder: @escaping (Base.Element, Base.Element) throws -> + Bool + ) { + self.base = base + self.areInIncreasingOrder = areInIncreasingOrder + } +} + +extension LazyCountDuplicatesSequence: LazySequenceProtocol { + public var underestimatedCount: Int { + base.underestimatedCount.signum() + } + + public func makeIterator() -> CountDuplicatesIterator { + .init(base.makeIterator(), by: areInIncreasingOrder) + } +} + +// MARK: - Iterator + +/// Vends the count of each run of duplicate values from a sorted source. +public struct CountDuplicatesIterator { + /// The predicate for which `base` is sorted by. + let areInIncreasingOrder: (Base.Element, Base.Element) throws -> Bool + /// The source of elements, which must be sorted by `areInIncreasingOrder`. + var base: Base + /// The last element read, for comparisons. + var mostRecent: Base.Element? + + /// Creates an iterator based on the given iterator, + /// whose virtual sequence must be sorted by the given predicate, + /// which counts the length of each run of duplicate values. + init( + _ base: Base, + by areInIncreasingOrder: @escaping (Base.Element, Base.Element) throws -> + Bool + ) { + self.base = base + self.areInIncreasingOrder = areInIncreasingOrder + } +} + +extension CountDuplicatesIterator: IteratorProtocol { + public mutating func next() -> (value: Base.Element, count: Int)? { + try! throwingNext() + } + + /// Extracts the next element that isn't equivalent to + /// the last unique one extracted. + mutating func throwingNext() throws -> Element? { + mostRecent = mostRecent ?? base.next() + guard let last = mostRecent else { return nil } + + var count = 1 + while let current = base.next() { + if try areInIncreasingOrder(last, current) { + mostRecent = current + return (last, count) + } else { + count += 1 + } + } + mostRecent = nil + return (last, count) + } +} diff --git a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift new file mode 100644 index 00000000..bf7b8543 --- /dev/null +++ b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift @@ -0,0 +1,82 @@ +//===----------------------------------------------------------------------===// +// +// This source file is part of the Swift Algorithms open source project +// +// Copyright (c) 2025 Apple Inc. and the Swift project authors +// Licensed under Apache License v2.0 with Runtime Library Exception +// +// See https://swift.org/LICENSE.txt for license information +// +//===----------------------------------------------------------------------===// + +import XCTest + +@testable import Algorithms + +final class SortedDuplicatesTests: XCTestCase { + /// Test counting over an empty sequence. + func testEmpty() { + let emptyString = "" + let emptyStringCounts = emptyString.countSortedDuplicates() + expectEqualCollections(emptyStringCounts.map(\.value), []) + expectEqualCollections(emptyStringCounts.map(\.count), []) + expectEqualCollections(emptyString.withoutSortedDuplicates(), []) + + let lazyEmptyStringCounts = emptyString.lazy.countSortedDuplicates() + expectEqualSequences(lazyEmptyStringCounts.map(\.value), []) + expectEqualSequences(lazyEmptyStringCounts.map(\.count), []) + expectEqualSequences(emptyString.lazy.withoutSortedDuplicates(), []) + } + + /// Test counting over a single-element sequence. + func testSingle() { + let aString = "a" + let aStringCounts = aString.countSortedDuplicates() + let aStringValues = aString.withoutSortedDuplicates() + expectEqualCollections(aStringCounts.map(\.value), ["a"]) + expectEqualCollections(aStringCounts.map(\.count), [1]) + expectEqualCollections(aStringValues, ["a"]) + + let lazyAStringCounts = aString.lazy.countSortedDuplicates() + expectEqualSequences(lazyAStringCounts.map(\.value), ["a"]) + expectEqualSequences(lazyAStringCounts.map(\.count), [1]) + expectEqualSequences(aString.lazy.withoutSortedDuplicates(), ["a"]) + } + + /// Test counting over a repeated element. + func testRepeat() { + let count = 20 + let letters = repeatElement("b" as Character, count: count) + let lettersCounts = letters.countSortedDuplicates() + let lazyLettersCounts = letters.lazy.countSortedDuplicates() + expectEqualCollections(lettersCounts.map(\.value), ["b"]) + expectEqualCollections(lettersCounts.map(\.count), [count]) + expectEqualCollections(letters.withoutSortedDuplicates(), ["b"]) + expectEqualSequences(lazyLettersCounts.map(\.value), ["b"]) + expectEqualSequences(lazyLettersCounts.map(\.count), [count]) + expectEqualSequences(letters.lazy.withoutSortedDuplicates(), ["b"]) + } + + /// Test multiple elements. + func testMultiple() { + let sample = "Xacccddffffxzz" + let sampleCounts = sample.countSortedDuplicates() + let expected: [(value: Character, count: Int)] = [ + ("X", 1), + ("a", 1), + ("c", 3), + ("d", 2), + ("f", 4), + ("x", 1), + ("z", 2), + ] + expectEqualCollections(sampleCounts.map(\.value), expected.map(\.0)) + expectEqualCollections(sampleCounts.map(\.count), expected.map(\.1)) + expectEqualCollections(sample.withoutSortedDuplicates(), "Xacdfxz") + + let lazySampleCounts = sample.lazy.countSortedDuplicates() + expectEqualSequences(lazySampleCounts.map(\.value), expected.map(\.0)) + expectEqualSequences(lazySampleCounts.map(\.count), expected.map(\.1)) + expectEqualSequences(sample.lazy.withoutSortedDuplicates(), "Xacdfxz") + } +} From 7677893c7baf706a1f151777eecf2634635aa185 Mon Sep 17 00:00:00 2001 From: Daryle Walker Date: Tue, 6 May 2025 11:47:39 -0600 Subject: [PATCH 2/6] Remove copy/paste error --- Sources/Algorithms/SortedDuplicates.swift | 3 --- 1 file changed, 3 deletions(-) diff --git a/Sources/Algorithms/SortedDuplicates.swift b/Sources/Algorithms/SortedDuplicates.swift index 5bfbb35c..11ff6738 100644 --- a/Sources/Algorithms/SortedDuplicates.swift +++ b/Sources/Algorithms/SortedDuplicates.swift @@ -119,9 +119,6 @@ extension Sequence where Element: Comparable { /// /// - Precondition: This sequence must be finite and sorted. /// - /// - Parameters: - /// - areInIncreasingOrder: The sorting predicate. - /// /// - Returns: An array with the earliest element in this sequence for /// each value. /// From b4370c94ea1b716a2c4a8ff0da85296a9c8fa06d Mon Sep 17 00:00:00 2001 From: Daryle Walker Date: Tue, 6 May 2025 11:50:02 -0600 Subject: [PATCH 3/6] Clean up documentation; explain use of try! --- Sources/Algorithms/SortedDuplicates.swift | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/Sources/Algorithms/SortedDuplicates.swift b/Sources/Algorithms/SortedDuplicates.swift index 11ff6738..66dc3341 100644 --- a/Sources/Algorithms/SortedDuplicates.swift +++ b/Sources/Algorithms/SortedDuplicates.swift @@ -18,9 +18,8 @@ extension Sequence { /// - Precondition: This sequence must be finite, /// and be sorted according to the given predicate. /// - /// - Parameters: - /// - type: A reference to the returned collection's type. - /// - areInIncreasingOrder: The sorting predicate. + /// - Parameter type: A reference to the returned collection's type. + /// - Parameter areInIncreasingOrder: The sorting predicate. /// - Returns: A collection of pairs, /// one for each element equivalence class present in this sequence, /// in order of appearance. @@ -56,8 +55,7 @@ extension Sequence { /// - Precondition: This sequence must be finite, /// and be sorted according to the given predicate. /// - /// - Parameters: - /// - areInIncreasingOrder: The sorting predicate. + /// - Parameter areInIncreasingOrder: The sorting predicate. /// - Returns: An array of pairs, /// one for each element equivalence class present in this sequence, /// in order of appearance. @@ -80,8 +78,7 @@ extension Sequence { /// - Precondition: This sequence must be finite, /// and be sorted according to the given predicate. /// - /// - Parameters: - /// - areInIncreasingOrder: The sorting predicate. + /// - Parameter areInIncreasingOrder: The sorting predicate. /// /// - Returns: An array with the earliest element in this sequence for /// each equivalence class. @@ -137,8 +134,7 @@ extension LazySequenceProtocol { /// - Precondition: This squence is sorted according to the given predicate, /// and cannot end with an infinite run of a single equivalence class. /// - /// - Parameters: - /// - areInIncreasingOrder: The sorting predicate. + /// - Parameter areInIncreasingOrder: The sorting predicate. /// /// - Returns: A sequence that lazily generates the first element of /// each equivalence class present in this sequence paired with @@ -156,8 +152,7 @@ extension LazySequenceProtocol { /// - Precondition: This squence is sorted according to the given predicate, /// and cannot end with an infinite run of a single equivalence class. /// - /// - Parameters: - /// - areInIncreasingOrder: The sorting predicate. + /// - Parameter areInIncreasingOrder: The sorting predicate. /// /// - Returns: A sequence that lazily generates the first element of /// each equivalence class present in this sequence. @@ -260,6 +255,8 @@ public struct CountDuplicatesIterator { extension CountDuplicatesIterator: IteratorProtocol { public mutating func next() -> (value: Base.Element, count: Int)? { + // NOTE: This method is called only when the predicate isn't `throw`-ing, + // so the forced `try` is OK. try! throwingNext() } From 8f1c906fbd0126cc277a7158e5bd97e48ae9e552 Mon Sep 17 00:00:00 2001 From: Daryle Walker Date: Tue, 6 May 2025 12:12:09 -0600 Subject: [PATCH 4/6] Rename duplicate-stripping functions --- Sources/Algorithms/SortedDuplicates.swift | 12 ++++++------ .../SortedDuplicatesTests.swift | 16 ++++++++-------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/Sources/Algorithms/SortedDuplicates.swift b/Sources/Algorithms/SortedDuplicates.swift index 66dc3341..663006fd 100644 --- a/Sources/Algorithms/SortedDuplicates.swift +++ b/Sources/Algorithms/SortedDuplicates.swift @@ -85,7 +85,7 @@ extension Sequence { /// /// - Complexity: O(`n`), where *n* is the length of this sequence. @inlinable - public func withoutSortedDuplicates( + public func deduplicateSorted( by areInIncreasingOrder: (Element, Element) throws -> Bool ) rethrows -> [Element] { try countSortedDuplicates(by: areInIncreasingOrder).map(\.value) @@ -121,8 +121,8 @@ extension Sequence where Element: Comparable { /// /// - Complexity: O(`n`), where *n* is the length of this sequence. @inlinable - public func withoutSortedDuplicates() -> [Element] { - withoutSortedDuplicates(by: <) + public func deduplicateSorted() -> [Element] { + deduplicateSorted(by: <) } } @@ -157,7 +157,7 @@ extension LazySequenceProtocol { /// - Returns: A sequence that lazily generates the first element of /// each equivalence class present in this sequence. @inlinable - public func withoutSortedDuplicates( + public func deduplicateSorted( by areInIncreasingOrder: @escaping (Element, Element) -> Bool ) -> some (Sequence & LazySequenceProtocol) { countSortedDuplicates(by: areInIncreasingOrder).lazy.map(\.value) @@ -188,10 +188,10 @@ extension LazySequenceProtocol where Element: Comparable { /// - Returns: A sequence that lazily generates the first element of /// each value. @inlinable - public func withoutSortedDuplicates() -> some ( + public func deduplicateSorted() -> some ( Sequence & LazySequenceProtocol ) { - withoutSortedDuplicates(by: <) + deduplicateSorted(by: <) } } diff --git a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift index bf7b8543..4df2d9bc 100644 --- a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift +++ b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift @@ -20,19 +20,19 @@ final class SortedDuplicatesTests: XCTestCase { let emptyStringCounts = emptyString.countSortedDuplicates() expectEqualCollections(emptyStringCounts.map(\.value), []) expectEqualCollections(emptyStringCounts.map(\.count), []) - expectEqualCollections(emptyString.withoutSortedDuplicates(), []) + expectEqualCollections(emptyString.deduplicateSorted(), []) let lazyEmptyStringCounts = emptyString.lazy.countSortedDuplicates() expectEqualSequences(lazyEmptyStringCounts.map(\.value), []) expectEqualSequences(lazyEmptyStringCounts.map(\.count), []) - expectEqualSequences(emptyString.lazy.withoutSortedDuplicates(), []) + expectEqualSequences(emptyString.lazy.deduplicateSorted(), []) } /// Test counting over a single-element sequence. func testSingle() { let aString = "a" let aStringCounts = aString.countSortedDuplicates() - let aStringValues = aString.withoutSortedDuplicates() + let aStringValues = aString.deduplicateSorted() expectEqualCollections(aStringCounts.map(\.value), ["a"]) expectEqualCollections(aStringCounts.map(\.count), [1]) expectEqualCollections(aStringValues, ["a"]) @@ -40,7 +40,7 @@ final class SortedDuplicatesTests: XCTestCase { let lazyAStringCounts = aString.lazy.countSortedDuplicates() expectEqualSequences(lazyAStringCounts.map(\.value), ["a"]) expectEqualSequences(lazyAStringCounts.map(\.count), [1]) - expectEqualSequences(aString.lazy.withoutSortedDuplicates(), ["a"]) + expectEqualSequences(aString.lazy.deduplicateSorted(), ["a"]) } /// Test counting over a repeated element. @@ -51,10 +51,10 @@ final class SortedDuplicatesTests: XCTestCase { let lazyLettersCounts = letters.lazy.countSortedDuplicates() expectEqualCollections(lettersCounts.map(\.value), ["b"]) expectEqualCollections(lettersCounts.map(\.count), [count]) - expectEqualCollections(letters.withoutSortedDuplicates(), ["b"]) + expectEqualCollections(letters.deduplicateSorted(), ["b"]) expectEqualSequences(lazyLettersCounts.map(\.value), ["b"]) expectEqualSequences(lazyLettersCounts.map(\.count), [count]) - expectEqualSequences(letters.lazy.withoutSortedDuplicates(), ["b"]) + expectEqualSequences(letters.lazy.deduplicateSorted(), ["b"]) } /// Test multiple elements. @@ -72,11 +72,11 @@ final class SortedDuplicatesTests: XCTestCase { ] expectEqualCollections(sampleCounts.map(\.value), expected.map(\.0)) expectEqualCollections(sampleCounts.map(\.count), expected.map(\.1)) - expectEqualCollections(sample.withoutSortedDuplicates(), "Xacdfxz") + expectEqualCollections(sample.deduplicateSorted(), "Xacdfxz") let lazySampleCounts = sample.lazy.countSortedDuplicates() expectEqualSequences(lazySampleCounts.map(\.value), expected.map(\.0)) expectEqualSequences(lazySampleCounts.map(\.count), expected.map(\.1)) - expectEqualSequences(sample.lazy.withoutSortedDuplicates(), "Xacdfxz") + expectEqualSequences(sample.lazy.deduplicateSorted(), "Xacdfxz") } } From 2c089e3f141adcaac46f6696378856489de1fc02 Mon Sep 17 00:00:00 2001 From: Daryle Walker Date: Tue, 6 May 2025 13:26:06 -0600 Subject: [PATCH 5/6] Change functions' categories; add test for example --- .../Algorithms/Documentation.docc/Filtering.md | 15 +++++++++++++++ Sources/Algorithms/Documentation.docc/Keying.md | 12 ++++++++++++ .../Algorithms/Documentation.docc/Selecting.md | 16 ---------------- .../SortedDuplicatesTests.swift | 9 +++++++++ 4 files changed, 36 insertions(+), 16 deletions(-) diff --git a/Sources/Algorithms/Documentation.docc/Filtering.md b/Sources/Algorithms/Documentation.docc/Filtering.md index 85073ab8..9fbb4b34 100644 --- a/Sources/Algorithms/Documentation.docc/Filtering.md +++ b/Sources/Algorithms/Documentation.docc/Filtering.md @@ -21,6 +21,14 @@ let withNoNils = array.compacted() // Array(withNoNils) == [10, 30, 2, 3, 5] ``` +The `withoutSortedDuplicates()` methods remove consecutive elements of the same equivalence class from an already sorted sequence, turning a possibly non-decreasing sequence to a strictly-increasing one. The sorting predicate can be supplied. + +```swift +let numbers = [0, 1, 2, 2, 2, 3, 5, 6, 6, 9, 10, 10] +let deduplicated = numbers.withoutSortedDuplicates() +// Array(deduplicated) == [0, 1, 2, 3, 5, 6, 9, 10] +``` + ## Topics ### Uniquing Elements @@ -34,6 +42,13 @@ let withNoNils = array.compacted() - ``Swift/Collection/compacted()`` - ``Swift/Sequence/compacted()`` +### Removing Duplicates from a Sorted Sequence + +- ``Swift/Sequence/withoutSortedDuplicates(by:)`` +- ``Swift/Sequence/withoutSortedDuplicates()`` +- ``Swift/LazySequenceProtocol/withoutSortedDuplicates(by:)`` +- ``Swift/LazySequenceProtocol/withoutSortedDuplicates()`` + ### Supporting Types - ``UniquedSequence`` diff --git a/Sources/Algorithms/Documentation.docc/Keying.md b/Sources/Algorithms/Documentation.docc/Keying.md index aa296161..8625f19f 100644 --- a/Sources/Algorithms/Documentation.docc/Keying.md +++ b/Sources/Algorithms/Documentation.docc/Keying.md @@ -12,3 +12,15 @@ Convert a sequence to a dictionary, providing keys to individual elements or to ### Grouping Elements by Key - ``Swift/Sequence/grouped(by:)`` + +### Counting each Element in a Sorted Sequence + +- ``Swift/Sequence/countSortedDuplicates(by:)`` +- ``Swift/Sequence/countSortedDuplicates()`` +- ``Swift/LazySequenceProtocol/countSortedDuplicates(by:)`` +- ``Swift/LazySequenceProtocol/countSortedDuplicates()`` + +### Supporting Types + +- ``LazyCountDuplicatesSequence`` +- ``CountDuplicatesIterator`` diff --git a/Sources/Algorithms/Documentation.docc/Selecting.md b/Sources/Algorithms/Documentation.docc/Selecting.md index aca0c242..e5bed603 100644 --- a/Sources/Algorithms/Documentation.docc/Selecting.md +++ b/Sources/Algorithms/Documentation.docc/Selecting.md @@ -18,24 +18,8 @@ or iterate of elements with their indices. - ``Swift/Collection/indexed()`` -### Counting each Element in a Sorted Sequence - -- ``Swift/Sequence/countSortedDuplicates(by:)`` -- ``Swift/Sequence/countSortedDuplicates()`` -- ``Swift/LazySequenceProtocol/countSortedDuplicates(by:)`` -- ``Swift/LazySequenceProtocol/countSortedDuplicates()`` - -### Removing Duplicates from a Sorted Sequence - -- ``Swift/Sequence/withoutSortedDuplicates(by:)`` -- ``Swift/Sequence/withoutSortedDuplicates()`` -- ``Swift/LazySequenceProtocol/withoutSortedDuplicates(by:)`` -- ``Swift/LazySequenceProtocol/withoutSortedDuplicates()`` - ### Supporting Types - ``IndexedCollection`` - ``StridingSequence`` - ``StridingCollection`` -- ``LazyCountDuplicatesSequence`` -- ``CountDuplicatesIterator`` diff --git a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift index 4df2d9bc..976f3eb4 100644 --- a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift +++ b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift @@ -79,4 +79,13 @@ final class SortedDuplicatesTests: XCTestCase { expectEqualSequences(lazySampleCounts.map(\.count), expected.map(\.1)) expectEqualSequences(sample.lazy.deduplicateSorted(), "Xacdfxz") } + + /// Test the example code from the Overview. + func testOverviewExample() { + let numbers = [0, 1, 2, 2, 2, 3, 5, 6, 6, 9, 10, 10] + let deduplicated = numbers.withoutSortedDuplicates() + // Array(deduplicated) == [0, 1, 2, 3, 5, 6, 9, 10] + + expectEqualSequences(deduplicated, [0, 1, 2, 3, 5, 6, 9, 10]) + } } From 5d2b7085b70271253dd7987fb468d6f55adc02ad Mon Sep 17 00:00:00 2001 From: Daryle Walker Date: Tue, 6 May 2025 13:28:23 -0600 Subject: [PATCH 6/6] Continue renaming function --- CHANGELOG.md | 2 +- Guides/README.md | 2 +- Guides/SortedDuplicates.md | 8 ++++---- Sources/Algorithms/Documentation.docc/Filtering.md | 12 ++++++------ .../SwiftAlgorithmsTests/SortedDuplicatesTests.swift | 2 +- 5 files changed, 13 insertions(+), 13 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c39be4df..c1053d8e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,7 +13,7 @@ This project follows semantic versioning. - Bidirectional collections have a new `ends(with:)` method that matches the behavior of the standard library's `starts(with:)` method. ([#224]) - Sequences that are already sorted can use the `countSortedDuplicates` and - `withoutSortedDuplicates` methods, with eager and lazy versions. + `deduplicateSorted` methods, with eager and lazy versions. The former returns each unique value paired with the count of that value's occurances. The latter returns each unique value, diff --git a/Guides/README.md b/Guides/README.md index 8a245609..effbfbb1 100644 --- a/Guides/README.md +++ b/Guides/README.md @@ -25,6 +25,7 @@ These guides describe the design and intention behind the APIs included in the ` #### Subsetting operations - [`compacted()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Compacted.md): Drops the `nil`s from a sequence or collection, unwrapping the remaining elements. +- [`deduplicateSorted()`, `deduplicateSorted(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/SortedDuplicates.md): Given an already-sorted sequence and the sorting predicate, reduce all runs of a unique value to a single element each. Has eager and lazy variants. - [`partitioned(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Partition.md): Returns the elements in a sequence or collection that do and do not match a given predicate. - [`randomSample(count:)`, `randomSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection. - [`randomStableSample(count:)`, `randomStableSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection, preserving their original relative order. @@ -32,7 +33,6 @@ These guides describe the design and intention behind the APIs included in the ` - [`suffix(while:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Suffix.md): Returns the suffix of a collection where all element pass a given predicate. - [`trimmingPrefix(while:)`, `trimmingSuffix(while)`, `trimming(while:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Trim.md): Returns a slice by trimming elements from a collection's start, end, or both. The mutating `trim...` methods trim a collection in place. - [`uniqued()`, `uniqued(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Unique.md): The unique elements of a collection, preserving their order. -- [`withoutSortedDuplicates()`, `withoutSortedDuplicates(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/SortedDuplicates.md): Given an already-sorted sequence and the sorting predicate, reduce all runs of a unique value to a single element each. Has eager and lazy variants. - [`minAndMax()`, `minAndMax(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/MinMax.md): Returns the smallest and largest elements of a sequence. #### Partial sorting diff --git a/Guides/SortedDuplicates.md b/Guides/SortedDuplicates.md index 1c6a3e7d..31d0142d 100644 --- a/Guides/SortedDuplicates.md +++ b/Guides/SortedDuplicates.md @@ -21,7 +21,7 @@ extension Sequence { by areInIncreasingOrder: (Element, Element) throws -> Bool ) rethrows -> [(value: Element, count: Int)] - public func withoutSortedDuplicates( + public func deduplicateSorted( by areInIncreasingOrder: (Element, Element) throws -> Bool ) rethrows -> [Element] } @@ -29,7 +29,7 @@ extension Sequence { extension Sequence where Self.Element : Comparable { public func countSortedDuplicates() -> [(value: Element, count: Int)] - public func withoutSortedDuplicates() -> [Element] + public func deduplicateSorted() -> [Element] } extension LazySequenceProtocol { @@ -37,7 +37,7 @@ extension LazySequenceProtocol { by areInIncreasingOrder: @escaping (Element, Element) -> Bool ) -> LazyCountDuplicatesSequence - public func withoutSortedDuplicates( + public func deduplicateSorted( by areInIncreasingOrder: @escaping (Element, Element) -> Bool ) -> some (Sequence & LazySequenceProtocol) } @@ -46,7 +46,7 @@ extension LazySequenceProtocol where Self.Element : Comparable { public func countSortedDuplicates() -> LazyCountDuplicatesSequence - public func withoutSortedDuplicates() + public func deduplicateSorted() -> some (Sequence & LazySequenceProtocol) } diff --git a/Sources/Algorithms/Documentation.docc/Filtering.md b/Sources/Algorithms/Documentation.docc/Filtering.md index 9fbb4b34..7eb23a5a 100644 --- a/Sources/Algorithms/Documentation.docc/Filtering.md +++ b/Sources/Algorithms/Documentation.docc/Filtering.md @@ -21,11 +21,11 @@ let withNoNils = array.compacted() // Array(withNoNils) == [10, 30, 2, 3, 5] ``` -The `withoutSortedDuplicates()` methods remove consecutive elements of the same equivalence class from an already sorted sequence, turning a possibly non-decreasing sequence to a strictly-increasing one. The sorting predicate can be supplied. +The `deduplicateSorted()` methods remove consecutive elements of the same equivalence class from an already sorted sequence, turning a possibly non-decreasing sequence to a strictly-increasing one. The sorting predicate can be supplied. ```swift let numbers = [0, 1, 2, 2, 2, 3, 5, 6, 6, 9, 10, 10] -let deduplicated = numbers.withoutSortedDuplicates() +let deduplicated = numbers.deduplicateSorted() // Array(deduplicated) == [0, 1, 2, 3, 5, 6, 9, 10] ``` @@ -44,10 +44,10 @@ let deduplicated = numbers.withoutSortedDuplicates() ### Removing Duplicates from a Sorted Sequence -- ``Swift/Sequence/withoutSortedDuplicates(by:)`` -- ``Swift/Sequence/withoutSortedDuplicates()`` -- ``Swift/LazySequenceProtocol/withoutSortedDuplicates(by:)`` -- ``Swift/LazySequenceProtocol/withoutSortedDuplicates()`` +- ``Swift/Sequence/deduplicateSorted(by:)`` +- ``Swift/Sequence/deduplicateSorted()`` +- ``Swift/LazySequenceProtocol/deduplicateSorted(by:)`` +- ``Swift/LazySequenceProtocol/deduplicateSorted()`` ### Supporting Types diff --git a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift index 976f3eb4..824b7300 100644 --- a/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift +++ b/Tests/SwiftAlgorithmsTests/SortedDuplicatesTests.swift @@ -83,7 +83,7 @@ final class SortedDuplicatesTests: XCTestCase { /// Test the example code from the Overview. func testOverviewExample() { let numbers = [0, 1, 2, 2, 2, 3, 5, 6, 6, 9, 10, 10] - let deduplicated = numbers.withoutSortedDuplicates() + let deduplicated = numbers.deduplicateSorted() // Array(deduplicated) == [0, 1, 2, 3, 5, 6, 9, 10] expectEqualSequences(deduplicated, [0, 1, 2, 3, 5, 6, 9, 10])