Skip to content

Commit 3864606

Browse files
authored
Add "sortedPrefix(_:by)" to Collection (#9)
1 parent e1c421c commit 3864606

File tree

6 files changed

+452
-0
lines changed

6 files changed

+452
-0
lines changed
73.6 KB
Loading
77.4 KB
Loading

Guides/SortedPrefix.md

+48
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Sorted Prefix
2+
3+
[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/PartialSort.swift) |
4+
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/PartialSortTests.swift)]
5+
6+
Returns the first k elements of this collection when it's sorted.
7+
8+
If you need to sort a collection but only need access to a prefix of its elements, using this method can give you a performance boost over sorting the entire collection. The order of equal elements is guaranteed to be preserved.
9+
10+
```swift
11+
let numbers = [7,1,6,2,8,3,9]
12+
let smallestThree = numbers.sortedPrefix(3, by: <)
13+
// [1, 2, 3]
14+
```
15+
16+
## Detailed Design
17+
18+
This adds the `Collection` method shown below:
19+
20+
```swift
21+
extension Collection {
22+
public func sortedPrefix(_ count: Int, by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> [Element]
23+
}
24+
```
25+
26+
Additionally, a version of this method for `Comparable` types is also provided:
27+
28+
```swift
29+
extension Collection where Element: Comparable {
30+
public func sortedPrefix(_ count: Int) -> [Element]
31+
}
32+
```
33+
34+
### Complexity
35+
36+
The algorithm used is based on [Soroush Khanlou's research on this matter](https://khanlou.com/2018/12/analyzing-complexity/). The total complexity is `O(k log k + nk)`, which will result in a runtime close to `O(n)` if k is a small amount. If k is a large amount (more than 10% of the collection), we fall back to sorting the entire array. Realistically, this means the worst case is actually `O(n log n)`.
37+
38+
Here are some benchmarks we made that demonstrates how this implementation (SmallestM) behaves when k increases (before implementing the fallback):
39+
40+
![Benchmark](Resources/SortedPrefix/FewElements.png)
41+
![Benchmark 2](Resources/SortedPrefix/ManyElements.png)
42+
43+
### Comparison with other languages
44+
45+
**C++:** The `<algorithm>` library defines a `partial_sort` function where the entire array is returned using a partial heap sort.
46+
47+
**Python:** Defines a `heapq` priority queue that can be used to manually achieve the same result.
48+

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,10 @@ Read more about the package, and the intent behind it, in the [announcement on s
2828
- [`randomStableSample(count:)`, `randomStableSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection, preserving their original relative order.
2929
- [`uniqued()`, `uniqued(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Unique.md): The unique elements of a collection, preserving their order.
3030

31+
#### Partial sorting
32+
33+
- [`sortedPrefix(_:by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/SortedPrefix.md): Returns the first k elements of a sorted collection.
34+
3135
#### Other useful operations
3236

3337
- [`chunked(by:)`, `chunked(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes.

Sources/Algorithms/SortedPrefix.swift

+99
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift Algorithms open source project
4+
//
5+
// Copyright (c) 2020 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
//
10+
//===----------------------------------------------------------------------===//
11+
12+
extension Collection {
13+
/// Returns the first k elements of this collection when it's sorted using
14+
/// the given predicate as the comparison between elements.
15+
///
16+
/// This example partially sorts an array of integers to retrieve its three
17+
/// smallest values:
18+
///
19+
/// let numbers = [7,1,6,2,8,3,9]
20+
/// let smallestThree = numbers.sortedPrefix(3, by: <)
21+
/// // [1, 2, 3]
22+
///
23+
/// If you need to sort a collection but only need access to a prefix of its
24+
/// elements, using this method can give you a performance boost over sorting
25+
/// the entire collection. The order of equal elements is guaranteed to be
26+
/// preserved.
27+
///
28+
/// - Parameter count: The k number of elements to prefix.
29+
/// - Parameter areInIncreasingOrder: A predicate that returns true if its
30+
/// first argument should be ordered before its second argument;
31+
/// otherwise, false.
32+
///
33+
/// - Complexity: O(k log k + nk)
34+
public func sortedPrefix(
35+
_ count: Int,
36+
by areInIncreasingOrder: (Element, Element) throws -> Bool
37+
) rethrows -> [Self.Element] {
38+
assert(count >= 0, """
39+
Cannot prefix with a negative amount of elements!
40+
"""
41+
)
42+
43+
// Do nothing if we're prefixing nothing.
44+
guard count > 0 else {
45+
return []
46+
}
47+
48+
// Make sure we are within bounds.
49+
let prefixCount = Swift.min(count, self.count)
50+
51+
// If we're attempting to prefix more than 10% of the collection, it's
52+
// faster to sort everything.
53+
guard prefixCount < (self.count / 10) else {
54+
return Array(try sorted(by: areInIncreasingOrder).prefix(prefixCount))
55+
}
56+
57+
var result = try self.prefix(prefixCount).sorted(by: areInIncreasingOrder)
58+
for e in self.dropFirst(prefixCount) {
59+
if let last = result.last, try areInIncreasingOrder(last, e) {
60+
continue
61+
}
62+
let insertionIndex =
63+
try result.partitioningIndex { try areInIncreasingOrder(e, $0) }
64+
let isLastElement = insertionIndex == result.endIndex
65+
result.removeLast()
66+
if isLastElement {
67+
result.append(e)
68+
} else {
69+
result.insert(e, at: insertionIndex)
70+
}
71+
}
72+
73+
return result
74+
}
75+
}
76+
77+
extension Collection where Element: Comparable {
78+
/// Returns the first k elements of this collection when it's sorted in
79+
/// ascending order.
80+
///
81+
/// This example partially sorts an array of integers to retrieve its three
82+
/// smallest values:
83+
///
84+
/// let numbers = [7,1,6,2,8,3,9]
85+
/// let smallestThree = numbers.sortedPrefix(3)
86+
/// // [1, 2, 3]
87+
///
88+
/// If you need to sort a collection but only need access to a prefix of its
89+
/// elements, using this method can give you a performance boost over sorting
90+
/// the entire collection. The order of equal elements is guaranteed to be
91+
/// preserved.
92+
///
93+
/// - Parameter count: The k number of elements to prefix.
94+
///
95+
/// - Complexity: O(k log k + nk)
96+
public func sortedPrefix(_ count: Int) -> [Element] {
97+
return sortedPrefix(count, by: <)
98+
}
99+
}

0 commit comments

Comments
 (0)