Skip to content

Commit d359943

Browse files
authored
add parallel-letter-frequency (#61)
1 parent c6c7d9f commit d359943

File tree

15 files changed

+618
-0
lines changed

15 files changed

+618
-0
lines changed

config.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -393,6 +393,14 @@
393393
"prerequisites": [],
394394
"difficulty": 6
395395
},
396+
{
397+
"slug": "parallel-letter-frequency",
398+
"name": "Parallel Letter Frequency",
399+
"uuid": "67c85ebf-e5da-4f05-ae86-5a988c2ae499",
400+
"practices": [],
401+
"prerequisites": [],
402+
"difficulty": 6
403+
},
396404
{
397405
"slug": "change",
398406
"name": "Change",
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Instructions append
2+
3+
## Asynchronous tasks
4+
5+
[Tasks][tasks] are the primary abstraction for writing asynchronous code in Lean.
6+
They are lightweight and may execute in parallel on another thread, or concurrently on the same thread.
7+
8+
Tasks can be either pure or impure.
9+
Pure tasks are created with `Task.spawn`, which accepts a pure computation.
10+
Impure tasks are created from `IO` computations using, for example, `IO.asTask`, which lifts an `IO α` into a `Task α` within the `IO` monad.
11+
12+
In this exercise, the function `calculateFrequencies` is monadic, returning a `IO (Std.TreeMap Char Nat)`.
13+
This allows the use of impure tasks via `IO.asTask`, and also makes it possible to spawn pure tasks for intermediate computations before returning the final value.
14+
15+
## Preventing data races
16+
17+
As a functional language, most values in Lean are persistent, i.e., immutable.
18+
Operations that appear to modify a value actually produce a new value with the requested changes.
19+
This means that memory can often be shared safely between tasks without introducing data races.
20+
21+
Note, however, that not all data structures are equally well suited to persistent use.
22+
For example, updating a single element of an `Array` usually requires copying the entire array.
23+
The same is true for `Std.HashSet` and `Std.HashMap`.
24+
25+
To make such data structures efficient, Lean employs reference counting.
26+
As long as a value has a unique reference, updates may be performed destructively, avoiding unnecessary copying.
27+
28+
Other data structures, such as `List`, `Std.TreeSet` and `Std.TreeMap`, are designed to share structure internally.
29+
They reuse unchanged nodes when updates are performed, so that multiple values may share references to those nodes.
30+
As a result, modifying one part of the structure does not usually require copying the entire value.
31+
This makes them particularly well suited for sharing between tasks.
32+
33+
## Measuring execution speed
34+
35+
The execution time of each test is measured in nanoseconds using `IO.monoNanosNow` and shown alongside the results.
36+
You can experiment with different approaches and check their impact on runtime performance.
37+
38+
[tasks]: https://lean-lang.org/doc/reference/latest/IO/Tasks-and-Threads/#Task
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Instructions
2+
3+
Count the frequency of letters in texts using parallel computation.
4+
5+
Parallelism is about doing things in parallel that can also be done sequentially.
6+
A common example is counting the frequency of letters.
7+
Employ parallelism to calculate the total frequency of each letter in a list of texts.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
import Std
2+
3+
namespace ParallelLetterFrequency
4+
5+
def countText (text : String) : Std.TreeMap Char Nat :=
6+
text.foldl (fun acc ch =>
7+
if ch.isAlpha
8+
then acc.alter ch.toLower (fun
9+
| none => some 1
10+
| some v => some (v + 1)
11+
)
12+
else acc
13+
) {}
14+
15+
def calculateFrequencies (texts : List String) : IO (Std.TreeMap Char Nat) :=
16+
let tasks := texts.map (fun text => Task.spawn (fun () => countText text))
17+
return Task.get (Task.mapList (·.foldl (·.foldl (fun acc k v1 =>
18+
acc.alter k (fun
19+
| none => some v1
20+
| some v2 => some (v1 + v2)
21+
)
22+
) · )
23+
{} ) tasks)
24+
25+
end ParallelLetterFrequency
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{
2+
"authors": [
3+
"oxe-i"
4+
],
5+
"files": {
6+
"solution": [
7+
"ParallelLetterFrequency.lean"
8+
],
9+
"test": [
10+
"ParallelLetterFrequencyTest.lean"
11+
],
12+
"example": [
13+
".meta/Example.lean"
14+
]
15+
},
16+
"blurb": "Count the frequency of letters in texts using parallel computation."
17+
}
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# This is an auto-generated file.
2+
#
3+
# Regenerating this file via `configlet sync` will:
4+
# - Recreate every `description` key/value pair
5+
# - Recreate every `reimplements` key/value pair, where they exist in problem-specifications
6+
# - Remove any `include = true` key/value pair (an omitted `include` key implies inclusion)
7+
# - Preserve any other key/value pair
8+
#
9+
# As user-added comments (using the # character) will be removed when this file
10+
# is regenerated, comments can be added via a `comment` key.
11+
12+
[c054d642-c1fa-4234-8007-9339f2337886]
13+
description = "no texts"
14+
15+
[818031be-49dc-4675-b2f9-c4047f638a2a]
16+
description = "one text with one letter"
17+
18+
[c0b81d1b-940d-4cea-9f49-8445c69c17ae]
19+
description = "one text with multiple letters"
20+
21+
[708ff1e0-f14a-43fd-adb5-e76750dcf108]
22+
description = "two texts with one letter"
23+
24+
[1b5c28bb-4619-4c9d-8db9-a4bb9c3bdca0]
25+
description = "two texts with multiple letters"
26+
27+
[6366e2b8-b84c-4334-a047-03a00a656d63]
28+
description = "ignore letter casing"
29+
30+
[92ebcbb0-9181-4421-a784-f6f5aa79f75b]
31+
description = "ignore whitespace"
32+
33+
[bc5f4203-00ce-4acc-a5fa-f7b865376fd9]
34+
description = "ignore punctuation"
35+
36+
[68032b8b-346b-4389-a380-e397618f6831]
37+
description = "ignore numbers"
38+
39+
[aa9f97ac-3961-4af1-88e7-6efed1bfddfd]
40+
description = "Unicode letters"
41+
include = false
42+
43+
[7b1da046-701b-41fc-813e-dcfb5ee51813]
44+
description = "combination of lower- and uppercase letters, punctuation and white space"
45+
46+
[4727f020-df62-4dcf-99b2-a6e58319cb4f]
47+
description = "large texts"
48+
49+
[adf8e57b-8e54-4483-b6b8-8b32c115884c]
50+
description = "many small texts"
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
import Std
2+
3+
namespace ParallelLetterFrequency
4+
5+
def calculateFrequencies (texts : List String) : IO (Std.TreeMap Char Nat) :=
6+
sorry
7+
8+
end ParallelLetterFrequency

exercises/practice/parallel-letter-frequency/ParallelLetterFrequencyTest.lean

Lines changed: 97 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
name = "parallel-letter-frequency"
2+
version = "0.1.0"
3+
defaultTargets = ["ParallelLetterFrequencyTest"]
4+
testDriver = "ParallelLetterFrequencyTest"
5+
6+
[[lean_lib]]
7+
name = "LeanTest"
8+
srcDir = "vendor/LeanTest"
9+
10+
[[lean_lib]]
11+
name = "ParallelLetterFrequency"
12+
13+
[[lean_exe]]
14+
name = "ParallelLetterFrequencyTest"
15+
root = "ParallelLetterFrequencyTest"
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
leanprover/lean4:v4.25.2

0 commit comments

Comments
 (0)