You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The manual claims that `a` is split into `nthreads()` chunks, but this
is not true in general. As it was you could get an error, if `length(a)
< nthreads()`, or a number of chunks larger than `nthreads()`, if
`nthreads()` is smaller than `length(a)` but does not divide it. With
`cld`, on the other hand, you always get at most `nthreads()` chunks.
Note that the result is not `500000500000` as it should be, and will most likely change each evaluation.
228
228
229
229
To fix this, buffers that are specific to the task may be used to segment the sum into chunks that are race-free.
230
-
Here `sum_single` is reused, with its own internal buffer `s`. The input vector `a` is split into `nthreads()`
230
+
Here `sum_single` is reused, with its own internal buffer `s`. The input vector `a` is split into at most `nthreads()`
231
231
chunks for parallel work. We then use `Threads.@spawn` to create tasks that individually sum each chunk. Finally, we sum the results from each task using `sum_single` again:
0 commit comments