8
8
[ ![ image] ( https://img.shields.io/github/actions/workflow/status/pytask-dev/pytask-parallel/main.yml?branch=main )] ( https://github.com/pytask-dev/pytask-parallel/actions?query=branch%3Amain )
9
9
[ ![ image] ( https://codecov.io/gh/pytask-dev/pytask-parallel/branch/main/graph/badge.svg )] ( https://codecov.io/gh/pytask-dev/pytask-parallel )
10
10
[ ![ pre-commit.ci status] ( https://results.pre-commit.ci/badge/github/pytask-dev/pytask-parallel/main.svg )] ( https://results.pre-commit.ci/latest/github/pytask-dev/pytask-parallel/main )
11
- [ ![ image ] ( https://img.shields.io/badge/code%20style-black-000000.svg )] ( https://github.com/psf/black )
11
+ [ ![ Ruff ] ( https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json )] ( https://github.com/astral-sh/ruff )
12
12
13
13
______________________________________________________________________
14
14
15
- Parallelize the execution of tasks with ` pytask-parallel ` which is a plugin for
15
+ Parallelize the execution of tasks with ` pytask-parallel ` , a plugin for
16
16
[ pytask] ( https://github.com/pytask-dev/pytask ) .
17
17
18
18
## Installation
@@ -28,11 +28,14 @@ $ pip install pytask-parallel
28
28
$ conda install -c conda-forge pytask-parallel
29
29
```
30
30
31
- By default, the plugin uses ` concurrent.futures.ProcessPoolExecutor ` .
31
+ By default, the plugin uses loky's reusable executor .
32
32
33
- It is also possible to select the executor from loky or ` ThreadPoolExecutor ` from the
34
- [ concurrent.futures] ( https://docs.python.org/3/library/concurrent.futures.html ) module
35
- as backends to execute tasks asynchronously.
33
+ The following backends are available:
34
+
35
+ - loky's [ ` get_reusable_executor ` ] ( https://loky.readthedocs.io/en/stable/API.html#loky.get_reusable_executor )
36
+ - ` ProcessPoolExecutor ` or ` ThreadPoolExecutor ` from
37
+ [ concurrent.futures] ( https://docs.python.org/3/library/concurrent.futures.html )
38
+ - dask's [ ` ClientExecutor ` ] ( https://distributed.dask.org/en/stable/api.html#distributed.Client.get_executor ) allows in combination with [ coiled] ( https://docs.coiled.io/user_guide/index.html ) to spawn clusters and workers on AWS, GCP, and other providers with minimal configuration.
36
39
37
40
## Usage
38
41
@@ -65,98 +68,22 @@ You can also set the options in a `pyproject.toml`.
65
68
66
69
[tool .pytask .ini_options ]
67
70
n_workers = 1
68
- parallel_backend = " processes" # or loky or threads
69
- ```
70
-
71
- ## Custom Executor
72
-
73
- > [ !NOTE]
74
- >
75
- > The interface for custom executors is rudimentary right now and there is not a lot of
76
- > support by public functions. Please, give some feedback if you are trying or managed
77
- > to use a custom backend.
78
- >
79
- > Also, please contribute your custom executors if you consider them useful to others.
80
-
81
- pytask-parallel allows you to use your parallel backend as long as it follows the
82
- interface defined by
83
- [ ` concurrent.futures.Executor ` ] ( https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor ) .
84
-
85
- In some cases, adding a new backend can be as easy as registering a builder function
86
- that receives some arguments (currently only ` n_workers ` ) and returns the instantiated
87
- executor.
88
-
89
- ``` python
90
- from concurrent.futures import Executor
91
- from my_project.executor import CustomExecutor
92
-
93
- from pytask_parallel import ParallelBackend, registry
94
-
95
-
96
- def build_custom_executor (n_workers : int ) -> Executor:
97
- return CustomExecutor(max_workers = n_workers)
98
-
99
-
100
- registry.register_parallel_backend(ParallelBackend.CUSTOM , build_custom_executor)
101
- ```
102
-
103
- Now, build the project requesting your custom backend.
104
-
105
- ``` console
106
- pytask --parallel-backend custom
71
+ parallel_backend = " loky" # or processes or threads
107
72
```
108
73
109
- Realistically, it is not the only necessary adjustment for a nice user experience. There
110
- are two other important things. pytask-parallel does not implement them by default since
111
- it seems more tightly coupled to your backend.
112
-
113
- 1 . A wrapper for the executed function that captures warnings, catches exceptions and
114
- saves products of the task (within the child process!).
115
-
116
- As an example, see
117
- [ ` def _execute_task() ` ] ( https://github.com/pytask-dev/pytask-parallel/blob/c441dbb75fa6ab3ab17d8ad5061840c802dc1c41/src/pytask_parallel/processes.py#L91-L155 )
118
- that does all that for the processes and loky backend.
119
-
120
- 1 . To apply the wrapper, you need to write a custom hook implementation for
121
- ` def pytask_execute_task() ` . See
122
- [ ` def pytask_execute_task() ` ] ( https://github.com/pytask-dev/pytask-parallel/blob/c441dbb75fa6ab3ab17d8ad5061840c802dc1c41/src/pytask_parallel/processes.py#L41-L65 )
123
- for an example. Use the
124
- [ ` hook_module ` ] ( https://pytask-dev.readthedocs.io/en/stable/how_to_guides/extending_pytask.html#using-hook-module-and-hook-module )
125
- configuration value to register your implementation.
126
-
127
- Another example of an implementation can be found as a
128
- [ test] ( https://github.com/pytask-dev/pytask-parallel/blob/c441dbb75fa6ab3ab17d8ad5061840c802dc1c41/tests/test_backends.py#L35-L78 ) .
129
-
130
- ## Some implementation details
131
-
132
- ### Parallelization and Debugging
74
+ ## Parallelization and Debugging
133
75
134
76
It is not possible to combine parallelization with debugging. That is why ` --pdb ` or
135
77
` --trace ` deactivate parallelization.
136
78
137
79
If you parallelize the execution of your tasks using two or more workers, do not use
138
80
` breakpoint() ` or ` import pdb; pdb.set_trace() ` since both will cause exceptions.
139
81
140
- ### Threads and warnings
82
+ ## Documentation
141
83
142
- Capturing warnings is not thread-safe. Therefore, warnings cannot be captured reliably
143
- when tasks are parallelized with ` --parallel-backend threads ` .
84
+ You find the documentation at < https://pytask-parallel.readthedocs.io/en/stable > .
144
85
145
86
## Changes
146
87
147
- Consult the [ release notes] ( CHANGES.md ) to find out about what is new.
148
-
149
- ## Development
150
-
151
- - ` pytask-parallel ` does not call the ` pytask_execute_task_protocol ` hook
152
- specification/entry-point because ` pytask_execute_task_setup ` and
153
- ` pytask_execute_task ` need to be separated from ` pytask_execute_task_teardown ` . Thus,
154
- plugins that change this hook specification may not interact well with the
155
- parallelization.
156
-
157
- - Two PRs for CPython try to re-enable setting custom reducers which should have been
158
- working but does not. Here are the references.
159
-
160
- - https://bugs.python.org/issue28053
161
- - https://github.com/python/cpython/pull/9959
162
- - https://github.com/python/cpython/pull/15058
88
+ Consult the [ release notes] ( https://pytask-parallel.readthedocs.io/en/stable/changes.html ) to
89
+ find out about what is new.
0 commit comments