Skip to content

Commit 504e5de

Browse files
committed
update docs + readme
1 parent 8007071 commit 504e5de

File tree

5 files changed

+22
-25
lines changed

5 files changed

+22
-25
lines changed

README.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -51,11 +51,12 @@ Let's simulate a pipeline that performs a series of transformations on some data
5151
```python
5252
import asyncio
5353
import time
54+
import typing
5455

5556
from pyper import task
5657

5758

58-
def step1(limit):
59+
def step1(limit: int):
5960
"""Generate some data."""
6061
for i in range(limit):
6162
yield i
@@ -75,7 +76,7 @@ def step3(data: int):
7576
return 2 * data - 1
7677

7778

78-
async def print_sum(data):
79+
async def print_sum(data: typing.AsyncGenerator[int]):
7980
"""Print the sum of values from a data stream."""
8081
total = 0
8182
async for output in data:
@@ -117,7 +118,7 @@ Having defined the logical operations we want to perform on our data as function
117118
```python
118119
# Analogous to:
119120
# pipeline = task(step1) | task(step2) | task(step3)
120-
async def pipeline(limit):
121+
async def pipeline(limit: int):
121122
for data in step1(limit):
122123
data = await step2(data)
123124
data = step3(data)
@@ -126,7 +127,7 @@ async def pipeline(limit):
126127

127128
# Analogous to:
128129
# run = pipeline > print_sum
129-
async def run(limit):
130+
async def run(limit: int):
130131
await print_sum(pipeline(limit))
131132

132133

@@ -152,7 +153,7 @@ Concurrent programming in Python is notoriously difficult to get right. In a con
152153
The basic approach to doing this is by using queues-- a simplified and very unabstracted implementation could be:
153154

154155
```python
155-
async def pipeline(limit):
156+
async def pipeline(limit: int):
156157
q1 = asyncio.Queue()
157158
q2 = asyncio.Queue()
158159
q3 = asyncio.Queue()
@@ -210,7 +211,7 @@ async def pipeline(limit):
210211
yield data
211212

212213

213-
async def run(limit):
214+
async def run(limit: int):
214215
await print_sum(pipeline(limit))
215216

216217

@@ -233,11 +234,12 @@ No-- not every program is asynchronous, so Pyper pipelines are by default synchr
233234

234235
```python
235236
import time
237+
import typing
236238

237239
from pyper import task
238240

239241

240-
def step1(limit):
242+
def step1(limit: int):
241243
for i in range(limit):
242244
yield i
243245

@@ -252,7 +254,7 @@ def step3(data: int):
252254
return 2 * data - 1
253255

254256

255-
def print_sum(data):
257+
def print_sum(data: typing.Generator[int]):
256258
total = 0
257259
for output in data:
258260
total += output

docs/src/docs/UserGuide/CombiningPipelines.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ permalink: /docs/UserGuide/CombiningPipelines
1212
1. TOC
1313
{:toc}
1414

15-
1615
## Piping and the `|` Operator
1716

1817
The `|` operator (inspired by UNIX syntax) is used to pipe one pipeline into another. This is syntactic sugar for the `Pipeline.pipe` method.
@@ -46,7 +45,6 @@ new_new_pipeline = p0 | new_pipeline | p4
4645
new_new_new_pipeline = new_pipeline | new_new_pipeline
4746
```
4847

49-
5048
## Consumer Functions and the `>` Operator
5149

5250
It is often useful to define resuable functions that process the results of a pipeline, which we'll call a 'consumer'. For example:
@@ -89,11 +87,8 @@ run = step1.pipe(step2).consume(JsonFileWriter("data.json"))
8987
run(limit=10)
9088
```
9189

92-
93-
The operator `>` is obviously not to be taken to mean 'greater than' when used in these contexts.
94-
9590
{: .info}
96-
Pyper comes with fantastic IDE intellisense support which understands these operators, and will always show you what the resulting type of a variable is (including the input and output type specs for pipelines)
91+
Pyper comes with fantastic IDE intellisense support which understands these operators, and will always show you which variables are `Pipeline` or `AsyncPipeline` objects; this also preserves type hints from your own functions, showing you the parameter and return type specs for each pipeline or consumer
9792

9893
## Asynchronous Code
9994

@@ -111,10 +106,10 @@ assert isinstance(task(func), AsyncPipeline)
111106

112107
When combining pipelines, the following rule applies:
113108

114-
* `Pipeline` > `Pipeline` = `Pipeline`
115-
* `Pipeline` > `AsyncPipeline` = `AsyncPipeline`
116-
* `AsyncPipeline` > `Pipeline` = `AsyncPipeline`
117-
* `AsyncPipeline` > `AsyncPipeline` = `AsyncPipeline`
109+
* `Pipeline` + `Pipeline` = `Pipeline`
110+
* `Pipeline` + `AsyncPipeline` = `AsyncPipeline`
111+
* `AsyncPipeline` + `Pipeline` = `AsyncPipeline`
112+
* `AsyncPipeline` + `AsyncPipeline` = `AsyncPipeline`
118113

119114
In other words:
120115

docs/src/docs/UserGuide/Considerations.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -74,12 +74,13 @@ The advantage of using `daemon` threads is that they do not prevent the main pro
7474
Therefore, there is a simple consideration that determines whether to set `daemon=True` on a particular task:
7575

7676
{: .info}
77-
Tasks can be created with `daemon=True` when they do NOT reach out to external resources.
77+
Tasks can be created with `daemon=True` when they do NOT reach out to external resources
7878

79-
This includes:
80-
* Pure functions, which simply take an input and generate an output
81-
* Functions that depend on or modify some external Python state, like an `Object` or a `Class`
79+
This includes all **pure functions** (functions which simply take an input and generate an output, without mutating external state).
8280

8381
Functions that should _not_ use `daemon` threads include:
8482
* Writing to a database
85-
* Reading from a file
83+
* Processing a file
84+
* Making a network request
85+
86+
Recall that only synchronous tasks can be created with `daemon=True`.

docs/src/docs/UserGuide/CreatingPipelines.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ In addition to functions, anything `callable` in Python can be wrapped in `task`
4545
from pyper import task
4646

4747
class Doubler:
48-
def __call__(self, x):
48+
def __call__(self, x: int):
4949
return 2 * x
5050

5151
pipeline1 = task(Doubler())

docs/src/docs/UserGuide/TaskParameters.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ permalink: /docs/UserGuide/TaskParameters
1212
1. TOC
1313
{:toc}
1414

15-
1615
> For convenience, we will use the following terminology on this page:
1716
> * **Producer**: The _first_ task within a pipeline
1817
> * **Producer-consumer**: Any task after the first task within a pipeline

0 commit comments

Comments
 (0)