Releases: pytask-dev/pytask
v0.4.4
What's Changed
- Fix typing issues with the DataCatalog. by @tobiasraabe in #510
- [automated] Update plugin list by @github-actions in #511
- Improve the documentation. by @tobiasraabe in #509
Full Changelog: v0.4.3...v0.4.4
v0.4.3
What's Changed
This release contains a lot of smaller improvements and bug fixes. Here is a short list.
- #484 raises an error message when a
PathNodewas used with a directory instead a file. - #496 makes pytask even lazier. When a preceding task is executed and produces the same outputs, the following task will no longer be executed.
- Objects in task modules that overwrite
__getattr__should not cause any problems anymore (#507 was fixed in #508). Same applies to importTaskin task modules.
Complete list of changes
- Simplify the teardown of a task. by @tobiasraabe in #483
- Correctly unconfigure pytask. by @tobiasraabe in #485
- Raise informative error when path nodes point to directories. by @tobiasraabe in #484
- Add default names to
PPathNodes. by @tobiasraabe in #486 - Modernize
TopologicalSorter. by @tobiasraabe in #458 - Raise error for invalid value in return annotation. by @tobiasraabe in #488
- Refactor and better test products. by @tobiasraabe in #489
- Refactor and better test parsing of dependencies. by @tobiasraabe in #490
- Addition to #489. by @tobiasraabe in #491
- Make pytask even lazier. by @tobiasraabe in #496
- Bump sigstore/gh-action-sigstore-python from 1.2.3 to 2.1.0 by @dependabot in #495
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #494
- Remove unnecessary code from the collection of tasks. by @tobiasraabe in #497
- Fix errors when using
TaskandTaskWithoutPathin task modules. by @tobiasraabe in #498 - Allow tasks to depend on other tasks. by @tobiasraabe in #493
- Move test dependencies to
pyproject.tomlby @tobiasraabe in #500 - [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #499
- Remove
MetaNode. by @tobiasraabe in #501 - [automated] Update plugin list by @github-actions in #505
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #504
- Catch objects pretending to be
PTask. by @tobiasraabe in #508
Full Changelog: v0.4.2...v0.4.3
v0.4.2
Highlights
This release contains a new feature and some improvements for users.
- 🚀 The new feature is the
pytask.DataCatalogthat allows users to manage dependencies and products in projects more easily. Read the tutorial to get started. 🚀 - File changes are now detected by hashes instead of modification timestamps. It should prevent accidental executions when working with cloud storage providers like Dropbox or OneDrive and in many other situations. To save runtime, pytask uses a cache for the hashes when the modification timestamp has not changed.
- Nodes now have signatures that separate how nodes are named and displayed from how nodes are identified internally. If you have written a custom node, please update it according to the how-to guide.
- All of pytask's internal files are now stored in a
.pytaskfolder in your project. The file.pytask.sqlite3is moved to this location as well. Add.pytaskto your.gitignoreto prevent accidentally committing the folder.
What's Changed
- Simplify building the plugin manager. by @tobiasraabe in #449
- Rename
graph.pytodag_command.pyand improvecollect_command.py. by @tobiasraabe in #451 - Remove more
.svgs and replace them with animations. by @tobiasraabe in #454 - [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #452
- [automated] Update plugin list by @github-actions in #453
- Add more explanation when
PNode.load()fails during execution. by @tobiasraabe in #455 - Refer to source code on Github in API docs. by @tobiasraabe in #456
- Refactor code for
format_node_name. by @tobiasraabe in #457 - Add hook to sort
__all__. by @tobiasraabe in #459 - Simplify removing internal tracebacks from exceptions with cause. by @tobiasraabe in #460
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #461
- Fix import error for pluggy<1.3. by @tobiasraabe in #462
- Raise error when function is defined outside the loop body. by @tobiasraabe in #463
- Improve pins. by @tobiasraabe in #464
- Test that internal tracebacks are removed by reports. by @tobiasraabe in #465
- Add
is_producttoPNode.load(). by @tobiasraabe in #472 - Add a data catalog. by @tobiasraabe in #419
- Hash files instead of relying on modification timestamps. by @tobiasraabe in #469
- Move
.pytask.sqlite3to.pytask/. by @tobiasraabe in #470 - [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #471
- Update PyPI action. by @tobiasraabe in #477
- Add node signatures. by @tobiasraabe in #473
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #476
- Add snapshot tests. by @tobiasraabe in #475
- Switch from black to ruff-format. by @tobiasraabe in #478
- Rework reports and tracebacks. by @tobiasraabe in #474
- Give skips higher precendence than ancestor failed as outcome. by @tobiasraabe in #479
- Remove checks for missing root nodes. by @tobiasraabe in #480
- Improve coverage. by @tobiasraabe in #481
- Fix handling of names and signatures of
PythonNodes. by @tobiasraabe in #482
Full Changelog: v0.4.1...v0.4.2
v0.4.1
What's Changed
Of course, it's a mandatory bug fix release after a bigger release.
Using the product annotation, Annotated[..., Product] did not work with multiple products.
- Fix setting the name of
PythonNode. by @tobiasraabe in #443 - Move content of
setup.cfgtopyproject.toml. by @tobiasraabe in #444 - [automated] Update plugin list by @github-actions in #445
- Fix when multiple product annotations are used. by @tobiasraabe in #448
- Fix
PythonNodewhen used as return. by @tobiasraabe in #446 - Simplify the
tree_mapcode for generating the DAG. by @tobiasraabe in #447
Full Changelog: v0.4.0...v0.4.1
v0.4.0
News
pytask became three years old in July, which is a suitable event to rethink pytask's design and blow dust off of some of its oldest components.
Here are the highlights of v0.4.0 🚀 ⭐
Highlights
New interfaces for products.
Every argument can be declared as a product with the new' Product' annotation. The path can be passed as a default value.
from pathlib import Path
from pytask import Product
from typing_extensions import Annotated
def task_hello_earth(path: Annotated[Path, Product] = Path("hello_earth.txt")):
path.write_text("Hello, earth!")More explanation can be found at https://tinyurl.com/yrezszr4.
It is also possible to use the return of the task function as a product, which allows wrapping any third-party function as a task function. Read more about it here: https://tinyurl.com/pytask-return.
from pathlib import Path
from pytask import Product
from typing_extensions import Annotated
def task_hello_earth() -> Annotated[str, Path("hello_earth.txt")]:
return "Hello, earth!"Every task argument is a dependency
In older pytask versions, only paths were treated as task dependencies. That meant when you passed other arguments to the task, and they changed, it did not trigger a rerun of the task.
Now, every argument to a task can be a dependency, and you can hash them if they should trigger a rerun. It is explained in https://tinyurl.com/pytask-hash.
from pathlib import Path
from typing import Annotated
from pytask import Product
from pytask import PythonNode
def task_example(
text: Annotated[str, PythonNode(value="Hello, World", hash=True)],
path: Annotated[Path, Product] = Path("file.txt"),
) -> None:
path.write_text(text)A new functional interface
The functional interface for pytask has been reworked and accepts a list of task functions. You can use it within your terminal or a Jupyter notebook. Read this guide to learn more about it: https://tinyurl.com/pytask-functional.
from pathlib import Path
from typing import Annotated
from pytask import build
def create_text() -> Annotated[str, Path("hello_earth.txt")]:
return "Hello, earth!"
session = build(tasks=[create_text])Custom Nodes through Protocols
In the newest version, nodes (dependencies and products) and tasks follow protocols. It allows for customizations like PickleNodes that store any Python object as a pickle file and inject the object into the task when used as a dependency. It is explained in more detail in this guide: https://tinyurl.com/pytask-custom-nodes.
Other notable changes
- Python 3.12 is supported, and support for Python 3.7 is dropped.
@pytask.mark.depends_onand@pytask.mark.producesare deprecated. There are better options to define dependencies and products explained in https://tinyurl.com/yrezszr4.@pytask.mark.taskis also deprecated and replaced byfrom pytask import taskand@task.
What's Changed
- Remove Python 3.7 support and add a new action for mamba. by @tobiasraabe in #323
- Replace pony with sqlalchemy>=1.4.36. by @tobiasraabe in #387
- Remove
@pytask.mark.parametrize. by @tobiasraabe in #391 - Parse dependencies from all args if
depends_onis not used. by @tobiasraabe in #384 - Add products with
typing.Annotation. by @tobiasraabe in #394 - Refactor pybaum to
_pytask.tree_util. by @tobiasraabe in #395 - Replace pybaum with optree and add paths to PythonNode names. by @tobiasraabe in #396
- Add support for
NamedTupleand attrs classes in@pytask.mark.task(kwargs=...). by @tobiasraabe in #397 - Deprecate decorators for
depends_onandproduces. by @tobiasraabe in #398 - Use protocols instead of ABCs. by @tobiasraabe in #402
- Allow tasks to return products. by @tobiasraabe in #404
- Tracking changes in v0.4.0. by @tobiasraabe in #400
- Bump peter-evans/create-pull-request from 5.0.1 to 5.0.2 by @dependabot in #390
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #388
- Allow to use prefix trees as nodes to parse function returns. by @tobiasraabe in #406
- Remove
.valuefromNodeprotocol. by @tobiasraabe in #408 - Make
.from_annotan optional feature of nodes. by @tobiasraabe in #409 - Allow to pass functions to
PythonNode(hash=...). by @tobiasraabe in #410 - Add protocols for tasks. by @tobiasraabe in #412
- Remove scripts to generate
.svgs. by @tobiasraabe in #413 - Allow more ruff rules. by @tobiasraabe in #414
- A new functional interface. by @tobiasraabe in #411
- Deprecate
@pytask.mark.taskin favor of@pytask.task. by @tobiasraabe in #417 - Simplify and fix code in
dag.py. by @tobiasraabe in #418 - Convert
DeprecationWarningtoFutureWarningfor deprecated decorators. by @tobiasraabe in #420 - Remove deprecation warning for
produces. by @tobiasraabe in #421 - Document new interface. by @tobiasraabe in #392
- Fix
import_path. by @tobiasraabe in #424 - Publish
pytask.tree_util. by @tobiasraabe in #426 - Fix type annotations of
task.depends_onandtask.produces. by @tobiasraabe in #427 - Document functional interface. by @tobiasraabe in #423
- Update example in
README.md. by @tobiasraabe in #428 - Add better error message when
node.state()throws error during DAG validation. by @tobiasraabe in #429 - Update parts of the documentation. by @tobiasraabe in #430
- Enable colors in WSL. by @tobiasraabe in #431
- Fix type checking for
pytask.mark.x. by @tobiasraabe in #432 - Fix ids of
PythonNodes. by @tobiasraabe in #433 - Add support for Python 3.12. by @tobiasraabe in #434
- Fix detection of task functions. by @tobiasraabe in #437
- Clarify some types. by @tobiasraabe in #438
- Refine typing. by @tobiasraabe in #440
Full Changelog: v0.3.2...v0.4.0
v0.4.0rc4
The last pre-release.
v0.4.0rc3
A couple of new fixes. Most notably a fix for the ids of PythonNodes that should prevent rebuilds.
v0.4.0rc2
Another release candidate that fixes the installation via conda and adds full support for pytask-parallel.
v0.4.0rc1
This is the first release candidate for the v0.4.* release series.
The final release still requires some changes. For example, the documentation needs to be extended. But, the essential parts are already there, and it is time to collect some final feedback! Let me know what you think and what needs to be improved. You can comment in the discussion for this release #422.
To install the pre-release, use
$ pip install pytask --pre
$ conda install -c "conda-forge/label/pytask_rc" pytaskNow, let's take a look at the changes.
What's Changed
New
- Dependencies and products of tasks have new interfaces that are explained in this tutorial.
- You can also now declare products by allowing task functions to return. Follow this guide.
- If you have inputs to task functions that should be hashed to detect any changes, follow this guide.
- Before, only
pathlib.Paths received special treatment as dependencies or products to task functions. Now, it is possible to define your own nodes that simplify, for example, loading pickle files as this guide explains. But many more extensions are possible, like defining data in an S3 bucket as a dependency or product. - The functional interface has been reworked and now accepts tasks directly, allowing you to execute pytask on the command line or in Jupyter notebooks. The documentation must still be written, but here is your starting point.
Removals
- Python 3.7 is no longer supported.
@pytask.mark.parametrizeis removed. Follow this tutorial instead.
Deprecations
@pytask.mark.depends_on,@pytask.mark.producesare deprecated and will be removed in v0.5.0.@pytask.mark.taskis deprecated. Use@pytask.taskinstead.- Paths defined as strings are deprecated and should be replaced with proper
pathlib.Pathobjects.
Full list of changes
- Remove Python 3.7 support and add a new action for mamba. by @tobiasraabe in #323
- Replace pony with sqlalchemy>=1.4.36. by @tobiasraabe in #387
- Remove
@pytask.mark.parametrize. by @tobiasraabe in #391 - Parse dependencies from all args if
depends_onis not used. by @tobiasraabe in #384 - Add products with
typing.Annotation. by @tobiasraabe in #394 - Refactor pybaum to
_pytask.tree_util. by @tobiasraabe in #395 - Replace pybaum with optree and add paths to PythonNode names. by @tobiasraabe in #396
- Add support for
NamedTupleand attrs classes in@pytask.mark.task(kwargs=...). by @tobiasraabe in #397 - Deprecate decorators for
depends_onandproduces. by @tobiasraabe in #398 - Use protocols instead of ABCs. by @tobiasraabe in #402
- Allow tasks to return products. by @tobiasraabe in #404
- Tracking changes in v0.4.0. by @tobiasraabe in #400
- Bump peter-evans/create-pull-request from 5.0.1 to 5.0.2 by @dependabot in #390
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #388
- Allow to use prefix trees as nodes to parse function returns. by @tobiasraabe in #406
- Remove
.valuefromNodeprotocol. by @tobiasraabe in #408 - Make
.from_annotan optional feature of nodes. by @tobiasraabe in #409 - Allow to pass functions to
PythonNode(hash=...). by @tobiasraabe in #410 - Add protocols for tasks. by @tobiasraabe in #412
- Remove scripts to generate
.svgs. by @tobiasraabe in #413 - Allow more ruff rules. by @tobiasraabe in #414
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #407
- A new functional interface. by @tobiasraabe in #411
- Deprecate
@pytask.mark.taskin favor of@pytask.task. by @tobiasraabe in #417 - Simplify and fix code in
dag.py. by @tobiasraabe in #418 - Convert
DeprecationWarningtoFutureWarningfor deprecated decorators. by @tobiasraabe in #420 - Remove deprecation warning for
produces. by @tobiasraabe in #421 - Document new interface. by @tobiasraabe in #392
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #415
Full Changelog: v0.3.2...v0.4.0rc1
v0.3.2
Highlights
This release contains the following highlights:
- Previously, if you accidentally hit the save button on an unchanged task file, the task would be rerun by pytask, although nothing had changed. Now, pytask wouldn't rerun the task because it also compares the hashes of task files, not only the modification timestamp.
- If you want to enforce rerunning tasks, there is now a
--forceflag. Take the function name/id of the task and runpytask -k <task id> --force, and the task + its necessary tasks will be executed. Or delete a product from the task you want to rerun. - The import mechanism for task modules has been reworked, and errors resolved. Thanks to @NickCrews!
Additionally, the @pytask.mark.parametrize decorator is deprecated and will be removed in pytask v0.4. If you use the decorator, you will have two options:
- (Recommended) Upgrade your code to the new approach for repeating tasks described in this tutorial.
- Or, pin pytask to
pytask<0.4and silence the deprecation warning by settingsilence_parametrize_deprecation = truein yourpyproject.tomlunder[tool.pytask.ini_options].
What's Changed
- Update version numbers in animations. by @tobiasraabe in #345
- Add dependabot for GitHub actions. by @tobiasraabe in #348
- Publish
db. by @tobiasraabe in #352 - Refactor nodes. by @tobiasraabe in #355
- Add
-f/--forceto force executing tasks. by @tobiasraabe in #354 - Add hashing to avoid re-executing tasks when modification times changed. by @tobiasraabe in #357
- Update
update_plugin_list.py. by @tobiasraabe in #364 - Rework panel with sphinx-design. by @tobiasraabe in #365
- Add light and dark logos for the documentation. by @tobiasraabe in #366
- Fix the panel on the index page of the documentation. by @tobiasraabe in #367
- Fix error introduced in #364. by @tobiasraabe in #369
- Revert change turning Node.state() into a hook. by @tobiasraabe in #370
- Rename Node back to MetaNode. by @tobiasraabe in #371
- Clearer documentation for
pytask dag -o. by @tobiasraabe in #376 - Conditionally skip tests on MacOS. by @tobiasraabe in #378
- Deprecate
@pytask.mark.parametrize. by @tobiasraabe in #381 - Fix the import mechanism for task modules. by @NickCrews in #373
- Update changes. by @tobiasraabe in #383
New Contributors
- @dependabot made their first contribution in #349
- @NickCrews made their first contribution in #373
Full Changelog: v0.3.1...v0.3.2