-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: mylearn with airflow, mlflow and poetry on python 3.11 (#1)
* chore: initialize mylearn with airflow, mlflow and poetry * build(airflow): extend conf via pyproject.toml + misc * build: upgrade required python version to 3.11 * ci: setup ci github action + disable tests as mlflow not on python 3.11 * docs: update readme badges
- Loading branch information
1 parent
fedf417
commit 62944b0
Showing
30 changed files
with
7,045 additions
and
54 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
name: ci | ||
|
||
on: push | ||
|
||
jobs: | ||
test: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Check out repository | ||
uses: actions/checkout@v3 | ||
- name: Set up python | ||
id: setup-python | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.11 | ||
- name: Install Poetry | ||
uses: snok/install-poetry@v1 | ||
with: | ||
virtualenvs-create: true | ||
virtualenvs-in-project: true | ||
installer-parallel: true | ||
- name: Load cached venv | ||
id: cached-poetry-dependencies | ||
uses: actions/cache@v3 | ||
with: | ||
path: .venv | ||
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/poetry.lock') }} | ||
- name: Install dependencies | ||
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true' | ||
run: poetry install --no-interaction --no-root | ||
- name: Install project | ||
run: poetry install --no-interaction | ||
- name: Run tests | ||
run: | | ||
source .venv/bin/activate | ||
poe checks |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,22 @@ | ||
.env | ||
.python-version | ||
.idea/ | ||
.venv/ | ||
build/ | ||
data/ | ||
dist/ | ||
mylearn.egg-info/ | ||
**/__pycache__/** | ||
**/.mypy_cache/** | ||
**/.pytest_cache/** | ||
|
||
airflow/dags/ | ||
metadata/ | ||
mlruns/ | ||
notebooks/.ipynb_checkpoints/ | ||
|
||
airflow/airflow.cfg | ||
airflow/airflow.db | ||
airflow/airflow-webserver.pid | ||
airflow/webserver_config.py | ||
airflow/logs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
repos: | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v4.3.0 | ||
hooks: | ||
- id: trailing-whitespace | ||
- id: end-of-file-fixer | ||
- id: name-tests-test | ||
- id: requirements-txt-fixer | ||
- repo: https://github.com/asottile/add-trailing-comma | ||
rev: v2.2.3 | ||
hooks: | ||
- id: add-trailing-comma | ||
args: [--py36-plus] | ||
- repo: https://github.com/asottile/pyupgrade | ||
rev: v2.37.3 | ||
hooks: | ||
- id: pyupgrade | ||
args: [--py37-plus] | ||
- repo: https://github.com/psf/black | ||
rev: 22.6.0 | ||
hooks: | ||
- id: black | ||
args: [-l 120, --check] | ||
- repo: https://github.com/PyCQA/flake8 | ||
rev: 5.0.4 | ||
hooks: | ||
- id: flake8 | ||
args: [--config=config/flake8.ini] | ||
- repo: https://github.com/pre-commit/mirrors-mypy | ||
rev: v0.971 | ||
hooks: | ||
- id: mypy | ||
additional_dependencies: [types-all] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,97 @@ | ||
<h2 align="center">mylearn: my Machine Learning toolkit</h2> | ||
<h2 align="center">mylearn: my Machine Learning framework</h2> | ||
|
||
<p align="center"> | ||
<a href="https://circleci.com/gh/MichaelKarpe/mylearn"><img alt="Build Status" src="https://circleci.com/gh/MichaelKarpe/mylearn.svg?style=shield"></a> | ||
<a href="https://github.com/psf/black/blob/master/LICENSE"><img alt="License: MIT" src="https://black.readthedocs.io/en/stable/_static/license.svg"></a> | ||
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a> | ||
<a href="https://pypi.org/project/mylearn"><img src="https://img.shields.io/pypi/v/mylearn.svg"></a> | ||
<a href="https://pypi.org/project/mylearn"><img src="https://img.shields.io/pypi/pyversions/mylearn.svg"></a> | ||
<a href="https://github.com/MichaelKarpe/mylearn/blob/main/LICENSE"><img src="https://img.shields.io/pypi/l/mylearn.svg"></a> | ||
<a href="https://github.com/MichaelKarpe/mylearn/actions"><img src="https://github.com/MichaelKarpe/mylearn/workflows/ci/badge.svg"></a> | ||
<a href="https://github.com/psf/black"><img src="https://img.shields.io/badge/code%20style-black-000000.svg"></a> | ||
</p> | ||
|
||
___ | ||
|
||
[mylearn](https://github.com/MichaelKarpe/mylearn) is a Machine Learning framework based on | ||
[Airflow](https://github.com/apache/airflow) and [MLflow](https://github.com/mlflow/mlflow) for designing machine | ||
learning systems in a production perspective. | ||
|
||
**Work in progress... Stay tuned!** | ||
|
||
# Index | ||
|
||
1. [Prerequisites](#prerequisites) | ||
2. [Installation & Setup](#installation-setup) | ||
3. [Usage](#usage) | ||
|
||
# Prerequisites | ||
|
||
## pyenv | ||
|
||
To be completed with how to install and setup pyenv | ||
|
||
## poetry | ||
|
||
To be completed with how to install and setup poetry | ||
|
||
# Installation & Setup | ||
|
||
mylearn leverages [poetry](https://github.com/python-poetry/poetry) and [poethepoet](https://github.com/nat-n/poethepoet) | ||
to make its installation and setup surprisingly simple. | ||
|
||
## Installation | ||
|
||
It is recommended to install requirements within a virtualenv located at the project root level, although not required. | ||
```commandline | ||
poetry config virtualenvs.in-project true | ||
``` | ||
|
||
Installation is run with | ||
```commandline | ||
poetry install | ||
``` | ||
|
||
## Airflow Setup | ||
|
||
Airflow setup is initialized via a `poe` command | ||
```commandline | ||
poe airflow-init | ||
``` | ||
|
||
Airflow Scheduler & Webserver can be run with | ||
```commandline | ||
poe airflow-scheduler | ||
poe airflow-webserver | ||
``` | ||
|
||
Airflow UI can be opened at [localhost](0.0.0.0:8080) (port 8080), and you can login with username and password `admin`. | ||
|
||
If you want to clean your Airflow setup before rerunning `poe airflow-init`, you need to kill Airflow Scheduler & | ||
Webserver and run | ||
```commandline | ||
poe airflow-clean | ||
``` | ||
|
||
## MLflow Setup | ||
|
||
MLflow UI can be opened at [localhost](0.0.0.0:5000) (port 5000) after execution of the following command: | ||
```commandline | ||
poe mlflow-ui | ||
``` | ||
|
||
# Usage | ||
|
||
## MLflow Pipelines Regression Template | ||
|
||
The *mlflow-template* pipeline, based on the | ||
[MLflow Pipelines Regression Template](https://github.com/mlflow/mlp-regression-template), can be run independently with | ||
```commandline | ||
poe mlflow-run | ||
``` | ||
|
||
or via an Airflow Directed Acyclic Graph (DAG) by triggering the *mlflow-template* DAG via Airflow UI or with | ||
```commandline | ||
TO BE COMPLETED | ||
``` | ||
|
||
## Other examples | ||
|
||
**Work in progress... Stay tuned!** |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
from datetime import datetime, timedelta | ||
|
||
from airflow import DAG | ||
from airflow.operators.bash import BashOperator | ||
|
||
# The DAG object; we'll need this to instantiate a DAG | ||
# Operators; we need this to operate! | ||
|
||
with DAG( | ||
"mlflow", | ||
# These args will get passed on to each operator | ||
# You can override them on a per-task basis during operator initialization | ||
default_args={ | ||
"depends_on_past": False, | ||
"email": ["[email protected]"], | ||
"email_on_failure": False, | ||
"email_on_retry": False, | ||
"retries": 0, | ||
"retry_delay": timedelta(seconds=5), | ||
# 'queue': 'bash_queue', | ||
# 'pool': 'backfill', | ||
# 'priority_weight': 10, | ||
# 'end_date': datetime(2016, 1, 1), | ||
# 'wait_for_downstream': False, | ||
# 'sla': timedelta(hours=2), | ||
# 'execution_timeout': timedelta(seconds=300), | ||
# 'on_failure_callback': some_function, | ||
# 'on_success_callback': some_other_function, | ||
# 'on_retry_callback': another_function, | ||
# 'sla_miss_callback': yet_another_function, | ||
# 'trigger_rule': 'all_success' | ||
}, | ||
description="MLflow DAG", | ||
schedule_interval=timedelta(days=1), | ||
start_date=datetime(2022, 8, 28), | ||
catchup=False, | ||
tags=["mlflow"], | ||
) as dag: | ||
|
||
# t1, t2 and t3 are examples of tasks created by instantiating operators | ||
t1 = BashOperator( | ||
task_id="ingest", | ||
bash_command=""" | ||
cd ${AIRFLOW_HOME}; | ||
cd ..; | ||
mlflow pipelines run --step ingest; | ||
""", | ||
) | ||
|
||
t2 = BashOperator( | ||
task_id="split", | ||
bash_command=""" | ||
cd ${AIRFLOW_HOME}; | ||
cd ..; | ||
mlflow pipelines run --step split; | ||
""", | ||
) | ||
|
||
t3 = BashOperator( | ||
task_id="transform", | ||
bash_command=""" | ||
cd ${AIRFLOW_HOME}; | ||
cd ..; | ||
mlflow pipelines run --step transform; | ||
""", | ||
) | ||
|
||
t4 = BashOperator( | ||
task_id="train", | ||
bash_command=""" | ||
cd ${AIRFLOW_HOME}; | ||
cd ..; | ||
mlflow pipelines run --step train; | ||
""", | ||
) | ||
|
||
t5 = BashOperator( | ||
task_id="evaluate", | ||
bash_command=""" | ||
cd ${AIRFLOW_HOME}; | ||
cd ..; | ||
mlflow pipelines run --step evaluate; | ||
""", | ||
) | ||
|
||
t6 = BashOperator( | ||
task_id="register", | ||
bash_command=""" | ||
cd ${AIRFLOW_HOME}; | ||
cd ..; | ||
mlflow pipelines run --step register; | ||
""", | ||
) | ||
|
||
t1 >> t2 | ||
t2 >> t3 | ||
t3 >> t4 | ||
t4 >> t5 | ||
t5 >> t6 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
[flake8] | ||
max-line-length = 120 |
Binary file not shown.
Empty file.
Oops, something went wrong.