Skip to content

Point72/airflow-pydantic

 
 

Repository files navigation

airflow-pydantic

Pydantic models for Apache Airflow

Build Status codecov License PyPI

Overview

Pydantic models of Apache Airflow data structures.

Primary Use Case: This library is designed to enable declarative DAG definitions using airflow-config or other YAML/JSON-based configuration frameworks. By representing Airflow constructs as Pydantic models, DAGs can be defined in configuration files rather than Python code, enabling better separation of concerns, easier testing, and configuration-driven workflows.

Core

Operators

Sensors

Other

Usage

Declarative DAGs with airflow-config (Recommended)

The primary use of airflow-pydantic is to build declarative, configuration-driven DAGs using airflow-config or similar YAML/JSON-based frameworks:

# config/my_dag.yaml
default_args:
  _target_: airflow_pydantic.TaskArgs
  owner: data-team
  retries: 3

default_dag_args:
  _target_: airflow_pydantic.DagArgs
  schedule: "@daily"
  start_date: "2024-01-01"
  catchup: false

This approach allows you to:

  • Define DAGs in YAML/JSON instead of Python
  • Separate configuration from code
  • Easily manage environment-specific settings
  • Version control your DAG configurations
  • Generate and validate DAGs programmatically

Programmatic Usage

All operators and sensors support two methods:

  • instantiate(): Create a concrete Airflow instance at runtime
  • render(): Generate Python code as a string for the Airflow construct

Code Generation with render()

The render() method generates valid Python code from your Pydantic models, enabling code generation workflows:

from airflow_pydantic import Dag, BashTask
from datetime import datetime

dag = Dag(
    dag_id="generated-dag",
    schedule="@daily",
    start_date=datetime(2024, 1, 1),
    tasks={
        "hello": BashTask(
            task_id="hello",
            bash_command="echo 'Hello World'",
        ),
    },
)

# Generate Python code
python_code = dag.render()

# Save to a DAG file
with open("dags/generated_dag.py", "w") as f:
    f.write(python_code)

Generated File:

from datetime import datetime

from airflow.models import DAG
from airflow.providers.standard.operators.bash import BashOperator

with DAG(schedule="@daily", start_date=datetime.fromisoformat("2024-01-01T00:00:00"), dag_id="generated-dag") as dag:
    hello = BashOperator(bash_command="echo 'Hello World'", task_id="hello", dag=dag)

This is useful for:

  • Generating DAG files from configuration during CI/CD
  • Creating DAG templates programmatically
  • Migrating from configuration-driven to static DAG files
  • Debugging and inspecting generated DAG code

Note

This library was generated using copier from the Base Python Project Template repository.

About

Pydantic models for Apache Airflow

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors

Languages

  • Python 100.0%