The Pythonic lakehouse framework. One Python project to define, run, validate, and inspect lakehouse pipelines.
Phlo is the framework and plugin runtime that ties together familiar lakehouse tools — Dagster, dlt, Sling, dbt, Pandera, Iceberg, Delta, Nessie, Trino, MinIO, and more — behind a single CLI and a coherent product surface called Observatory.
Most lakehouse projects start in Python and quickly spill into YAML, Compose files, orchestration config, catalog setup, quality checks, and a pile of glue scripts and duplicated config. Phlo keeps those pieces in one project.
Use the phlo CLI to create a project, start the local stack, materialize assets, run quality checks, follow logs, and inspect what happened. Add provider packages when you need them: Dagster for orchestration, dlt or Sling for ingestion, dbt for transforms, Iceberg or Delta for tables, Trino for query, and Observatory for a UI to inspect assets, tables, lineage, quality, services, and logs.
A Phlo asset is ordinary Python with lakehouse metadata attached:
from pathlib import Path
import dlt
import pandas as pd
import phlo
from workflows.schemas.csv import EventsSchema
@phlo.ingestion(
table_name="events",
unique_key="event_id",
validation_schema=EventsSchema,
group="csv",
freshness_hours=(1, 24),
)
def csv_events(partition_date: str) -> object:
events = pd.read_csv(Path("data/events.csv"))
events["event_id"] = events["id"].astype(str) + "-" + partition_date
rows = events.to_dict(orient="records")
return dlt.resource(rows, name="events")This single function registers a partitioned ingestion asset, validates rows with Pandera, materializes through the configured orchestrator, lands the table in your configured storage and catalog, and becomes visible in Observatory and the catalog CLI — no separate orchestration, schema, Compose, or catalog wiring needed.
Prerequisites
- Python 3.11 or later
uv- Docker with Compose v2, or Podman with a Compose provider
# Create an isolated environment for the quickstart
mkdir phlo-quickstart && cd phlo-quickstart
uv venv
source .venv/bin/activate
# Install Phlo with the default local stack providers
uv pip install "phlo[defaults]"
# Create a project from the CSV batch starter
phlo init my-lakehouse --template csv-batch
cd my-lakehouse
uv pip install -e .
# Generate and start the local lakehouse stack
phlo services init
phlo services start
# Check that services are healthy
phlo services status
phlo doctor --verbose
# Materialize a completed daily partition
phlo materialize dlt_events --partition 2025-01-15
# Verify the table landed in the catalog
phlo catalog tables
# Stop the local stack when finished
phlo services stop- Project layout for
phlo.yaml, workflows, schemas, transforms, tests, local runtime state, and project plugins. - Starters for CSV ingestion, REST API ingestion, dbt medallion projects, Sling replication, and Observatory demos.
- Python decorators for registering ingestion, quality, and transformation assets without hand-writing provider boilerplate.
- Local service commands for generating, starting, checking, logging, and stopping the stack.
- Provider packages for Dagster, MinIO, Nessie, Trino, Iceberg, dbt, PostgreSQL, Observatory, and the rest of a working lakehouse.
- Plugin hooks for custom commands, services, assets, resources, catalogs, and Observatory extensions.
Phlo's core stays small. Installed provider packages contribute capabilities through Python entry points; the CLI discovers them in the current project and wires the runtime accordingly.
| Area | Intent | Provider examples |
|---|---|---|
| Pipeline authoring | Define ingestion assets, schemas, checks, and transforms | phlo-dlt, phlo-sling, phlo-pandera, phlo-dbt |
| Runtime services | Start the local lakehouse stack without hand-written Compose files | phlo-dagster, phlo-postgres, phlo-minio, phlo-nessie, phlo-trino |
| Table & catalog layer | Store, version, and query lakehouse tables | phlo-iceberg, phlo-delta, phlo-clickhouse, phlo-openmetadata |
| Product surfaces | Inspect and control assets, tables, lineage, quality, services, and logs | phlo-api, phlo-observatory, phlo-mcp |
| Serving & BI | Expose lakehouse data to apps and analysts | phlo-hasura, phlo-postgrest, phlo-pgweb, phlo-superset |
| Observability | Export telemetry, logs, metrics, and alerts | phlo-otel, phlo-prometheus, phlo-loki, phlo-grafana, phlo-alerting |
| Development | Test and validate projects and provider integrations | phlo-testing |
- Installation Guide
- Quickstart Guide
- Core Concepts
- Choosing Components
- Workflow Development
- Plugin Development
- Operations Guide
- CLI Reference
Phlo is alpha. The local development workflow is usable and exercised in CI, but APIs, provider contracts, and the on-disk project layout may change before 1.0. Pin exact versions in production.
uv pip install -e .
make checkUseful local service commands:
phlo services init
phlo services start
phlo services status
phlo services logs -f
phlo services stop
phlo doctor --verboseIssues and pull requests are welcome. Run make check locally before opening a PR, and please open an issue first for larger changes so the design can be discussed up front.