Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions .github/GITHUB_PAGES_SETUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Setting Up GitHub Pages for dbt Documentation

This guide explains how to enable GitHub Pages for the dbt documentation site.

## Automatic Setup

The GitHub Actions workflow (`.github/workflows/deploy-dbt-docs.yml`) is configured to automatically deploy documentation to GitHub Pages when changes are pushed to the `main` branch.

## Repository Configuration

To enable GitHub Pages deployment, a repository administrator needs to configure the following:

### Step 1: Enable GitHub Pages

1. Go to your repository on GitHub
2. Navigate to **Settings** > **Pages**
3. Under **Build and deployment**:
- **Source**: Select "GitHub Actions"
- This allows the workflow to deploy directly without using a specific branch

### Step 2: Run the Workflow

After merging this PR to the `main` branch:

1. The workflow will automatically trigger
2. It will generate the dbt documentation
3. Deploy it to GitHub Pages

### Step 3: Access the Documentation

Once deployed, the documentation will be available at:
- **URL**: `https://summers-team.github.io/school-m2-bi-project/`

## Workflow Details

The workflow performs the following steps:

1. **Build Job**:
- Checks out the repository
- Sets up Python 3.12
- Installs dependencies using `uv`
- Generates dbt documentation using `dbt docs generate --empty-catalog --no-compile`
- Prepares documentation files for deployment
- Uploads the artifact for deployment

2. **Deploy Job**:
- Deploys the artifact to GitHub Pages
- Creates the `github-pages` environment

## Manual Workflow Trigger

You can also manually trigger the workflow:

1. Go to **Actions** > **Deploy dbt Docs to GitHub Pages**
2. Click **Run workflow**
3. Select the `main` branch
4. Click **Run workflow**

## Troubleshooting

If the deployment fails:

1. **Check Permissions**: Ensure the repository has GitHub Pages enabled and the workflow has the necessary permissions
2. **Review Workflow Logs**: Go to the Actions tab and review the logs for any errors
3. **Verify dbt Configuration**: Ensure the dbt project structure is correct and models can be parsed

## Local Preview

To preview the documentation locally before deployment, see the instructions in [`dbt/README.md`](../dbt/README.md).
119 changes: 119 additions & 0 deletions .github/workflows/deploy-dbt-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
name: Deploy dbt Docs to GitHub Pages

# This workflow generates dbt documentation and deploys it to GitHub Pages
#
# Setup Required:
# 1. Go to repository Settings > Pages
# 2. Under "Build and deployment", set Source to "GitHub Actions"
# 3. For more details, see .github/GITHUB_PAGES_SETUP.md
#
# The workflow runs automatically on:
# - Push to main branch
# - Manual trigger via workflow_dispatch

on:
push:
branches:
- main
workflow_dispatch:

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

# Allow only one concurrent deployment, skipping runs queued between
# the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these
# production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true

- name: Install dependencies
run: |
uv sync

- name: Generate dbt docs
run: |
cd dbt
# Create a minimal profiles.yml for docs generation
# (no connection needed)
cat > profiles.yml << EOF
projet_m2_bi:
outputs:
docs:
type: bigquery
method: oauth
project: dummy-project
dataset: dummy_dataset
threads: 1
location: US
target: docs
EOF
# Parse the project first
uv run --project .. dbt parse --profiles-dir .
# Generate documentation without catalog
# (no database connection required)
uv run --project .. \
dbt docs generate --profiles-dir . \
--empty-catalog --no-compile
continue-on-error: false

- name: Prepare docs for GitHub Pages
run: |
mkdir -p gh-pages
cp -r dbt/target/* gh-pages/
# Create index.html redirect to the dbt docs
cat > gh-pages/index.html << 'EOF'
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>dbt Documentation</title>
<meta http-equiv="refresh" content="0; url=./index.html">
<link rel="canonical" href="./index.html">
</head>
<body>
<p>
Redirecting to
<a href="./index.html">dbt documentation</a>...
</p>
</body>
</html>
EOF

- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: 'gh-pages'

deploy:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
needs: build
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,11 @@ docs/_build/
.pybuilder/
target/

# dbt
dbt/target/
dbt/dbt_packages/
dbt/logs/

# Jupyter Notebook
.ipynb_checkpoints

Expand Down
181 changes: 181 additions & 0 deletions IMPLEMENTATION_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# dbt Documentation Implementation Summary

This document summarizes the implementation of automated dbt documentation with GitHub Pages and CI/CD.

## What Was Implemented

### 1. GitHub Actions Workflow
**File:** `.github/workflows/deploy-dbt-docs.yml`

The workflow automatically:
- Triggers on every push to the `main` branch
- Can be manually triggered via GitHub UI
- Installs Python 3.12 and uv package manager
- Syncs project dependencies
- Parses the dbt project
- Generates documentation without requiring a database connection
- Deploys the documentation to GitHub Pages

**Key Features:**
- No database connection required for CI/CD (uses `--empty-catalog` flag)
- Proper permissions for GitHub Pages deployment
- Concurrent deployment protection
- Clear comments explaining setup requirements

### 2. Local Development Support

#### mise Tasks
**File:** `mise.toml`

Added two convenient tasks:
```bash
# Generate documentation
mise run dbt:docs:generate

# Generate and serve documentation
mise run dbt:docs:serve
```

These tasks:
- Automatically create a minimal `profiles.yml` if needed
- Generate documentation without database connection
- Serve documentation at http://localhost:8080

#### Documentation Updates
**File:** `dbt/README.md`

Added comprehensive instructions for:
- Accessing the live documentation site
- Generating docs locally (both with mise and manually)
- Serving docs locally
- Understanding what the documentation includes

### 3. Setup Guide
**File:** `.github/GITHUB_PAGES_SETUP.md`

Complete guide covering:
- How to enable GitHub Pages in repository settings
- How to verify deployment
- How to manually trigger the workflow
- Troubleshooting tips
- Links to documentation

### 4. Main README Update
**File:** `README.md`

Added:
- Prominent link to the documentation site
- Description of documentation features
- Reference to dbt README for local preview

### 5. .gitignore Updates
**File:** `.gitignore`

Added explicit entries for:
- `dbt/target/` - Generated documentation
- `dbt/dbt_packages/` - Downloaded packages
- `dbt/logs/` - Log files

## How to Use

### For Repository Administrators

1. **Enable GitHub Pages:**
- Go to repository Settings > Pages
- Under "Build and deployment", set Source to "GitHub Actions"
- See `.github/GITHUB_PAGES_SETUP.md` for detailed instructions

2. **Deploy Documentation:**
- Merge this PR to the `main` branch
- The workflow will automatically run and deploy
- Documentation will be available at: https://summers-team.github.io/school-m2-bi-project/

### For Contributors

1. **View Online Documentation:**
- Visit: https://summers-team.github.io/school-m2-bi-project/

2. **Preview Documentation Locally:**
```bash
# Using mise (recommended)
mise run dbt:docs:serve

# Or manually
cd dbt
uv run --project .. dbt parse
uv run --project .. dbt docs generate --empty-catalog --no-compile
uv run --project .. dbt docs serve
```

3. **Open Browser:**
- Navigate to http://localhost:8080

## Technical Details

### Why `--empty-catalog`?

The `--empty-catalog` flag allows dbt to generate documentation without connecting to the database. This is important for:
- CI/CD environments without database credentials
- Contributors who want to preview docs structure without database access
- Faster documentation generation

**Note:** When generating docs with a database connection, dbt will include additional metadata like row counts, column types from the actual database, and freshness information.

### Documentation Contents

The generated documentation includes:
- **Data lineage graphs:** Visual representation of model dependencies
- **Model descriptions:** Markdown documentation for each model
- **Column-level documentation:** Descriptions and metadata for each column
- **Test coverage:** Information about data quality tests
- **Source definitions:** Documentation for raw data sources
- **Macro definitions:** Reusable SQL snippets and their documentation

### Files Generated

The documentation generation creates:
- `index.html` - Main documentation interface
- `manifest.json` - dbt project metadata
- `catalog.json` - Database catalog information (empty when using `--empty-catalog`)
- `semantic_manifest.json` - Semantic layer information

## Validation

### Local Testing Completed
✅ Documentation generation works without database connection
✅ Generated HTML is valid and contains dbt docs
✅ mise tasks work correctly
✅ YAML workflow file is properly formatted
✅ No security vulnerabilities detected

### Next Steps
1. Merge PR to main branch
2. Enable GitHub Pages in repository settings
3. Verify workflow runs successfully
4. Access documentation at the GitHub Pages URL

## Troubleshooting

If the workflow fails:

1. **Check Workflow Logs:**
- Go to Actions tab in GitHub
- Click on the failed workflow run
- Review the logs for errors

2. **Common Issues:**
- GitHub Pages not enabled: Enable in Settings > Pages
- Permissions error: Ensure workflow has `pages: write` permission
- Parsing error: Check dbt project structure and model definitions

3. **Local Testing:**
- Always test locally first using `mise run dbt:docs:generate`
- Check that models can be parsed successfully

## Support

For more information:
- **dbt Documentation:** https://docs.getdbt.com/docs/collaborate/documentation
- **GitHub Pages:** https://docs.github.com/en/pages
- **GitHub Actions:** https://docs.github.com/en/actions
- **Setup Guide:** `.github/GITHUB_PAGES_SETUP.md`
Loading