Skip to content

An end-to-end MLOps project for dynamic AI ad generation using FastAPI, Celery, Gemini 2.5 Pro, and Scikit-learn, fully containerized with Docker and deployed on Kubernetes with a CI/CD pipeline via GitHub Actions.

License

Notifications You must be signed in to change notification settings

Aditya-ADII/Aether-Marketing-System

Repository files navigation

Aether: AI-Powered Dynamic Ad Generation & Optimization System

CI Pipeline Python FastAPI Celery PostgreSQL Redis Scikit-Learn Pandas Docker Kubernetes Poetry Pytest

Aether is an end-to-end MLOps application that solves a core business problem: the automated creation of personalized ad copy. The system uses a classic machine learning model (K-Means Clustering) on real sales data to discover customer segments and then leverages a powerful large language model (Google's Gemini 2.5 Pro) to generate tailored ad copy for each of those segments on demand.

The entire application is built on a scalable, asynchronous, and containerized architecture using FastAPI, Celery, Docker, and PostgreSQL. It is further enhanced with a professional CI/CD pipeline for automated testing and builds, and is fully deployable to a production-like Kubernetes environment.


Key Features

  • Data-Driven Customer Segmentation: Uses K-Means clustering on the Olist e-commerce dataset to identify distinct customer profiles based on purchasing behavior (e.g., "High-Value Champions", "New Customers").
  • On-Demand AI Ad Generation: Employs Google's Gemini 2.5 Pro API with advanced prompt engineering to create unique, high-quality ad copy for each customer segment.
  • Asynchronous & Scalable Backend: Built with a modern Python stack (FastAPI, Celery, Redis) to handle multiple long-running AI tasks efficiently without blocking the user.
  • Containerized for Portability: Fully containerized with Docker, allowing the entire multi-service application (api, worker, db, redis) to run consistently in any environment.
  • CI/CD Automation: Includes a GitHub Actions workflow that automatically runs the test suite, builds the production Docker image, and pushes it to a container registry on every commit to the main branch.
  • Production-Ready Deployment: Comes with complete Kubernetes manifests for deploying the application to a production-grade orchestration platform.
  • Optimization-Ready: Features a /feedback endpoint to track ad performance (clicks/impressions), enabling a closed-loop system for future optimization and model re-training.
  • Fully Tested: Includes a comprehensive suite of automated integration tests using pytest to ensure code quality and reliability.

System Architecture

The system is designed as a set of communicating microservices, separating the API, background processing, and data storage into distinct, scalable containers.

System Architecture Diagram

+------------------------------------------------------------------------------------------------------------------------+
|                                       THE AETHER PRODUCTION SYSTEM (Run Time)                                          |
|                                                                                                                        |
|           [Olist Kaggle Data] -> [K-Means Model] -> [Customer Segments]                                                |
|           (Runs on CPU)                              |                                                                 |
|                                                      v                                                                 |
| [User] -> [FastAPI] -> [Redis] -> [Celery Worker] --(Uses Segments to build prompt)-->[Calls Gemini 2.5 Pro API]       |
|    |                                 |       ^                                                                         |
|    |                                 |       | (Stores Results)                                                        |
|    |                                 v       +------------------+                                                      |
|    +-----> [POST /feedback] -> [Postgres DB] <------------------+                                                      |
|          (Enables Optimization)                                                                                        |
+------------------------------------------------------------------------------------------------------------------------+

How It Works

The application workflow is fully asynchronous to handle the time-intensive AI generation process:

  1. API Request: A user sends a POST request to the /campaigns endpoint with product information. The FastAPI server immediately creates a PENDING campaign in the PostgreSQL database.
  2. Task Queuing: The API places a job on a Redis queue with the campaign ID and instantly returns a 202 Accepted response to the user.
  3. Background Processing: A Celery worker, running in a separate container, picks up the job from the Redis queue.
  4. Customer Segmentation: The worker loads a pre-trained K-Means model (built from the Olist dataset) to get a list of customer segments.
  5. AI Generation: For each segment, the worker constructs a detailed prompt and calls the Gemini 2.5 Pro API to generate tailored ad copy.
  6. Persistence: The worker saves each piece of generated ad copy to the creatives table in the PostgreSQL database and updates the campaign's status to COMPLETED.
  7. Result Retrieval: The user can then send a GET request to the /campaigns/{id} endpoint to retrieve the final, AI-generated results.

Technology Stack

  • Backend: FastAPI, Celery
  • Database: PostgreSQL
  • Cache & Message Broker: Redis
  • Machine Learning: Scikit-learn, Pandas, Joblib
  • Generative AI: Google Gemini 2.5 Pro API
  • Infrastructure: Docker, Docker Compose, Kubernetes
  • Automation (CI/CD): GitHub Actions
  • Testing: Pytest
  • Dependency Management: Poetry

Project Structure

The project is organized with a clean separation of concerns for the application source (src), tests, and infrastructure configurations.

aether-marketing-system/
├── .github/
│   └── workflows/
│       └── ci-pipeline.yml         # CI/CD automation workflow
├── data/                           # (Not in Git) Contains raw and processed data
│   ├── raw/
│   │   ├── olist_customers_dataset.csv
│   │   └── ... (all other Olist CSVs)
│   └── processed/
│       ├── customer_segmentation_model.joblib
│       └── customer_segmentation_scaler.joblib
├── kubernetes/
│   ├── api-deployment.yaml
│   ├── api-service.yaml
│   ├── configmap.yaml
│   ├── db-deployment.yaml
│   ├── redis-deployment.yaml
│   ├── secrets.yaml
│   └── worker-deployment.yaml
├── src/
│   ├── api/
│   │   ├── endpoints/
│   │   │   ├── campaigns.py
│   │   │   └── feedback.py
│   │   ├── main.py
│   │   └── schemas.py
│   ├── core/
│   │   ├── config.py
│   │   └── db.py
│   ├── models/
│   │   └── campaign_models.py
│   └── worker/
│       ├── celery_app.py
│       ├── customer_segmentation.py
│       └── tasks.py
├── tests/
│   └── test_api.py                 # Automated API tests
├── .env                            # (Not in Git) Your local secret keys
├── .env.example                    # Template for environment variables
├── .gitignore
├── docker-compose.yml              # Local development orchestration
├── Dockerfile                      # Builds the application container
├── kaggle.json                     # (Not in Git) Your Kaggle API credentials
├── poetry.lock
├── pyproject.toml                  # Python dependency management
├── pytest.ini
└── README.md                       # This file

Prerequisites

To run this project, you will need the following accounts and tools:

  • Docker Desktop: Installed and running on your local machine.
  • Kubernetes: Enabled within your Docker Desktop settings.
  • Google API Key: A valid API key with access to the Gemini 2.5 Pro model, obtainable from Google AI Studio.
  • Kaggle API Key: Your Kaggle username and API key, obtainable from your Kaggle Account Settings.
  • Docker Hub Account: A free account at hub.docker.com for the CI/CD pipeline to push images to.

Setup & Running Instructions

This project is designed to run in a containerized environment. The following instructions provide a clear, step-by-step flow from initial setup to running the application both locally with Docker Compose and in a production-like environment with Kubernetes.

Part 1: One-Time Prerequisites & Configuration

This initial setup only needs to be performed once to prepare your machine and accounts.

1. Install Required Software

Ensure the following tools are installed and running on your machine:

  • Docker Desktop: For building and running the containers.
  • Git: For cloning the repository.

2. Enable Kubernetes

Open Docker Desktop, navigate to Settings > Kubernetes, and check the "Enable Kubernetes" box. Click "Apply & Restart" and wait for the indicator in the bottom-left to turn green.

3. Gather API Keys, Credentials & Dataset

You will need to create accounts and generate credentials from three external services:

  • Google API Key: For accessing the Gemini 2.5 Pro model. Obtain this from Google AI Studio.
  • Kaggle API Key & Dataset: The project uses the public Brazilian E-Commerce Public Dataset by Olist. To download it automatically, you need an API key. Go to your Kaggle Account Settings, click "Create New Token" to download a kaggle.json file containing your username and key.
  • Docker Hub Credentials: A username and a personal access token are required for the CI/CD pipeline. Create a free account at hub.docker.com and generate an access token under Account Settings > Security.

4. Clone the Repository

git clone [https://github.com/Aditya-ADII/Aether-Marketing-System.git](https://github.com/Aditya-ADII/Aether-Marketing-System.git)
cd aether-marketing-system

5. Configure Local Secret Files (Not Committed to Git)

The application requires two secret files in the project root. These are listed in .gitignore to protect your credentials and must be created manually.

  • Create the .env file: Copy the template: cp .env.example .env. Open the new .env file with a text editor and add your Google API Key and other required values.

  • Create the kaggle.json file: Create a new file named kaggle.json and add your Kaggle credentials from the file you downloaded in step 3.

    {
      "username": "your-kaggle-username",
      "key": "your-kaggle-api-key"
    }

6. Configure GitHub Secrets for CI/CD

For the automated CI/CD pipeline to work, you must add your credentials to your GitHub repository's secrets. Go to your repository's Settings > Secrets and variables > Actions and create the following four repository secrets:

  • DOCKERHUB_USERNAME: Your Docker Hub username.
  • DOCKERHUB_TOKEN: The access token you generated on Docker Hub.
  • KAGGLE_USERNAME: Your Kaggle username.
  • KAGGLE_KEY: Your Kaggle API key.

Part 2: Running the Project

This project has three distinct operational stages: local development and testing, CI/CD automation, and deployment to Kubernetes. Following these steps will allow you to run, test, and deploy the entire application.

1. Local Development & Testing

This workflow is for running the application and its test suite on your local machine.

Step 1: Install Dependencies This project uses Poetry for dependency management. Install all required packages, including development dependencies like pytest, by running:

poetry install

Step 2: Run Automated Tests Before running the full application, you can verify the core logic by running the automated test suite. The tests use a temporary, in-memory database and will not affect your Docker environment.

pytest

A successful run will show 4 passed, confirming the application's logic is sound.

Step 3: Build and Run with Docker Compose This single command builds the Docker image (which includes downloading the Kaggle dataset and training the segmentation model) and starts all four services (api, worker, db, redis):

docker-compose up --build

The application is now running.

  • The API is available at http://localhost:8000.
  • Interactive documentation is at http://localhost:8000/docs.

Step 4: Manually Test the Live Application You can now interact with the running system.

  • To see the logs from all services in real-time, run:
    docker-compose logs -f
  • To create a campaign, open a new terminal and send a POST request:
    Invoke-WebRequest -Uri http://localhost:8000/campaigns/ -Method POST -ContentType "application/json" -Body '{"product_info": "A new line of premium, organic coffee beans."}'
  • To get the results after waiting ~1 minute, use a GET request:
    Invoke-WebRequest http://localhost:8000/campaigns/1
    A successful run will return a JSON object with "status": "COMPLETED" and a list of AI-generated creatives.

2. CI/CD Automation (GitHub Actions)

This workflow automates the testing and build process whenever you push code to your repository.

Step 1: The Trigger Commit and push your code to the main branch on GitHub.

git push origin main

Step 2: The Automated Workflow This push automatically triggers the GitHub Actions pipeline defined in .github/workflows/ci-pipeline.yml. The pipeline will execute the following jobs in the cloud:

  1. Install all Python dependencies using Poetry.
  2. Run the entire pytest suite to validate the code.
  3. Build the final, production-ready Docker image.
  4. Push the tagged image to your Docker Hub repository using the secrets you configured.

Step 3: Verification You can watch the pipeline run in real-time by going to the "Actions" tab on your GitHub repository. A green checkmark indicates a successful run, meaning your code has been tested and a deployable artifact has been created.


3. Production-like Deployment (Kubernetes)

This workflow demonstrates how to deploy the final application to a production-grade environment.

Step 1: Prerequisite Ensure your CI/CD pipeline has run successfully and pushed the latest image to your Docker Hub repository.

Step 2: Update Kubernetes Manifests In the kubernetes/api-deployment.yaml and kubernetes/worker-deployment.yaml files, ensure the image: field points to your correct Docker Hub repository (e.g., aditya12121/aether-marketing-system:latest).

Step 3: Deploy the Application From the project root, apply all Kubernetes configurations to your running local cluster:

kubectl apply -f kubernetes/

Step 4: Verify the Deployment Check that all your application "pods" (containers) are running successfully.

kubectl get pods

Wait until all pods show Running under the STATUS column with 0 restarts.

Step 5: Access and Test the Service To access the API running inside Kubernetes, open a new terminal and forward the port:

kubectl port-forward svc/aether-api-service 8080:80

The API will now be accessible at http://localhost:8080.


Project Validation & Demo Results

The project has been fully validated at every stage, from local testing to a live Kubernetes deployment.

1. End-to-End Functional Test

A POST request was sent to create a campaign, and a subsequent GET request confirmed the successful generation of AI-powered ad copy with a COMPLETED status. The final results are persisted correctly in the PostgreSQL database.

API POST Request API GET Result Database Query Result

2. Automated Test Suite

The project includes a full suite of automated tests using pytest to ensure code quality and reliability. The successful test run below confirms that all API endpoints and logic are working as expected.

Pytest Results

3. Automated CI/CD Pipeline

The GitHub Actions pipeline successfully automated the testing, building, and publishing of the application's Docker image.

CI/CD Pipeline Success

4. Kubernetes Deployment

The application was successfully deployed to a local Kubernetes cluster, with all pods (api, worker, db, redis) in a healthy, Running state.

Kubernetes Pods Running

5. All Data Files

The Database files to this project.

Data

6. Reddis Key

The Reddis Key to this project.

Reddis

6. Docker Container Build in terminal

The Docker Container Build to this project.

Docker Build done Docker Build running

7. Docker Containers in Docker Desktop

The Docker Container in Docker Desktop to this project.

Docker Containers On Docker Desktop Docker Desktop YML File

About

An end-to-end MLOps project for dynamic AI ad generation using FastAPI, Celery, Gemini 2.5 Pro, and Scikit-learn, fully containerized with Docker and deployed on Kubernetes with a CI/CD pipeline via GitHub Actions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published