Skip to content

akvo/vector-knowledge-base-mcp-server

Repository files navigation

Vector Knowledge Base MCP Server

Repo Size Languages Issues Last Commit

A high-performance FastAPI/FastMCP-based Model Context Protocol (MCP) server that provides vector-based knowledge management with document storage, similarity search, and intelligent retrieval capabilities.


πŸ“– Table of Contents


πŸš€ Features

  • FastAPI Backend: High-performance async API server
  • Vector Database: ChromaDB integration for semantic search
  • Document Storage: MinIO object storage for file management
  • PostgreSQL Database: Structured data storage and metadata
  • MCP Protocol: Model Context Protocol server implementation using FastMCP
  • Admin Interface: PgAdmin for database management (development only)
  • Development Ready: Hot-reload and development tools included

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI App   │────│   ChromaDB      │────│   PostgreSQL    β”‚
β”‚   (Port 8100)   β”‚    β”‚   (Port 8101)   β”‚    β”‚   (Port 5432)   β”‚
│─────────────────│    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚   FastMCP App   β”‚
β”‚ (Port 8100/mcp) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     MinIO       β”‚    β”‚    PgAdmin      β”‚
β”‚ (Ports 9100/01) β”‚    β”‚   (Port 5550)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

All port described in above schema are for development

πŸ› οΈ Tech Stack

  • Backend: FastAPI with Python 3.11+
  • Vector Database: ChromaDB for embeddings and similarity search
  • Database: PostgreSQL 12 with Alpine Linux
  • Object Storage: MinIO for file storage
  • Containerization: Docker & Docker Compose
  • MCP Protocol: FastMCP for model context protocol implementation

πŸ“‹ Prerequisites

  • Docker and Docker Compose
  • Python 3.11+ (for local development)
  • Git

πŸš€ Quick Start

Environment Variables

Before running the application, create a .env file based on .env.example:

cp .env.example .env

Fill in the variables according to your environment:

APP_ENV=dev
APP_PORT=8100

# Nginx
NGINX_PORT=8080

DATABASE_URL=postgresql://akvo:password@db:5432/kb_mcp

# MinIO settings
MINIO_ENDPOINT=minio:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin
MINIO_BUCKET_NAME=documents
# Should be same as NGINX_PORT
MINIO_SERVER_URL=http://localhost:8080/minio

# Chroma DB settings
CHROMA_DB_HOST=chromadb
CHROMA_DB_PORT=8000
VECTOR_STORE_BATCH_SIZE=100

# OpenAI settings
OPENAI_API_KEY=your-openai-api-key-here
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_MODEL=gpt-4
OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002

# Admin Auth
ADMIN_API_KEY=your-admin-api-key-here

Notes

  • APP_ENV accepts two values: prod or dev.
  • This variable controls the startup command in entrypoint.sh, determining whether the application runs in reload mode (dev) or in production mode (prod).
  • VECTOR_STORE_BATCH_SIZE controls how many documents are processed in a single batch when adding to the vector store. There is a trade off between performance and hitting limits on the number of chunks that can be stored at once default is 100 but you can tune this setting here.
  • ADMIN_API_KEY currently used for authentication to access the CRUD API keys endpoint. With this, the script can create an API key that will be used as the authentication token to access the CRUD Knowledge Base.
  • MINIO_ENDPOINT is the internal Docker network address used by the FastAPI application to communicate with MinIO.
  • MINIO_SERVER_URL is the external URL that browsers use to access documents through the Nginx proxy. It should match your NGINX_PORT.

Development Setup

  1. Clone the repository

    git clone [email protected]:akvo/vector-knowledge-base-mcp-server.git
    cd vector-knowledge-base-mcp-server
  2. Set up environment variables

    cp .env.example .env
    # Edit .env with your configurations
  3. Start the development environment

    ./dev.sh up -d
  4. Verify services are running

    docker compose ps
  5. Running pytest

    • Running FastAPI endpoint test
    ./dev.sh exec main ./test.sh api
    • Running e2e test
    ./dev.sh exec main ./test.sh e2e
    • Running FastMCP test
    ./dev.sh exec main ./test.sh mcp
    • Running All test
    ./dev.sh exec main ./test.sh all

Production Setup

  1. Build and start production services
    docker compose -f docker-compose.yml up -d

Service Ports (dev)

Service Development Production Description
FastAPI 8100 8000 Main application API
ChromaDB 8101 8001 Vector database
PostgreSQL 5432 5432 Primary database
MinIO API 9100 9000 Object storage API
MinIO Console 9101 9001 MinIO web interface
PgAdmin 5550 - Database admin (dev only)
Nginx 8080 80 Reverse proxy

πŸ”‘ Authentication and API Keys

This project uses API keys for authentication to access the Knowledge Base and Admin APIs. There are two types of keys:

  1. Admin API Key (ADMIN_API_KEY) – Used for administrative actions, such as creating or revoking other API keys.
  2. API Key – Generated via the Admin API to access the Knowledge Base endpoints.

Using the Admin API Key

  • Your ADMIN_API_KEY is defined in your .env file.
  • To perform administrative tasks, include it in the Authorization header:
Authorization: Admin-API-Key <your_admin_api_key>

Example: Create a new API key via Admin endpoint

curl -X POST http://localhost:8100/api/v1/api-key \
  -H "Authorization: Admin-Key sk_xxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"name": "app-name", "is_active": true}'

Using the API Key

  • The generated API key is used to access protected Knowledge Base endpoints.
  • Include it in the Authorization header:
Authorization: API-Key <your_api_key>

Example: Query the Knowledge Base

curl -X GET http://localhost:8100/api/v1/knowledge-base \
  -H "Authorization: API-Key sk_xxxxxxx"

πŸ‘‰ For the detail of API-Key usage, please read: SECURITY.md

Summary Table

Key Type Header Name Purpose
Admin API Key Authorization: Admin-Key <key> Manage API keys and administrative tasks
User/API Key Authorization: API-Key <key> Access Knowledge Base and perform CRUD ops

πŸ“¦ MinIO Document Storage and Public Access

This application uses MinIO for object storage and provides public access to uploaded documents through an Nginx reverse proxy. This allows documents to be directly viewed or downloaded in web browsers without requiring AWS signature-based authentication.

How It Works

The document access flow is designed to work seamlessly with Docker networking:

Browser Request
    ↓
http://localhost:8080/minio/documents/kb_1/file.pdf
    ↓
Nginx (port 8080) - Reverse Proxy
    ↓
MinIO Container (minio:9000) - Internal Docker Network
    ↓
Document Served

Key Components:

  1. Internal Communication (MINIO_ENDPOINT):

    • Used by FastAPI application for upload, delete, and management operations
    • Format: minio:9000
    • Only accessible within Docker network
  2. External Access (MINIO_SERVER_URL):

    • Used by browsers to access documents
    • Format: http://localhost:8080/minio
    • Routed through Nginx reverse proxy
  3. Public Bucket Policy:

    • The MinIO bucket is configured with a public read policy
    • Allows direct document access without AWS signatures
    • Policy version 2012-10-17 is the AWS S3 standard (static, never changes)

Configuration

Environment Variables:

# Internal endpoint - used by FastAPI for operations
MINIO_ENDPOINT=minio:9000

# External endpoint - used by browsers to access files
MINIO_SERVER_URL=http://localhost:8080/minio

# MinIO credentials
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin
MINIO_BUCKET_NAME=documents

Nginx Configuration:

The Nginx proxy is configured to forward /minio/* requests to the MinIO service:

location /minio/ {
    proxy_pass http://minio/;
    proxy_set_header Host $host;
    # ... additional proxy settings
}

Document URLs

When a document is uploaded and processed, the API returns URLs in this format:

{
  "document_id": 1,
  "file_name": "example.pdf",
  "file_path": "http://localhost:8080/minio/documents/kb_1/example.pdf",
  "file_type": "application/pdf",
  "is_viewable_in_browser": true
}

These URLs can be:

  • Opened directly in a browser
  • Embedded in <iframe> elements
  • Used in PDF viewers
  • Downloaded via direct links

Security Considerations

Current Setup:

  • Documents in the knowledge base are publicly readable
  • No authentication required for document access
  • Suitable for internal networks or non-sensitive data

For Production with Sensitive Data:

If you need to restrict document access, consider:

  1. API-based Access: Remove public bucket policy and stream documents through authenticated API endpoints
  2. Nginx Authentication: Add authentication at the Nginx level
  3. Network Isolation: Keep MinIO and Nginx on a private network
  4. VPN/Firewall: Restrict access to authorized networks only

To disable public access, remove or modify the bucket policy in minio_service.py:

# Comment out or remove this in init_minio()
# set_bucket_public_read_policy(bucket_name)

Then implement authentication at the API or Nginx level based on your security requirements.

πŸ“– API Documentation

Once the application is running (uvicorn app.main:app --reload or via Docker), the API documentation is automatically available through FastAPI docs:

From these interfaces, you can:

  • Try out endpoints directly
  • View request and response schemas
  • Test the API interactively

πŸ“ Project Structure

vector-knowledge-base-mcp-server/
β”œβ”€β”€ main/                          # FastAPI application
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ api/                   # API routes (endpoint FastAPI)
β”‚   β”‚   β”œβ”€β”€ core/                  # Core configuration (settings, logging, security)
β”‚   β”‚   β”œβ”€β”€ mcp/                   # MCP related files (FastMCP server, tools)
β”‚   β”‚   β”œβ”€β”€ models/                # Pydantic models / ORM models
β”‚   β”‚   β”œβ”€β”€ schemas/               # API schemas (Pydantic / base)
β”‚   β”‚   β”œβ”€β”€ services/              # Business logic / service layer
β”‚   β”‚   β”œβ”€β”€ utils/                 # Helpers / utilities
β”‚   β”œβ”€β”€ tests/                     # Unit / integration tests
β”‚   β”œβ”€β”€ Dockerfile
β”‚   └── requirements.txt
β”œβ”€β”€ nginx/                         # Nginx reverse proxy
β”‚   β”œβ”€β”€ conf.d/
β”‚   β”‚   └── default.conf          # Nginx configuration
β”‚   └── Dockerfile
β”œβ”€β”€ db/
β”‚   β”œβ”€β”€ docker-entrypoint-initdb.d/ # Init SQL scripts
β”‚   └── script/                    # Migration / seed
β”œβ”€β”€ pgadmin4/
β”‚   └── servers.json               # GUI config
β”œβ”€β”€ docker-compose.yml             # Compose prod
β”œβ”€β”€ docker-compose.override.yml    # Override dev
β”œβ”€β”€ .env.example                   # Env vars
└── README.md

🚨 Troubleshooting

Health Checks

# Check all services
curl http://localhost:8100/health

# Check individual components
curl http://localhost:8101/api/v2/heartbeat  # ChromaDB
curl http://localhost:9100/minio/health/live # MinIO
curl http://localhost:8080/health            # Nginx

Common Issues:

  1. Cannot access documents (404 error)

    • Verify Nginx is running: docker compose ps nginx
    • Check Nginx logs: docker compose logs nginx
    • Ensure MINIO_SERVER_URL matches your NGINX_PORT
  2. MinIO connection refused

    • Verify MinIO is running: docker compose ps minio
    • Check MinIO logs: docker compose logs minio
    • Ensure bucket policy is set correctly (check application logs during startup)
  3. Documents not loading in browser

    • Check if URL format is correct: http://localhost:8080/minio/documents/...
    • Verify bucket policy: Access MinIO console at http://localhost:9101 and check bucket permissions
    • Review Nginx proxy configuration in nginx/conf.d/default.conf

🀝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guide
  • Add tests for new features
  • Update documentation for API changes
  • Use conventional commits for commit messages

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

  • Issues: Open an issue on GitHub
  • Documentation: Check the /docs endpoint when server is running
  • Community: Join our discussions in GitHub Discussions

Built with ❀️ using FastAPI and FastMCP

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages