Skip to content

Security Enhancement: Implement Encrypted Key Management and API Gateway Authentication #19

@ebowwa

Description

@ebowwa

Security Enhancement: Encrypted Key Management & API Gateway Authentication

Problem Statement

The current AI Proxy Core implementation has several critical security vulnerabilities:

  1. API keys stored in plain text in memory and environment variables
  2. No encryption for sensitive data at rest or in transit (beyond TLS)
  3. No authentication layer - anyone with access to the API can trigger expensive LLM calls
  4. Keys exposed in logs and error messages
  5. No key rotation support without service restart
  6. Direct provider key exposure to clients

Proposed Solution

Implement a comprehensive security layer that includes:

1. Encrypted Key Management System

  • In-memory encryption using cryptography library with Fernet symmetric encryption
  • Key derivation functions (PBKDF2) for master key generation
  • Secure string class that only decrypts when accessed
  • Automatic key masking in logs and error messages

2. Multiple Storage Backend Support

  • Environment variables (encrypted with enc: prefix)
  • HashiCorp Vault integration for enterprise deployments
  • AWS Secrets Manager support
  • Azure Key Vault support
  • OS Keyring for local development (using keyring library)

3. API Gateway Authentication

  • Bearer token authentication with HMAC-SHA256 signing
  • Configurable rate limiting per token
  • Token expiration and revocation support
  • Scope-based access control (read, write, admin)
  • Client ID tracking for audit trails

4. Runtime Security Features

  • Hot key rotation without service restart
  • Automatic key migration from plain environment variables
  • Request validation and sanitization
  • Secure error handling that masks sensitive data

Implementation Plan

Phase 1: Core Security Module (Non-Breaking)

Create new ai_proxy_core/security/ module with:

  • encryption.py - Key encryption utilities
  • key_manager.py - Secure key storage and retrieval
  • auth.py - API gateway authentication

Phase 2: Provider Integration (Backward Compatible)

Update existing providers to optionally use secure key management:

# Backward compatible initialization
class GoogleCompletions(BaseCompletions):
    def __init__(self, api_key=None, key_manager=None):
        if key_manager:
            # Use secure key management
            self.key_manager = key_manager
        else:
            # Fall back to current behavior
            self.api_key = api_key or os.environ.get("GEMINI_API_KEY")

Phase 3: API Layer Enhancement

Add new secure endpoints alongside existing ones:

# Existing endpoint remains unchanged
@router.post("/api/chat/completions")  

# New secure endpoint with auth
@router.post("/api/secure/chat/completions")
async def secure_completion(token=Depends(auth)):
    ...

Phase 4: Migration Tools

Provide utilities to help users migrate:

  • Key migration script from env vars to secure storage
  • Token generation CLI tool
  • Documentation and migration guide

Affected Files

Core Files to Modify:

  1. ai_proxy_core/providers/google.py - Add optional secure key support
  2. ai_proxy_core/providers/openai.py - Add optional secure key support
  3. ai_proxy_core/providers/ollama.py - Update for consistency
  4. ai_proxy_core/completion_client.py - Add key manager integration
  5. api/completions.py - Add secure endpoints

New Files to Create:

  1. ai_proxy_core/security/__init__.py
  2. ai_proxy_core/security/encryption.py
  3. ai_proxy_core/security/key_manager.py
  4. ai_proxy_core/security/auth.py
  5. examples/secure_setup.py - Setup guide

Configuration Files:

  1. setup.py - Add security dependencies as optional extras
  2. requirements-security.txt - Security-specific dependencies

Dependencies

Required (add to extras_require):

"security": [
    "cryptography>=41.0.0",  # Encryption
    "keyring>=24.0.0",       # OS keyring
]

Optional (for storage backends):

"vault": ["hvac>=1.0.0"],           # HashiCorp Vault
"aws": ["boto3>=1.28.0"],           # AWS Secrets Manager  
"azure": ["azure-keyvault-secrets>=4.7.0"],  # Azure Key Vault

Backward Compatibility

Critical: All changes must be backward compatible:

  1. Existing code continues to work without any changes
  2. Environment variables still supported as fallback
  3. Current API endpoints unchanged - new secure endpoints added separately
  4. Opt-in security - users choose when to enable enhanced security
  5. Gradual migration path - can run mixed mode during transition

Testing Requirements

  1. Unit tests for all security components
  2. Integration tests for each storage backend
  3. Backward compatibility tests - ensure existing code works
  4. Security tests - attempt key extraction, invalid tokens, etc.
  5. Performance tests - measure encryption overhead

Security Considerations

  1. Master key management - Document best practices for master key storage
  2. Token distribution - Secure methods for sharing tokens with clients
  3. Audit logging - Track all key access and API usage
  4. Compliance - Ensure solution meets common compliance requirements
  5. Key rotation schedule - Recommend rotation intervals

Migration Guide Example

# Step 1: Install security extras
pip install ai-proxy-core[security]

# Step 2: Initialize secure key manager
from ai_proxy_core.security import SecureKeyManager, KeyProvider

key_manager = SecureKeyManager(
    provider=KeyProvider.VAULT,
    vault_url="https://vault.company.com",
    vault_token=os.environ.get("VAULT_TOKEN")
)

# Step 3: Migrate existing keys
await key_manager.set_api_key("openai", os.environ.get("OPENAI_API_KEY"))
await key_manager.set_api_key("gemini", os.environ.get("GEMINI_API_KEY"))

# Step 4: Update initialization (optional)
from ai_proxy_core import CompletionClient
client = CompletionClient(key_manager=key_manager)

# Step 5: Enable API authentication
from ai_proxy_core.security import APIGatewayAuth
auth = APIGatewayAuth()
token = auth.create_token(client_id="my-app", scopes=["read", "write"])

Success Metrics

  1. Zero breaking changes to existing API
  2. < 5ms latency added by encryption layer
  3. 100% test coverage for security components
  4. Complete documentation with examples
  5. Successful migration by early adopters

Questions to Resolve

  1. Should we make security mandatory in next major version (2.0)?
  2. Which storage backends to prioritize?
  3. Default rate limits for API tokens?
  4. Should we add webhook support for key rotation events?
  5. Integration with existing auth providers (OAuth, JWT)?

References


Priority: High
Estimated Effort: 2-3 weeks
Breaking Changes: None (backward compatible)
Target Version: 0.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions