Mezon Call Translation

Real-time Speech-to-Text (STT) system with multi-client support, horizontal scaling, and LiveKit integration

🚀 Quick Start

Option 1: Basic Setup

# 1. Setup environment
./scripts/setup.sh                    # Linux/macOS
.\scripts\setup.ps1                    # Windows

# 2. Download models
./scripts/download-vosk-model.sh       # STT model (Linux/macOS)
.\scripts\download-vosk-model.ps1       # STT model (Windows)

./scripts/download-kokoro-model.sh     # TTS model (Linux/macOS)
.\scripts\download-kokoro-model.ps1     # TTS model (Windows)

# 3. Configure environment
cp env.example .env                    # Edit with your LiveKit credentials

# 4. Run the system
./scripts/run-dev.sh                   # Development mode
./scripts/run-prod.sh                  # Production mode

Option 2: Horizontal Scaling (Recommended for Production)

# Start with 5 server instances behind load balancer
./scripts/scale-deploy.sh start 5      # Linux/macOS
.\scripts\scale-deploy.ps1 start 5      # Windows

# Scale to 10 instances
./scripts/scale-deploy.sh scale 10

# Check status
./scripts/scale-deploy.sh status

System will be available at: http://localhost:8000

📋 Table of Contents

System Overview
Architecture
Features
Quick Start
Documentation
Deployment Options
API Reference
Monitoring
Contributing
Support

🎯 System Overview

Mezon Call Translation is a production-ready, scalable Speech-to-Text system designed for real-time communication platforms. It provides:

Real-time STT: Convert speech to text with low latency using Vosk engine
Multi-client Support: Handle multiple simultaneous audio streams
Horizontal Scaling: Scale across multiple server instances with load balancing
LiveKit Integration: Seamless integration with LiveKit rooms and agents
High Availability: Circuit breaker pattern, health monitoring, and graceful degradation

Key Technologies

Vosk: Offline speech recognition engine
FastAPI: Modern web framework with WebSocket support
LiveKit: Real-time communication platform
Docker: Containerization and orchestration
Nginx: Load balancer and proxy

🏗️ Architecture

High-Level Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Multiple      │    │  Nginx Load     │    │  Server Pool    │
│   Clients       │◄──►│  Balancer       │◄──►│  (Scalable)     │
│                 │    │  (Port 8000)    │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │                       │
                                ▼                       ▼
                       ┌─────────────────┐    ┌─────────────────┐
                       │  LiveKit Agent  │    │  Vosk STT       │
                       │  (Port 8080)    │    │  Workers        │
                       └─────────────────┘    └─────────────────┘

Core Components

Server - FastAPI with multi-worker STT processing
Agent - LiveKit integration with VAD processing
Load Balancer - Nginx for traffic distribution and WebSocket proxy
Session Management - Multi-client session coordination
Health Monitoring - Comprehensive health checks and metrics

✨ Features

Core Capabilities

✅ Real-time Speech-to-Text with Vosk engine
✅ Multi-client Session Management with language support
✅ WebSocket-based Communication for low latency
✅ Adaptive Processing based on system load
✅ Circuit Breaker Pattern for fault tolerance
✅ Voice Activity Detection (VAD) for efficiency

Scalability & Operations

🚀 Horizontal Scaling with automatic load balancing
📊 Comprehensive Monitoring and health checks
🔄 Auto-recovery and graceful degradation
📈 Performance Metrics and analysis tools
🐳 Docker Containerization for easy deployment
🔧 Configuration Management via environment variables

Integration Features

🎤 LiveKit Integration for room management
🌐 REST API for system management
📡 WebSocket API for real-time communication
🔐 JWT Authentication support
📝 Multi-language Support for transcripts

📚 Documentation

Setup & Installation

Setup Guide - Complete installation and configuration guide
Environment Configuration - LiveKit credentials and system settings
Model Management - Vosk model download and configuration

Architecture Documentation

Server Architecture - Detailed server design and components
Agent Architecture - LiveKit agent implementation
System Design Patterns - Circuit breaker, session management, worker pools

Operations & Monitoring

Metrics Guide - Performance monitoring and log analysis
Health Check Endpoints - System status and monitoring
Troubleshooting - Common issues and solutions

Development

API Documentation - REST and WebSocket endpoints
Configuration Reference - All environment variables and settings
Development Setup - Hot reload and debugging

🚀 Deployment Options

1. Development Mode

# Hot reload enabled, debug logging
./scripts/run-dev.sh

✅ Hot reload for code changes
✅ Enhanced debugging
✅ Local volume mounting

2. Production Mode

# Optimized for production
./scripts/run-prod.sh

✅ Performance optimizations
✅ Resource limits
✅ Production logging

3. Horizontal Scaling (Recommended)

# Multiple server instances with load balancing
./scripts/scale-deploy.sh start 5

✅ Multiple server instances
✅ Nginx load balancer
✅ Auto-scaling capabilities
✅ High availability

4. Manual Docker Compose

# Custom scaling
docker-compose up -d --scale server=3

🔌 API Reference

WebSocket API

ws://localhost:8000/ws/vosk/

Parameters:

client_id: Unique client identifier
session_id: Session identifier
transcript: Enable transcript delivery
translation: Enable translation delivery
language: Client language (en, vi, etc.)

Input: Binary audio data
Output: JSON transcript/translation results

REST API

Endpoint	Method	Description
`/health`	GET	Detailed health status
`/health/simple`	GET	Simple health check
`/agent/join`	POST	LiveKit agent dispatch
`/ws/stats`	GET	WebSocket statistics

Health Check Example

# Simple health check
curl http://localhost:8000/health/simple

# Detailed health information
curl http://localhost:8000/health

📊 Monitoring

Built-in Monitoring

Health Endpoints: Real-time system status
Metrics Collection: Performance and usage statistics
Log Analysis: Comprehensive logging with structured format
Worker Statistics: STT worker performance tracking

Key Metrics

Audio Processing Latency: Real-time performance tracking
Worker Load Distribution: Load balancing effectiveness
Session Management: Client connection statistics
Error Rates: System reliability monitoring

Monitoring Tools

# Check system status
./scripts/scale-deploy.sh status

# View real-time logs
docker-compose logs -f server

# Monitor resource usage
docker stats

For detailed metrics analysis, see the Metrics Guide.

🔧 Configuration

Environment Variables

# Core Configuration
LIVEKIT_URL=wss://your-livekit-server.com
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret
VOSK_MODEL_PATH=/app/models/vosk-model

# Performance Tuning
SERVER_HOST=0.0.0.0
SERVER_PORT=8000
LOG_LEVEL=INFO

Advanced Configuration

The system supports extensive configuration for:

Audio processing parameters
Worker pool management
Circuit breaker settings
Health check intervals
Scaling parameters

See Setup Guide for complete configuration options.

🤝 Contributing

Development Setup

Fork the repository
Follow the Setup Guide
Use development mode: ./scripts/run-dev.sh
Make changes with hot reload enabled
Test thoroughly with multiple clients
Submit pull request with documentation updates

Architecture Guidelines

Follow the existing service pattern
Maintain thread-safe operations
Add appropriate error handling
Include health check integration
Update documentation

🐛 Troubleshooting

Common Issues

Server won't start:

# Check Vosk STT model exists
ls -la models/vosk-model/

# Download if missing
./scripts/download-vosk-model.sh

Agent TTS not working:

# Check Kokoro TTS model exists
ls -la models/kokoro_models/

# Download if missing
./scripts/download-kokoro-model.sh

# Or download with specific voices
./scripts/download-kokoro-model.sh -v "af_heart,am_adam"

Agent connection failed:

# Verify server is running
curl http://localhost:8000/health/simple

# Check environment variables
cat .env

Poor performance:

# Check resource usage
docker stats

# Scale up servers
./scripts/scale-deploy.sh scale 10

For comprehensive troubleshooting, see the Setup Guide.

📞 Support

Getting Help

Check Documentation: Review relevant guides in /docs
System Requirements: Ensure Docker & Docker Compose are installed
Health Checks: Verify system status via health endpoints
Log Analysis: Examine service logs for specific errors
Resource Check: Ensure adequate CPU, memory, and disk space

Debug Information

# System status
docker-compose ps
docker stats

# Service health
curl http://localhost:8000/health

# Recent logs
docker-compose logs --tail=100 server

Resources

Setup Guide - Installation and configuration
Server Architecture - System design
Operations Guide - Monitoring and troubleshooting

📄 License

This project is part of the Mezon platform ecosystem. See the project documentation for licensing information.

🏷️ Tags

speech-to-text real-time vosk fastapi livekit docker microservices websocket audio-processing scalable/degraded/unhealthy) │ ├── Uptime information │ ├── Component details │ └── HTTP status codes (200/503) └── /health/simple: Simple boolean check


## Outstanding Technical Features

### 1. **Scalability**
- Multi-worker architecture cho STT processing
- Async/await pattern cho I/O operations
- Queue-based load balancing
- Adaptive processing based on system load

### 2. **Reliability**
- Circuit breaker pattern cho error handling
- Graceful degradation (VAD fallback)
- Resource cleanup và memory management
- Health monitoring và metrics

### 3. **Performance Optimization**
- VAD pre-filtering để giảm STT workload
- Chunk accumulation strategy
- Overlapping audio processing
- GPU acceleration support (VAD)

### 4. **Real-time Capabilities**
- WebSocket-based communication
- Non-blocking audio submission
- Async result dispatching
- Low-latency processing pipeline

### 5. **Horizontal Scaling with Load Balancer**
- Nginx load balancer for multiple server instances
- Docker Compose scaling capabilities
- Health check integration with load balancer
- Session-independent client routing
- Zero-downtime scaling operations

### 6. **Multi-tenant Support**
- Session-based client isolation
- Per-client language settings
- Flexible subscription model (transcript/translation)
- Resource sharing với isolation

## Overall Operational Flow

1. **Client connection** → WebSocket with parameters
2. **Audio streaming** → Continuous audio chunks
3. **VAD filtering** → Silence elimination
4. **STT processing** → Multi-worker Vosk recognition
5. **Result dispatching** → Async delivery to subscribed clients
6. **Session management** → Multi-client coordination
7. **Resource cleanup** → Automatic maintenance

The system is designed to handle real-time speech-to-text for multiple clients simultaneously with low latency and high reliability.

## Horizontal Scaling with Load Balancer

### Scaling Architecture
The system supports horizontal scaling with Nginx load balancer:

Client → Nginx Load Balancer → Multiple Server Instances ↓ STT Processing Workers ↓ Shared Result Queue ↓ Agent Services


### How It Works
1. **Nginx Load Balancer**:

- Distribute traffic to multiple server instances
- Automatic health check for backend servers
- WebSocket proxy with timeout configuration
- Round-robin load balancing (other configurations are possible)

2. **Server Scaling**:

- Multiple FastAPI server instances running in parallel
- Each instance has its own set of STT workers
- Independent session management on each instance
- Agent connects via Nginx instead of direct connection

### Scaling Deployment

#### 1. Quick Start with Script
**Linux/macOS:**
```bash
# Start with 5 server instances
./scripts/scale-deploy.sh start 5

# Scale to 10 instances
./scripts/scale-deploy.sh scale 10

# Check status
./scripts/scale-deploy.sh status

Windows:

# Start with 5 server instances
.\scripts\scale-deploy.ps1 start 5

# Scale to 10 instances
.\scripts\scale-deploy.ps1 scale 10

# Check status
.\scripts\scale-deploy.ps1 status

2. Manual Docker Compose

# Build images
docker-compose build

# Start with 3 server instances
docker-compose up -d --scale server=3

# Scale to 5 instances
docker-compose up -d --scale server=5

# Check status
docker-compose ps

Health Check and Monitoring

Load Balancer Health: http://localhost:8000/health/simple
Nginx Status: Automatic health checks to backend servers
Container Status: docker-compose ps
Logs: docker-compose logs -f [service_name]

Nginx Configuration

The nginx.conf file is optimized for:

WebSocket proxy support
Health check integration
Timeout configuration for long-running connections
Load balancing strategy (adjustable)

Performance Benefits

Increased Throughput: Multiple servers handle concurrent requests
High Availability: Server failure does not affect the entire system
Zero Downtime Scaling: Add/remove instances without interrupting service
Resource Optimization: Distribute load evenly across multiple instances

Name		Name	Last commit message	Last commit date
Latest commit History 486 Commits
Architect_MultiClient_Server		Architect_MultiClient_Server
scripts		scripts
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Mezon Call Translation

🚀 Quick Start

Option 1: Basic Setup

Option 2: Horizontal Scaling (Recommended for Production)

📋 Table of Contents

🎯 System Overview

Key Technologies

🏗️ Architecture

High-Level Architecture

Core Components

✨ Features

Core Capabilities

Scalability & Operations

Integration Features

📚 Documentation

Setup & Installation

Architecture Documentation

Operations & Monitoring

Development

🚀 Deployment Options

1. Development Mode

2. Production Mode

3. Horizontal Scaling (Recommended)

4. Manual Docker Compose

🔌 API Reference

WebSocket API

REST API

Health Check Example

📊 Monitoring

Built-in Monitoring

Key Metrics

Monitoring Tools

🔧 Configuration

Environment Variables

Advanced Configuration

🤝 Contributing

Development Setup

Architecture Guidelines

🐛 Troubleshooting

Common Issues

📞 Support

Getting Help

Debug Information

Resources

📄 License

🏷️ Tags

2. Manual Docker Compose

Health Check and Monitoring

Nginx Configuration

Performance Benefits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages