The system supports two deployment patterns with automatic protocol detection for seamless integration with different environments.
Single process containing all services on one port (default: 6280). This mode combines:
- MCP server accessible via
/mcpand/sseendpoints - Web interface for job management
- Embedded worker for document processing
- API (tRPC over HTTP) for programmatic access
- Development environments
- Single-container deployments
- Simple production setups
- Local documentation indexing
Services can be selectively enabled via AppServerConfig:
enableMcpServer: MCP protocol endpointenableWebInterface: Web UI and management APIenableWorker: Embedded job processingenableApiServer: HTTP API for pipeline and data operations (served at/api)
Separate coordinator and worker processes for scaling. The coordinator handles interfaces while workers process jobs.
- Coordinator: Runs MCP server, web interface, and API
- Workers: Execute document processing jobs
- Communication: Coordinator uses the API (tRPC over HTTP) to talk to workers
- High-volume processing
- Container orchestration (Kubernetes, Docker Swarm)
- Horizontal scaling requirements
- Resource isolation
Workers may expose a simple /health or container-level healthcheck for monitoring. Coordinators communicate with workers via Pipeline RPC.
The system automatically selects communication protocol based on execution environment:
if (!process.stdin.isTTY && !process.stdout.isTTY) {
return "stdio"; // AI tools, CI/CD
} else {
return "http"; // Interactive terminals
}
- Direct MCP communication via stdin/stdout
- Used by VS Code, Claude Desktop, other AI tools
- No HTTP server required
- Minimal resource usage
- Server-Sent Events transport for MCP
- Full web interface available
- API accessible at
/api - Suitable for browser access
Protocol can be explicitly set via --protocol stdio|http flag, bypassing auto-detection.
Deployment mode, ports, and embedding settings are resolved through the shared configuration loader (defaults → docs-mcp.config.yaml or DOCS_MCP_CONFIG → legacy envs → generic env DOCS_MCP_<KEY> → CLI flags for the current run). Override with YAML or env keys such as DOCS_MCP_PROTOCOL, DOCS_MCP_PORT, and DOCS_MCP_EMBEDDING_MODEL; use CLI flags like --protocol, --port, --server-url, or --resume when you need per-invocation changes.
Job recovery behavior depends on deployment mode:
- Embedded worker recovers pending jobs from database
- Enabled by default for persistent job processing
- Prevents job loss during server restarts
- Workers handle their own job recovery
- Coordinators do not recover jobs to avoid conflicts
- Each worker maintains independent job state
- No job recovery to prevent conflicts
- Immediate execution model
- Safe for concurrent CLI usage
FROM ghcr.io/arabold/docs-mcp-server:latest
EXPOSE 6280
CMD ["--protocol", "http", "--port", "6280"]services:
coordinator:
image: ghcr.io/arabold/docs-mcp-server:latest
ports: ["6280:6280"]
command: ["mcp", "--server-url", "http://worker:8080/api"]
worker:
image: ghcr.io/arabold/docs-mcp-server:latest
ports: ["8080:8080"]
command: ["worker", "--port", "8080"]Use a load balancer (or DNS) in front of multiple worker instances. The coordinator is configured with a single --server-url that points to the balancer.
Expose a lightweight /health endpoint or container healthcheck for load balancers and monitoring.
- Horizontal: Add more worker containers
- Vertical: Increase worker resource allocation
- Hybrid: Combine both strategies based on workload