This guide covers production deployment, monitoring, and maintenance of the Distributed Message Broker.
- Deployment Options
- Docker Deployment
- Kubernetes Deployment
- Systemd Deployment
- Configuration
- Monitoring
- Backup & Recovery
- Troubleshooting
| Method | Use Case | Complexity |
|---|---|---|
| Docker Compose | Development, testing, small deployments | Low |
| Kubernetes | Production, cloud-native environments | Medium |
| Systemd | Bare metal, VMs | Medium |
# Start the cluster
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f broker-1
# Stop the cluster
docker-compose down# Start with Prometheus and Grafana
docker-compose --profile monitoring up -d
# Access Grafana at http://localhost:3000 (admin/admin)# Note: For scaling beyond 3 nodes, modify docker-compose.yml
# Each new node needs a unique ID and should join the cluster- Kubernetes 1.21+
- kubectl configured
- StorageClass with ReadWriteOnce support
# Apply all manifests
kubectl apply -f deploy/kubernetes/broker.yaml
# Check pod status
kubectl get pods -n broker
# Wait for all pods to be ready
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=distributed-broker -n broker --timeout=300s# Check logs of first pod
kubectl logs broker-0 -n broker
# Port forward to access the API
kubectl port-forward svc/broker 8080:8080 -n broker# Scale to 5 replicas
kubectl scale statefulset broker --replicas=5 -n broker# 1. Build or download the binary
go build -o /usr/local/bin/broker .
# 2. Create user and directories
sudo useradd -r -s /bin/false broker
sudo mkdir -p /var/lib/broker/data /etc/broker
sudo chown -R broker:broker /var/lib/broker
# 3. Copy configuration
sudo cp deploy/systemd/broker.env.example /etc/broker/broker.env
sudo vim /etc/broker/broker.env # Edit as needed
# 4. Install service
sudo cp deploy/systemd/broker.service /etc/systemd/system/
sudo systemctl daemon-reload
# 5. Start and enable
sudo systemctl enable broker
sudo systemctl start brokerOn each node, edit /etc/broker/broker.env:
Node 1 (Bootstrap):
BROKER_NODE_ID=broker-1
BROKER_BOOTSTRAP=true
BROKER_ADVERTISE_ADDR=192.168.1.10Node 2:
BROKER_NODE_ID=broker-2
BROKER_BOOTSTRAP=false
BROKER_JOIN_ADDR=192.168.1.10:8080
BROKER_ADVERTISE_ADDR=192.168.1.11Node 3:
BROKER_NODE_ID=broker-3
BROKER_BOOTSTRAP=false
BROKER_JOIN_ADDR=192.168.1.10:8080
BROKER_ADVERTISE_ADDR=192.168.1.12Copy and modify config.yaml.example:
node:
id: "broker-1"
data_dir: "/var/lib/broker/data"
network:
grpc_port: 8080
raft_port: 9080
data_port: 9180
metrics_port: 9090
cluster:
bootstrap: true
controller: trueAll settings can be overridden via environment variables:
| Variable | Description | Default |
|---|---|---|
BROKER_NODE_ID |
Unique node identifier | Required |
BROKER_GRPC_PORT |
Client API port | 8080 |
BROKER_RAFT_PORT |
Controller communication | 9080 |
BROKER_DATA_PORT |
Inter-broker replication | 9180 |
BROKER_METRICS_PORT |
Prometheus metrics | 9090 |
BROKER_DATA_DIR |
Data directory | ./data |
BROKER_BOOTSTRAP |
Bootstrap new cluster | false |
BROKER_JOIN_ADDR |
Node to join | "" |
BROKER_ADVERTISE_ADDR |
External address | localhost |
BROKER_LOG_LEVEL |
Logging level | info |
The broker exposes metrics at http://<host>:9090/metrics:
| Metric | Type | Description |
|---|---|---|
broker_produce_total |
Counter | Total messages produced |
broker_consume_total |
Counter | Total messages consumed |
broker_produce_latency_seconds |
Histogram | Produce latency |
broker_consume_latency_seconds |
Histogram | Consume latency |
broker_topic_count |
Gauge | Number of topics |
broker_partition_count |
Gauge | Number of partitions |
Import deploy/grafana/dashboards/broker-dashboard.json or use the provided docker-compose setup.
# HTTP health endpoint
curl http://localhost:9090/health
# Expected response: {"status": "ok"}/var/lib/broker/data/
├── raft/ # Raft consensus logs and snapshots
├── topics/ # Topic data and segments
└── partitions/ # Partition data
# 1. Stop the broker (optional but recommended)
sudo systemctl stop broker
# 2. Create backup
sudo tar -czvf broker-backup-$(date +%Y%m%d).tar.gz /var/lib/broker/data
# 3. Restart broker
sudo systemctl start broker# 1. Stop the broker
sudo systemctl stop broker
# 2. Clear existing data
sudo rm -rf /var/lib/broker/data/*
# 3. Restore from backup
sudo tar -xzvf broker-backup-YYYYMMDD.tar.gz -C /
# 4. Start broker
sudo systemctl start brokerCluster won't form:
# Check connectivity between nodes
nc -zv <peer-ip> 9080
# Check Raft logs
journalctl -u broker | grep -i raftLeader election stuck:
# Ensure quorum (majority of nodes running)
# For 3-node cluster, need at least 2 nodes
# Check leader status
curl http://localhost:9090/metrics | grep leaderHigh memory usage:
# Check segment count
ls -la /var/lib/broker/data/topics/*/
# Consider adjusting retention policySlow produce/consume:
# Check disk I/O
iostat -x 1
# Check network latency between nodes
ping <peer-ip>Set BROKER_LOG_LEVEL=debug for detailed logging:
# Systemd
sudo sed -i 's/BROKER_LOG_LEVEL=info/BROKER_LOG_LEVEL=debug/' /etc/broker/broker.env
sudo systemctl restart broker- Check logs:
journalctl -u broker -f - Review metrics at
/metrics - Open an issue with logs and metrics attached