A comprehensive Node.js-based monitoring system for the XDC Network. This application provides real-time monitoring of blockchain infrastructure with a focus on:
- RPC endpoint monitoring (availability and performance)
- Multi-RPC endpoint health checks
- Port monitoring
- Alert notifications
- InfluxDB metrics storage and Grafana visualization
-
RPC URL Monitoring
- Mainnet and Testnet endpoint monitoring
- Downtime detection
- Latency measurement
- Curl API for external testing
-
Multi-RPC Monitoring
- Monitor multiple RPC endpoints simultaneously
- Compare response times and availability across nodes
- Load balancing checks
- Block height comparison between nodes
-
RPC Port Monitoring
- HTTP/HTTPS port checks
- WebSocket port checks
- Automated connectivity testing
-
Block Propagation Monitoring
- Block time tracking
- Slow block detection (configurable threshold)
- Cross-node block height discrepancy detection
-
Alert System
- Customizable dashboard alerts
- Telegram notifications (via secure NestJS backend API)
- Webhook notifications (for other chat services)
- Detailed error reporting
-
Metrics Collection
- InfluxDB time-series database
- Real-time metrics for RPC performance
- Block and transaction statistics
- Grafana dashboards
- Multi-RPC comparative metrics
- Node.js 16.x or higher
- npm or yarn package manager
- Access to XDC Network RPC endpoints
- Docker and Docker Compose (for full stack deployment)
-
Clone the repository:
git clone https://github.com/yourusername/xdc-monitor.git cd xdc-monitor
-
Install dependencies:
npm install
-
Configure the application by creating a
.env
file (see Configuration section) -
Build the application:
npm run build
The project uses environment variables for configuration. Create a .env
file in the project root with the following variables:
# Primary RPC endpoint
RPC_URL=https://rpc.xinfin.network # Mainnet
# RPC_URL=http://157.173.195.189:8555 # Testnet
CHAIN_ID=50
SCAN_INTERVAL=15
# WebSocket URL (if available)
WS_URL=wss://ws.xinfin.network
# Monitoring configuration
ENABLE_RPC_MONITORING=true
ENABLE_PORT_MONITORING=true
ENABLE_BLOCK_MONITORING=true
BLOCK_TIME_THRESHOLD=3.0
# Alert configuration
ENABLE_DASHBOARD_ALERTS=true
ENABLE_CHAT_NOTIFICATIONS=true
NOTIFICATION_WEBHOOK_URL=
# Telegram notification configuration
TELEGRAM_BOT_TOKEN="your-telegram-bot-token-here"
TELEGRAM_CHAT_ID="your-telegram-chat-id-here"
# Metrics configuration
METRICS_PORT=9090
# Logging configuration
LOG_LEVEL=debug
# Multi-RPC monitoring
ENABLE_MULTI_RPC=true
# InfluxDB Configuration
INFLUXDB_URL=http://localhost:8086
INFLUXDB_TOKEN=your-influxdb-token
INFLUXDB_ORG=xdc
INFLUXDB_BUCKET=xdc_metrics
INFLUXDB_ADMIN_USER=admin
INFLUXDB_ADMIN_PASSWORD=secure-password
# Grafana Admin Credentials
GRAFANA_ADMIN_USER=admin
GRAFANA_ADMIN_PASSWORD=secure-password
npm run start:prod
docker-compose up -d
This will start all services:
- XDC Monitor (API and monitoring)
- InfluxDB (metrics storage)
- Grafana (visualization)
# Run only InfluxDB
docker-compose up -d influxdb
# Run only Grafana
docker-compose up -d grafana
- Block Status:
/api/monitoring/block-status
- Current block monitoring information - Block Comparison:
/api/monitoring/block-comparison
- Comparison of block heights across RPCs - RPC Status:
/api/monitoring/rpc-status
- Status of all RPC endpoints - WebSocket Status:
/api/monitoring/websocket-status
- Status of WebSocket connections - Overall Status:
/api/monitoring/status
- Combined status of all monitoring systems - Metrics:
/metrics
- Prometheus-compatible metrics endpoint - Notifications Test:
/api/notifications/test
- Test the notification system - Telegram Webhook:
/api/notifications/telegram
- Endpoint for Grafana to send alerts
- Trigger Manual Alert:
/api/testing/trigger-manual-alert?type=error&title=Title&message=Message
- Directly trigger an alert - Simulate Slow Block Time:
/api/testing/simulate-slow-blocktime?seconds=4
- Simulate a slow block time - Simulate RPC Down:
/api/testing/simulate-rpc-down?endpoint=URL
- Simulate an RPC endpoint being down - Simulate RPC Latency:
/api/testing/simulate-rpc-latency?endpoint=URL&latency=500
- Simulate high RPC latency
The application stores the following metrics in InfluxDB:
block_height
- Current block height, tagged withchainId
andendpoint
transaction_count
- Transaction counts by status, tagged withstatus
andchainId
rpc_latency
- Response time of RPC endpoints in ms, tagged withendpoint
andchainId
rpc_status
- Status of RPC endpoints (1=up, 0=down), tagged withendpoint
andchainId
websocket_status
- Status of WebSocket endpoints (1=up, 0=down), tagged withendpoint
andchainId
explorer_status
- Status of explorer endpoints (1=up, 0=down), tagged withendpoint
andchainId
faucet_status
- Status of faucet endpoints (1=up, 0=down), tagged withendpoint
andchainId
block_time
- Time between blocks in seconds, tagged withchainId
alert_count
- Count of alerts by type and component, tagged withtype
,component
, andchainId
The system maintains custom alert tracking metrics:
alert_count
- Incremented whenever an alert is processed- Tags:
type
(error/warning/info) andcomponent
(blockchain/rpc/etc.) - Provides historical data on alert frequency
The dashboards display alerts in various panels:
- "Active Alerts" panel shows currently firing alerts
- Status panels show the current state of various services
- Block time and latency panels include threshold indicators for alerting conditions
src/blockchain/
- Blockchain interaction layersrc/config/
- Configuration managementsrc/models/
- Data structures and interfacessrc/monitoring/
- Monitoring servicessrc/metrics/
- Prometheus metrics collection
For Grafana integration, a data source should be configured pointing to the Prometheus server.
This project uses a special approach to manage Grafana configurations:
- The actual Grafana data is stored in
grafana_data/
(ignored by Git) - Version-controlled configurations are stored in
grafana_config/
- Two helper commands synchronize between these directories:
# Export your current Grafana configurations to the version-controlled directory
./run.sh grafana-export
# Import the version-controlled configurations to your local Grafana
./run.sh grafana-import
This approach allows:
- Servers to maintain their own customized dashboards without Git conflicts
- Selectively updating dashboards only when desired
- Committing only intentional configuration changes to Git
- Make changes to your dashboards in Grafana UI
- Export the changes:
./run.sh grafana-export
- Commit the changes:
git add grafana_config/ && git commit -m "Update dashboards"
- Pull the latest code:
git pull
- Import the changes (optional):
./run.sh grafana-import
- Restart Grafana (if running):
./run.sh restart grafana
The Grafana dashboards are automatically provisioned when you start the containers. The dashboards and datasources are configured in the grafana_config/
directory.
- Login to Grafana at http://localhost:3100 (default credentials from your .env file)
- You should see the XDC Network dashboards already available
- Explore the "XDC Network Unified Dashboard" and "XDC Apothem Testnet Monitoring" dashboards
Grafana alerts are configured to use the NestJS backend API for sending notifications:
- The alerts are defined in
grafana_data/provisioning/alerting/rules.yaml
- Notifications are sent via webhook to the NestJS backend API endpoint
- The NestJS backend handles sending notifications to Telegram
- This approach keeps Telegram credentials securely in the NestJS backend only
The system comes with several pre-configured alert rules:
-
Slow Block Time Alert - Triggers when block time exceeds the threshold (default: 1s for testing)
- Source: Block monitoring service
- Severity: Warning
- Component: blockchain
-
RPC Endpoint Down Alert - Triggers when an RPC endpoint is detected as down
- Source: RPC monitoring service
- Severity: Critical
- Component: rpc
-
High RPC Latency Alert - Triggers when RPC response time exceeds threshold (default: 300ms for testing)
- Source: RPC monitoring service
- Severity: Warning
- Component: rpc
-
Block Height Discrepancy Alert - Triggers when different RPC endpoints report varying block heights
- Source: Block monitoring service
- Severity: Warning
- Component: blockchain
The project includes several ways to test the notification system:
# Test sending a notification via the API
./test-telegram-notification.sh
# Run comprehensive system tests
./test-notification-system.sh
Send a GET request to test the notification system:
curl -X GET 'http://localhost:3000/api/notifications/test?title=Test&message=This%20is%20a%20test&severity=info'
Parameters:
title
: The title of the test notificationmessage
: The content of the notificationseverity
: One ofinfo
,warning
, orcritical
/error
The system includes a Testing Controller with endpoints to simulate various alert conditions:
-
Test Slow Block Time Alert:
curl "http://localhost:3000/api/testing/simulate-slow-blocktime?seconds=4"
-
Test RPC Endpoint Down Alert:
curl -X POST "http://localhost:3000/api/testing/simulate-rpc-down?endpoint=https://rpc.xinfin.network"
-
Test High RPC Latency:
curl -X POST "http://localhost:3000/api/testing/simulate-rpc-latency?endpoint=https://erpc.xinfin.network&latency=600"
-
Trigger a Manual Alert:
curl "http://localhost:3000/api/testing/trigger-manual-alert?type=error&title=Critical%20Test&message=Urgent%20test%20message"
-
Use annotations to trigger the manual-test-alert rule:
- Open your Grafana dashboard
- Add an annotation with name
manual_test_alert
and value1
- This should trigger the alert rule, which will send a notification
-
Check the Grafana Alerting UI:
- Navigate to Alerting > Alert rules to see the status of all alerts
For convenience, this project includes a helper script to manage various deployment scenarios:
# Make the script executable (first time only)
chmod +x run.sh
# Show available commands
./run.sh help
# Start the complete stack (app + Prometheus + Grafana)
./run.sh up
# View logs
./run.sh logs
# Clear prometheus data
./run.sh clear-prometheus
# Rebuild containers (after code changes)
./run.sh rebuild
# Clean up all containers, volumes and networks (fixes Docker issues)
./run.sh clean
You can also use Docker Compose commands directly:
-
Build and start all services:
docker-compose up -d
-
Stop all services:
docker-compose down
-
View logs:
docker-compose logs -f
-
Rebuild containers (after code changes):
docker-compose build docker-compose up -d
Prometheus and Grafana data are stored in local directories for persistence and easy access:
- Grafana Data:
./grafana_data/
- XDC Monitor API: http://localhost:3000
- InfluxDB Interface: http://localhost:8086 (credentials from .env file)
- Grafana: http://localhost:3100 (credentials from .env file)
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This project follows secure practices for managing sensitive credentials:
-
Environment Variables: All sensitive credentials (like Telegram bot tokens) are stored in the
.env
file, which is excluded from Git. -
Centralized Credential Storage: Telegram credentials are only stored in the NestJS backend's environment, not in Grafana configuration.
-
Webhook-based Notification: Grafana uses a webhook to send alerts to the NestJS API, which then securely uses the credentials to send notifications.
The project is set up so you can:
- Share your Grafana dashboards via Git
- Keep your sensitive credentials private in your
.env
file
grafana_data/provisioning/dashboards/*.yaml
- Dashboard configurationsgrafana_data/provisioning/datasources/*.yaml
- Data source configurationsgrafana_data/provisioning/plugins/*.yaml
- Plugin configurationsgrafana_data/provisioning/alerting/*.yaml
- Alert configuration (webhook URLs only, no credentials)
.env
file with sensitive credentials
-
Copy
.env.example
to.env
and fill in your credentials:cp .env.example .env
-
Start the services:
docker-compose up -d
- Access the Grafana dashboard at http://localhost:3100 (default credentials: admin/Admin@123456@789)
If you need to update your Telegram bot token or chat ID:
- Update the values in your
.env
file - Restart the XDC Monitor container:
docker-compose restart xdc-monitor
To verify that alerts are properly firing and reaching your Telegram bot:
-
Use the Testing Controller:
curl "http://localhost:3000/api/testing/trigger-manual-alert?type=error&title=Test&message=Test"
-
Check Grafana Alerting UI:
- Navigate to Grafana > Alerting
- Look for your firing alerts in the list
-
Check NestJS Logs:
docker-compose logs -f xdc-monitor | grep "notification"
-
Check Your Telegram:
- You should receive messages from your Telegram bot
- Docker and Docker Compose
- Git
- Clone the repository:
git clone https://github.com/yourusername/XDCMonitor.git
cd XDCMonitor
- Start the services:
docker-compose up -d
- Access the Grafana dashboard at http://localhost:3100 (default credentials: admin/Admin@123456@789)