A machine learning solution for proactive network management
This project builds a system that analyzes network traffic patterns and predicts whether congestion is likely to occur, giving network administrators a heads-up to take action.
The Problem: Network congestion leads to slow load times, dropped connections, and frustrated users.
The Solution: By predicting congestion before it happens, ISPs and network teams can implement traffic shaping, load balancing, or capacity adjustments proactively.
# Clone and navigate
git clone https://github.com/AR10129/Network-Traffic-Congestion-Prediction.git
cd Network-Traffic-Congestion-Prediction
# Set up environment (Windows)
python -m venv .venv
.venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Train the model
python scripts/run_pipeline.py
# Launch the web app
python app.pyThen visit http://127.0.0.1:5000
├── app.py Flask web application
├── requirements.txt Python dependencies
├── config/ YAML configuration files
├── data/ Generated traffic datasets
├── logs/ Application logs
├── models/ Trained model artifacts (.pkl)
├── notebooks/ Jupyter notebooks for EDA
├── screenshots/ App demo images
├── scripts/ Pipeline execution scripts
├── src/ Core source code modules
│ ├── data/ Data generation utilities
│ ├── features/ Feature engineering logic
│ ├── models/ Training & prediction code
│ ├── utils/ Logging & helper functions
│ └── visualization/ Plotting functions
├── static/ CSS stylesheets
├── templates/ HTML templates for Flask
└── visualization/ Generated plots and charts
| Homepage | Prediction Form | Result |
|---|---|---|
![]() |
![]() |
![]() |
Generate Data --> Engineer Features --> Train Model --> Deploy via API
| Stage | Description |
|---|---|
| Data Generation | Synthetic network traffic with realistic peak/off-peak patterns |
| Feature Engineering | IP conversion, packet rate calculation, temporal features, one-hot encoding |
| Model Training | Random Forest with TimeSeriesSplit CV + RandomizedSearchCV hyperparameter tuning |
| Deployment | Flask web app for real-time predictions |
| Feature | Description |
|---|---|
packet_size |
Size of network packets |
bytes_sent |
Total bytes transmitted |
packet_rate |
Rolling average of packet transmission |
hour / day_of_week |
Temporal patterns |
is_weekend |
Weekend indicator (different traffic patterns) |
protocol_* |
One-hot encoded protocol type (TCP/UDP/HTTP) |
The model uses Random Forest Classifier with the following evaluation approach:
- Cross-Validation: TimeSeriesSplit (prevents data leakage in temporal data)
- Hyperparameter Search: RandomizedSearchCV for efficient tuning
- Metrics Tracked: Precision, Recall, F1-Score, AUC-ROC
After training, visualizations are auto-generated:
- Confusion Matrix
- Feature Importance Chart
- Traffic Volume Over Time
- Protocol Distribution
The Flask app provides a simple interface to test predictions:
Input Fields:
- Packet Size
- Bytes Sent
- Source/Destination IP
- Protocol (TCP/UDP/HTTP)
- Timestamp & Hour
Output: Congested or Normal prediction
All parameters are centralized in config/config.yaml:
# Model hyperparameters
model:
param_grid:
n_estimators: [50, 100, 200, 300]
max_depth: [null, 10, 20, 30]
cv_splits: 5
# Data settings
data:
test_size: 0.2
random_state: 42| Category | Technologies |
|---|---|
| ML/Data | pandas, numpy, scikit-learn |
| Visualization | matplotlib, seaborn |
| Web Framework | Flask |
| Config | PyYAML |
| Utilities | ipaddress, pickle |
- Integrate real-time network monitoring (SNMP/NetFlow)
- Experiment with Gradient Boosting / XGBoost
- Add model drift detection
- Docker containerization
- REST API with authentication
- Dashboard with live metrics


