Computational Performance of Semi- and Unsupervised Concept Drift Detection: A Survey and Multiobjective Benchmark using Bayesian Optimization
A comprehensive benchmark suite for evaluating drift detection algorithms on data streams using multi-objective Bayesian optimization.
This benchmark builds upon the initial work on unsupervised concept drift detection:
- Original Repository: DFKI-NI/unsupervised-concept-drift-detection
This benchmark suite provides a systematic framework for evaluating and comparing drift detection algorithms across multiple dimensions:
- Multi-objective optimization using Bayesian optimization (via OmniOpt)
- 20 drift detectors including both supervised and unsupervised methods
- Real-world and synthetic datasets for comprehensive evaluation
- Multiple performance metrics: Accuracy, Runtime, Requested Labels, Mean Time Ratio (MTR)
- Automated hyperparameter tuning for fair comparison
- Scalable execution on both HPC clusters (SLURM) and local machines
-
Currently implemented detectors: CSDDM, BNDM, D3, IBDD, OCDD, SPLL, UDetect, EDFS, NNDVI, UCDD, STUDD, DDAL, DAWIDD, IKS, HDDDM, PCACD, CDBD, SlidShaps, WindowKDE, CDLEEDS
-
Multi-Objective Optimization: Simultaneously optimize for accuracy, runtime, label efficiency, MTR or others
-
Feature Testing Mode: Single-variate detector evaluation
benchmarkdd/
├── datasets/ # Dataset definitions and loaders
│ ├── files.tar.gz # CSV data files to unpack
│ └── *.py # Dataset classes (electricity.py, etc.)
│
├── detectors/ # Drift detector implementations
│ ├── base.py # Base detector class
│ └── *.py # Individual detectors (csddm.py, etc.)
│
├── metrics/ # Performance metrics
│ ├── drift.py # Drift detection metrics
│ ├── computational_metrics.py # Runtime and memory metrics
│ └── lift_per_drift.py # Lift per drift calculation
│
├── evaluation_notebooks/ # Analysis and visualization notebooks
│ ├── evaluation_unified.ipynb # Experiment status and data loading
│ ├── evaluation_visualization.ipynb # Performance visualization
│ ├── prediction_analysis.ipynb # Prediction pattern analysis
│ ├── evaluation_radar_graphs.ipynb # Multi-dimensional comparison
│ └── eval_config.py # Shared configuration
│
├── model/ # Pre-trained classifier models
│ └── HoeffdingTreeClassifier/ # Hoeffding Tree models
│
├── results/ # Experiment results
│ ├── baselines/ # Baseline results
│ └── omniopt_results/ # OmniOpt optimization results
│
├── runs/ # Active experiment runs (created by OmniOpt)
│ └── [detector]_[dataset]_[classifier]_[metrics]/
│
├── test/ # Unit and integration tests
│
├── main.py # Main experiment runner
├── train_classifiers.py # Classifier training
├── compute_baselines.py # Baseline computation
├── config.py # Global configuration
├── requirements.txt # Python dependencies
│
├── run_stream_detector_optimization.sh # Main benchmark script
├── run_stream_detector_optimization.sbatch # SLURM batch script
├── run_train_classifiers.sh # Classifier training script
├── run_baselines.sh # Baseline computation script
└── benchmark_config.sh # Benchmark configuration
- Python 3.8+
- Required packages (see
requirements.txt
) - SLURM based scheduler for HPC execution
- Clone the repository
git clone <repository-url>
cd benchmarkdd
- Install dependencies
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install required packages
pip install -r requirements.txt
- Prepare datasets
# Datasets can be downloaded from [USP DS Repository](https://sites.google.com/view/uspdsrepository) and extracted in `datasets/files`
# Or download additional datasets if needed
- Train classifiers (optional, pre-trained models included)
# On local machine
./run_train_classifiers.sh
# On HPC cluster
sbatch run_train_classifiers.sbatch
You can run a specific detector with a specific configuration directly using main.py
, without the hyperparameter optimization:
Standard Mode (ACCURACY-RUNTIME-REQLABELS):
python main.py <Accuracy> <Runtime> <ReqLabels> <Dataset> <TrainSamples> <Classifier> <Detector> [params...]
MTR Mode (RUNTIME-MTR):
python main.py <Runtime> <MTR> <Dataset> <TrainSamples> <Classifier> <Detector> [params...]
Arguments:
Accuracy/Runtime/ReqLabels/MTR
- Boolean flags (True/False/1/0) for which metrics to optimizeDataset
- Dataset name (e.g., 'Electricity', 'NOAAWeather')TrainSamples
- Number of training samples (typically 2000)Classifier
- Classifier name (e.g., 'HoeffdingTreeClassifier')Detector
- Drift detector name (e.g., 'CSDDM', 'BNDM')params
- Detector-specific parameters as key-value pairs
Examples:
# Run CSDDM on Electricity dataset, optimize for accuracy and runtime
python main.py True True False Electricity 1600 HoeffdingTreeClassifier CSDDM recent_samples_size 1000 n_samples 500 confidence 0.05 feature_proportion 0.5 n_clusters 5
For automated hyperparameter optimization across multiple configurations:
Edit run_stream_detector_optimization.sh
to configure your experiments:
# Select evaluation mode
MTR_MODE=false # Set to true for MTR evaluation
FEATURE_TEST_MODE=false # Set to true for feature testing
# Configure detector-dataset combinations
streamdds["Electricity"]="CSDDM BNDM D3"
streamdds["NOAAWeather"]="CSDDM BNDM"
# Add more combinations as needed
The script automatically detects local execution and adjusts accordingly:
# Run benchmark
./run_stream_detector_optimization.sh
Example: ACCURACY-RUNTIME-REQLABELS
# Edit run_stream_detector_optimization.sh
MTR_MODE=false
FEATURE_TEST_MODE=false
# Configure detectors and datasets
streamdds["Electricity"]="CSDDM BNDM D3 IBDD"
streamdds["NOAAWeather"]="CSDDM BNDM"
# Run
./run_stream_detector_optimization.sh
Example: MTR (Mean Time Ratio)
# Edit run_stream_detector_optimization.sh
MTR_MODE=true
FEATURE_TEST_MODE=false
# Configure synthetic datasets
streamdds["SineClustersPre"]="CSDDM BNDM D3"
streamdds["WaveformPre"]="CSDDM BNDM"
# Run
./run_stream_detector_optimization.sh
Example: Feature Testing Mode (single variate detectors)
# Edit run_stream_detector_optimization.sh
MTR_MODE=false
FEATURE_TEST_MODE=true
# Configure single-variate detectors
streamdds["Electricity"]="CDBD IKS WindowKDE"
# Run
./run_stream_detector_optimization.sh
The script automatically detects SLURM and uses appropriate commands:
# Submit batch job
sbatch run_stream_detector_optimization.sbatch
# Or run interactively in a job allocation
./run_stream_detector_optimization.sh
Adjust OmniOpt parameters:
# In run_stream_detector_optimization.sh, modify OmniOpt parameters:
--max_eval=50000 # Maximum evaluations
--num_parallel_jobs=30 # Parallel jobs
--worker_timeout=240 # Timeout per evaluation (minutes)
--mem_gb=64 # Memory per job (GB)
--time=10080 # Total time limit (minutes)
Please check the OmniOpt documentation for further details: https://imageseg.scads.de/omniax/tutorials
Navigate to evaluation_notebooks/
for comprehensive analysis:
cd evaluation_notebooks
jupyter notebook
Available notebooks:
-
evaluation_unified.ipynb
- Experiment status and completion- Check which experiments have completed
- Identify missing or incomplete runs
- View completion statistics
-
evaluation_visualization.ipynb
- Performance visualization- Pareto front analysis
- Performance heatmaps
- Best configuration identification
- Export results to CSV
-
prediction_analysis.ipynb
- Prediction pattern analysis- Time-series visualization of predictions
- Ensemble analysis
- Minimal detector set identification
The benchmark evaluates detectors across multiple dimensions:
- Accuracy: Classification accuracy with drift detection
- Runtime: Total execution time (seconds)
- Requested Labels: Number of labels requested (for semi-supervised methods)
- Memory: Peak and mean memory usage (MB)
- Mean Time Ratio (MTR): Ratio of time to recover from drift
- Runtime: Execution time
- Accuracy: Overall classification accuracy
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Original unsupervised drift detection work: DFKI-NI/unsupervised-concept-drift-detection
- OmniOpt framework for Bayesian optimization
- River library for online machine learning
- All contributors and researchers in the drift detection community
For questions, issues, or collaboration:
- Open an issue on GitHub
- Contact: [[email protected]]