This is a repository for pre-processing and analyzing OUH incident dataset, aiming for an integrated solution for our simulation program. The simulator program is found here.
- Install required python version 3.11
- Install required packages
pip install -r source/requirements.txt
(We recommend using virtual environment, follow guide under Virtual Environment Setup below and skip this step) - Change directory
cd source
- Run program
python main.py
to start the dataset pipeline
- Get the package
pip install virtualenv
- Create a new empty instance of python 3.11 environment
py -3.11 -m venv ./.venv
- Activate the environment
source .venv/Scripts/activate
- Install the packages required by this project
pip install -r source/requirements.txt
- Get the package
pip install virtualenv
- Create a new empty instance of python 3.11 environment
python -m venv ./.venv
- Activate the environment
source .venv/bin/activate
- Install the packages required by this project
pip install -r source/requirements.txt
All the code is in the source/
directory, with the analysis notebooks contained in the source/analysis/
directory. Analysis of the experiment results is done in source/analysis/simulator/
.
The data used in this project can't be made public and will therefore not be contained in this public repository.
The data is stored in the data/
directory and contains 4 directories for the dataset processing pipeline; raw/
which contains the raw versions of incidents.csv
and depots.csv
, clean/
which containes the cleaned versions of the files (removing CSV errors), processed/
which contains the the processed versions of the files (convert to our dataframe template), and lastly enhanced/
which contains the final form of the OUH dataset which the simulator will use. The OD cost matrix and traffic data is included.
data/
enhanced/
oslo/
depots.csv
incidents.csv
processed/
oslo/
depots.csv
incidents.csv
clean/
oslo/
depots.csv
incidents.csv
raw/
oslo/
depots.csv
incidents.csv
oslo/
od_matrix.txt
traffic.csv