Skip to content

UrbanEV is an open benchmark dataset for electric vehicle (EV) charging demand in Shenzhen, China.

License

Notifications You must be signed in to change notification settings

IntelligentSystemsLab/UrbanEV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UrbanEV

UrbanEV is an open dataset of EV charging space availability and electricity use in Shenzhen, China. This project is dedicated to the public domain using the CC0 1.0 Universal License. For more information, see Creative Commons - CC0.

Citations

If this project is helpful to your research, please cite our papers:

Li, H., Qu, H., Tan, X. et al. (2025). UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction. Scientific Data. Paper in Spring Nature

Qu, H., Kuang, H., Li, J., & You, L. (2023). A physics-informed and attention-based graph learning approach for regional electric vehicle charging demand prediction. IEEE Transactions on Intellgent Transportation Systems. Paper in IEEE Explore Paper in arXiv

Kuang, H., Zhang, X., Qu, H., and You, L., and Zhu, R. and Li, J. (2024). Unravelling the effect of electricity price on electric vehicle charging behavior: A case study in Shenzhen, China. Sustainable Cities and Society. DOI

Haohao Qu, Han Li, Linlin You, Rui Zhu, Jinyue Yan, Paolo Santi, Carlo Ratti, Chau Yuen. (2024) ChatEV: Predicting electric vehicle charging demand as natural language processing. Transportation Research Part D: Transport and Environment. Paper in TRD Code in Github

@article{li2025urbanev,
  author={Li, Han and Qu, Haohao and Tan, Xiaojun and You, Linlin and Zhu, Rui and Fan, Wenqi}
  title={UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction},
  journal={Scientific Data},
  volum={12},
  pages={523},
  year={2025},
  issn={2052-4463},
  doi={10.1038/s41597-025-04874-4},
}

@Article{qu2024a,
  author={Qu, Haohao and Kuang, Haoxuan and Wang, Qiuxuan and Li, Jun and You, Linlin},
  journal={IEEE Transactions on Intelligent Transportation Systems}, 
  title={A Physics-Informed and Attention-Based Graph Learning Approach for Regional Electric Vehicle Charging Demand Prediction}, 
  year={2024},
  pages={1-14},
  doi={10.1109/TITS.2024.3401850}}

@article{kuang2024unravelling,
  title={Unravelling the effect of electricity price on electric vehicle charging behavior: A case study in Shenzhen, China},
  author={Kuang, Haoxuan and Zhang, Xinyu and Qu, Haohao and You, Linlin and Zhu, Rui and Li, Jun},
  journal={Sustainable Cities and Society},
  pages={105836},
  year={2024},
  publisher={Elsevier}
}

@article{qu2024chatev,
 title = {ChatEV: Predicting electric vehicle charging demand as natural language processing},
 journal = {Transportation Research Part D: Transport and Environment},
 volume = {136},
 pages = {104470},
 year = {2024},
 issn = {1361-9209},
 author = {Haohao Qu and Han Li and Linlin You and Rui Zhu and Jinyue Yan and Paolo Santi and Carlo Ratti and Chau Yuen},
}

Contact

If you have any questions regarding this dataset, feel free to reach out.

Author: Han Li [email protected], Haohao Qu [email protected]

Updates

  • January 19, 2025: Uploaded code and data for distribution prediction based on UrbanEV.
  • March 17, 2025: Published the dataset on Data in Dryad.
  • March 28, 2025: The paper "UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction" was published in Scientific Data. Paper in Spring Nature

Data Description

The UrbanEV dataset was developed to meet the urgent need for understanding and forecasting electric vehicle (EV) charging demand in urban environments. As global EV adoption accelerates, efficient charging infrastructure management is crucial for ensuring grid stability and enhancing user experience. Collected from public EV charging stations in Shenzhen, China — a leading city in vehicle electrification — the dataset covers a six-month period (September 1, 2022, to February 28, 2023), capturing seasonal variations in charging patterns. To ensure data quality, the raw records underwent meticulous preprocessing, including the extraction of key information (availability status, rated power, and fees), anomaly removal, and missing value imputation via forward and backward filling. Outliers identified by the IQR method were replaced with adjacent valid values. The data was aggregated both temporally (hourly) and spatially (by traffic zones), with variance tests and zero-value filtering applied to exclude low-activity regions.The final dataset includes:

  • Charging data: occupancy, duration, and volume
  • Environmental context: weather conditions
  • Spatial features: adjacency matrices, distances
  • Static attributes: Points of Interest, area size, and road length

Data Access

All datasets related to UrbanEV have been made publicly available on Dryad. In addition to Dryad, all datasets are also available on Google Drive for easier access and to host the most up-to-date versions of the datasets. This includes:

  • Preprocessed zone-level data at both hourly and 5-minute resolution (1,362 charging stations with 17,532 charging piles)
  • Raw station-level data at 5-minute resolution (before preprocessing) (1,682 charging stations with 24,798 charging piles)

The data directory of this GitHub repository contains the preprocessed zone-level dataset used in Paper in Spring Nature

avatar Figure 1. Spatial distribution of 1,682 public charging stations and 24,798 charging piles in the UrbanEV dataset.

Files

code: Code for distribution time-series prediction using traditional models and deep learning models based on the UrbanEV dataset, including several modularized functions.

  • baselines.py: Includes three traditional forecasting methods (Last Observation, Auto-regressive (AR), and ARIMA) and six deep learning models (Fully Connected Neural Network (FCNN), Long Short-Term Memory (LSTM), Graph Convolutional Network (GCN), GCN-LSTM, Attention-Based Spatial-Temporal Graph Convolutional Network (ASTGCN)).
  • exp.bat|exp.sh: Scripts for distribution time-series prediction.
  • init_env.bat|init_env.sh: Scripts to create a virtual environment for running time-series predictions based on the UrbanEV dataset.
  • main.py: Main script file.
  • parse.py: Provides a command-line interface to configure training parameters for spatiotemporal EV charging demand prediction models.
  • preprocess.py: Converts data in the ./data/dataset folder into a format suitable for training and predicting with Transformer-based time-series models.
  • train.py: Model training script.
  • utils.py: Utility functions related to the UrbanEV dataset predictions, e.g., time-series cross-validation and dataset preparation.

data: 1-hour resolution zone-level data of the UrbanEV dataset, which has been cleaned through outlier detection, zero-value checks, etc., and includes data from 275 zones, 1,362 charging stations, and 17,532 charging piles.

  • adj.csv: Adjacency matrix.
  • duration.csv: Hourly EV charging duration (Unit: hour).
  • e_price.csv: Electricity price (Unit: Yuan/kWh).
  • inf.csv: Key information about the 275 zones, including coordinates, charging capacities, area (Unit: m^2), and perimeter (Unit: m).
  • occupancy.csv: Hourly EV charging occupancy rate (Unit: %).
  • s_price.csv: Service price (Unit: Yuan/kWh).
  • volume.csv: Hourly EV charging volume (Unit: kWh). The volume in \emph{volume.csv} is derived from the rated power of charging piles
  • volume-11kW.csv provides an alternative vehicle-side estimation of charging volume to mitigate potential overestimation in volume.csv. Specifically, for direct current charging stations, the volume is calculated using the standard power of the most commonly used electric vehicle, Tesla Model Y (11kW), instead of the rated power of the charging pile.
  • weather_airport.csv: Weather data from the meteorological station at Bao'an Airport (Shenzhen). These are the raw data collected, and it is recommended to use the Max-Min method for normalization.
  • weather_central.csv: Weather data from Futian Meteorological Station in the city center of Shenzhen.
  • weather_header.txt: Descriptions of the table headers in weather_airport.csv and weather_central.csv.
  • distance.csv: Distance matrix between the 275 zones.
  • poi.csv: Points of Interest categorized into three types: food and beverage services, business and residential, and lifestyle services. The coordinates used are based on the WGS84 coordinate system.

code_transformer: Code for distribution time-series prediction using Transformer-based models on the UrbanEV dataset. Below are explanations for some core files and directories related to UrbanEV prediction:

  • dataset/st-evcdp: Contains data files used for predictions, which can be generated through ../code/preprocess.py.
  • exp.bat|exp.sh: Scripts for distribution time-series prediction using Transformer-based models.

Environment Requirements

This section outlines the setup for the virtual environment required for time-series prediction based on UrbanEV, using Python 3.8 and PyTorch 2.4.1. Assuming your working directory is the project root directory, here are the relevant commands:

Windows

cd code
init_env.bat

Linux

cd code
./init_env.sh

Due to the discontinuation of PyG Temporal, you may encounter a ModuleNotFoundError: No module named 'torch_geometric.utils.to_dense_adj' when running ASTGCN experiments. To resolve this, change from torch_geometric.utils.to_dense_adj import to_dense_adj to from torch_geometric.utils import to_dense_adj. Seepyg-#9023 (reply in thread) for more details.

Run Distribution Prediction on the UrbanEV dataset

Assuming your working directory is the project root directory

Simple Example

Traditional and deep learning-based model

cd code
conda activate UrbanEV
python main.py --model=fcnn --pre_len=3 --fold=1 --pred_type=region --add_feat=None --feat occ --epoch 1

Transformer-based model

cd code-transformer
conda activate UrbanEV
python run.py --task_name long_term_forecast --is_training 1 --root_path ./dataset/UrbanEV/ --data_path occ-e.csv --model_id st-evcdp_12_3 --model TimeXer --data custom --features M --seq_len 12 --label_len 12 --e_layers 1 --factor 1 --enc_in 608 --dec_in 608 --c_out 304 --d_model 512 --d_ff 512 --des Exp --batch_size 32 --learning_rate 0.001 --itr 1 --train_epochs 1 --pred_len 3 --pred_type region --add_feat None --fold 3 --feat occ

Traditional and Deep Learning Model

Windows

cd code
conda activate UrbanEV
exp.bat

Linux

cd code
conda activate UrbanEV
./exp.sh

Transformer-based Model

Windows

cd code
conda activate UrbanEV
python preprocess.py
cd ..
cd code-transformer
exp.bat

Linux

cd code
conda activate UrbanEV
python preprocess.py
cd ..
cd code-transformer
./exp.sh

Acknowledgement

The project is based on the following time series forecasting repositories, from which you can explore more about the models and methods by clicking on the respective links:

More updates will be posed in the near future! Thank you for your interest.