UrbanEV is an open dataset of EV charging space availability and electricity use in Shenzhen, China. This project is dedicated to the public domain using the CC0 1.0 Universal License. For more information, see Creative Commons - CC0.
If this project is helpful to your research, please cite our papers:
Li, H., Qu, H., Tan, X. et al. (2025). UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction. Scientific Data. Paper in Spring Nature
Qu, H., Kuang, H., Li, J., & You, L. (2023). A physics-informed and attention-based graph learning approach for regional electric vehicle charging demand prediction. IEEE Transactions on Intellgent Transportation Systems. Paper in IEEE Explore Paper in arXiv
Kuang, H., Zhang, X., Qu, H., and You, L., and Zhu, R. and Li, J. (2024). Unravelling the effect of electricity price on electric vehicle charging behavior: A case study in Shenzhen, China. Sustainable Cities and Society. DOI
Haohao Qu, Han Li, Linlin You, Rui Zhu, Jinyue Yan, Paolo Santi, Carlo Ratti, Chau Yuen. (2024) ChatEV: Predicting electric vehicle charging demand as natural language processing. Transportation Research Part D: Transport and Environment. Paper in TRD Code in Github
@article{li2025urbanev,
author={Li, Han and Qu, Haohao and Tan, Xiaojun and You, Linlin and Zhu, Rui and Fan, Wenqi}
title={UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction},
journal={Scientific Data},
volum={12},
pages={523},
year={2025},
issn={2052-4463},
doi={10.1038/s41597-025-04874-4},
}
@Article{qu2024a,
author={Qu, Haohao and Kuang, Haoxuan and Wang, Qiuxuan and Li, Jun and You, Linlin},
journal={IEEE Transactions on Intelligent Transportation Systems},
title={A Physics-Informed and Attention-Based Graph Learning Approach for Regional Electric Vehicle Charging Demand Prediction},
year={2024},
pages={1-14},
doi={10.1109/TITS.2024.3401850}}
@article{kuang2024unravelling,
title={Unravelling the effect of electricity price on electric vehicle charging behavior: A case study in Shenzhen, China},
author={Kuang, Haoxuan and Zhang, Xinyu and Qu, Haohao and You, Linlin and Zhu, Rui and Li, Jun},
journal={Sustainable Cities and Society},
pages={105836},
year={2024},
publisher={Elsevier}
}
@article{qu2024chatev,
title = {ChatEV: Predicting electric vehicle charging demand as natural language processing},
journal = {Transportation Research Part D: Transport and Environment},
volume = {136},
pages = {104470},
year = {2024},
issn = {1361-9209},
author = {Haohao Qu and Han Li and Linlin You and Rui Zhu and Jinyue Yan and Paolo Santi and Carlo Ratti and Chau Yuen},
}
If you have any questions regarding this dataset, feel free to reach out.
Author: Han Li [email protected], Haohao Qu [email protected]
- January 19, 2025: Uploaded code and data for distribution prediction based on UrbanEV.
- March 17, 2025: Published the dataset on Data in Dryad.
- March 28, 2025: The paper "UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction" was published in Scientific Data. Paper in Spring Nature
The UrbanEV dataset was developed to meet the urgent need for understanding and forecasting electric vehicle (EV) charging demand in urban environments. As global EV adoption accelerates, efficient charging infrastructure management is crucial for ensuring grid stability and enhancing user experience. Collected from public EV charging stations in Shenzhen, China — a leading city in vehicle electrification — the dataset covers a six-month period (September 1, 2022, to February 28, 2023), capturing seasonal variations in charging patterns. To ensure data quality, the raw records underwent meticulous preprocessing, including the extraction of key information (availability status, rated power, and fees), anomaly removal, and missing value imputation via forward and backward filling. Outliers identified by the IQR method were replaced with adjacent valid values. The data was aggregated both temporally (hourly) and spatially (by traffic zones), with variance tests and zero-value filtering applied to exclude low-activity regions.The final dataset includes:
- Charging data: occupancy, duration, and volume
- Environmental context: weather conditions
- Spatial features: adjacency matrices, distances
- Static attributes: Points of Interest, area size, and road length
All datasets related to UrbanEV have been made publicly available on Dryad. In addition to Dryad, all datasets are also available on Google Drive for easier access and to host the most up-to-date versions of the datasets. This includes:
- Preprocessed zone-level data at both hourly and 5-minute resolution (1,362 charging stations with 17,532 charging piles)
- Raw station-level data at 5-minute resolution (before preprocessing) (1,682 charging stations with 24,798 charging piles)
The data directory of this GitHub repository contains the preprocessed zone-level dataset used in Paper in Spring Nature
Figure 1. Spatial distribution of 1,682 public charging stations and 24,798 charging piles in the UrbanEV dataset.
code: Code for distribution time-series prediction using traditional models and deep learning models based on the UrbanEV dataset, including several modularized functions.
baselines.py
: Includes three traditional forecasting methods (Last Observation, Auto-regressive (AR), and ARIMA) and six deep learning models (Fully Connected Neural Network (FCNN), Long Short-Term Memory (LSTM), Graph Convolutional Network (GCN), GCN-LSTM, Attention-Based Spatial-Temporal Graph Convolutional Network (ASTGCN)).exp.bat
|exp.sh
: Scripts for distribution time-series prediction.init_env.bat
|init_env.sh
: Scripts to create a virtual environment for running time-series predictions based on the UrbanEV dataset.main.py
: Main script file.parse.py
: Provides a command-line interface to configure training parameters for spatiotemporal EV charging demand prediction models.preprocess.py
: Converts data in the./data/dataset
folder into a format suitable for training and predicting with Transformer-based time-series models.train.py
: Model training script.utils.py
: Utility functions related to the UrbanEV dataset predictions, e.g., time-series cross-validation and dataset preparation.
data: 1-hour resolution zone-level data of the UrbanEV dataset, which has been cleaned through outlier detection, zero-value checks, etc., and includes data from 275 zones, 1,362 charging stations, and 17,532 charging piles.
adj.csv
: Adjacency matrix.duration.csv
: Hourly EV charging duration (Unit: hour).e_price.csv
: Electricity price (Unit: Yuan/kWh).inf.csv
: Key information about the 275 zones, including coordinates, charging capacities, area (Unit: m^2), and perimeter (Unit: m).occupancy.csv
: Hourly EV charging occupancy rate (Unit: %).s_price.csv
: Service price (Unit: Yuan/kWh).volume.csv
: Hourly EV charging volume (Unit: kWh). The volume in \emph{volume.csv} is derived from the rated power of charging pilesvolume-11kW.csv
provides an alternative vehicle-side estimation of charging volume to mitigate potential overestimation involume.csv
. Specifically, for direct current charging stations, the volume is calculated using the standard power of the most commonly used electric vehicle, Tesla Model Y (11kW), instead of the rated power of the charging pile.weather_airport.csv
: Weather data from the meteorological station at Bao'an Airport (Shenzhen). These are the raw data collected, and it is recommended to use the Max-Min method for normalization.weather_central.csv
: Weather data from Futian Meteorological Station in the city center of Shenzhen.weather_header.txt
: Descriptions of the table headers inweather_airport.csv
andweather_central.csv
.distance.csv
: Distance matrix between the 275 zones.poi.csv
: Points of Interest categorized into three types:food and beverage services
,business and residential
, andlifestyle services
. The coordinates used are based on theWGS84
coordinate system.
code_transformer: Code for distribution time-series prediction using Transformer-based models on the UrbanEV dataset. Below are explanations for some core files and directories related to UrbanEV prediction:
dataset/st-evcdp
: Contains data files used for predictions, which can be generated through../code/preprocess.py
.exp.bat
|exp.sh
: Scripts for distribution time-series prediction using Transformer-based models.
This section outlines the setup for the virtual environment required for time-series prediction based on UrbanEV, using Python 3.8 and PyTorch 2.4.1. Assuming your working directory is the project root directory, here are the relevant commands:
Windows
cd code
init_env.bat
Linux
cd code
./init_env.sh
Due to the discontinuation of PyG Temporal, you may encounter a ModuleNotFoundError: No module named 'torch_geometric.utils.to_dense_adj' when running ASTGCN experiments. To resolve this, change from torch_geometric.utils.to_dense_adj import to_dense_adj to from torch_geometric.utils import to_dense_adj. Seepyg-#9023 (reply in thread) for more details.
Assuming your working directory is the project root directory
Traditional and deep learning-based model
cd code
conda activate UrbanEV
python main.py --model=fcnn --pre_len=3 --fold=1 --pred_type=region --add_feat=None --feat occ --epoch 1
Transformer-based model
cd code-transformer
conda activate UrbanEV
python run.py --task_name long_term_forecast --is_training 1 --root_path ./dataset/UrbanEV/ --data_path occ-e.csv --model_id st-evcdp_12_3 --model TimeXer --data custom --features M --seq_len 12 --label_len 12 --e_layers 1 --factor 1 --enc_in 608 --dec_in 608 --c_out 304 --d_model 512 --d_ff 512 --des Exp --batch_size 32 --learning_rate 0.001 --itr 1 --train_epochs 1 --pred_len 3 --pred_type region --add_feat None --fold 3 --feat occ
Windows
cd code
conda activate UrbanEV
exp.bat
Linux
cd code
conda activate UrbanEV
./exp.sh
Windows
cd code
conda activate UrbanEV
python preprocess.py
cd ..
cd code-transformer
exp.bat
Linux
cd code
conda activate UrbanEV
python preprocess.py
cd ..
cd code-transformer
./exp.sh
The project is based on the following time series forecasting repositories, from which you can explore more about the models and methods by clicking on the respective links:
- Time Series Library (TSLib):https://github.com/thuml/Time-Series-Library.
- PyG Temporal: https://github.com/benedekrozemberczki/pytorch_geometric_temporal
More updates will be posed in the near future! Thank you for your interest.