Repository of openly available wind turbine SCADA datasets with high-level descriptions, reusable data loaders for convenient CSV import, and a platform for documenting insights related to data quality and malfunctions.
For questions and feedback, plese reach out to: [email protected]
ID | Dataset | .jpynb | Loc | Met- mast |
Trb # |
Var # |
Logs ✓/✗ |
Labels ✓/✗ |
ΔT | ∑T | Ref | Remarks/License |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | EDP Open Data | here | ESP (on) | ✓ | 5 | ~80 | ✓ | ✓1 | 10m | 2y | - | T09 removed from dataset |
2 | Winji Gearbox Challenge | ✗ | ? | ? | 5 | ~20 | ✓ | ✓2 | 10m | 3y | - | register & consent from WinJi |
3 | Kelmarsh Farm | here | UK (on) | ✗ | 6 | ~99 | ✓ | ✗ | 10m3 | 5y | - | farm info |
4 | Penmanshiel Farm | ✗ | UK (on) | ✗ | 14 | >150 | ✓ | ✗ | 10m3 | 5y | - | farm info |
5 | Ørsted Anholt Offshore | ✗ | DEN (off) | (✓)4 | 111 | ? | ? | ? | 10m | 2y | - | application/NDA; farm info |
6 | Ørsted Westermost Rough | ✗ | UK (off) | (✓)4 | 35 | ? | ? | ? | 10m | 2y | - | application/NDA; farm info |
7a | "CAREtoCompare" Windfarm B | ✗ | GER (off) | ? | 9 | 64 | ? | ✓ | 10m | 2y | - | normalized for anonymization |
7b | "CAREtoCompare" Windfarm C | ✗ | GER (off) | ? | 22 | 238 | ? | ✓ | 10m | 2y | - | normalized for anonymization |
8 | Fuhrländer Farm | ✗ | ? (on) | ✗ | 5 | 312 | ✓ | ✗ | 5m | 3y | [2] | Eclipse Public License v2.0 |
9a | DSforWind Windfarm 1a | ✗ | ? (on) | ✓6 | 4 | 7 | ✗ | ✗ | 10m | 1y | - | - |
9b | DSforWind Windfarm1b | ✗ | ? (off) | ✓6 | 2 | 7 | ✗ | ✗ | 10m | 1y | - | - |
9c | DSforWind Windfarm 2a | ✗ | ? (on) | ✓6 | 2 | 7 | ✗ | ✗ | 10m | 1y | - | - |
9d | DSforWind Windfarm 2b | ✗ | ? (off) | ✓6 | 2 | 7 | ✗ | ✗ | 10m | 1y | - | - |
10 | PCWG Data Sets | ✗ | ? (on) | ✓ | 3 | 1 | ✗ | ✗ | 10m | 1y | - | - |
11 | Norrekaer Windfarm | ✗ | DK (on) | ✓ | 41 | 3 | ✗ | ✗ | 10m | 1.5y | [3] | farm info |
11 | Delabole Windfarm | ✗ | UK (on) | ✓ | 10 | 1 | ✗ | ✗ | 10m | 1y | [4] | farm info |
12 | Dundalk IoT | ✗ | IRE (on) | ✗ | 1 | 20 | ✗ | ✓7 | 10m | 14y | - | urban terrain |
13 | Kaggle Wind Turbine | ✗ | TUR (on) | ✗ | 1 | 4 | ✗ | ✗ | 10m | 1y | - | - |
14 | Small São Paulo | ✗ | BRZ (on) | ✗ | 1 | ~40 | ✗ | ✗ | 1m | 5y | - | small, urban turbine |
15 | Björkö Wind Turbine | ✗ | SWE (on) | ✗ | 1 | 68 | ✗ | ✗ | 1s | 1y | - | small; turbine info |
16 | IET-OST Turbine | ✗ | SUI (on) | ✗ | 1 | 15 | ✗ | ✗ | 1s | 1.5y | - | small; turbine info |
17 | Pedra do Sal Wind Farm | ✗ | BRZ (on) | ✓ | 20 | ~40 | ✗ | ✗ | 10m | 1y | - | farm info |
18 | Beberibe Wind Farm | ✗ | BRZ (on) | ✓ | 32 | ~40 | ✗ | ✗ | 10m | 1y | - | farm info |
19 | SMARTEOLE Wind Farm | ✗ | FRA (on) | ✓ | 7 | ~40 | ✓ | ✗ | 1m | 4m | [5] | wake steering; farm info |
98 | Engie La Haute Borne | ✗ | FR (on) | ✗ | 4 | ~80 | ✗ | ✗ | 10m | 8y | - | offline; farm info |
99 | Levenmouth Turbine | ✗ | UK (near) | ✓ | 1 | >500 | ✓ | ✗ | 10m/1s | 3y | - | not for free (~2000 £) |
✗ = no / ✓ = yes
The jupyter notebooks in the 'notebooks' folder contain a data loader for SCADA signals, logs, annotations as well as community annotations (see next sections). Table 1 indicates whether the respective dataset has already been added. Furthermore, they produce an overview over each dataset such as shown in the following image:
Also, for each turbine, there is an 'Overview Cockpit' with a power curve plot, a wind rose and the data avilability over time. An example is shown here:
Lastly, operator annotations are listed, if they are part of the dataset. See e.g. for T01 of the edp data set:
To run the notebooks yourself, please add the respective .csv-files to the data folder.
We want to enable researchers to build upon the findings of others who were previously working with the dataset. For every dataset, we have set up a community-annotation folder, containing simple CSV's to collect data quality or malfunction related observations. They contain the following columns:
- annot_id: unique annotation identifies (running ascending number)
- turbine_id: which turbine of the respective dataset is affected?
- signal: which signal exhibits the respective observation?
- time_start / time_stop: during which time is the observation present?
- related_log_message (optional): is there a SCADA log message that coincides with the observation?
- remarks: describe your observation in a few words.
The respective notebooks automatically load, read and display the respective malfuncitons. See e.g. this example from T01 of the edp-dataset:
We welcome contributions to expand the collection of open datasets in this repository as well as community annotations for the respective datasets.
Many of the above listed datasets are described and analysed in [1].