Multiscale-attention Masked Autoencoder for Missing Data Imputation of Wind Turbines

Yuwei Fan,Chenlong Feng,Rui Wu,Chao Liu,Dongxiang Jiang
DOI: https://doi.org/10.1016/j.knosys.2024.112114
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:High-quality data is essential for effective operation and maintenance of wind farms. However, data missing is a persistent issue in the supervisory control and data acquisition (SCADA) system, which seriously affects the data quality. To tackle the two limitations of current missing data imputation methods: the gap between training tasks and imputation tasks, and the inadequate extraction of correlations within SCADA data, this work proposes a data-driven framework named multiscale-attention masked autoencoder (MAMAE) for missing data imputation of wind turbines. The MAMAE employs masked autoencoding as a self-supervised training method, bridging the gap between the training and imputing task. Additionally, considering the importance of correlations in imputation for the SCADA data, a multiscale attention architecture built upon transformer is employed. Comprising four transformer stages, each applying attention mechanisms at distinct scales, the multiscale attention efficiently extracts feature, turbine, and temporal correlations. To ameliorate the problem of large computation cost caused by increased sequence length in different scales, localized attention is implemented in shifted windows, reducing the computational complexity from quadratic to a linear relationship with the sequence length. Furthermore, a turbine correlation-based feature combination method is proposed to coordinate with the multiscale attention and introduce turbine correlations into the imputation process. Experiments were conducted on a SCADA dataset collected in a real-world wind farm. The results show that the proposed method achieves higher accuracy than existing methods in most cases (especially in the cases with band missing and feature missing) and the ablation experiments verify the effectiveness of each proposed modification in improving accuracy or efficiency.
What problem does this paper attempt to address?