Country-wide German hourly wind power dataset mined to provide insight to predictions and forecasts with optimized data-matching machine learning

David A. Wood
DOI: https://doi.org/10.1016/j.ref.2020.06.005
2020-09-01
Renewable Energy Focus
Abstract:German wind power (MW) generated on an hourly basis is evaluated in terms of weather-environmental-market related inputs using 8784 data records for the year 2016. An optimized data-matching algorithm achieves high degrees of accuracy predicting MW from a diverse set of inputs across the year and forecasting MW in the short-term using only historical inputs. The developed algorithm demonstrates its value by enabling detailed auditable data mining of all data record prediction and forecast and providing forensic knowledge of the dataset. The method initially identifies the most closely matching data records using squared differences between each of the inputs included in its feature selection. Subsequently, it applies standard and memetic optimizers to minimize the squared errors for its MW forecasts by tuning a small subset of only ∼150 data records (∼1.7% of all data records). It then selects the best optimum solution based on evaluations with a set of data records held independently from the tuning process. The accuracy realized for the best solution determined applied using all data records is RMSE = 791.4 MW and R2 = 0.988. The dataset has a few significant MW prediction outliers. Detailed auditable analysis of the prediction outliers identifies and explains their causes. The method also provides high forecasting accuracy and data insight for short-term time series data. It achieves one hour ahead (t + 1) forecasting accuracies of RMSE = 879.5 MW; R2 = 0.9802, and three hour ahead (t + 3) forecasting accuracies of RMSE = 1880.1 MW; R2 = 0.9095 using just 1000 h of historical data records.
What problem does this paper attempt to address?