A Data Fusion and Data Cleaning System for Smart Grids Big Data.

Zhining Lv,Wei Deng,Zhihan Zhang,Ningxuan Guo,Gangfeng Yan
DOI: https://doi.org/10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00119
2019-01-01
Abstract:The smart grid is considered as one of the important technical areas of big data applications. Based on the massive data generated by smart grids, this paper proposes a system to provide support services for data mining, which includes data fusion, data cleaning, and other data preparation and pre-processing services. According to the typical application scenarios of the power industry, targeted automatic selection and fusion are carried out on the basis of pre-processed power big data. In terms of data cleaning, a universal and effective cleaning solution based on commonly used data mining requirements is proposed. In order to improve the data cleaning efficiency and accuracy, an algorithm for missing value verification based on machine learning is proposed. The test results show that the performance accuracy of the proposed system is 100%. With a comprehensive comparison of several typical machine learning algorithms, Support Vector Machine (SVM) is considered as the most suitable algorithm for data cleaning verification.
What problem does this paper attempt to address?