Evaluation of data imputation approaches for multi-stream building systems data1
Ojas Pradhan,Jin Wen,David Hälleberg,Zhelun Chen,Noresh Varman,Jiajing Huang,Teresa Wu,K. Selçuk Candan,Zheng O’Neill
DOI: https://doi.org/10.1080/23744731.2024.2351311
2024-05-25
Science and Technology for the Built Environment
Abstract:Increasing advancements in building digitization, smart sensing, and metering technologies have allowed large amounts of timeseries data to be collected for monitoring, analyzing, and controlling building systems. However, due to sensor or communication failures, the data collected are often incomplete and poor in quality. Data imputation approaches to replace the missing values, specifically based on either statistical or predictive models have been widely adopted for multivariate datasets in other domains. It is hence of interest to find an effective way to impute timeseries data collected from a building system. In this paper, we evaluate multiple data imputation approaches using data collected from a medium sized building situated in Stockholm, Sweden and a small commercial building from the ASHRAE RP-1312 research project. Sensors with widely varying characteristics from the case study buildings were selected to evaluate the imputation methods. The imputation accuracy and the impact of each chosen imputation method on information entropy, short-term building forecasting model performance, and fault detection strategy were evaluated. Results demonstrate that incorporating time-lagged cross correlations within a k -nearest neighbor ( k NN) model provide the most accurate imputations without affecting the quality of subsequent data analysis.
engineering, mechanical,thermodynamics,construction & building technology