Progressive Cleaning and Mining of Uncertain Smart Water Meter Data

Milad Khaki
DOI: https://doi.org/10.1201/9781003269793-26
2022-04-05
Abstract:Several municipalities have recently installed wireless ’smart’ water meters that allow functionalities such as demand response, leak alerts, identification of characteristic demand patterns, and detailed consumption analysis. The meter data needs to be error-free to achieve these benefits, which is not necessarily available in practice due to ’dirtiness’ or ’uncertainty’ of data, which is mostly unavoidable. This paper investigates solutions to mine uncertain data for reliable results and evaluates the impact of dirty data on data analysis results. The evaluation results can be used for informed decision-making on water planning strategies. A systematic study of the errors existing in large-scale smart water meter deployments is performed in this paper and helps to understand the nature of errors. Identifying customers contributing to a consumption peak is used as the primary filter in this study. Its outputs are combined with the domain expert knowledge to evaluate their accuracy, validity, and potential errors. Each error is analyzed, and its source is investigated. This procedure is applied progressively to ensure that all detectable errors are discovered and characterized in the data model. The proposed approach is evaluated using the smart water meter consumption data obtained from Abbotsford, British Columbia, Canada. The results present the sensitivity of the selected filter to the errors are illustrated. This chapter highlights the detrimental effects of data errors in reducing the benefits of using the concept of big data. The impact of uncertain data on identifying customers contributing to a peak load is examined to evaluate the data quality. The proposed progressive approach helps to determine errors, their origins and finding solutions to remove them. The chapter provides a generalized model of smart water meter infrastructure. Information about the case study, City of Abbotsford, is provided, and the structure of the employed dataset and its schema is discussed. The progressive data cleaning approach is presented, and the data quality issues that were mainly encountered in the current study together, with the adopted or produced solutions. As the final part of the case study, the results of using the cleaned dataset are presented, and the sensitivity of these results to errors is also examined.
What problem does this paper attempt to address?