Automated Data Cleaning For Data Centers: A Case Study

Syed Naeem Haider,Qianchuan Zhao,Bushra Kainat Meran
DOI: https://doi.org/10.23919/CCC50068.2020.9189357
2020-01-01
Abstract:Preprocessing the raw data is a critical stage in machine learning whose fundamental objective is to prepare a cleaned and error-free data set for data analytic algorithms. Transforming raw data into clean data is a basic requirement in industrial and commercial sectors but there are many challenges which have to be addressed individually and manually. Since there is no unified framework that incorporates all the required fields to transform raw data into clean data, manual transformation is ineffective and very time consuming. We discuss a case study for cleaning data in data center, comparing missing values tilling issue with forecast and mean value replacement for missing values and propose an automated data preprocessing framework for data cleaning. Proposed frame work successfully cleans data sets automatically instead of dealing multiple problems distinctly and manually.
What problem does this paper attempt to address?