Pre-Processing: A Data Preparation Step

Swarup Roy,Pooja Sharma,Keshab Nath,Dhruba K. Bhattacharyya,Jugal K. Kalita
DOI: https://doi.org/10.1016/b978-0-12-809633-8.20457-3
2019-01-01
Abstract:Large real life data are noisy in nature. They are not suitable for data analysis directly. The effectiveness of any data analytic or machine learning methodology is highly sensitive to the quality of input data. Data contain errors, missing values and inconsistencies. Biological databases are not the exception. Hence, for effective biological data analysis, the data must pass through a phase called data preprocessing. In this article, we discuss phases and steps in data preprocessing required for any analysis of large databases, including biological repositories. We briefly highlight different techniques proposed for performing the different types of pre-processing tasks.
What problem does this paper attempt to address?