Data preprocessing techniques: emergence and selection towards machine learning models - a practical review using HPA dataset
K Mallikharjuna Rao,Ghanta Saikrishna,Kundrapu Supriya
DOI: https://doi.org/10.1007/s11042-023-15087-5
IF: 2.577
2023-03-19
Multimedia Tools and Applications
Abstract:To compute the frequent metamorphosis of the housing price, the House Price Index (HPI) is one of the effective indicators. Various methodologies are involved in data processing the current house prices, which are affected by factors like house configuration, building class, air conditioning quality, etc. Remarkably, more research papers adopting classical machine learning approaches are introduced to estimate house sale prices accurately. Still, they barely regard the data processing techniques that make the data suitable for modeling more accurate house prices forecasting architectures. This research contributes to a wide variety of adequate data pre-processing. It highlights mechanisms like missingness of data, missing data handling, categorical feature encoding, discretization, outliers, and feature scaling extensively to build efficient predictive models. Comprehensive arguments have been broadly presented to portray the advantages and disadvantages of prevailed data pre-processing techniques at various distribution scenarios of variables in the house price data. The current research conclusions oblige the evolution of modern data-driven research in machine learning.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering