Exploratory Data Analysis using Autoviz for Machine Learning Classification Problem

Praveen Gujjar J,Prasanna Kumar H R,Afsar Jahan,G. M S,Raghavendra M Devadas,Arnav Kotiyal
DOI: https://doi.org/10.1109/INNOCOMP63224.2024.00087
2024-05-25
Abstract:As the field of machine learning continues to advance, the importance of effective exploratory data analysis (EDA) cannot be overstated, especially in the context of classification problems. This research paper introduces an innovative and automated approach to exploratory data analysis using Autoviz, a powerful Python library designed to streamline the visualization process for machine learning datasets. The study aims to demonstrate the efficacy of Autoviz in meaningful insights, facilitating feature understanding, and supporting data preprocessing for classification tasks. Furthermore, the paper discusses how Autoviz assists in feature selection by highlighting the relevance and importance of each feature about the target variable. This feature-centric analysis contributes to optimizing machine learning models by guiding the selection of relevant input features, thereby improving model accuracy and interpretability. Autoviz is a Python library designed to automate the exploratory data analysis (EDA) process for machine learning tasks. It is specifically tailored to generate a variety of visualizations that aid in understanding the structure and characteristics of a dataset. The results indicate that Autoviz significantly reduces the time and effort required for exploratory data analysis while maintaining the quality and depth of insights obtained. The differences between Autoviz and Pycaret is discussed in this paper. It has been observed that AutoViz automates the process of exploring and analyzing data, while PyCaret is designed for the comparison, construction, and tuning of machine learning models.
Computer Science
What problem does this paper attempt to address?