Assessment of feature selection for student academic performance through machine learning classification

R. Suguna,M. Shyamala Devi,Rupali Amit Bagate,Aparna Shashikant Joshi
DOI: https://doi.org/10.1080/09720510.2019.1609729
2019-05-19
Journal of Statistics and Management Systems
Abstract:Regression analysis is used to find the trends in the data. The analysis helps to find the relationship between dependent and independent variables in the dataset. It also suggests the degree of influence of Independent variables towards the prediction of desired outcome. Multiple Linear Regression technique builds a model with more than one predictor by identifying the statistical relationship between them. This paper evaluates and analyzes the performance of multiple linear regression models and suggests a way to improve the model by Feature Selection. The performance of the model with and without backward elimination is analyzed for the Student Academic Performance dataset from Kaggle repository. The optimized model is experimented with various classifiers such as Logistic, KNN, Kernel SVM, Naïve Bayes, Decision Tree and Random Forest and its efficiency is assessed through metrics such as Precision, Recall, FScore and Accuracy.
What problem does this paper attempt to address?