Performance Evaluation of Classification Models for Household Income, Consumption and Expenditure Data Set

Mersha Nigus,Dorsewamy
DOI: https://doi.org/10.48550/arXiv.2106.11055
IF: 5.414
2021-06-18
Machine Learning
Abstract:Food security is more prominent on the policy agenda today than it has been in the past, thanks to recent food shortages at both the regional and global levels as well as renewed promises from major donor countries to combat chronic hunger. One field where machine learning can be used is in the classification of household food insecurity. In this study, we establish a robust methodology to categorize whether or not a household is being food secure and food insecure by machine learning algorithms. In this study, we have used ten machine learning algorithms to classify the food security status of the Household. Gradient Boosting (GB), Random Forest (RF), Extra Tree (ET), Bagging, K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), Logistic Regression (LR), Ada Boost (AB) and Naive Bayes were the classification algorithms used throughout this study (NB). Then, we perform classification tasks from developing data set for household food security status by gathering data from HICE survey data and validating it by Domain Experts. The performance of all classifiers has better results for all performance metrics. The performance of the Random Forest and Gradient Boosting models are outstanding with a testing accuracy of 0.9997 and the other classifier such as Bagging, Decision tree, Ada Boost, Extra tree, K-nearest neighbor, Logistic Regression, SVM and Naive Bayes are scored 0.9996, 0.09996, 0.9994, 0.95675, 0.9415, 0.8915, 0.7853 and 0.7595, respectively.
What problem does this paper attempt to address?