Performance Analysis of Machine Learning Based On Optimized Feature Selection for Type II Diabetes Mellitus

Salliah Shafi Bhat,Gufran Ahmad Ansari,Mohd Dilshad Ansari,Bhat, Salliah Shafi,Ansari, Mohd Dilshad
DOI: https://doi.org/10.1007/s11042-024-19000-6
IF: 2.577
2024-03-28
Multimedia Tools and Applications
Abstract:Type 2 Diabetes Mellitus (T2DM) is a common chronic illness caused by variations in the secretion of insulin. T2DM will be treated early with the help of an early diagnosis reducing the risk of early death and controlling is course of illness. In this Paper Machine Learning Algorithms (MLA) for T2DM diagnosis are proposed. The major objective of this research is to assess use of MLA and its methods to forecast different illnesses at an initial stage. In this research, we examine most recent proposed strategies with some limitations and potential improvements for future work. The data is collected from Kaggle (University of California) *. Researchers highlight how important is to identify important features that enhance outcomes suggested by recent methods. Six MLA were used in this paper such as Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF) and K-Nearest Neighbour (KNN) were applied in Jupiter Notebook. Feature sets are selected using three feature selection methodologies such as information gain-based selection, correlation-based selection and sequential feature selection. On these feature subsets a variety of MLA are applied and optimal feature subset is chosen based on their performance. Finally a DT is suggested as the top model among the top six performances. The suggested method provides an improved performance label with an average accuracy of 96.10%.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?