SMOTE-SMO-based expert system for type II diabetes detection using PIMA dataset

Huma Naz,Sachin Ahuja
DOI: https://doi.org/10.1007/s13410-021-00969-x
2021-08-17
International Journal of Diabetes in Developing Countries
Abstract:BackgroundMedical data, which is critical to human existence, is used to identify potential people prone to any specific complication or disease by the application of appropriate data mining (DM) techniques. DM is specifically applied to extract details for diagnosis, prediction, prevention, and treatment of various diseases. According to the International Diabetes Federation (IDF) 2019 atlas report, diabetes caused 4.2 million deaths over the globe, and hence, it is critical to diagnose diabetes at an early stage.Material and methodEven though many techniques are available to diagnose diabetes, the methods are not efficient to find hidden patterns with the desired accuracy for correct decision-making. Thus, this paper presents an integrated approach of synthetic minority oversampling technique (SMOTE) and sequential minimal optimization (SMO) algorithms for predicting diabetes. In this proposed two-phase classification model, the first step is pre-processing of data using the SMOTE algorithm, and the second step is SMO classifier. The output of the pre-processing is given to SMO to increase the performance of the classifier.ResultThis classification model achieved an accuracy rate of 99.07% on the PIMA Indian diabetes dataset (PIDD) using our proposed approach. PIDD has been taken from UCI repository for this proposed work; however, the National Institute of Diabetes and digestive kidney disease owned the PIDD. The dataset contains 768 female patients, details each with 8 numeric and one decision class attribute.ConclusionThe output of the study confirms that the proposed integrated approach of DM could be used as an expert system for diagnosing diabetes in patients at an early stage. The extracted features from this study will be used for the development of a prognostic tool in the form of a mobile application for early diabetes detection.
endocrinology & metabolism
What problem does this paper attempt to address?