The utility of a machine learning model in identifying people at high risk of type 2 diabetes mellitus

Abdullah Alkattan,Abdullah Al-Zeer,Fahad Alsaawi,Alanoud Alyahya,Raghad Alnasser,Raoom Alsarhan,Mona Almusawi,Deemah Alabdulaali,Nagla Mahmoud,Rami Al-Jafar,Faisal Aldayel,Mustafa Hassanein,Alhan Haji,Abdulrahman Alsheikh,Amal Alfaifi,Elfadil Elkagam,Ahmed Alfridi,Amjad Alfaleh,Khaled Alabdulkareem,Nashwa Radwan,Edward W Gregg
DOI: https://doi.org/10.1080/17446651.2024.2400706
Abstract:Background: According to previous reports, very high percentages of individuals in Saudi Arabia are undiagnosed for type 2 diabetes mellitus (T2DM). Despite conducting several screening and awareness campaigns, these efforts lacked full accessibility and consumed extensive human and material resources. Thus, developing machine learning (ML) models could enhance the population-based screening process. The study aims to compare a newly developed ML model's outcomes with the validated American Diabetes Association's (ADA) risk assessment regarding predicting people with high risk for T2DM. Research design and methods: Patients' age, gender, and risk factors that were obtained from the National Health Information Center's dataset were used to build and train the ML model. To evaluate the developed ML model, an external validation study was conducted in three primary health care centers. A random sample (N = 3400) was selected from the non-diabetic individuals. Results: The results showed the plotted data of sensitivity/100-specificity represented in the Receiver Operating Characteristic (ROC) curve with an AROC value of 0.803, 95% CI: 0.779-0.826. Conclusions: The current study reveals a new ML model proposed for population-level classification that can be an adequate tool for identifying those at high risk of T2DM or who already have T2DM but have not been diagnosed.
What problem does this paper attempt to address?