Patterns of high-risk drinking among medical students: A web-based survey with machine learning

Grasiela Marcon,Flávia de Ávila Pereira,Aline Zimerman,Bruno Castro da Silva,Lisia von Diemen,Ives Cavalcante Passos,Mariana Recamonde-Mendoza
DOI: https://doi.org/10.1016/j.compbiomed.2021.104747
Abstract:Background: Prior studies have found increased rates of alcohol consumption among physicians and medical students. The present study aims to build machine learning (ML) models to identify patterns of high-risk drinking (HRD), including alcohol use disorder, within this population. Methods: We analyzed data collected through a web-based survey among Brazilian medical students. Variables included sociodemographic data, personal information, university status, and mental health. Stratification for HRD was carried out based on the AUDIT-C scores. Three ML algorithms were used to build classifiers to predict HRD among medical students: elastic net regularization, random forest, and artificial neural networks. Model interpretation techniques were adopted to assess the most influential predictors for models' decisions, which represent potential factors associated with HRD. Results: A total of 4840 medical students were included in the study. The prevalence of HRD was 53.03%. The three ML models built were able to distinguish individuals with HRD from low-risk drinking (LRD) with very similar performance. The average AUC scores in the cross-validation procedure were around 0.72, and this performance was replicated in the test set. The most important features for the ML models were the use of tobacco and cannabis, monthly family income, marital status, sexual orientation, and physical activities. Conclusions: This study proposes that ML models may serve as tools for initial screening of students regarding their susceptibility for at-risk drinking or alcohol use disorder. In addition, we identified several key factors associated with HRD that could be further investigated and explored for preventive and assistance measures.
What problem does this paper attempt to address?