Machine learning-based analysis of adolescent gambling factors

Wonju Seo,Namho Kim,Sang-Kyu Lee,Sung-Min Park
DOI: https://doi.org/10.1556/2006.2020.00063
2020-10-03
Abstract:Background and aims: Problem gambling among adolescents has recently attracted attention because of easy access to gambling in online environments and its serious effects on adolescent lives. We proposed a machine learning-based analysis method for predicting the degree of problem gambling. Methods: Of the 17,520 respondents in the 2018 National Survey on Youth Gambling Problems dataset (collected by the Korea Center on Gambling Problems), 5,045 students who had gambled in the past 3 months were included in this study. The Gambling Problem Severity Scale was used to provide the binary label information. After the random forest-based feature selection method, we trained four models: random forest (RF), support vector machine (SVM), extra trees (ETs), and ridge regression. Results: The online gambling behavior in the past 3 months, experience of winning money or goods, and gambling of personal relationship were three factors exhibiting the high feature importance. All four models demonstrated an area under the curve (AUC) of >0.7; ET showed the highest AUC (0.755), RF demonstrated the highest accuracy (71.8%), and SVM showed the highest F1 score (0.507) on a testing set. Discussion: The results indicate that machine learning models can convey meaningful information to support predictions regarding the degree of problem gambling. Conclusion: Machine learning models trained using important features showed moderate accuracy in a large-scale Korean adolescent dataset. These findings suggest that the method will help screen adolescents at risk of problem gambling. We believe that expandable machine learning-based approaches will become more powerful as more datasets are collected.
What problem does this paper attempt to address?