Machine Learning Methods to Identify Predictors of Psychological Distress

Yang Chen,Xiaomei Zhang,Lin Lu,Yinzhi Wang,Jiajia Liu,Lei Qin,Linglong Ye,Jianping Zhu,Ben-Chang Shia,Ming-Chih Chen
DOI: https://doi.org/10.3390/pr10051030
IF: 3.5
2022-05-23
Processes
Abstract:As people pay ever-increasing attention to the problems caused by psychological stress, research on its influencing factors becomes crucial. This study analyzed the Health Information National Trends Survey (HINTS, Cycle 3 and Cycle 4) data (N = 5484) and assessed the outcomes using descriptive statistics, Chi-squared tests, and t-tests. Four machine learning algorithms were applied for modeling: logistic regression (linear), random forests (RF) (ensemble), the artificial neural network (ANN) (nonlinear), and gradient boosting (GB) (ensemble). The samples were randomly assigned to a 50% training set and a 50% validation set. Twenty-six preselected variables from the databases were used in the study as predictors, and the four models identified twenty predictors of psychological distress. The essence of this paper is a binary classification problem of judging whether an individual has psychological distress based on many different factors. Therefore, accuracy, precision, recall, F1-score, and AUC were used to evaluate the model performance. The logistic regression model selected predictors by forward selection, backward selection, and stepwise regression; variable importance values were used to identify predictors in the other three machine learning methods. Of the four machine learning models, the ANN exhibited the best predictive effect (AUC = 73.90%). A range of predictors of psychological distress was identified by combining the four machine learning models, which would help improve the performance of the existing mental health screening tools.
engineering, chemical
What problem does this paper attempt to address?