Anxiety State Recognition Based on Speech Emotional Features and ECAPA-TDNN

YING LIU,Xiaoqian Liu,Tingshao Zhu
DOI: https://doi.org/10.1117/12.3034286
2024-01-01
Abstract:Anxiety disorders not only have negative effects on patient health, social and social interactions but also place a significant burden on society. This paper proposes a anxiety state recognition method based on speech emotional features and ECAPA-TDNN (emphasized channel attention, propagation, and aggregation based TDNN), aiming to explore the feasibility of large-scale screening of anxiety states using voice data. 152 participants have been recruited and collected their speech data. Meanwhile, each participant was asked to complete the anxiety scales (GAD-7). Subsequently, 6125 acoustic features were extracted from each participant's speech and the f-regression (a method for Linear Regression with One Variable) was used to screen the features for each group. Then the embedded features are extracted through ECAPA-TDNN, concatenated with the acoustic features above to form a new emotional representation. Moreover, various machine learning algorithms including Random forests, ExtraTrees regression, linear regression and Multi-Layer Perceptron (MLP) regression were used to develop anxiety recognition models. The results showed that the best Pearson Correlation Coefficient (PCC) between the predicted score and the GAD-7 score was 0.515. Notably, the reliability of the model surpassed 0.657, further reinforcing its credibility.
What problem does this paper attempt to address?