Speech-based Depression Detection Using Unsupervised Autoencoder

Guangyao Sun,Shenghui Zhao,Bochao Zou,Yubo An
DOI: https://doi.org/10.1109/icsip55141.2022.9886372
2022-01-01
Abstract:With the rapid development of society, over three hundred million people worldwide suffer from depression, which has become one of the most serious health problems in the world. As we know, depression detection is of great importance for its timely treatment. In this paper, a speech-based depression detection method using unsupervised autoencoder is proposed. Most previous methods encode the frame-level speech features into sentence-level features with statistical functions which lead to the loss of the temporal information between frames. To solve this, we propose an unsupervised network based on transformer. The unsupervised network is adopted to obtain the audio embedding vector of an audio segment from depressed or non-depressed people. Then the embedding audio vector is used for depression detection. The experimental results show that the proposed method achieves superior performance on both the English database DAIC and our self-built Chinese database CMDC.
What problem does this paper attempt to address?