Depression Recognition Based on Speech Analysis
Wei Pan,Jingying Wang,Tianli Liu,Xiaoqian Liu,Mingming Liu,Bin Hu,Tingshao Zhu
DOI: https://doi.org/10.1360/n972017-01250
2018-01-01
Chinese Science Bulletin (Chinese Version)
Abstract:Depression is one of the common mental diseases. Patients with depression often have depressed moods such as sadness, guilty, low self-esteem, loss of interest, hypofunction and so on. They suffer from serious emotional problems, unexplained suffering, which has caused enormous losses to individuals, families and society. According to the World Health Organization, there are aproximately 322 million people suffering from depression in the whole world in 2017. While there are about 54 million depressive patients in China. Depression can be cured effciently. However, due to the complexity of the pathogenesis of depression, clinical diagnosis is accompanied with many difficulties. Firstly, the mental disease, especially depression, are not getting enough attention and even being misinterpreted by other people. Secondly, the depression patients are less willing to ask for help. Thirdly, it is hard to select and dignose the potential depression patients precisely, as well as there are limited medical resource for depression diagnosis. It is necessary to find a more convenient, objective and efficient way to assist the fast identification of depression. As a relatively objective and easily accessible variable, speech has its potential value. The speech of patient is easy to acquire, and also, it has been proved that the sound of depressed patients have special charcteristics such as slow speech rate, lack of cadence and so on. The purpose of this paper is to explore the relationship between speech and depression by establishing classification models of voice feature and depression prediction. In this research, 3(emotion mood: positive, neutral, negative)×3(task type: question answering, text reading, picture description) experimental design was employed, and the voice data was collected from the speech of individuals recorded during different tasks. 103 participants were inculded in this study, including 45 depression patients (age: 23.8–44.6, M =34.2, SD =10.4, males=22, females=23) and 58 healthy ones (age: 20.1–41.7, M =30.9, SD =10.8, males=27, females=31). The former were recruited in the hospital in Beijing Anding Hospital and Huilongguan Hospital, while the latter were recruited by advertisement. All of them were diagnosed by specialist with DSM-IV and MINI interview. All participants did not have substance abuse, substance dependence, personality disorders and other mental diseases, no serious physical illness or suicidal behavior. The education level of subjects are all above the elementary school. 988 Voice features were extracted from the speech data using open SMILE software. Logistic regression, a machine learning method, was used to train the predicting models. Results showed that the precision rate of predicting can reach to 82.9%. Based on machine learning methods, this paper employed voice features to establish predicting models of depression. Results show the speech of depression patients has certain predicting effect, which paves the way for the further identification of depression in a more thorough way.