Prediction of Depression Severity Based on Transformer Encoder and CNN Model

Jiahao Lu,Bin Liu,Zheng Lian,Cong Cai,Jianhua Tao,Ziping Zhao
DOI: https://doi.org/10.1109/ISCSLP57327.2022.10038064
2022-01-01
Abstract:Depression is one of the most common mental health disorders and can lead to suicide in extreme cases. It is crucial to develop and design an automatic depression detection(ADD) model that is objective, effective, and robust. Many works have been done to develop automatic depression detection systems using speech data due to its easy accessibility and privacy. Although most current methods use RNN, it is difficult to capture the long-term dependence of depressed audio due to the gradient explosion/vanish. In this paper, a novel model combining Transformer Encoder and CNN is proposed to address this problem. Experiments on the DAIC-WOZ dataset show that the performance of the algorithm has improved considerably, with a mean absolute error (MAE) of 4.48 and a root mean square error (RMSE) of 5.37.
What problem does this paper attempt to address?