Bipolar Disorder Classification Based on Multimodal Recordings

Siming Cao,Huanqing Yan,Penghao Rao,Kaijie Zhao,Xiaocui Yu,Jun He,Lejun Yu,Yongkang Xiao
DOI: https://doi.org/10.1145/3497623.3497653
2021-01-01
Abstract:Automatic bipolar disorder classification is a challenging task. In this paper, we mainly focus on BD classification from acoustic, visual, and textual modalities. We highlight three aspects of our methods: 1) besides the baseline features, we explore and fuse some hand-crafted and deep learned features from all available modalities including acoustic, visual, and textual modalities. It should be noted that we extracted the textual modality by using the voice translation tool according to the acoustic modality; 2) Considering the fact that each video is given only one video-level label, while each frame of the video is unlabeled, we use the unsupervised Convolutional Auto-Encoder (CAE) and used it for feature extraction. 3) Due to the dataset is too small to train Convolutional Neural Network (CNN), so we decide to pre-train the CNN on other emotion datasets. The experimental results show that our model outperforms the baseline system. The final unweighted average recall (UAR) we gained is 93.12%.
What problem does this paper attempt to address?