Detecting Alzheimer's Disease Based on Acoustic Features Extracted from Pre-trained Models

Kangdi Mei,Zhiqiang Guo,Zhaoci Liu,Lijuan Liu,Xin Li,Zhenhua Ling
DOI: https://doi.org/10.1007/978-3-031-20503-3_22
2022-01-01
Abstract:In this paper, we study the performance of Alzheimer's disease (AD) detection using two different high-level acoustic features extracted from pre-trained models, i.e., bottleneck features obtained by supervised learning and wav2vec 2.0 representations obtained by self-supervised learning. We exploit early fusion at the frame level and late fusion at the score level to combine these two features. Moreover, the silence-related information extracted based on voice activity detection (VAD) is integrated to further optimize the detection results. Experiments on the INTERSPEECH 2020 ADRess Challenge dataset show that the bottleneck features and wav2vec 2.0 representations perform better in the detection of AD class and non-AD class respectively, while the late fusion provides a higher accuracy than both of them, which suggests that there exists complementary information between these two features. The integration of the silence-related information improves the fusion system even further. Our highest accuracy on AD detection is 79.2%, which achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.
What problem does this paper attempt to address?