Multimodal fusion for alzheimer’s disease recognition

Yangwei Ying,Tao Yang,Hong Zhou
DOI: https://doi.org/10.1007/s10489-022-04255-z
IF: 5.3
2022-12-01
Applied Intelligence
Abstract:Alzheimer’s disease (AD) is the most prevalent form of progressive degenerative dementia, which has a great impact on social economics throughout the world. In the vast majority of cases, AD patients are diagnosed by biochemical analysis, lumbar puncture and advanced imaging examination, which cannot play a preventive role in early stage of Alzheimer’s disease. Speech signals contain abundant personal information, especially AD patients always accompany with speech disorder, which provides a potential to utilize speech information to distinguish AD patients from healthy persons. The work presented in this paper aims to develop new approach for early detection of AD by noninvasive methods. We propose to make utilization of multimodal features with speech acoustic and linguistic features for the speech recognition of Alzheimer’s disease. Three different kinds of features, IS10_paraling features, deep acoustic using fine-tuned Wav2Vec2.0 model and deep linguistic features extracted using fine-tuned BERT, are adopted for AD classification by SVM classifier. By conducting experiments on two publicly available datasets of NCMMSC2021 and ADReSSo, the experimental results show that our model achieves state-of-the-art (SOTA) performance with satisfactory recognition effect. Our best-performing model obtains the accuracy of 89.1% and 84.0% in the long and short-audio of NCMMSC2021, and 83.7% in ADReSSo, which is promising for the early diagnosis and classification of AD patients.
computer science, artificial intelligence
What problem does this paper attempt to address?