Acoustic Feature Extraction with Interpretable Deep Neural Network for Neurodegenerative Related Disorder Classification

Yilin Pan,Bahman Mirheidari,Zehai Tu,Ronan O'Malley,Traci Walker,Annalena Venneri,Markus Reuber,Daniel Blackburn,Heidi Christensen
DOI: https://doi.org/10.21437/interspeech.2020-2684
2020-01-01
Abstract:Speech-based automatic approaches for detecting neuro-degenerative disorders (ND) and mild cognitive impairment (MCI) have received more attention recently due to being non-invasive and potentially more sensitive than current pen-and-paper tests. The performance of such systems is highly dependent on the choice of features in the classification pipeline. In particular for acoustic features, arriving at a consensus for a best feature set has proven challenging. This paper explores using deep neural network for extracting features directly from the speech signal as a solution to this. Compared with hand-crafted features, more information is present in the raw waveform, but the feature extraction process becomes more complex and less interpretable which is often undesirable in medical domains. Using a SincNet as a first layer allows for some analysis of learned features. We propose and evaluate the Sinc-CLA (with SincNet, Convolutional, Long Short-Term Memory and Attention layers) as a task-driven acoustic feature extractor for classifying MCI, ND and healthy controls (HC). Experiments are carried out on an inhouse dataset. Compared with the popular hand-crafted feature sets, the learned task-driven features achieve a superior classification accuracy. The filters of the SincNet is inspected and acoustic differences between HC, MCI and ND are found.
What problem does this paper attempt to address?