Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia

Dimitris Gkoumas,Matthew Purver,Maria Liakata
2023-10-16
Abstract:Dementia is associated with language disorders which impede communication. Here, we automatically learn linguistic disorder patterns by making use of a moderately-sized pre-trained language model and forcing it to focus on reformulated natural language processing (NLP) tasks and associated linguistic patterns. Our experiments show that NLP tasks that encapsulate contextual information and enhance the gradient signal with linguistic patterns benefit performance. We then use the probability estimates from the best model to construct digital linguistic markers measuring the overall quality in communication and the intensity of a variety of language disorders. We investigate how the digital markers characterize dementia speech from a longitudinal perspective. We find that our proposed communication marker is able to robustly and reliably characterize the language of people with dementia, outperforming existing linguistic approaches; and shows external validity via significant correlation with clinical markers of behaviour. Finally, our proposed linguistic disorder markers provide useful insights into gradual language impairment associated with disease progression.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the detection and analysis of language impairments associated with dementia. Specifically, the research objectives include: 1. **Automatic Learning of Language Impairment Patterns**: Automatically learning the language impairment patterns of dementia patients by utilizing pre-trained language models, particularly RoBERTa. 2. **Improving Natural Language Processing Tasks**: Reformulating natural language processing (NLP) tasks to capture long-term impairment features in the language of dementia patients. These tasks can better utilize contextual information and enhance gradient signals through language patterns. 3. **Constructing Digital Language Markers**: Constructing digital language markers using probability estimates from the best models to measure the overall quality of communication and the degree of various language impairments. 4. **Longitudinal Analysis**: Analyzing how the aforementioned digital markers represent the speech characteristics of dementia patients from a long-term perspective and exploring how these markers change over time. 5. **Performance Comparison**: Comparing the proposed communication markers with other existing methods (such as those based on semantic similarity and word-level fluency) to validate their diagnostic efficacy. 6. **Clinical Relevance Assessment**: Evaluating the correlation between communication markers and clinical behavioral markers (such as the Mini-Mental State Examination, MMSE, and Clinical Dementia Rating, CDR) to verify their reliability. In summary, the core objective of this paper is to develop a new method based on natural language processing to identify, understand, and track the language impairments of dementia patients, thereby aiding in early diagnosis and improving disease management.