Automated free speech analysis reveals distinct markers of Alzheimer's and frontotemporal dementia

Pamela Lopes da Cunha,Fabián Ruiz,Franco Ferrante,Lucas Federico Sterpin,Agustín Ibáñez,Andrea Slachevsky,Diana Matallana,Ángela Martínez,Eugenia Hesse,Adolfo M García
DOI: https://doi.org/10.1371/journal.pone.0304272
IF: 3.7
2024-06-06
PLoS ONE
Abstract:Dementia can disrupt how people experience and describe events as well as their own role in them. Alzheimer's disease (AD) compromises the processing of entities expressed by nouns, while behavioral variant frontotemporal dementia (bvFTD) entails a depersonalized perspective with increased third-person references. Yet, no study has examined whether these patterns can be captured in connected speech via natural language processing tools. To tackle such gaps, we asked 96 participants (32 AD patients, 32 bvFTD patients, 32 healthy controls) to narrate a typical day of their lives and calculated the proportion of nouns, verbs, and first- or third-person markers (via part-of-speech and morphological tagging). We also extracted objective properties (frequency, phonological neighborhood, length, semantic variability) from each content word. In our main study (with 21 AD patients, 21 bvFTD patients, and 21 healthy controls), we used inferential statistics and machine learning for group-level and subject-level discrimination. The above linguistic features were correlated with patients' scores in tests of general cognitive status and executive functions. We found that, compared with HCs, (i) AD (but not bvFTD) patients produced significantly fewer nouns, (ii) bvFTD (but not AD) patients used significantly more third-person markers, and (iii) both patient groups produced more frequent words. Machine learning analyses showed that these features identified individuals with AD and bvFTD (AUC = 0.71). A generalizability test, with a model trained on the entire main study sample and tested on hold-out samples (11 AD patients, 11 bvFTD patients, 11 healthy controls), showed even better performance, with AUCs of 0.76 and 0.83 for AD and bvFTD, respectively. No linguistic feature was significantly correlated with cognitive test scores in either patient group. These results suggest that specific cognitive traits of each disorder can be captured automatically in connected speech, favoring interpretability for enhanced syndrome characterization, diagnosis, and monitoring.
What problem does this paper attempt to address?