Abstract:Background: In the past few years there has been a growing interest in the employment of verbal productions as digital biomarkers, namely objective, quantifiable behavioural data that can be collected and measured by means of digital devices, allowing for a low-cost pathology detection, classification and monitoring. Numerous research papers have been published on the automatic detection of subtle verbal alteration, starting from written texts, raw speech recordings and transcripts, and such linguistic analysis has been singled out as a cost-effective method for diagnosing dementia and other medical conditions common among elderly patients (e.g., cognitive dysfunctions associated with metabolic disorders, dysarthria). Aims: To provide a critical appraisal and synthesis of evidence concerning the application of natural language processing (NLP) techniques for clinical purposes in the geriatric population. In particular, we discuss the state of the art on studying language in healthy and pathological ageing, focusing on the latest research efforts to build non-intrusive language-based tools for the early identification of cognitive frailty due to dementia. We also discuss some challenges and open problems raised by this approach. Methods & procedures: We performed a scoping review to examine emerging evidence about this novel domain. Potentially relevant studies published up to November 2021 were identified from the databases of MEDLINE, Cochrane and Web of Science. We also browsed the proceedings of leading international conferences (e.g., ACL, COLING, Interspeech, LREC) from 2017 to 2021, and checked the reference lists of relevant studies and reviews. Main contribution: The paper provides an introductory, but complete, overview of the application of NLP techniques for studying language disruption due to dementia. We also suggest that this technique can be fruitfully applied to other medical conditions (e.g., cognitive dysfunctions associated with dysarthria, cerebrovascular disease and mood disorders). Conclusions & implications: Despite several critical points need to be addressed by the scientific community, a growing body of empirical evidence shows that NLP techniques can represent a promising tool for studying language changes in pathological aging, with a high potential to lead a significant shift in clinical practice. What this paper adds: What is already known on this subject Speech and languages abilities change due to non-pathological neurocognitive ageing and neurodegenerative processes. These subtle verbal modifications can be measured through NLP techniques and used as biomarkers for screening/diagnostic purposes in the geriatric population (i.e., digital linguistic biomarkers-DLBs). What this paper adds to existing knowledge The review shows that DLBs can represent a promising clinical tool, with a high potential to spark a major shift to dementia assessment in the elderly. Some challenges and open problems are also discussed. What are the potential or actual clinical implications of this work? This methodological review represents a starting point for clinicians approaching the DLB research field for studying language in healthy and pathological ageing. It summarizes the state of the art and future research directions of this novel approach.

How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults

Evaluating Web-Based Automatic Transcription for Alzheimer Speech Data: Transcript Comparison and Machine Learning Analysis

Automated remote speech-based testing of individuals with cognitive decline: Bayesian agreement of transcription accuracy

Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research

Lexical Speech Features of Spontaneous Speech in Older Persons With and Without Cognitive Impairment: Reliability Analysis

The perception of artificial-intelligence (AI) based synthesized speech in younger and older adults

An AI Generated Test of Pragmatic Competence and Connected Speech

Using HIPAA (Health Insurance Portability and Accountability Act)-Compliant Transcription Services for Virtual Psychiatric Interviews: Pilot Comparison Study

Automated assessment of speech production and prediction of MCI in older adults

Who Said What? An Automated Approach to Analyzing Speech in Preschool Classrooms

Fair or Fare? Understanding Automated Transcription Error Bias in Social Media and Videoconferencing Platforms

Speech Analysis by Natural Language Processing Techniques: A Possible Tool for Very Early Detection of Cognitive Decline?

A framework for language technologies in behavioral research and clinical applications: Ethical challenges, implications, and solutions.

Natural language processing techniques for studying language in pathological ageing: A scoping review

Can linguists distinguish between ChatGPT/AI and human writing?: A study of research ethics and academic publishing

AI Psychometrics: Assessing the Psychological Profiles of Large Language Models Through Psychometric Inventories

Differentiating between human-written and AI-generated texts using linguistic features automatically extracted from an online computational tool

Artificial Intelligence, speech and language processing approaches to monitoring Alzheimer's Disease: a systematic review

Open Brain AI. Automatic Language Assessment

The Optimization of a Natural Language Processing Approach for the Automatic Detection of Alzheimer’s Disease Using GPT Embeddings

Understanding older people's voice interactions with smart voice assistants: a new modified rule-based natural language processing model with human input