Detecting of Voice Fatigue With Artificial Intelligence

Abhinav Siripurapu,Robert T. Sataloff
DOI: https://doi.org/10.1016/j.jvoice.2024.08.002
IF: 2.3
2024-08-27
Journal of Voice
Abstract:Summary Voice fatigue (VF) has many symptoms and can occur after extended or brief voice use, depending on the presence or absence of voice pathology, and other factors. However, fatigue is difficult to detect and quantify through current approaches. This study explores the use of artificial intelligence (AI) in the automatic detection and analysis of VF, presenting a novel approach to detect and monitor the condition. Objective This study aims to create an AI-based system for detecting VF. The AI model's performance is evaluated against traditional methods of assessment conducted by speech-language pathologists (SLPs). Methods Voice samples were collected from individuals experiencing varying levels of VF. To validate these samples, we calculated f o , increases that have been shown to be correlated with VF, at the beginning and end of the recordings. The samples were processed using a machine learning model trained to recognize patterns associated with VF. To build the model, we extracted embeddings from an ECAPA-TDNN model that has been shown to capture changes in the voice characteristics of a speaker over time and used a Convolutional Neural Network for classification. To validate the model, the model's accuracy in detecting VF was compared with assessments from SLPs. Results We achieved an accuracy score of 93% on our dataset of English academic lectures and podcasts. As further validation, we asked three experienced SLPs to classify audio segments from our dataset and compared their responses to the classifications from our model, and achieved an accuracy of 86% as compared to their ratings. Conclusion The application of AI in the detection of VF shows a generalizable approach for the analysis of VF. Future research will incorporate patient data to validate further the models that we created.
otorhinolaryngology,audiology & speech-language pathology
What problem does this paper attempt to address?