Integrated Vocal Deviation Index (IVDI): A Machine Learning Model to Classifier of the General Grade of Vocal Deviation

Luiz Medeiros Araujo Lima-Filho,Leonardo Wanderley Lopes,Telmo de Menezes e Silva Filho
DOI: https://doi.org/10.1016/j.jvoice.2024.11.002
IF: 2.3
2024-11-28
Journal of Voice
Abstract:Objective To develop a multiparametric index based on machine learning (ML) to predict and classify the overall degree of vocal deviation (GG). Method The sample consisted of 300 dysphonic and non-dysphonic participants of both sexes. Two speech tasks were sustained vowel [a] and connected speech (counting numbers from 1 to 10). Five speech-language pathologists performed the auditory-perceptual judgment (APJ) of the GG and the degrees of roughness (GR), breathiness (GB), instability (GI), and strain (GS). We extracted 47 acoustic measurements from these tasks. The APJ result and the acoustic measurements were used to develop the multiparametric index. We used mean absolute error, root mean square error, and coefficient of determination ( R 2) to select the best model of ML to predict GG and feature importance to select the best set of variables for the index. After classifying the GG between nondysphonic, mild, moderate, and severe, the final model was validated using accuracy, sensitivity, specificity, predictive values, likelihood ratios, F1-Score, and weighted kappa. Results The gradient boost model showed the best performance among the ML models. Eight features were selected in the model, including four acoustic measures (jitterLoc, smoothed cepstral peak prominenc, mean harmonic-to-noise ratio (HNRmean), and correlation) and four APJ measures (GR, GB, GS, and GI). The final model correctly classified 93.75% of participants and obtained a weighted kappa index of 0.9374, demonstrating the model's excellent performance. Conclusion The Integrated Vocal Deviation Index includes four acoustic measures and four auditory-perceptual measures and showed excellent performance in classifying voices according to GG.
otorhinolaryngology,audiology & speech-language pathology
What problem does this paper attempt to address?