Software Implementation for the Processing of High-Speed Digital Images of the Glottis and the Analysis of Vocal Fold Vibration

Tao Jiang,Guang Li,Shouhua Luo,Ning Gu,Yan Yao
DOI: https://doi.org/10.1117/12.925219
2012-01-01
Abstract:High-speed digital imaging (HSDI) of the glottis provides a direct means to capture the actual vocal-fold vibrations. Subsequent image-based analyses can be used for objective and quantitative assessment of voice kinematics in healthy and diseased states. HSDI generates massive visual data array, yet the development of effective software for handling such massive image data has lagged behind. To obtain a robust and clinically relevant analysis, we have implemented a software system that includes the processing of AVI image sequences from HSDI recordings, a s we l l a s the spatiotemporal analysis of glottal area waveform (GAW) and vocal fold displacements extracted from these image sequences. The software contains the following three modules: 1) Import and View Module- to read AVI video data, edit/compile and save selected image data, and make image montages using DirectShow technology; 2) Process Module- to perform frame-by-frame image segmentation to delineate the glottis, and to extract GAW and bilateral vocal fold displacements; 3) Analysis Module- to adopt Nyquist plot displays that involve the Hilbert transform based analysis of GAW, and to provide instantaneous frequency and amplitude distributions. Upon rigorous testing of this software using numerous clinical data samples, we demonstrate the validity of this software in delivering accurate and useful vibratory characteristics of the vocal folds that may correlate with voice condition.
What problem does this paper attempt to address?