Toward Intelligent Head Impulse Test: A Goggle‐Free Approach Using a Monocular Infrared Camera

Yang Ouyang,Wenwei Luo,Yinwei Zhan,Caizhen Wei,Xian Liang,Hongming Huang,Yong Cui
DOI: https://doi.org/10.1002/lary.31848
IF: 2.97
2024-10-20
The Laryngoscope
Abstract:This study proposes an intelligent head impulse test (iHIT) framework using a monocular camera instead of specialized head‐mounted goggles to assess vestibular function. A two‐stage multi‐modal deep learning video classification network is trained on a dataset of HIT video clips to identify the semicircular canal being tested and determine vestibulo‐ocular reflex abnormalities. Experiments show iHIT achieves high accuracy, eliminates the need for equipment calibration, enables complete automation, and offers benefits of low cost and ease of use compared to existing video‐based HIT methods. Objectives To assess vestibular function, video head impulse test (vHIT) is taken as the gold standard by evaluating the vestibulo‐ocular reflex (VOR). However, vHIT requires the patient to wear a specialized head‐mounted goggle equipment that needs to be calibrated before each use. For this, we proposed an intelligent head impulse test (iHIT) setting with a monocular infrared camera instead of the head‐mounted goggle and contributed correspondingly a video classification approach with deep learning to vestibular function determination. Methods Within the iHIT framework, a monocular infrared camera was set in front of the patient to capture test videos, based on which a dataset DiHIT of HIT video clips was set up. We then proposed a two‐stage multi‐modal video classification network, trained on the dataset DiHIT, that took as input the eye motion and head motion data extracted from the facial keypoints via HIT clips and outputted the identification of the semicircular canal (SCC) being tested (SCC identification) and determination of VOR abnormality (SCC qualitation). Results Experiments on this dataset DiHIT showed that it achieved the accuracy of 100% in prediction of SCC identification. Furthermore, it attained predictive accuracies of 84.1% in horizontal and 79.0% in vertical SCC qualitation. Conclusions Compared with existing video‐based HIT, iHIT eliminates goggles, does not require equipment calibration, and achieves complete automation. Furthermore, iHIT will bring more benefits to users due to its low cost and ease of operation. Codes and use case pipeline are available at: https://github.com/dec1st2023/iHIT. Level of Evidence 3 Laryngoscope, 2024
medicine, research & experimental,otorhinolaryngology
What problem does this paper attempt to address?