A machine learning screening model for identifying the risk of high-frequency hearing impairment in a general population

Yi Wang,Xinmeng Yao,Dahui Wang,Chengyin Ye,Liangwen Xu
DOI: https://doi.org/10.1186/s12889-024-18636-1
IF: 4.5
2024-04-27
BMC Public Health
Abstract:Hearing impairment (HI) has become a major public health issue in China. Currently, due to the limitations of primary health care, the gold standard for HI diagnosis (pure-tone hearing test) is not suitable for large-scale use in community settings. Therefore, the purpose of this study was to develop a cost-effective HI screening model for the general population using machine learning (ML) methods and data gathered from community-based scenarios, aiming to help improve the hearing-related health outcomes of community residents.
public, environmental & occupational health
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of screening for high - frequency hearing impairment (HFHI) among Chinese community residents. Specifically, the researchers attempt to develop a cost - effective screening model based on machine learning (ML) to help identify individuals at risk of HFHI in the general population. Currently, due to the limitations of primary medical resources, traditional hearing test methods (such as pure - tone audiometry) are not suitable for large - scale application in the community environment. Therefore, the goal of this study is to use data available in the community environment (such as questionnaire surveys and routine blood test data), combined with machine - learning algorithms, to construct an efficient HFHI screening tool, thereby improving the hearing health of community residents. ### Background of the main problem 1. **Public health problem of hearing impairment**: Hearing impairment (HI) has become an important public health problem in China and even globally. Especially among middle - aged and elderly people, HI not only affects the quality of life but also brings a huge social and economic burden. 2. **Limitations of existing screening methods**: At present, the diagnostic standard for HI is pure - tone audiometry, but this method requires expensive equipment and professional personnel, and it is difficult to conduct large - scale screening at the community level. 3. **Importance of early intervention**: HI usually starts with a decline in high - frequency hearing and gradually develops into dysfunction in low - frequency or speech frequencies. Therefore, early screening and intervention are crucial for delaying the progress of the disease. ### Specific objectives of the study - **Develop an efficient screening model**: Use machine - learning algorithms to construct a model that can accurately identify the risk of HFHI by using common data in the community environment (such as questionnaire surveys, blood tests, etc.). - **Improve the feasibility and operability of screening**: Ensure that the model can be conveniently used in the community environment and generate an easy - to - understand risk score to help community doctors and residents better understand and deal with hearing health problems. - **Identify the characteristics of high - risk groups**: Analyze populations with different risk stratifications to find high - risk factors related to HFHI and provide a basis for personalized prevention and intervention. ### Method overview The researchers adopted a multi - stage stratified cluster sampling method to conduct a cross - sectional survey in 7 community health centers in Zhejiang province. A total of 3,371 community residents' data were collected, including questionnaire surveys, hearing tests, and blood test results. Then, seven common machine - learning algorithms (such as Naive Bayes, K - Nearest Neighbors, Support Vector Machines, Random Forests, XGBoost, etc.) were used to construct and compare multiple HFHI screening models. Finally, the LASSO regression model with the best performance was selected as the final screening tool, and the operability of the model was further improved through Nomogram. ### Results and conclusions The study shows that the AUC value of the LASSO regression model on the validation set reached 0.868 (95% confidence interval: 0.847 - 0.889), with high accuracy, specificity, and F - score. In addition, the model successfully identified 34 indicators related to the risk of HFHI, covering demographic characteristics, disease history, behavioral factors, environmental exposure, hearing cognitive factors, and multiple blood test indicators. These findings provide strong support for personalized HI prevention and intervention for community residents. In conclusion, this study developed an efficient and practical HFHI screening model by combining machine - learning technology and common community data, which helps to identify and intervene in hearing impairment earlier, thereby improving the hearing health of community residents.