Building gender-specific sexually transmitted infection risk prediction models using CatBoost algorithm and NHANES data

Mengjie Hu,Han Peng,Xuan Zhang,Lefeng Wang,Jingjing Ren
DOI: https://doi.org/10.1186/s12911-024-02426-1
IF: 3.298
2024-01-26
BMC Medical Informatics and Decision Making
Abstract:Sexually transmitted infections (STIs) are a significant global public health challenge due to their high incidence rate and potential for severe consequences when early intervention is neglected. Research shows an upward trend in absolute cases and DALY numbers of STIs, with syphilis, chlamydia, trichomoniasis, and genital herpes exhibiting an increasing trend in age-standardized rate (ASR) from 2010 to 2019. Machine learning (ML) presents significant advantages in disease prediction, with several studies exploring its potential for STI prediction. The objective of this study is to build males-based and females-based STI risk prediction models based on the CatBoost algorithm using data from the National Health and Nutrition Examination Survey (NHANES) for training and validation, with sub-group analysis performed on each STI. The female sub-group also includes human papilloma virus (HPV) infection.
medical informatics
What problem does this paper attempt to address?