A Supervised Machine Learning Approach with Feature Selection for Sex-Specific Biomarker Prediction

Luke Meyer,Danielle Mulder,Joshua Wallace
DOI: https://doi.org/10.1101/2024.06.06.597741
2024-06-07
Abstract:Biomarkers play a crucial role in various aspects of healthcare, offering valuable insights into disease diagnosis, prognosis, and treatment selection. Recently, machine learning (ML) techniques have emerged as effective tools for uncovering novel biomarkers and improving predictive modelling capabilities. However, bias within ML algorithms, particularly regarding sex-based disparities, remains a concern. In this study, a supervised ML model was developed in order to predict 9 common biomarkers widely used in clinical settings. These biomarkers included triglycerides, body mass index, waist circumference, systolic blood pressure, blood glucose, uric acid, urinary albumin-to-creatinine ratio, high-density lipoproteins and albuminuria. During the validation test, it was observed that the ML models successfully predicted values within 5 and 10% error of the actual values. Out of the 121 female individuals tested, the following percentages of predicted values fell within this 10% range: 93% for albuminuria, 86% for waist circumference, 76% for BMI, and the lowest being 64% for systolic blood pressure and blood glucose. For the 119 male individuals tested, the percentages were as follows: 92% for albuminuria, 96% for waist circumference, 91% for BMI, 74% for blood glucose, and 68% for systolic blood pressure. Triglycerides, uric acid, urinary albumin-to-creatinine ratio and high-density lipoproteins all predicted lower than 50% for both male and female subgroups. Overall, the male subgroup had higher prediction scores than the female group.
Systems Biology
What problem does this paper attempt to address?