Machine learning-based ensemble approach in prediction of lung cancer predisposition using XRCC1 gene polymorphism

Abhishek Choudhary,Adarsh Anand,Amrita Singh,Pratima Roy,Navneet Singh,Vinay Kumar,Siddharth Sharma,Manoj Baranwal
DOI: https://doi.org/10.1080/07391102.2023.2242492
Abstract:The employment of machine learning approaches has shown promising results in predicting cancer. In the current study, polymorphisms data of five single nucleotide polymorphisms (SNPs) of DNA repair gene XRCC1 (XRCC1 399, XRCC1 194, XRCC1 206, XRCC1 632, XRCC1 280) of the north Indian population along with four smoking status data is considered as an input to the proposed ensemble model to predict the risk of individual susceptibility to the lung cancer. The prediction accuracy of the proposed ensemble model for cancer predisposition was found to be 85%. The model performance is also evaluated using sensitivity, specificity, precision and the Gini index, which is found in the range of 0.83-0.87. The proposed model also outperformed in all evaluation parameters when compared with the individual Model (LM, SVM, RF, KNN and baseline neural net). Collectively, current results suggest the potential of the proposed ensemble model in predicting the risk of cancer based on XRCC1 SNPs data.Communicated by Ramaswamy H. Sarma.
What problem does this paper attempt to address?