Universal penalized regression (Elastic-net) model with differentially methylated promoters for oral cancer prediction

Shantanab Das,Saikat Karuri,Joyeeta Chakraborty,Baidehi Basu,Aditi Chandra,S Aravindan,Anirvan Chakraborty,Debashis Paul,Jay Gopal Ray,Matt Lechner,Stephan Beck,E Andrew Teschendorff,Raghunath Chatterjee
DOI: https://doi.org/10.1186/s40001-024-02047-4
2024-09-11
Abstract:Background: DNA methylation showed notable potential to act as a diagnostic marker in many cancers. Many studies proposed DNA methylation biomarker in OSCC detection, while most of these studies are limited to specific cohorts or geographical location. However, the generalizability of DNA methylation as a diagnostic marker in oral cancer across different geographical locations is yet to be investigated. Methods: We used genome-wide methylation data from 384 oral cavity cancer and normal tissues from TCGA HNSCC and eastern India. The common differentially methylated CpGs in these two cohorts were used to develop an Elastic-net model that can be used for the diagnosis of OSCC. The model was validated using 812 HNSCC and normal samples from different anatomical sites of oral cavity from seven countries. Droplet Digital PCR of methyl-sensitive restriction enzyme digested DNA (ddMSRE) was used for quantification of methylation and validation of the model with 22 OSCC and 22 contralateral normal samples. Additionally, pyrosequencing was used to validate the model using 46 OSCC and 25 adjacent normal and 21 contralateral normal tissue samples. Results: With ddMSRE, our model showed 91% sensitivity, 100% specificity, and 95% accuracy in classifying OSCC from the contralateral normal tissues. Validation of the model with pyrosequencing also showed 96% sensitivity, 91% specificity, and 93% accuracy for classifying the OSCC from contralateral normal samples, while in case of adjacent normal samples we found similar sensitivity but with 20% specificity, suggesting the presence of early disease methylation signature at the adjacent normal samples. Methylation array data of HNSCC and normal tissues from different geographical locations and different anatomical sites showed comparable sensitivity, specificity, and accuracy in detecting oral cavity cancer with across. Similar results were also observed for different stages of oral cavity cancer. Conclusions: Our model identified crucial genomic regions affected by DNA methylation in OSCC and showed similar accuracy in detecting oral cancer across different geographical locations. The high specificity of this model in classifying contralateral normal samples from the oral cancer compared to the adjacent normal samples suggested applicability of the model in early detection.
What problem does this paper attempt to address?