A Novel Machine Learning-Based Predictive Model of Clinically Significant Prostate Cancer and Online Risk Calculator

Flavio Vasconcelos Ordones,Paulo Roberto Kawano,Lodewikus Vermeulen,Ali Hooshyari,David Scholtz,Peter John Gilling,Darren Foreman,Basil Kaufmann,Cedric Poyet,Michael Gorin,Abner Macola Pacheco Barbosa,Naila Camila da Rocha,Luis Gustavo Modelli de Andrade
DOI: https://doi.org/10.1016/j.urology.2024.11.001
IF: 1.555
2024-11-17
Urology journal
Abstract:Objectives To create a machine learning predictive model combining PI-RADS score, PSA density, and clinical variables to predict clinically significant prostate cancer (csPCa). Methods We evaluated a cohort of patients who underwent prostate biopsy for suspected prostate cancer (PCa) in New Zealand, Australia, and Switzerland. We collected data on age, body mass index (BMI), PSA level, prostate volume, PSA density (PSAD), PI-RADS scores, previous biopsy, and corresponding histology results. The dataset was divided into derivation (training) and validation (test) sets using random splits. An independent dataset was obtained from the Harvard Dataverse for external validation. A cohort of 1272 patients was analyzed. We fitted a Lasso model, XGBoost, and LightGBM to the training set and assessed their accuracy. Results All models demonstrated ROC AUC values ranging from 0.830 to 0.851. LightGBM was considered the superior model, with an ROC of 0.851 [95%CI: 0.804 – 0.897] in the test set and 0.818 [95% CI: 0.798 – 0.831] in the external dataset. The most important variable was PI-RADS, followed by PSA density, history of previous biopsy, age, and BMI. Conclusions We developed a predictive model for detecting csPCa that exhibited a high ROC-AUC value for internal and external validations. This suggests that the integration of the clinical parameters outperformed each individual predictor. Additionally, the model demonstrated good calibration metrics, indicative of a more balanced model than the existing models.
urology & nephrology
What problem does this paper attempt to address?