SYNPRED: prediction of drug combination effects in cancer using different synergy metrics and ensemble learning
António J Preto,Pedro Matos-Filipe,Joana Mourão,Irina S Moreira
DOI: https://doi.org/10.1093/gigascience/giac087
IF: 7.658
2022-09-26
GigaScience
Abstract:Background: In cancer research, high-throughput screening technologies produce large amounts of multiomics data from different populations and cell types. However, analysis of such data encounters difficulties due to disease heterogeneity, further exacerbated by human biological complexity and genomic variability. The specific profile of cancer as a disease (or, more realistically, a set of diseases) urges the development of approaches that maximize the effect while minimizing the dosage of drugs. Now is the time to redefine the approach to drug discovery, bringing an artificial intelligence (AI)-powered informational view that integrates the relevant scientific fields and explores new territories. Results: Here, we show SYNPRED, an interdisciplinary approach that leverages specifically designed ensembles of AI algorithms, as well as links omics and biophysical traits to predict anticancer drug synergy. It uses 5 reference models (Bliss, Highest Single Agent, Loewe, Zero Interaction Potency, and Combination Sensitivity Score), which, coupled with AI algorithms, allowed us to attain the ones with the best predictive performance and pinpoint the most appropriate reference model for synergy prediction, often overlooked in similar studies. By using an independent test set, SYNPRED exhibits state-of-the-art performance metrics either in the classification (accuracy, 0.85; precision, 0.91; recall, 0.90; area under the receiver operating characteristic, 0.80; and F1-score, 0.91) or in the regression models, mainly when using the Combination Sensitivity Score synergy reference model (root mean square error, 11.07; mean squared error, 122.61; Pearson, 0.86; mean absolute error, 7.43; Spearman, 0.87). Moreover, data interpretability was achieved by deploying the most current and robust feature importance approaches. A simple web-based application was constructed, allowing easy access by nonexpert researchers. Conclusions: The performance of SYNPRED rivals that of the existing methods that tackle the same problem, yielding unbiased results trained with one of the most comprehensive datasets available (NCI ALMANAC). The leveraging of different reference models allowed deeper insights into which of them can be more appropriately used for synergy prediction. The Combination Sensitivity Score clearly stood out with improved performance among the full scope of surveyed approaches and synergy reference models. Furthermore, SYNPRED takes a particular focus on data interpretability, which has been in the spotlight lately when using the most advanced AI techniques.