A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution

Daniele Ramazzotti
DOI: https://doi.org/10.48550/arXiv.1602.07614
IF: 5.414
2016-02-15
Machine Learning
Abstract:Recently, there has been a resurgence of interest in rigorous algorithms for the inference of cancer progression from genomic data. The motivations are manifold: (i) growing NGS and single cell data from cancer patients, (ii) need for novel Data Science and Machine Learning algorithms to infer models of cancer progression, and (iii) a desire to understand the temporal and heterogeneous structure of tumor to tame its progression by efficacious therapeutic intervention. This thesis presents a multi-disciplinary effort to model tumor progression involving successive accumulation of genetic alterations, each resulting populations manifesting themselves in a cancer phenotype. The framework presented in this work along with algorithms derived from it, represents a novel approach for inferring cancer progression, whose accuracy and convergence rates surpass the existing techniques. The approach derives its power from several fields including algorithms in machine learning, theory of causality and cancer biology. Furthermore, a modular pipeline to extract ensemble-level progression models from sequenced cancer genomes is proposed. The pipeline combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations and progression model inference. Furthermore, the results are validated by synthetic data with realistic generative models, and empirically interpreted in the context of real cancer datasets; in the later case, biologically significant conclusions are also highlighted. Specifically, it demonstrates the pipeline's ability to reproduce much of the knowledge on colorectal cancer, as well as to suggest novel hypotheses. Lastly, it also proves that the proposed framework can be applied to reconstruct the evolutionary history of cancer clones in single patients, as illustrated by an example from clear cell renal carcinomas.
What problem does this paper attempt to address?