Using somatic mutation data to test tumors for clonal relatedness

Irina Ostrovnaya,Venkatraman E. Seshan,Colin B. Begg
DOI: https://doi.org/10.1214/15-AOAS836
2015-11-17
Abstract:A major challenge for cancer pathologists is to determine whether a new tumor in a patient with cancer is a metastasis or an independent occurrence of the disease. In recent years numerous studies have evaluated pairs of tumor specimens to examine the similarity of the somatic characteristics of the tumors and to test for clonal relatedness. As the landscape of mutation testing has evolved, a number of statistical methods for determining clonality have developed, notably for comparing losses of heterozygosity at candidate markers, and for comparing copy number profiles. Increasingly tumors are being evaluated for point mutations in panels of candidate genes using gene sequencing technologies. Comparison of the mutational profiles of pairs of tumors presents unusual methodological challenges: mutations at some loci are much more common than others; knowledge of the marginal mutation probabilities is scanty for most loci at which mutations might occur; the sample space of potential mutational profiles is vast. We examine this problem and propose a test for clonal relatedness of a pair of tumors from a single patient. Using simulations, its properties are shown to be promising. The method is illustrated using several examples from the literature.
Applications,Quantitative Methods
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to test whether tumors are clonally related through somatic mutation data in cancer pathology. Specifically, the research aims to develop a statistical method for comparing the mutation spectra between two tumors within the same patient to determine whether these two tumors are derived from the same "clonal" cell population, that is, whether one tumor is a metastasis of the other tumor or two independently occurring tumors. ### Background and Challenges 1. **Clinical Requirement**: Clinically, distinguishing whether a newly discovered tumor is a metastasis of the primary tumor or a new independent tumor is crucial for formulating treatment plans. Traditional diagnostic methods mainly rely on the histological characteristics of tumor cells, but this method has limitations. 2. **Evidence at the Molecular Level**: In recent years, with the development of gene sequencing technology, more and more studies have begun to use gene mutation data to evaluate the clonal relatedness of tumors. In particular, by comparing the point mutation spectra of two tumors, it is possible to more accurately determine whether they have a clonal origin. 3. **Methodological Challenges**: - **Differences in Mutation Frequencies**: The mutation frequencies at different sites vary greatly. Mutations at some sites are very common, while mutations at other sites are extremely rare. - **Uncertainty in Mutation Probabilities**: For most sites where mutations may occur, little is known about their marginal mutation probabilities. - **Huge Sample Space**: The space of potential mutation spectra is very large and difficult to define precisely. ### Research Objectives 1. **Propose a New Statistical Testing Method**: The author proposes a statistical testing method based on somatic mutation data to evaluate whether two tumors are clonally related. 2. **Verify the Effectiveness of the Method**: Through simulation experiments, verify the performance of this method under different conditions, including the influence of factors such as mutation frequency estimation errors and the correlation of mutation sites on the test results. 3. **Example of Practical Application**: Demonstrate the application effect of this method by analyzing the mutation data in actual cases. ### Method Overview 1. **Assumptions and Symbols**: - \(n\): The total number of potential mutation sites. - \(m\): The actual number of mutation sites observed in the two tumors. - \(p_i\): The probability of mutation at the \(i\)-th site. - \(\xi\): The probability of mutation occurring at the clonal stage, \(\xi = 0\) indicates independent tumors, and \(\xi> 0\) indicates the strength of the clonal signal. 2. **Test Statistics**: - Unconditional test statistic \(S_u\): \[ S_u=\sum_{i \in A} \log \left(\frac{\hat{\xi}}{1 - \hat{\xi} p_i^{-1}+1}\right)-\sum_{i \in E} \log \left(\frac{\hat{\xi}}{1 - \hat{\xi}(1 - p_i)^{-1}+1}\right)+\sum_{i \in D} \log \left(\frac{\hat{\xi}}{1 - \hat{\xi}(1 - p_i)^{-1}+1 - \hat{\xi}}\right) \] - Conditional test statistic \(S_c\): \[ S_c=\sum_{i \in A} \log \left(\frac{\hat{\xi}}{1 - \hat{\xi} p_i^{-1}+1}\right)-\sum_{i \in E} \log \left(\frac{\hat{\xi}}{1 - \hat{\xi}(2 - p_i)^{-1}+1}\right) \] 3. **Generation of Reference Distributions**: - Generate reference distributions for unconditional and conditional test statistics through random simulation for calculating p - values and determining critical values. ### Results and Discussion 1. **There are...**