The Importance of Genomic Predictors for Clinical Outcome of Hematological Malignancies
Cunte Chen,Chengwu Zeng,Yangqiu Li
DOI: https://doi.org/10.1097/bs9.0000000000000075
2021-01-01
Blood Science
Abstract:Hematological malignancy is 1 of the top 10 malignant diseases with regards to cancer patient morbidity and mortality.1 Although hematopoietic stem cell transplantation, chemotherapy, and targeted therapy have made great progress in recent years, patients with hematological malignancies still have adverse clinical outcomes, particularly elderly patients.2,3 Therefore, it is necessary to explore for an optimal prediction model to evaluate the clinical outcome, which is important for devising a therapeutic strategy for hematological diseases. Currently, sequencing technology can provide in-depth insights for the diagnosis, classification, prognostic evaluation, and therapeutic decision-making of patients with hematological malignancies.4 Recently, Twa et al reported that B-cell lymphoma 6 (BCL6) and/or programmed death ligand (PDL) 1/2 rearrangements can be used as genomic predictors for central nervous system relapse in primary testicular diffuse large B-cell lymphoma (DLBCL).5 Here, we discuss the importance and prospects of transcriptome data and genomic sequencing technology for evaluating and discovering genomic predictors of hematological malignancies. Transcriptome and genomic sequencing data are enormous, and bioinformatics is needed to decipher them. Currently, statistical methods used to evaluate large gene expression and mutation datasets can be mainly divided into 2 categories: supervised and unsupervised learning (Fig. 1). Supervised learning is used to identify genes related to known categories such as cancer type or clinical outcome, and unsupervised learning is used to explore the similarity of gene expression patterns.6 A large number of supervised learning have been used to explore hematological malignancies, including weighted voting, k-nearest neighbors, support vector machines, artificial neural networks, decision trees, random forest, and nearest shrunken centroid algorithms.7–10 For unsupervised learning, K-means clustering, principal component analysis, nonnegative matrix factorization, and weighted co-expression network analysis (WGCNA) have been widely used to investigate hematological malignancies.11–14 However, when conducting biological exploration, no formal classifier is needed to observe the correlation between 2 genes and investigate 1 gene's effect on the prognosis of patients.6Figure 1: Schematic diagram of the commentary. The transcriptome data, genomic sequencing data, and follow-up information are obtained to perform unsupervised and supervised learning, and Kaplan–Meier (KM) curve analysis is then conducted. Next, publicly available databases or clinical samples are used to validate the results of bioinformatics analyses. Finally, based on the above results, survival analysis and risk stratification of patients with hematological malignancies are performed.Transcriptome and genomic data have provided a reliable reference for finding prognosis-related genes for hematological malignancies. Of course, findings need to be further validated with external datasets (Fig. 1). Twa et al found that BCL6 and/or PDL 1/2 rearrangements can serve as genomic biomarkers for the clinical outcome of testicular DLBCL patients, which is particularly important given the limitations of clinical risk models for testicular DLBCL. However, Twa et al lacked another dataset for validation.5 Recently, we also performed research in this area. We first obtained transcriptome data from the acute myeloid leukemia (AML) patients in the Cancer Genome Atlas (TCGA) database to perform unsupervised learning by WGCNA and identified 6 prognosis-related genes, LOC646762, CCND3, CBR1, C10orf54, CD97, and BLOC1S1, which could be used for the risk stratification of AML patients. Then, bone marrow (BM) samples from AML patients were obtained from our clinical center for validation by quantitative real-time polymerase chain reaction.15 Moreover, from bioinformatics analysis, we found that high expression of CD56 is associated with a favorable prognosis for intermediate-risk AML patients, and 2 other publicly available datasets were used to validate the prognostic importance of CD56.16 Interestingly, we also investigated the prognostic value of immune checkpoints (ICs) and BRD4 in AML patients from the publicly available TCGA database using correlation analysis and found the optimal combination of ICs/BRD4 that could predict the overall survival (OS) of AML patients and then used BM samples to perform expression detection and prognosis validation. This finding provides deep insights for designing combinational IC inhibitors or immuno-targeted therapy for AML patients.17,18 As it is well known, bioinformatics analysis of exome sequencing data is also a promising direction for exploring the prognostic value of gene mutations in hematological malignancies, and we have also performed some exploration in this area. The tumor mutation burden (TMB) calculated by the 69 gene panel in our clinical center significantly correlates with the OS of DLBCL patients, and this could be confirmed by mutation data in the TCGA database. Therefore, TMB may be a potential indicator for risk stratification for DLBCL patients in China.19 Although bioinformatics analysis of Transcriptome and genomic data can provide us with evidence for predicting the prognosis of patients with hematological malignancies, additional validation data is needed to improve the accuracy of prediction and feasibility of clinical application (Fig. 1). However, different validation methods have various interpretations, evaluation qualities, and credibilities of the results. For example, detailed clinical information of patients cannot be provided in publicly available databases, but the patients in our clinical center have detailed information that may be used to conduct prognostic analysis. In addition, experimental validation of clinical samples can be more intuitive and reliable than validating from a database. However, samples from a single clinical center also have limitations, including a small sample size, and they may also have regional or temporal preferences. Therefore, the reliability of the results requires more validation and exploration in the future. It is worth noting that validation results from multicenter clinical trials, particularly randomized controlled studies from multiple countries, have promising clinical application value, which can guide clinicians to manage and treat patients accurately. It is known that clinical samples play a pivotal role in validating the prognostic importance of genomic predictors in hematological malignancies. Nevertheless, the histopathological type, optimal model, and simple alternative model of a clinical sample also play an essential role in validating results. Because myeloma, myelodysplastic syndrome, leukemia, and myeloproliferative neoplasms originate from the hematopoietic stem or progenitor cells in the BM, BM samples are the optimal choice for validating these diseases. Similarly, because lymphoma originates in lymph nodes and lymphatic tissue, it is best to have in situ tissue for validation. However, in situ lymph nodes and lymphatic tissues are more difficult to clinically obtain, and peripheral blood samples have more availability; thus, blood samples may be used as a substitute for in situ tissue for validation. Notably, researchers must conduct comparative studies to confirm the replacement effects of peripheral blood. In conclusion, because transcriptome and genome sequencing generates a large amount of data, bioinformatics is needed to decipher their biological or prognostic significance to provide a reliable reference for experimental validation. Notably, experimental validation of clinical samples can relatively accurately confirm the importance of genomic predictors for the prognosis of patients with hematological malignancies.5 These strategies may provide in-depth insights into treatment options and manage patients by risk stratification.