Clustering of Transcriptomic Data for the Identification of Cancer Subtypes

Xiaochun Chen,Honggang Wang,Donghui Yan
DOI: https://doi.org/10.48550/arXiv.1811.09926
2018-11-25
Applications
Abstract:Cancer is a number of related yet highly heterogeneous diseases. Correct identification of cancer subtypes is critical for clinical decisions. The advance in sequencing technologies has made it possible to study cancer based on abundant genomics and transcriptomic (-omics) data. Such a data-driven approach is expected to address limitations and issues with traditional methods in identifying cancer subtypes. We evaluate the suitability of clustering--a data mining tool to study heterogenous data when there is a lack of sufficient understanding of the subject matters--in the identification of cancer subtypes. A number of popular clustering algorithms and their consensus are explored, and we find cancer subtypes identified by consensus clustering agree well with clinical studies.
What problem does this paper attempt to address?