Abstract A067: A single sample classifier of Bailey's molecular subtype of PDAC

Taisuke Baba,Masaki Sunagawa,Junpei Yamaguchi,Toshio Kokuryo,Tomoki Ebata
DOI: https://doi.org/10.1158/1538-7445.panca2023-a067
IF: 11.2
2024-01-18
Cancer Research
Abstract:Transcriptomic molecular subtype of pancreatic cancer (PDAC) is expected to be an important part of future personalized medicine. However, most classifier methods are based on the relative expression patterns in the dataset. Batch correction methods are still under development, thus no one can know if the subtyping of their own dataset is accurate, which has made many researchers hesitant to perform studies involving molecular subtyping. PurIST is a robust classifier of Moffitt tumor subtype invented by Rashid et al. It calculates a score based on the expression pattern within a sample and predicts a specific subtype. Unlike traditional cluster-based subtyping, the predicted subtype never changes. Because these types of methods are independent of the cohort composition and the prediction can be made for even a single sample, these methods are called single sample classifiers (SSCs). The approaches of most SSCs to absorb the batch differences are to make the information fuzzy, thus sacrificing the predictive ability. In this study, we invented a single sample classifier called inverse pair boosting (IPB). Our approach does not lose information, instead, magnifies the differences between subtypes and relatively reduces the differences between batches to a negligible level. Considering Moffitt's tumor classification, classical tumors have high classical gene expressions and low basal-like gene expressions, whereas, the gene expression patterns of basal-like tumors are the other way around. This kind of one side up and the other side down relationship is widely observed in the gene expression pattern. We used this to enhance the subtyping difference and overcome the batch differences. We developed a machine learning model with IPB method based on the Bailey's original RNA-seq data (PACA-AU RNA-seq), and tested its robustness with PACA-AU microarray data. Although there is a huge gap between RNA-seq data and microarray data due to their different modalities, the accuracy of the Bailey's subtype prediction on microarray data was 0.86, which was much higher than other SSCs (accuracy = 0.60 - 0.67). We developed a single sample classifier of Bailey's molecular subtype of PDAC using novel approach called IPB. Our AI model can make a significant contribution to the development of research on molecular subtypes of PDAC. Citation Format: Taisuke Baba, Masaki Sunagawa, Junpei Yamaguchi, Toshio Kokuryo, Tomoki Ebata. A single sample classifier of Bailey's molecular subtype of PDAC [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Pancreatic Cancer; 2023 Sep 27-30; Boston, Massachusetts. Philadelphia (PA): AACR; Cancer Res 2024;84(2 Suppl) nr A067.
oncology
What problem does this paper attempt to address?