Generalizability of Self-Supervised Training Models for Digital Pathology: A Multicountry Comparison in Colorectal Cancer

Zhuchen Shao,Liuxi Dai,Jitendra Jonnagaddala,Yang Chen,Yifeng Wang,Zijie Fang,Yongbing Zhang
DOI: https://doi.org/10.1200/cci.22.00178
2023-01-01
JCO Clinical Cancer Informatics
Abstract:PURPOSEIn this multicountry study, we aim to explore the effectiveness of self-supervised learning (SSL) in colorectal cancer (CRC)-related predictive tasks using large amount of unlabeled digital pathology imaging data.METHODSWe adopted SimSiam to conduct self-supervised pretraining on two large whole-slide image CRC data sets from the United States and Australia. The SSL pretrained encoder is then used in several predictive tasks, including supervised predictive tasks (tissue classification, microsatellite instability v microsatellite stability classification), and weakly supervised predictive tasks (polyp type classification and adenoma grading, and 5-year survival prediction). Performance on the tasks was compared between models using SSL pretraining and those using ImageNet pretraining, and performance for one-country pretraining was compared with two-country pretraining.RESULTSWe demonstrate that SSL pretraining outperforms ImageNet pretraining in predictive tasks, that is, SSL pretraining outperforms the ImageNet pretraining by 3.01% of F1 score on average over supervised predictive tasks and 1.53% of AUC on average over weakly supervised predictive tasks. Furthermore, two-country SSL pretraining has shown more stable performance than single-country pretraining, that is, two-country pretraining outperforms at least one of the single-country pretrainings by 1.93% of F1 on average over supervised predictive tasks and 1.36% of AUC on average over weakly-supervised predictive tasks.CONCLUSIONWe find that using unlabeled image data for SSL pretraining in CRC related tasks is more effective than using ImageNet pretraining. Furthermore, SSL pretraining using data from multiple countries achieve more stable performance and better generalization than single-country pretraining.
What problem does this paper attempt to address?