Overcoming the limitations of patch-based learning to detect cancer in whole slide images

Ozan Ciga,Tony Xu,Sharon Nofech-Mozes,Shawna Noy,Fang-I Lu,Anne L. Martel
DOI: https://doi.org/10.48550/arXiv.2012.00617
2020-12-02
Abstract:Whole slide images (WSIs) pose unique challenges when training deep learning models. They are very large which makes it necessary to break each image down into smaller patches for analysis, image features have to be extracted at multiple scales in order to capture both detail and context, and extreme class imbalances may exist. Significant progress has been made in the analysis of these images, thanks largely due to the availability of public annotated datasets. We postulate, however, that even if a method scores well on a challenge task, this success may not translate to good performance in a more clinically relevant workflow. Many datasets consist of image patches which may suffer from data curation bias; other datasets are only labelled at the whole slide level and the lack of annotations across an image may mask erroneous local predictions so long as the final decision is correct. In this paper, we outline the differences between patch or slide-level classification versus methods that need to localize or segment cancer accurately across the whole slide, and we experimentally verify that best practices differ in both cases. We apply a binary cancer detection network on post neoadjuvant therapy breast cancer WSIs to find the tumor bed outlining the extent of cancer, a task which requires sensitivity and precision across the whole slide. We extensively study multiple design choices and their effects on the outcome, including architectures and augmentations. Furthermore, we propose a negative data sampling strategy, which drastically reduces the false positive rate (7% on slide level) and improves each metric pertinent to our problem, with a 15% reduction in the error of tumor extent.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced in detecting cancer in Whole Slide Images (WSIs). Specifically, the paper focuses on the following aspects: 1. **Large - scale image processing**: WSIs are very large and usually need to be decomposed into smaller image patches for analysis. This method needs to consider multi - scale information when extracting image features in order to capture details and context. 2. **Class imbalance problem**: When training deep - learning models, extreme class imbalance may exist in WSIs, that is, the number of samples in some classes is much larger than that in other classes. 3. **Data bias**: Many publicly available datasets consist of image patches annotated by experts, and these datasets may be affected by data management bias. For example, training and validation datasets are usually collected by the same experts or under the same guidelines, resulting in a higher proportion of positive - class samples (such as cancer tissues). 4. **Applicability to clinical workflows**: Even if a method performs well on a specific task, this success may not be directly translated into high performance in clinical workflows. The paper points out that many datasets are only annotated at the whole - slide level and lack detailed annotations, which may lead to local prediction errors being ignored. To overcome these problems, the paper proposes a new negative - sample sampling strategy to improve model performance by reducing the false - positive rate. In addition, the paper also investigates the impact of different design choices (such as architecture and augmentation methods) on the results and conducts experimental verification on WSIs of breast cancer after neoadjuvant treatment (NAT). Specifically, the main contributions of the paper include: - **Negative - sample sampling strategy**: A negative - sample sampling method based on feature clustering is proposed, which significantly reduces the false - positive rate (from 7% to 2%) and improves the accuracy of tumor - extent estimation. - **Multi - task comparative study**: The best practices of patch - level classification, slide - level classification, and slide - level segmentation tasks are compared, and it is found that there are significant differences in the best practices of these tasks. - **Study of model complexity and augmentation methods**: The impact of different model complexity and image - augmentation methods on task performance is studied, and it is found that the EfficentNet - B3 model provides the best bias - and - variance balance in sliding - window tasks. In summary, this paper aims to improve the accuracy and robustness of cancer detection in WSIs by improving the negative - sample sampling strategy and optimizing model design.