Whole slide semantic segmentation: large scale active learning for digital pathology

Jonathan Folmsbee,Margaret Brandwein-Weber,Scott Doyle
DOI: https://doi.org/10.1117/12.2581229
2021-02-15
Abstract:Deep learning for digital pathology is a challenging problem. Small patient datasets limit generalizability of trained deep learning models, while the large size of whole slide images (WSIs) represents a bottleneck for training. Additionally, annotations are difficult to obtain at scale due to image size and the volume of samples needed for accurate and generalizable training. We have investigated the use of Active Leaning (AL) to alleviate this burden; AL is a training approach where a small subset of samples is used to create a bootstrap classifier, which in turn selects new samples for annotation to maximize the performance gain for each additional training sample. In our previous work, we have found AL to be more efficient than the more common Random Learning (RL) approach in terms of segmentation performance per training sample. In the current work, we extend our investigation of AL by using our region-of-interest (ROI) trained classifier and perform WSI-level segmentation of multiple classes. We compare the results of the AL- to RL-based training, and generate inference results for a dataset of 75 WSIs spanning 61 patients. After four rounds of training, AL yielded a validation loss 0.566 lower as well as dice coefficients an average of 0.022 higher for classes present in images for the holdout testing set. This work demonstrates the generalizability of AL from patch-based segmentation to WSI-based, and provides a path forward for rapid development of complex digital pathology datasets in deep learning.
What problem does this paper attempt to address?