MTU: A multi-tasking U-net with hybrid convolutional learning and attention modules for cancer classification and gland Segmentation in Colon Histopathological Images
Manju Dabass,Sharda Vashisth,Rekha Vig
DOI: https://doi.org/10.1016/j.compbiomed.2022.106095
Abstract:A clinically comparable multi-tasking computerized deep U-Net-based model is demonstrated in this paper. It intends to offer clinical gland morphometric information and cancer grade classification to be provided as referential opinions for pathologists in order to abate human errors. It embraces enhanced feature learning capability that aids in extraction of potent multi-scale features; efficacious semantic gap recovery during feature concatenation; and successful interception of resolution-degradation and vanishing gradient problems while performing moderate computations. It is proposed by integrating three unique novel structural components namely Hybrid Convolutional Learning Units in the encoder and decoder, Attention Learning Units in skip connection, and Multi-Scalar Dilated Transitional Unit as the transitional layer in the traditional U-Net architecture. These units are composed of the amalgamated phenomenon of multi-level convolutional learning through conventional, atrous, residual, depth-wise, and point-wise convolutions which are further incorporated with target-specific attention learning and enlarged effectual receptive field size. Also, pre-processing techniques of patch-sampling, augmentation (color and morphological), stain-normalization, etc. are employed to burgeon its generalizability. To build network invariance towards digital variability, exhaustive experiments are conducted using three public datasets (Colorectal Adenocarcinoma Gland (CRAG), Gland Segmentation (GlaS) challenge, and Lung Colon-25000 (LC-25K) dataset)) and then its robustness is verified using an in-house private dataset of Hospital Colon (HosC). For the cancer classification, the proposed model achieved results of Accuracy (CRAG(95%), GlaS(97.5%), LC-25K(99.97%), HosC(99.45%)), Precision (CRAG(0.9678), GlaS(0.9768), LC-25K(1), HosC(1)), F1-score (CRAG(0.968), GlaS(0.977), LC 25K(0.9997), HosC(0.9965)), and Recall (CRAG(0.9677), GlaS(0.9767), LC-25K(0.9994), HosC(0.9931)). For the gland detection and segmentation, the proposed model achieved competitive results of F1-score (CRAG(0.924), GlaS(Test A(0.949), Test B(0.918)), LC-25K(0.916), HosC(0.959)); Object-Dice Index (CRAG(0.959), GlaS(Test A(0.956), Test B(0.909)), LC-25K(0.929), HosC(0.922)), and Object-Hausdorff Distance (CRAG(90.47), GlaS(Test A(23.17), Test B(71.53)), LC-25K(96.28), HosC(85.45)). In addition, the activation mappings for testing the interpretability of the classification decision-making process are reported by utilizing techniques of Local Interpretable Model-Agnostic Explanations, Occlusion Sensitivity, and Gradient-Weighted Class Activation Mappings. This is done to provide further evidence about the model's self-learning capability of the comparable patterns considered relevant by pathologists without any pre-requisite for annotations. These activation mapping visualization outcomes are evaluated by proficient pathologists, and they delivered these images with a class-path validation score of (CRAG(9.31), GlaS(9.25), LC-25K(9.05), and HosC(9.85)). Furthermore, the seg-path validation score of (GlaS (Test A(9.40), Test B(9.25)), CRAG(9.27), LC-25K(9.01), HosC(9.19)) given by multiple pathologists is included for the final segmented outcomes to substantiate the clinical relevance and suitability for facilitation at the clinical level. The proposed model will aid pathologists to formulate an accurate diagnosis by providing a referential opinion during the morphology assessment of histopathology images. It will reduce unintentional human error in cancer diagnosis and consequently will enhance patient survival rate.