A convolution neural network with multi-level convolutional and attention learning for classification of cancer grades and tissue structures in colon histopathological images

Manju Dabass,Sharda Vashisth,Rekha Vig
DOI: https://doi.org/10.1016/j.compbiomed.2022.105680
Abstract:A clinically comparable Convolutional Neural Network framework-based technique for performing automated classification of cancer grades and tissue structures in hematoxylin and eosin-stained colon histopathological images is proposed in this paper. It comprised of Enhanced Convolutional Learning Modules (ECLMs), multi-level Attention Learning Module (ALM), and Transitional Modules (TMs). The ECLMs perform a dual mechanism to extract multi-level discriminative spatial features and model cross-channel correlations with fewer computations and effectual avoidance of vanishing gradient issues. The ALM performs focus-refinement through the channel-wise elemental attention learning to accentuate the discriminative channels of the features maps specifically belonging to the important pathological regions and the scale-wise attention learning to facilitate recalibration of features maps at diverse scales. The TMs concatenate the output of these two modules, infuse deep multi-scalar features and eliminate resolution degradation issues. Varied pre-processing techniques are further employed to improvise the generalizability of the proposed network. For performance evaluation, four diverse publicly available datasets (Gland Segmentation challenge(GlaS), Lung Colon(LC)-25000, Kather_Colorectal_Cancer_Texture_Images (Kather-5k), and NCT_HE_CRC_100K(NCT-100k)) and a private dataset Hospital Colon(HosC) are used that further aids in building network invariance against digital variability that exists in real-clinical data. Also, multiple pathologists are involved at every stage of the proposed research and their verification and approval are taken for each step outcome. For the cancer grade classification, the proposed model achieves competitive results for GlaS (Accuracy(97.5%), Precision(97.67%), F1-Score(97.67%), and Recall(97.67%)), LC-25000 (Accuracy(100%), Precision(100%), F1-Score(100%), and Recall(100%)), and HosC(Accuracy(99.45%), Precision(100%), F1-Score(99.65%), and Recall(99.31%)), and while for the tissue structure classification, it achieves results for Kather-5k(Accuracy(98.83%), Precision(98.86%), F1-Score(98.85%), and Recall(98.85%)) and NCT-100k(Accuracy(97.7%), Precision(97.69%), F1-Score(97.71%), and Recall(97.73%)). Furthermore, the reported activation mappings of Gradient-Weighted Class Activation Mappings(Grad-CAM), Occlusion Sensitivity, and Local Interpretable Model-Agnostic Explanations (LIME) evidence that the proposed model can itself learn the similar patterns considered pertinent by the pathologists exclusive of any prerequisite for annotations. In addition, these visualization results are inspected by multiple expert pathologists and provided with a validation score as (GlaS(9.251), LC-25000(9.045), Kather-5k(9.248), NCT-100k(9.262), and HosC(9.853)). This model will provide a secondary referential diagnosis for the pathologists to ease their load and aid them in devising an accurate diagnosis and treatment plan.
What problem does this paper attempt to address?