Abstract:The quest for enhancing the interpretability of neural networks has become a prominent focus in recent research endeavors. Prototype-based neural networks have emerged as a promising avenue for imbuing models with interpretability by gauging the similarity between image components and category prototypes to inform decision-making. However, these networks face challenges as they share similarity activations during both the inference and explanation processes, creating a trade-off between accuracy and interpretability. To address this issue and ensure that a network achieves high accuracy and robust interpretability in the classification process, this paper introduces a groundbreaking prototype-based neural network termed the “Decoupling Prototypical Network” (DProtoNet). This novel architecture comprises encoder, inference, and interpretation modules. In the encoder module, we introduce decoupling feature masks to facilitate the generation of feature vectors and prototypes, enhancing the generalization capabilities of the model. The inference module leverages these feature vectors and prototypes to make predictions based on similarity comparisons, thereby preserving an interpretable inference structure. Meanwhile, the interpretation module advances the field by presenting a novel approach: a “multiple dynamic masks decoder” that replaces conventional upsampling similarity activations. This decoder operates by perturbing images with mask vectors of varying sizes and learning saliency maps through consistent activation. This methodology offers a precise and innovative means of interpreting prototype-based networks. DProtoNet effectively separates the inference and explanation components within prototype-based networks. By eliminating the constraints imposed by shared similarity activations during the inference and explanation phases, our approach concurrently elevates accuracy and interpretability. Experimental evaluations on diverse public natural datasets, including CUB-200-2011, Stanford Cars, and medical datasets like RSNA and iChallenge-PM, corroborate the substantial enhancements achieved by our method compared to previous state-of-the-art approaches. Furthermore, ablation studies are conducted to provide additional evidence of the effectiveness of our proposed components.

FAPI-Net: A Lightweight Interpretable Network Based on Feature Augmentation and Prototype Interpretation.

FENet: A Feature Explanation Network with a Hierarchical Interpretable Architecture for Intelligent Decision-Making

Evaluation and Improvement of Interpretability for Self-Explainable Part-Prototype Networks

Decoupling Deep Learning for Enhanced Image Recognition Interpretability

FMGNet: an Efficient Feature-Multiplex Group Network for Real-Time Vision Task

FNA++: Fast Network Adaptation via Parameter Remapping and Architecture Search

MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation

Decoupling Deep Learning for Interpretable Image Recognition

Efficient feature transform module

Interpretable Image Recognition by Screening Class-Specific and Class-Shared Prototypes

Self-Fusion Convolutional Neural Networks.

FFBNet : Lightweight Backbone for Object Detection Based Feature Fusion Block.

APPFNet: Adaptive point-pixel fusion network for 3D semantic segmentation with neighbor feature aggregation

Visualizing Deep Networks Using Segmentation Recognition and Interpretation Algorithm

PFFN: Progressive Feature Fusion Network for Lightweight Image Super-Resolution

Alpha-SGANet: A multi-attention-scale feature pyramid network combined with lightweight network based on Alpha-IoU loss

LPRNet: Lightweight Deep Network by Low-rank Pointwise Residual Convolution

Rethinking 1D convolution for lightweight semantic segmentation

RAPNet: Resolution-Adaptive and Predictive Early Exit Network for Efficient Image Recognition

Feature Analysis Network: An Interpretable Idea in Deep Learning

MIFNet: A Lightweight Multiscale Information Fusion Network