Abstract:Objectives: Accurate extraction of regions of interest (ROI) with variable shapes and scales is one of the primary challenges in medical image segmentation. Current U-based networks mostly aggregate multi-stage encoding outputs as an improved multi-scale skip connection. Although this design has been proven to provide scale diversity and contextual integrity, there remain several intuitive limits: (i) the encoding outputs are resampled to the same size simply, which destruct the fine-grained information. The advantages of utilization of multiple scales are insufficient. (ii) Certain redundant information proportional to the feature dimension size is introduced and causes multi-stage interference. And (iii) the precision of information delivery relies on the up-sampling and down-sampling layers, but guidance on maintaining consistency in feature locations and trends between them is lacking. Methods: To improve these situations, this paper proposed a U-based CNN network named HAD-Net, by assembling a new hyper-scale shifted aggregating module (HSAM) paradigm and progressive reusing attention (PRA) for skip connections, as well as employing a novel pair of dual-branch parameter-free sampling layers, i.e. max-diagonal pooling (MDP) and max-diagonal un-pooling (MDUP). That is, the aggregating scheme additionally combines five subregions with certain offsets in the shallower stage. Since the lower scale-down ratios of subregions enrich scales and fine-grain context. Then, the attention scheme contains a partial-to-global channel attention (PGCA) and a multi-scale reusing spatial attention (MRSA), it builds reusing connections internally and adjusts the focus on more useful dimensions. Finally, MDP and MDUP are explored in pairs to improve texture delivery and feature consistency, enhancing information retention and avoiding positional confusion. Results: Compared to state-of-the-art networks, HAD-Net has achieved comparable and even better performances with Dice of 90.13%, 81.51%, and 75.43% for each class on BraTS20, 89.59% Dice and 98.56% AUC on Kvasir-SEG, as well as 82.17% Dice and 98.05% AUC on DRIVE. Conclusions: The scheme of HSAM+PRA+MDP+MDUP has been proven to be a remarkable improvement and leaves room for further research.

A decoder-free feature aggregation network for medical image segmentation

Feature Agglomeration Networks for Single Stage Face Detection

A feature aggregation and feature fusion network for retinal vessel segmentation

DGFAU-Net: Global feature attention upsampling network for medical image segmentation

MFA-Net: Multiple Feature Association Network for Medical Image Segmentation

FI‐Net: Rethinking Feature Interactions for Medical Image Segmentation

MpMsCFMA-Net: Multi-path Multi-scale Context Feature Mixup and Aggregation Network for medical image segmentation

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

MAFUNet: Multi-Attention Fusion Network for Medical Image Segmentation

J-Net: Asymmetric Encoder-Decoder for Medical Semantic Segmentation

FIF-UNet: An Efficient UNet Using Feature Interaction and Fusion for Medical Image Segmentation

FANet: A Feedback Attention Network for Improved Biomedical Image Segmentation

Encoder- and Decoder-Based Networks Using Multiscale Feature Fusion and Nonlocal Block for Remote Sensing Image Semantic Segmentation

Multi-scale Feature Pyramid Fusion Network for Medical Image Segmentation

AFFSegNet: Adaptive Feature Fusion Segmentation Network for Microtumors and Multi-Organ Segmentation

UXNet: Searching Multi-level Feature Aggregation for 3D Medical Image Segmentation

HAD-Net: an Attention U-based Network with Hyper-Scale Shifted Aggregating and Max-Diagonal Sampling for Medical Image Segmentation

MEA-Net: multilayer edge attention network for medical image segmentation

FAN-Unet: Enhancing Unet with vision Fourier Analysis Block for Biomedical Image Segmentation

DCACNet: Dual context aggregation and attention-guided cross deconvolution network for medical image segmentation