Abstract:Deep convolution neural networks (DCNNs) in deep learning have been widely used in semantic segmentation. However, the filters of most regular convolutions in DCNNs are spatially invariant to local transformations, which reduces localization accuracy and hinders the improvement of semantic segmentation. Dynamic convolution with pixel-level filters can enhance the localization accuracy through its region-awareness, but these are sensitive to objects with large-scale variations in semantic segmentation. To simultaneously address the low localization accuracy and objects with large-scale variations, we propose a filter-varying atrous convolution (FAC) to efficiently enlarge the per-pixel receptive fields pertaining to various objects. FAC mainly consists of a conditional-filter-generating network (CFGN) and a dynamic local filtering operation (DLFO). In the CFGN, a class probability map is used to generate the corresponding filters, making the FAC genuinely dynamic. In the DLFO, by replacing the sliding convolution operation one by one with a one-time dot product operation, the efficiency of the algorithm is greatly improved. Also, a dense scale module (DSM) is constructed to generate denser scales and larger receptive fields for exploring long-range contextual information. Finally, a dense-scale dynamic network (DsDNet) simultaneously enhances the localization accuracy and reduces the effect of large-scale variations of the object, by assigning FAC to different spatial locations at dense scales. In addition, to accelerate network convergence and improve segmentation accuracy, our network employs two pixel-wise cross-entropy loss functions. One is between the Backbone and DSM, and the other is at the network's end. Extensive experiments on Cityscapes, PASCAL VOC 2012, and ADE20K datasets verify that the performance of our DsDNet is superior to the non-dynamic and multi-scale convolution neural networks.

Decoupled Dynamic Filter Networks

Learning Scalable Dynamic Filter in Convolutional Networks.

Building Efficient CNNs Using Depthwise Convolutional Eigen-Filters (DeCEF)

Dense Deep Joint Image Filter For Upsampling And Denoising

Dynamic Filtering with Large Sampling Field for ConvNets

CasDyF-Net: Image Dehazing via Cascaded Dynamic Filters

C $^{2}$ DFNet: Criss-Cross Dynamic Filter Network for RGB-D Salient Object Detection

Dilated Residual Encode-Decode Networks for Image Denoising

C<inline-formula><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula>DFNet: Criss-Cross Dynamic Filter Network for RGB-D Salient Object Detection

C<SUP>2</SUP>DFNet: Criss-Cross Dynamic Filter Network for RGB-D Salient Object Detection

Dynamic Sampling Convolutional Neural Networks.

A Convolutional Neural Network-Based Low Complexity Filter

An Efficient Low-Complexity Convolutional Neural Network Filter

Balanced Decoupled Spatial Convolution for CNNs

Decoupled Dynamic Group Equivariant Filter for Saliency Prediction on Omnidirectional Image

Decoupled Convolutions for CNNs

Network Decoupling: From Regular to Depthwise Separable Convolutions

Training Compact CNNs for Image Classification Using Dynamic-coded Filter Fusion

Dynamic Convolutional Capsule Network for In-loop Filtering in HEVC Video Codec

Dense-scale Dynamic Network with Filter-Varying Atrous Convolution for Semantic Segmentation

DualConv: Dual Convolutional Kernels for Lightweight Deep Neural Networks