Disassembling Convolutional Segmentation Network

Kaiwen Hu,Jing Gao,Fangyuan Mao,Xinhui Song,Lechao Cheng,Zunlei Feng,Mingli Song
DOI: https://doi.org/10.1007/s11263-023-01776-z
IF: 13.369
2023-01-01
International Journal of Computer Vision
Abstract:In recent years, the convolutional segmentation network has achieved remarkable performance in the computer vision area. However, training a practicable segmentation network is time- and resource-consuming. In this paper, focusing on the semantic image segmentation task, we attempt to disassemble a convolutional segmentation network into category-aware convolution kernels and achieve customizable tasks without additional training by utilizing those kernels. The core of disassembling convolutional segmentation networks is how to identify the relevant convolution kernels for a specific category. According to the encoder-decoder network architecture, the disassembling framework, named Disassembler, is devised to be composed of the forward channel-wise activation attribution and backward gradient attribution. In the forward channel-wise activation attribution process, for each image, the activation values of each feature map in the high-confidence mask area are summed into category-aware probability vectors. In the backward gradient attribution process, the positive gradients w.r.t. each feature map in the high-confidence mask area are summed into a relative coefficient vector for each category. With the cooperation of two vectors, the Disassembler can effectively disassemble category-aware convolution kernels. Extensive experiments demonstrate that the proposed Disassembler can accomplish the category-customizable task without additional training. The disassembled category-aware sub-network achieves comparable performance without any finetuning and will outperform existing state-of-the-art methods with one epoch of finetuning.
What problem does this paper attempt to address?