Mamba Capsule Routing Towards Part-Whole Relational Camouflaged Object Detection

Dingwen Zhang,Liangbo Cheng,Yi Liu,Xinggang Wang,Junwei Han
2024-10-05
Abstract:The part-whole relational property endowed by Capsule Networks (CapsNets) has been known successful for camouflaged object detection due to its segmentation integrity. However, the previous Expectation Maximization (EM) capsule routing algorithm with heavy computation and large parameters obstructs this trend. The primary attribution behind lies in the pixel-level capsule routing. Alternatively, in this paper, we propose a novel mamba capsule routing at the type level. Specifically, we first extract the implicit latent state in mamba as capsule vectors, which abstract type-level capsules from pixel-level versions. These type-level mamba capsules are fed into the EM routing algorithm to get the high-layer mamba capsules, which greatly reduce the computation and parameters caused by the pixel-level capsule routing for part-whole relationships exploration. On top of that, to retrieve the pixel-level capsule features for further camouflaged prediction, we achieve this on the basis of the low-layer pixel-level capsules with the guidance of the correlations from adjacent-layer type-level mamba capsules. Extensive experiments on three widely used COD benchmark datasets demonstrate that our method significantly outperforms state-of-the-arts. Code has been available on <a class="link-external link-https" href="https://github.com/Liangbo-Cheng/mamba" rel="external noopener nofollow">this https URL</a>\_capsule.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the part - whole relationship modeling problem in **Camouflaged Object Detection (COD)**. Specifically, the author points out that although Capsule Networks (CapsNets) perform well in handling part - whole relationships, the traditional Expectation Maximization (EM) capsule routing algorithm is difficult to be widely applied in the camouflaged object detection task due to its high computational complexity and large number of parameters. ### Specific manifestations of the problem: 1. **High computational complexity**: The traditional EM routing algorithm performs capsule routing at the pixel level, generating a large number of capsule assignments, resulting in computationally intensive and time - consuming operations. 2. **Large number of parameters**: Pixel - level capsule routing needs to handle a large number of parameters, increasing the difficulty of model training and creating a bottleneck in the inference speed. 3. **Slow inference speed**: For the above reasons, the pixel - level capsule routing method has a slow inference speed in practical applications and is difficult to meet the real - time requirements. ### Solution: To solve these problems, the author proposes a new type of **Mamba Capsule Routing Network (MCRNet)**. This network significantly reduces the computational complexity and the number of parameters by introducing the Vision Mamba (VMamba) technology to convert pixel - level capsules into type - level capsules. Specific contributions include: 1. **Designed MCRNet**: This network can significantly reduce the complexity of capsule routing and is the first attempt to apply Mamba to CapsNets and COD tasks. 2. **Proposed MCG module**: It is used to generate type - level Mamba capsules from pixel - level capsules, which helps to achieve lightweight capsule routing. 3. **Designed CSDR module**: It is used to recover spatial details from high - level type Mamba capsules to achieve dense prediction of camouflaged objects. ### Summary: The main goal of this paper is to improve the efficiency and accuracy of camouflaged object detection while reducing the consumption of computational resources by introducing a lightweight capsule routing method. Through experimental verification, MCRNet significantly outperforms 25 existing state - of - the - art methods on three widely used COD datasets.