Depth alignment interaction network for camouflaged object detection

Hongbo Bi,Yuyu Tong,Jiayuan Zhang,Cong Zhang,Jinghui Tong,Wei Jin
DOI: https://doi.org/10.1007/s00530-023-01250-3
IF: 3.9
2024-01-30
Multimedia Systems
Abstract:Many animals actively change their own characteristics, such as color and texture, through camouflage, a natural defense mechanism, making them difficult to be detected in the natural environment, which makes the task of camouflaged object detection extremely challenging. Biological research shows that the eyes of animals have three-dimensional perception ability, and the obtained depth information can provide useful object positioning clues for finding camouflaged objects. However, almost all the current studies for camouflaged object detection do not combine depth maps with RGB images. Therefore, combining depth maps with traditional unimodal RGB images is of great research significance to improve the accuracy of camouflaged object detection. In this paper, we propose a depth alignment interaction network for camouflaged object detection in which the depth maps used are generated from existing monocular depth estimation networks. To address the problem that the quality of the generated depth maps varies, we propose a depth alignment index method to evaluate the quality of the depth maps. The method dynamically assigns the proportion of depth maps in the fusion process to depth maps of different quality according to their alignment with RGB images. Then, to fully extract the fused artifact features, we design an expanded pyramid interaction module, which first expands the receptive field of the features in each layer. Then, the features at the higher levels interacted with the features at the lower levels by connecting them step-by-step to further refine the predicted camouflaged area. Extensive experiments on 4 camouflaged object detection datasets demonstrate the effectiveness of our solution for camouflaged object detection.
computer science, information systems, theory & methods
What problem does this paper attempt to address?