Multi-View Graph Embedding Learning for Image Co-Segmentation and Co-Localization

Aiping Huang,Lijian Li,Le Zhang,Yuzhen Niu,Tiesong Zhao,Chia-Wen Lin
DOI: https://doi.org/10.1109/tcsvt.2023.3339181
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Image co-segmentation and co-localization exploit inter-image information to identify and extract foreground objects with a batch mode. However, they remain challenging when confronted with large object variations or complex backgrounds. This paper proposes a multi-view graph embedding (MV-Gem) learning scheme which integrates diversity, robustness and discernibility of object features to alleviate this phenomenon. To encourage the diversity, the deep co-information containing both low-layer general representations and high-layer semantic information is generated to form a multi-view feature pool for comprehensive co-object description. To enhance the robustness, a multi-view adaptive weighted learning is formulated to fuse the deep co-information for feature complementation. To ensure the discernibility, the graph embedding and sparse constraint are embedded into the fusion formulation for feature selection. The former aims to inherit important structures from multiple views, and the latter further selects important features to restrain irrelevant backgrounds. With these techniques, MV-Gem gradually recovers all co-objects through optimization iterations. Extensive experimental results on real-world datasets demonstrate that MV-Gem is capable of locating and delineating co-objects in an image group.
What problem does this paper attempt to address?