MFFNet: Multimodal Feature Fusion Network for RGB-D Transparent Object Detection

Li Zhu,Tuanjie Li,Yuming Ning,Yan Zhang
DOI: https://doi.org/10.1177/17298806241283373
IF: 1.714
2024-01-01
International Journal of Advanced Robotic Systems
Abstract:Transparent objects are ubiquitous in everyday life, but how to detect them is full of challenges. Transparent objects hardly reflect light, and they usually transmit the appearance of their surroundings, making it difficult to distinguish them from their surroundings. Existing methods usually use only RGB (Red Green Blue) images as input, ignoring the role of depth maps in transparent object detection. In this article, we try to improve the detection performance of transparent objects by fusing RGB and depth information. Specifically, we propose a multimodal fusion network that fuses RGB and depth modalities in a complementary way. Moreover, extensive experiments and ablation studies on the RGB-D (RGB-Depth) transparent object dataset demonstrate the excellent performance of our method.
What problem does this paper attempt to address?