Depth-aware Space-Time Memory Network for Video Object Segmentation

Haozhe Xie,Yunmu Huang,Anni Xu,Jinpeng Lan,Wenxiu Sun,SenseTime
2020-01-01
Abstract:In this paper, we propose Depth-aware Space-Time Memory (D-STM) Network for semi-supervised Video Object Segmentation (VOS). Space-Time Memory (STM) Network learns the feature embedding of the foreground objects and archives promising results in VOS. However, STM focus on the appearances of objects without explicitly considering the spatial location, which leads to poor segmentation results when objects having similar appearances. To solve this problem, we estimate the depth maps from a video sequence to alleviate the ambiguity of objects with similar appearances. Besides, an ASPP module is incorporated to increase the semantic receptive field on different scales. Together with the multi-scale ensemble, the proposed D-STM archives a J&F score of 76.9% in the 2020 DAVIS challenge on semi-supervised VOS.
What problem does this paper attempt to address?