Supervised Saliency Maps for First-Person Videos Based on Sparse Coding

Yujie Li,Atsunori Kanemura,Hideki Asoh,Taiki Miyanishi,Motoaki Kawanabe
DOI: https://doi.org/10.23919/apsipa.2018.8659499
2018-01-01
Abstract:Specifying attentive regions in first-person vision (FPV) plays an important role to find meaningful objects in our daily life. Saliency detection is a major technique to locate such attentive regions. However, even though the FPV captured from the user perspective is always associated with his/her actions, existing saliency detection methods are bottom-up, and they cannot incorporate the information about the actions of the user. Since people look at the target of their actions, saliency detection algorithms for FPV should take into account which objects are more likely to be manipulated by the user. In this paper, we propose a supervised saliency detection method that uses human gaze information when the user performs actions as supervised signals. Our proposed method is based on sparse coding (dictionary learning) with a supervised saliency dictionary. Experiments using a real-world gaze dataset show that our proposed approach outperforms a state-of-the-art saliency detection algorithm based on sparse coding.
What problem does this paper attempt to address?