Semi-supervised multi-instance multi-label learning for video annotation task.

Xin-Shun Xu,Yuan Jiang,Xiangyang Xue,Zhi-Hua Zhou
DOI: https://doi.org/10.1145/2393347.2396300
2012-01-01
Abstract:Traditional approaches for automatic video annotation usually represent one video clip with a flat feature vector, neglecting the fact that video data contain natural structures. It is also noteworthy that a video clip is often relevant to multiple concepts. Indeed, the video annotation task is inherently a Multi-Instance Multi-Label learning (MIML) problem. Considering that manually annotating videos is labor-intensive and time-consuming, this paper proposes a semi-supervised MIML approach, SSMIML, which is able to exploit abundant unannotated videos to help improve the annotation performance. This approach takes label correlations into account, and enforces similar instances to share similar multi-labels. Evaluation on TREVID 2005 show that the proposed approach outperforms several state-of-the-art methods.
What problem does this paper attempt to address?