Tag Refinement of Micro-Videos by Learning from Multiple Data Sources
Lei Huang,Bin Luo
DOI: https://doi.org/10.1007/s11042-017-4781-z
IF: 2.577
2017-01-01
Multimedia Tools and Applications
Abstract:Micro-video is an increasingly prevalent social media form, which attracts much attention for its convenient acquisition and expressive ability. However, for the user-generated hashtags of micro-videos have seriously unbalanced distribution and low quality, the management of micro-videos becomes challenging. In this paper, we propose a novel tag refinement approach for micro-videos by learning from multiple public data sources with manually labelled tags, which can overcome the difficulty of directly refining the imprecise hashtags and address the problem of lacking manually labelled micro-video datasets for training. We define a set of target tags by referring to the widely used datasets for object, activity and scene detection. In tag refinement, we firstly transfer the tags from the images in NUS-WIDE to the micro-video keyframes by similarity measurement. Meanwhile, we complete the tags by detecting the objects, activities and scenes in micro-videos based on appearance features and motion features with the assistance of the datasets, namely, ImageNet , PASCAL VOC , HMDB51 , UCF50 and SUN . We also denoise the hashtags by constructing the mapping relationships among hashtags and target tags based on the statistics on NUS-WIDE . The results of tag transfer, complement and denoising are finally linearly combined to generate the tag refinement results of micro-videos. To validate the performance, we construct a dataset with 600 micro-videos from Vine, and manually labelled the micro-videos with target tags. The experimental results show that our approach can obtain good performance in tag refinement of micro-videos by learning from multiple data sources.