Abstract:Micro-video is an increasingly prevalent social media form, which attracts much attention for its convenient acquisition and expressive ability. However, for the user-generated hashtags of micro-videos have seriously unbalanced distribution and low quality, the management of micro-videos becomes challenging. In this paper, we propose a novel tag refinement approach for micro-videos by learning from multiple public data sources with manually labelled tags, which can overcome the difficulty of directly refining the imprecise hashtags and address the problem of lacking manually labelled micro-video datasets for training. We define a set of target tags by referring to the widely used datasets for object, activity and scene detection. In tag refinement, we firstly transfer the tags from the images in NUS-WIDE to the micro-video keyframes by similarity measurement. Meanwhile, we complete the tags by detecting the objects, activities and scenes in micro-videos based on appearance features and motion features with the assistance of the datasets, namely, ImageNet , PASCAL VOC , HMDB51 , UCF50 and SUN . We also denoise the hashtags by constructing the mapping relationships among hashtags and target tags based on the statistics on NUS-WIDE . The results of tag transfer, complement and denoising are finally linearly combined to generate the tag refinement results of micro-videos. To validate the performance, we construct a dataset with 600 micro-videos from Vine, and manually labelled the micro-videos with target tags. The experimental results show that our approach can obtain good performance in tag refinement of micro-videos by learning from multiple data sources.

Image Tag Refinement with View-Dependent Concept Representations

Image Tag Refinement Towards Low-Rank, Content-Tag Prior and Error Sparsity.

Contextual Modeling on Auxiliary Points for Robust Image Reranking

Clickthrough Refinement for Improved Graph Ranking

A Collaborative Learning Framework to Tag Refinement for Points of Interest

Towards Training-Free Refinement for Semantic Indexing of Visual Media.

Bridging the Semantic Gap Between Image Contents and Tags

Tag Clustering and Refinement on Semantic Unity Graph

Tag Refinement of Micro-Videos by Learning from Multiple Data Sources

Training-free indexing refinement for visual media via multi-semantics.

Deep Enhanced Weakly-Supervised Hashing with Iterative Tag Refinement

Improve Web Image Retrieval by Refining Image Annotations

Image Captioning with Attribute Refinement.

Discovering Visual Concept Structure with Sparse and Incomplete Tags

An Algorithm for the Automatic Annotation Refinement on Large-Scale Web Images

Extracting Effective Image Attributes with Refined Universal Detection

Image Tag Completion and Refinement by Subspace Clustering and Matrix Completion

Kill Two Birds with One Stone: Weakly-Supervised Neural Network for Image Annotation and Tag Refinement

Subspace Clustering Based Tag Sharing for Inductive Tag Matrix Refinement with Complex Errors

Semantically Smoothed Refinement for Everyday Concept Indexing.

Low-rank image tag completion with dual reconstruction structure preserved.