Multimodal visual dictionary learning via heterogeneous latent semantic sparse coding

chenxiao li,guiguang ding,jile zhou,yuchen guo,qiang liu
DOI: https://doi.org/10.1117/12.2073276
2014-01-01
Abstract:Visual dictionary learning as a crucial task of image representation has gained increasing attention. Specifically, sparse coding is widely used due to its intrinsic advantage. In this paper, we propose a novel heterogeneous latent semantic sparse coding model. The central idea is to bridge heterogeneous modalities by capturing their common sparse latent semantic structure so that the learned visual dictionary is able to describe both the visual and textual properties of training data. Experiments on both image categorization and retrieval tasks demonstrate that our model shows superior performance over several recent methods such as K-means and Sparse Coding.
What problem does this paper attempt to address?