Abstract:To go beyond the query-by-example paradigm in image retrieval, there is a need for semantic indexing of large image collections for intuitive text-based image search. Different models have been proposed to learn the dependencies between the visual content of an image set and the associated text captions, then allowing for the automatic creation of semantic indices for unannotated images. The task, however, remains unsolved. In this paper, we present three alternatives to learn a Probabilistic Latent Semantic Analysis model (PLSA) for annotated images, and evaluate their respective performance for automatic image indexing. Under the PLSA assumptions, an image is modeled as a mixture of latent aspects that generates both image features and text captions, and we investigate three ways to learn the mixture of aspects. We also propose a more discriminative image representation than the traditional Blob histogram, concatenating quantized local color information and quantized local texture descriptors. The first learning procedure of a PLSA model for annotated images is a standard EM algorithm, which implicitly assumes that the visual and the textual modalities can be treated equivalently. The other two models are based on an asymmetric PLSA learning, allowing to constrain the definition of the latent space on the visual or on the textual modality. We demonstrate that the textual modality is more appropriate to learn a semantically meaningful latent space, which translates into improved annotation performance. A comparison of our learning algorithms with respect to recent methods on a standard dataset is presented, and a detailed evaluation of the performance shows the validity of our framework.

Learning Latent Semantic Model with Visual Consistency for Image Analysis

Probabilistic Latent Semantic Analysis for Sketch-Based 3D Model Retrieval

Efficient Probabilistic Latent Semantic Analysis with Sparsity Control

Modeling semantic aspects for cross-media image indexing

Semantic image classification using statistical local spatial relations model

Image Categorization Via Robust Plsa

Multilabel image annotation based on double-layer PLSA model.

Dynamic Threshold Model Based Probabilistic Latent Semantic Analysis

LDA Model Combined Spatial Information for Visual Object Recognition Research

Integrating Image Segmentation And Annotation Using Supervised Plsa

Latent Topic Visual Language Model for Object Categorization.

Semi-supervised topic modeling for image annotation.

Partial Membership Latent Dirichlet Allocation

Regularized Semi-Supervised Latent Dirichlet Allocation for Visual Concept Learning

Multidimensional Latent Semantic Analysis Using Term Spatial Information

Supervised LDA for Image Annotation

Visual language modeling for image classification.

Semantic Correlation Mining between Images and Texts with Global Semantics and Local Mapping.

How Does Latent Semantic Analysis Work? A Visualisation Approach

Scene Classification Using Class-Supervised Local-Space-constraint Latent Dirichlet Allocation

Direct Semantic Analysis for Social Image Classification