Abstract:Different kinds of high-dimensional visual features can be extracted from a single image. Images can thus be treated as multiple view data when taking each type of extracted high-dimensional visual feature as a particular understanding of images. In this paper, we propose a framework of sparse unsupervised dimensionality reduction for multiple view data. The goal of our framework is to find a low-dimensional optimal consensus representation from multiple heterogeneous features by multiview learning. In this framework, we first learn low-dimensional patterns individually from each view, considering the specific statistical property of each view. We construct a low-dimensional optimal consensus representation from those learned patterns, the goal of which is to leverage the complementary nature of the multiple views. We formulate the construction of the low-dimensional consensus representation to approximate the matrix of patterns by means of a low-dimensional consensus base matrix and a loading matrix. To select the most discriminative features for the spectral embedding of multiple views, we propose to add an l(1)-norm into the loading matrix's columns and impose orthogonal constraints on the base matrix. We develop a new alternating algorithm, i.e., spectral sparse multiview embedding, to efficiently obtain the solution. Each row of the loading matrix encodes structured information corresponding to multiple patterns. In order to gain flexibility in sharing information across subsets of the views, we impose a novel structured sparsity-inducing norm penalty on the loading matrix's rows. This penalty makes the loading coefficients adaptively load shared information across subsets of the learned patterns. We call this method structured sparse multiview dimensionality reduction. Experiments on a toy benchmark image data set and two real-world Web image data sets demonstrate the effectiveness of the proposed algorithms.

On multi-view feature learning

Multi-View Correlated Feature Learning by Uncovering Shared Component.

Transformational Sparse Coding

Joint Learning of Latent Similarity and Local Embedding for Multi-View Clustering

Generalized Multi-view Embedding for Visual Recognition and Cross-modal Retrieval

Self-Supervised Multi-View Learning via Auto-Encoding 3D Transformations

Spatiotemporal Feature Learning for Event-Based Vision

A Survey on Multi-view Learning

A Multi-View Fusion Method Via Tensor Learning And Gradient Descent For Image Features

Multi-view Learning Overview: Recent Progress and New Challenges

Co-Learning Non-Negative Correlated and Uncorrelated Features for Multi-View Data

One-Pass Multi-View Learning.

Subspace-based Multi-View Fusion for Instance-Level Image Retrieval

Sparse Unsupervised Dimensionality Reduction for Multiple View Data.

Multi-view Latent Space Learning Based on Local Discriminant Embedding

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Embedded Deep Bilinear Interactive Information and Selective Fusion for Multi-view Learning

Multi-View Matrix Decomposition: A New Scheme for Exploring Discriminative Information.

Multi-view task-driven recognition in visual sensor networks

Pairwise Decomposition of Image Sequences for Active Multi-view Recognition

Tensorized Multi-view Subspace Representation Learning