Distributed Binary Subspace Learning on Large-Scale Cross Media Data

Xueyi Zhao,Chenyi Zhang,Zhongfei Zhang
DOI: https://doi.org/10.1007/s13735-015-0081-4
2015-01-01
International Journal of Multimedia Information Retrieval
Abstract:Due to the ubiquitous existence of large-scale data in today's real-world applications including learning on cross media data, we propose a semi-supervised learning method named Multiple Binary Subspace Regression (MBSR) for cross media data classification. In order to mine the common features among the data with multiple modalities, we project the original cross-media data into the same low-rank representation simultaneously by mapping to the corresponding subspaces for dimension reduction. All the subspaces are set to be binary, which only involve the addition operations and omit the multiplication operations in the subsequent computation owing to the good property of the binary values. The dimension reduction to a binary subspace and the classification on this subspace are also optimized simultaneously leading to a semi-supervised model. For dealing with large-scale data, our learning method is easily implemented to run in a MapReduce-based Hadoop system. Empirical studies demonstrate its competitive performance on convergence, efficiency, and scalability in comparison with the state-of-the-art literature.
What problem does this paper attempt to address?