Abstract:Subspace learning is an effective and widely used image feature extraction and classification technique. However, for the large-scale image recognition issue in real-world applications, many subspace learning methods often suffer from large computational burden. In order to reduce the computational time and improve the recognition performance of subspace learning technique under this situation, we introduce the idea of parallel computing which can reduce the time complexity by splitting the original task into several subtasks. We develop a parallel subspace learning framework. In this framework, we first divide the sample set into several subsets by designing two random data division strategies that are equal data division and unequal data division. These two strategies correspond to equal and unequal computational abilities of nodes under parallel computing environment. Next, we calculate projection vectors from each subset in parallel. The graph embedding technique is employed to provide a general formulation for parallel feature extraction. After combining the extracted features from all nodes, we present a unified criterion to select most distinctive features for classification. Under the developed framework, we separately propose supervised and unsupervised parallel subspace learning approaches, which are called parallel linear discriminant analysis (PLDA) and parallel locality preserving projection (PLPP). PLDA selects features with the largest Fisher scores by estimating the weighted and unweighted sample scatter, while PLPP selects features with the smallest Laplacian scores by constructing a whole affinity matrix. Theoretically, we analyze the time complexities of proposed approaches and provide the fundamental supports for applying random division strategies. In the experiments, we establish two real parallel computing environments and employ four public image and video databases as the test data. Experimental results demonstrate that the proposed approaches outperform several related supervised and unsupervised subspace learning methods, and significantly reduce the computational time.

A PARALLEL AND MODULAR PATTERN CLASSIFICATION FRAMEWORK FOR LARGE-SCALE PROBLEMS

Patent Classification Using Parallel Min-Max Modular Support Vector Machine

Parallel Learning of Large-Scale Multi-Label Classification Problems with Min-Max Modular Liblinear

Parallel Learning - A New Framework for Machine Learning

A Parallel SVM Training Algorithm on Large-Scale Classification Problems

Parallel Graph Pattern Matching in Massive Networks Based on MapReduce

A Scalable Pattern Recognition Method

Áòòóöôóööøøòò Ôööóö Òóûððððð Òøó Ðððöòòòò Ý Úúúúòò Øöööòòòò Øø ¾ Ååò¹ññü Ñóóùððö Òòøûóöö

Scalable string matching framework enhanced by pattern clustering

Large-Scale Patent Classification With Min-Max Modular Support Vector Machines

A Generic Parallel Pattern-Based System for Bioinformatics

An ensemble framework for patent classification

Supervised and Unsupervised Parallel Subspace Learning for Large-Scale Image Recognition.

Adaptive Taxonomy Learning and Historical Patterns Modelling for Patent Classification

A novel distributed machine learning method for classification: parallel covering algorithm

Large Scale Online Kernel Classification

A Mahout Based Image Classification Framework for Very Large Dataset

A Modular Massively Parallel Learning Framework for Brain-Like Computers

Exploratory Parallel Hybrid Sampling Framework for Imbalanced Data Classification

Task Decomposition and Module Combination Based on Class Relations: a Modular Neural Network for Pattern Classification

Massively scalable prototype learning for heterogeneous parallel computing architecture