Abstract:Recent research emphasizes more on analyzing multiple features to improve face recognition (FR) performance. One popular scheme is to extend the sparse representation based classification framework with various sparse constraints. Although these methods jointly study multiple features through the constraints, they just process each feature individually such that they overlook the possible high-level relationship among different features. It is reasonable to assume that the low-level features of facial images, such as edge information and smoothed/low-frequency image, can be fused into a more compact and more discriminative representation based on the latent high-level relationship. FR on the fused features is anticipated to produce better performance than that on the original features, since they provide more favorable properties. Focusing on this, we propose two different strategies which start from fusing multiple features and then exploit the dictionary learning (DL) framework for better FR performance. The first strategy is a simple and efficient two-step model, which learns a fusion matrix from training face images to fuse multiple features and then learns class-specific dictionaries based on the fused features. The second one is a more effective model requiring more computational time that learns the fusion matrix and the class-specific dictionaries simultaneously within an iterative optimization procedure. Besides, the second model considers to separate the shared common components from class-specified dictionaries to enhance the discrimination power of the dictionaries. The proposed strategies, which integrate multi-feature fusion process and dictionary learning framework for FR, realize the following goals: (1) exploiting multiple features of face images for better FR performances; (2) learning a fusion matrix to merge the features into a more compact and more discriminative representation; (3) learning class-specific dictionaries with consideration of the common patterns for better classification performance. We perform a series of experiments on public available databases to evaluate our methods, and the experimental results demonstrate the effectiveness of the proposed models.

Fusion of deep shallow features and models for speaker recognition

Integration of multi-feature fusion and dictionary learning for face recognition

Sentiment Analysis Using Deep Robust Complementary Fusion of Multi-Features and Multi-Modalities.

Audio-Visual Speech Enhancement with Deep Multi-modality Fusion

Attentive Feature Fusion for Robust Speaker Verification

Acoustic Model Fusion for End-to-end Speech Recognition

Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition

A Fusion Approach to Spoken Language Identification Based on Combining Multiple Phone Recognizers and Speech Attribute Detectors

Text-independent Speaker Recognition Based on X-vector

A novel speech feature fusion algorithm for text-independent speaker recognition

Coarse-Grained Attention Fusion with Joint Training Framework for Complex Speech Enhancement and End-to-End Speech Recognition

Multi-feature Combination for Speaker Recognition

Fine-tune Pre-Trained Models with Multi-Level Feature Fusion for Speaker Verification

A Fishervoice Based Feature Fusion Method for Short Utterance Speaker Recognition

Deep fusion framework for speech command recognition using acoustic and linguistic features

A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification

Depth-First Neural Architecture with Attentive Feature Fusion for Efficient Speaker Verification.

Multi-resolution Time Frequency Feature and Complementary Combination for Short Utterance Speaker Recognition

Speaker Verification With Deep Features

Combining Information from Multi-Stream Features Using Deep Neural Network in Speech Recognition

Deep Speaker Embedding Learning with Multi-level Pooling for Text-independent Speaker Verification