Abstract:If fusion rules cannot adapt to the changes of environment and individual users, multimodal systems may perform worse than unimodal systems when one or more modalities encounter data degeneration. This paper develops a robust face and ear based multimodal biometric system using Sparse Representation (SR), which integrates the face and ear at feature level, and can effectively adjust the fusion rule based on reliability difference between the modalities. We first propose a novel index called Sparse Coding Error Ratio (SCER) to measure the reliability difference between face and ear query samples. Then, SCER is utilized to develop an adaptive feature weighting scheme for dynamically reducing the negative effect of the less reliable modality. In multimodal classification phase, SR-based classification techniques are employed, i.e., Sparse Representation based Classification (SRC) and Robust Sparse Coding (RSC). Finally, we derive a category of SR-based multimodal recognition methods, including Multimodal SRC with feature Weighting (MSRCW) and Multimodal RSC with feature Weighting (MRSCW). Experimental results demonstrate that: (a) MSRCW and MRSCW perform significantly better than the unimodal recognition using either face or ear alone, as well as the known multimodal methods; (b) The effectiveness of adaptive feature weighting is verified. MSRCW and MRSCW are very robust to the image degeneration occurring to one of the modalities. Even when face (ear) query sample suffers from 100% random pixel corruption, they can still get the performance close to the ear (face) unimodal recognition; (c) By integrating the advantages of adaptive feature weighting and sparsity-constrained regression, MRSCW seems excellent in tackling the face and ear based multimodal recognition problem.

Sparse-Based Auditory Model for Robust Speaker Recognition

Learning Virtual HD Model for Bi-model Emotional Speaker Recognition

A Multi-Spike Approach For Robust Sound Recognition

Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods

A Robust Face and Ear Based Multimodal Biometric System Using Sparse Representation

A novel hybrid feature method based on Caelen auditory model and gammatone filterbank for robust speaker recognition under noisy environment and speech coding distortion

Auditory model-based speech feature extraction and its application to speaker identification

Joint sparse representation based cepstral-domain dereverberation for distant-talking speech recognition

Bionic Cepstral coefficients (BCC): A new auditory feature extraction to noise-robust speaker identification

Self-attention Based Speaker Recognition Using Cluster-Range Loss

Speech Dereverberation Based on Sparse Matrix Decomposition

Mismatched Feature Detection with Finer Granularity for Emotional Speaker Recognition.

Modified MFCCs for Robust Speaker Recognition

A Forward Masking Auditory Model And Its Application In Speaker Identification And Speech Recognition

ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

Robust Speech Recognition by Selecting Mel-Filter Banks

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

Robust Speech Recognition Based on Neighborhood Space

Hidden Markov Acoustic Modeling with Bootstrap and Restructuring for Low-Resourced Languages

Auditory Model Based Speech Feature Extraction and Its Application to Speaker Identification

Robust Front-End for Speech Recognition Based on Computational Auditory Scene Analysis and Speaker Model