Incorporating frequency masking filtering in a standard MFCC feature extraction algorithm

Weizhong Zhu,D. O'Shaughnessy
DOI: https://doi.org/10.1109/ICOSP.2004.1452739
Abstract:Frequency masking filtering is introduced in a standard mel frequency cepstral coefficients (MFCC) feature extraction algorithm. It mimics a human masking mechanism to get more robust features when the input speech is distorted by various noises. The AURORA 2.0 database together with HTK speech recognition toolkits are used to evaluate the impact of the frequency masking filtering algorithm at various thresholds. It is shown that with the proper frequency masking coefficients, it can have about 6.59%, 6.01% and 1.20% relative performance improvements over standard MFCC for test A and test B and test C respectively, in clean-condition training. It works well on all eight different noise conditions. It has also proved to be effective when it is combined with other popular noise robust techniques, such as cepstral mean normalization. The proposed frequency masking filtering algorithm is fairly simple and it only requires a very small extra computation load.
What problem does this paper attempt to address?