An auditory inspired amplitude modulation filter bank for robust feature extraction in automatic speech recognition

Niko Moritz,Jörn Anemüller,Birger Kollmeier
DOI: https://doi.org/10.1109/TASLP.2015.2456420
2015-11-01
Abstract:The human ability to classify acoustic sounds is still unmatched compared to recent methods in machine learning. Psychoacoustic and physiological studies indicate that the auditory system of mammals decomposes audio signals into their acoustic and modulation frequency components prior to further analysis. Since it is known that most linguistic information is coded in amplitude fluctuations, mimicking temporal processing strategies of the auditory system in automatic speech recognition (ASR) promises to increase recognition accuracies. We present an amplitude modulation filter bank (AMFB) that is used as a feature extraction scheme in ASR systems. The time-frequency resolution of the employed FIR filters, i.e., bandwidth and modulation frequency settings, are adopted from a psychophysically inspired model of Dau
What problem does this paper attempt to address?