Integrating Spectrotemporal Context Into Features Based On Auditory Perception For Classification-Based Speech Separation

Xiang Li,Xihong Wu,Jing Chen
DOI: https://doi.org/10.1109/icassp.2019.8682503
2019-01-01
Abstract:Speech separation, which has been a challenging task for decades, especially at low signal-to-noise ratios (SNRs), can be cast as a classification problem. In such adverse acoustic environment, extracting robust features from noisy mixtures is crucial for successful classification. In the past studies, features representing temporal dynamics, known as delta features, have been widely used. Combining basic features with their deltas yields better speech separation results than using basic features alone. In this study, the commonly used delta feature was modified according to the characteristics of auditory perception, which included auditory processing on spectral change and spectral contrast. Therefore, we proposed a feature which integrated spectrotemporal context via replacing the commonly used delta feature by spectral change feature and spectral contrast feature. Experimental results showed that the proposed feature could produce better speech segregation performance than the common delta feature.
What problem does this paper attempt to address?