Bispectral feature speech intelligibility assessment metric based on auditory model
Chen Xiaomei,Wang Xiaowei,Zhong Bo,Yang Jiayan,Shang Yingying
DOI: https://doi.org/10.1016/j.csl.2023.101492
IF: 3.252
2023-05-01
Computer Speech & Language
Abstract:A bispectral feature based predictive speech intelligibility metric (GMBSIM) using a more refined functional auditory model of ear is proposed. In the auditory model of ear, Gammatone filter banks and Meddis inner hair cell auditory model is combined to simulate the ear function. With input speech signal divided into 32 auditory subbands, and each subband signal passed through the inner hair cell model, the bispectrum of each subband signal in time domain is estimated by frames. And then bispectral features are extracted and chosen to calculate the speech intelligibility. The proposed GMBSIM has relative low computational complexity by omitting the spectrogram or neurogram image transformation. Considering the ear's perception and processing of speech signals makes the metric is advantageous to the classical metrics. And the last but not the least, the proposed GMBSIM metric is verified favorably across a range of conditions spanning reverberation, additive noise, and distortion such as jitter, which means it can be applied in most kinds of complex background noise environment.
computer science, artificial intelligence