A non-invasive speech quality evaluation algorithm for hearing aids with multi-head self-attention and audiogram-based features

Ruiyu Liang,Yue Xie,Jiaming Cheng,Cong Pang,Bjrn Schuller,Björn Schuller
DOI: https://doi.org/10.1109/taslp.2024.3378107
2024-01-01
Abstract:The speech quality delivered by hearing aids plays a crucial role in determining the acceptance and satisfaction of users. Compared with invasive speech quality evaluation methods that require pure signals as a reference, this paper proposes a non-invasive speech quality evaluation algorithm for hearing aids with multi-head self-attention and audiogram-based features. Initially, the audiogram of hearing-impaired individuals is extended along the frequency axis, enabling the speech quality evaluation model to learn the gain requirements specific to frequency bands for hearing-impaired individuals. Subsequently, the spectrogram is extracted from the speech signals to be evaluated. These features are combined with the transformed audiogram to create input features. To extract deep frame-level feature, a network employing multiple two-dimensional convolutional modules is utilized. Then, the temporal features are modeled using bidirectional long short-term memory networks (BiLSTM), while a multi-head self-attention mechanism is employed to integrate contextual information. This mechanism enables the model to focus on key frame information. Experimental results demonstrate that, compared to currently available advanced algorithms, the proposed network exhibits a higher correlation with the Hearing Aid Speech Quality Index (HASQI) and demonstrates robustness under various noise conditions.
engineering, electrical & electronic,acoustics
What problem does this paper attempt to address?