The NPU System for DASR Task of CHiME-7 Challenge

Bingshen Mu,Pengcheng Guo,He Wang,Yangze Li,Yang Li,Pan Zhou,Wei Chen,Lei Xie
DOI: https://doi.org/10.21437/chime.2023-12
2023-01-01
Abstract:This study describes the NPU system for the Distant Automatic Speech Recognition (DASR) task of the CHiME-7 Challenge.Specifically, two attention-based channel selection modules are introduced to automatically select the most advantageous channel subset from multiple signal channels.Furthermore, we incorporate additional spatial features during the cross-channel attention, which guides the model to capture the desired signals while suppressing the interference sources.It is noteworthy that these enhancements solely pertain to the ASR model, with no modifications made to the speaker diarization (SD).Our approach achieves a Macro diarization attributed word error rate (DA-WER) of 22.28% on CHiME-7 dev sets with oracle diarization and 41.04% on CHiME-7 dev sets with baseline SD results.
What problem does this paper attempt to address?