Abstract:This paper describes our submission to ICASSP 2022 Multi-channel Multi-party Meeting Transcription (M2MeT) Challenge. For Track 1, we propose several approaches to make the clustering-based speaker diarization system enable to handle overlapped speech. Front-end dereverberation and the direction-of-arrival (DOA) estimation are used to improve the accuracy of speaker diarization. Multi-channel combination and overlap detection are applied to reduce the missed speaker error. A modified DOVER-Lap is also proposed to fuse the results from different systems. We achieve the final DER of 5.79% on the Eval set and 7.23% on the Test set, which ranks 4th in the diarization challenge. For Track 2, we develop our system using the Conformer model in a joint CTC-attention architecture. Serialized output training (SOT) is adopted to multi-speaker overlapped speech recognition. We propose a neural front-end module to model multi-channel audio and train the model end-to-end. Various data augmentation methods are utilized to mitigate over-fitting in the multi-channel multi-speaker E2E system. Transformer language model fusion is developed to achieve better performance. The final CER is 19.2% on the Eval set and 20.8% on the Test set, which ranks 2nd in the ASR challenge.

GIST-AiTeR Speaker Diarization System for VoxCeleb Speaker Recognition Challenge (VoxSRC) 2023

GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge

The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

The DKU-DukeECE-Lenovo System for the Diarization Task of the 2021 VoxCeleb Speaker Recognition Challenge

North America Bixby Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2021

Microsoft Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2020

The SpeakIn System for VoxCeleb Speaker Recognition Challange 2021

The JHU submission to VoxSRC-21: Track 3

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

The IDLAB VoxCeleb Speaker Recognition Challenge 2020 System Description

The DKU-MSXF Speaker Verification System for the VoxCeleb Speaker Recognition Challenge 2023

The Microsoft System for VoxCeleb Speaker Recognition Challenge 2022

XMUSPEECH System for VoxCeleb Speaker Recognition Challenge 2021

The ReturnZero System for VoxCeleb Speaker Recognition Challenge 2022

The xx205 System for the VoxCeleb Speaker Recognition Challenge 2020

The Volcspeech System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

EML System Description for VoxCeleb Speaker Diarization Challenge 2020

SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022.

TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024

The ID R&D VoxCeleb Speaker Recognition Challenge 2023 System Description

Royalflush Speaker Diarization System for ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge