Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Jiatong Shi,William Chen,Dan Berrebbi,Hsiu-Hsuan Wang,Wei-Ping Huang,En-Pei Hu,Ho-Lam Chuang,Xuankai Chang,Yuxun Tang,Shang-Wen Li,Abdelrahman Mohamed,Hung-yi Lee,Shinji Watanabe
2023-10-09
Abstract:The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework, emphasizing self-supervised models in multilingual speech recognition and language identification. The challenge comprises a research track focused on applying ML-SUPERB to specific multilingual subjects, a Challenge Track for model submissions, and a New Language Track where language resource researchers can contribute and evaluate their low-resource language data in the context of the latest progress in multilingual speech recognition. The challenge garnered 12 model submissions and 54 language corpora, resulting in a comprehensive benchmark encompassing 154 languages. The findings indicate that merely scaling models is not the definitive solution for multilingual speech tasks, and a variety of speech/voice types present significant challenges in multilingual speech processing.
Sound,Computation and Language,Audio and Speech Processing
What problem does this paper attempt to address?