Modernizing Open-Set Speech Language Identification

Mustafa Eyceoz,Justin Lee,Homayoon Beigi
DOI: https://doi.org/10.13140/RG.2.2.24797.28647
2022-05-21
Abstract:While most modern speech Language Identification methods are closed-set, we want to see if they can be modified and adapted for the open-set problem. When switching to the open-set problem, the solution gains the ability to reject an audio input when it fails to match any of our known language options. We tackle the open-set task by adapting two modern-day state-of-the-art approaches to closed-set language identification: the first using a CRNN with attention and the second using a TDNN. In addition to enhancing our input feature embeddings using MFCCs, log spectral features, and pitch, we will be attempting two approaches to out-of-set language detection: one using thresholds, and the other essentially performing a verification task. We will compare both the performance of the TDNN and the CRNN, as well as our detection approaches.
Computation and Language,Artificial Intelligence,Machine Learning,Audio and Speech Processing
What problem does this paper attempt to address?