Abstract:Language Identification (LID) is a crucial preliminary process in the field of Automatic Speech Recognition (ASR) that involves the identification of a spoken language from audio samples. Contemporary systems that can process speech in multiple languages require users to expressly designate one or more languages prior to utilization. The LID task assumes a significant role in scenarios where ASR systems are unable to comprehend the spoken language in multilingual settings, leading to unsuccessful speech recognition outcomes. The present study introduces convolutional recurrent neural network (CRNN) based LID, designed to operate on the Mel-frequency Cepstral Coefficient (MFCC) characteristics of audio samples. Furthermore, we replicate certain state-of-the-art methodologies, specifically the Convolutional Neural Network (CNN) and Attention-based Convolutional Recurrent Neural Network (CRNN with attention), and conduct a comparative analysis with our CRNN-based approach. We conducted comprehensive evaluations on thirteen distinct Indian languages and our model resulted in over 98\% classification accuracy. The LID model exhibits high-performance levels ranging from 97% to 100% for languages that are linguistically similar. The proposed LID model exhibits a high degree of extensibility to additional languages and demonstrates a strong resistance to noise, achieving 91.2% accuracy in a noisy setting when applied to a European Language (EU) dataset.

A reproduction of Apple's bi-directional LSTM models for language identification in short strings

Phonetic Temporal Neural Model for Language Identification

Phone-aware Neural Language Identification

LIMIT: Language Identification, Misidentification, and Translation using Hierarchical Models in 350+ Languages

Investigating model performance in language identification: beyond simple error statistics

A language model based approach towards large scale and lightweight language identification systems

LIDE: Language Identification from Text Documents

Streaming Language Identification using Combination of Acoustic Representations and ASR Hypotheses

Native Language Identification with Large Language Models

Byte-based Language Identification with Deep Convolutional Networks

A Deep Learning Approach for Similar Languages, Varieties and Dialects

Look, Listen and Learn - A Multimodal LSTM for Speaker Identification

Short Text Language Identification for Under Resourced Languages

Insights into End-to-End Learning Scheme for Language Identification

A Fast, Compact, Accurate Model for Language Identification of Codemixed Text

Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech

Is Attention always needed? A Case Study on Language Identification from Speech

Towards spoken dialect identification of Irish

Automatic Language Identification for Romance Languages using Stop Words and Diacritics

Evaluating Input Representation for Language Identification in Hindi-English Code Mixed Text

Discriminating Between Similar Nordic Languages