A review on emotion recognition from dialect speech using feature optimization and classification techniques
Sunil Thimmaiah,Vinay N A,Ravikumar M G,Prasad S R,Thimmaiah, Sunil
DOI: https://doi.org/10.1007/s11042-024-18297-7
IF: 2.577
2024-03-07
Multimedia Tools and Applications
Abstract:Emotion recognition from speech has gained prominence across various domains due to its wide-ranging applications. This paper presents a comprehensive review of advancements in emotion recognition, focusing on dialect speech, through the utilization of feature optimization and classification techniques. Dialectal variations in speech introduce complexities that impact the accuracy of emotion recognition models. To address this challenge, diverse feature extraction methods have been explored, capturing both general and dialect-specific acoustic cues. Spectral, prosodic, and temporal features are adapted and optimized to enhance emotional content representation within dialect speech. Classification techniques play a pivotal role in distinguishing emotions in dialect speech. Traditional classifiers like Support Vector Machines (SVMs), Gaussian Mixture Models (GMMs), and Hidden Markov Models (HMMs) have been employed. Recent studies highlight the efficacy of machine learning approaches such as Random Forests, Gradient Boosting, Convolutional Neural Networks (CNNs), and Long Short-Term Memory networks (LSTMs). Feature selection and dimensionality reduction techniques optimize model performance. Principal Component Analysis (PCA), Recursive Feature Elimination (RFE), and genetic algorithms enhance feature sets, improving classification accuracy and computational efficiency. Datasets tailored for dialect-specific speech corpora address linguistic nuances and contribute to the model's relevance to distinct regions or communities. Challenges include limited labelled dialect emotion datasets, model generalization across multiple dialects, and ethical considerations. As the field evolves, striking a balance between performance and ethics remains imperative. This review underscores the promise of optimized feature extraction, innovative classification techniques, and tailored datasets in dialect-based emotion recognition.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering