Abstract:The challenge of identifying the emotional qualities of voice, regardless of the semantic meaning, is known as speech emotion recognition (SER). While people are capable of performing this activity efficiently as a natural aspect of voice communication, the capacity to do so autonomously through programmed technologies is indeed a work in progress. As it offers perspective on human mental processes, emotion identification from speech signals is a frequently investigated topic in the construction of human–computer interface (HCI) models. In HCI, it is frequently necessary to determine the emotion of persons as mental feedback. An attempt is made in this study to distinguish seven different emotions using speech signals: sadness, anger, disgusted, pleased, surprised, enjoyable, and neutrality mood. For the identification of emotion, the suggested method uses a signals preprocessing method based on the randomness measure. The signals are first normalized to reduce noise. Due to the obvious changing length and continual form of voice signals, emotions identification requires both locally and globally information. Local features depict dynamic behavior, while feature points reveal statistic factors such as standard error, median, and lowest and maximum values. The SER system includes several features, including spectrum characteristics, sound quality characteristics, and Teager energy operator-based characteristics. Prosodic features are those that are based on the human perception, such as rhythm and inflection. These characteristics are based on three factors: power, length, and frequency response. From of the heavily processed signals, a features vector is generated that evaluates the random feature for all of the emotional responses. Then, using mutual information (MI), the feature vector is utilized to choose from the entire set. The feature vectors are then categorized using the BOAT method and association rule mining. Experiments were carried out on the TESS dataset for several metrics, and the performance of the suggested method outperformed the state-of-the-art methods.

Optimal feature selection for speech emotion recognition using enhanced cat swarm optimization algorithm

Scores Selection for Emotional Speaker Recognition

A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm

A Methodical Framework Utilizing Transforms and Biomimetic Intelligence-Based Optimization with Machine Learning for Speech Emotion Recognition

Exploiting the potentialities of features for speech emotion recognition

Hybrid deep learning with optimal feature selection for speech emotion recognition using improved meta-heuristic algorithm

Speech Emotion Recognition Based on Feature Selection and Extreme Learning Machine Decision Tree

Speaker-independent Speech Emotion Recognition Based on Random Forest Feature Selection Algorithm

Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition

Feature extraction algorithms to improve the speech emotion recognition rate

Speech Emotion Recognition Based on Syllable-Level Feature Extraction

An Improved MSER using Grid Search based PCA and Ensemble Voting Technique

Speaker-Independent Speech Emotion Recognition Based On Cnn-Blstm And Multiple Svms

Fusion-based Speech Emotion Classification Using Two-Stage Feature Selection

Harmony Search for Feature Selection in Speech Emotion Recognition

Machine learning technique-based emotion classification using speech signals

Feature selection enhancement and feature space visualization for speech-based emotion recognition

Facial Emotion Recognition Via Discrete Wavelet Transform, Principal Component Analysis, And Cat Swarm Optimization

Speech emotion recognition approaches in human computer interaction

Speech Emotion Recognition based on Optimized Support Vector Machine.

Speech Emotion Recognition Based on PSO-optimized SVM