An Efficient Voice-Based Emotion Recognition Using Hybrid Capsule Slime Mould Dense Deep Learning Framework

V. V. Satyanarayana Tallapragada,M. Naresh,G. V. Pradeep Kumar,V. Sireesha
DOI: https://doi.org/10.1142/s0218001424500174
IF: 1.261
2024-08-29
International Journal of Pattern Recognition and Artificial Intelligence
Abstract:International Journal of Pattern Recognition and Artificial Intelligence, Ahead of Print. Emotion recognition is an acceptable task of understanding the other's emotions and thoughts. Modern technology allows machines to recognize objects without the need for human intervention. The existing emotion recognition system faces more difficulties in making an accurate result with limited audio files. To address this problem, a Bag of audio terms-based hybrid deep learning models will be introduced it is known as the pioneering deep learning model. Input voice data is considered from a large dataset and pre-processed using a Data normalization and adaptive bilinear filtering approach. Afterward, acoustic features are taken out from the voice signals to capture related information for emotion recognition. These features can include linear prediction coefficients (LPC), three-dimensional (3D) log-mel spectrum, mel-frequency cepstral coefficients (MFCCs), and Prosodic features. Subsequently, feature selection is performed using an improved wild horse optimization (WHO) approach. Finally, a hybrid capsule slime mould dense deep learning framework (HCSDN) is used for voice-based emotion recognition. IEMOCAP and EMODB datasets are used to calculate system performance. The performance metrics denote the proposed system achieves 96.78% accuracy, 96.45% specificity, 95.81% precision, 4.256% error rate, and 94.256% sensitivity, 0.75% false positive rate in terms of the IEMOCAP dataset. Similarly, the proposed system achieves 96.85% accuracy, 95.74% specificity, 96.12% precision, 3.432% error rate, 95.25% sensitivity, and 0.62% false positive rate in terms of the EMODB dataset.
computer science, artificial intelligence
What problem does this paper attempt to address?