Feature extraction model for speech emotion detection with prodigious precedence assortment model using fuzzy-based convolution neural networks

Chandupatla Deepika,Swarna Kuchibhotla
DOI: https://doi.org/10.1007/s00500-023-08458-5
IF: 3.732
2023-06-08
Soft Computing
Abstract:Human speech emotion identification is a critical topic in the research of Human–Computer Interfaces (HCIs) to sympathize with people. When a computer communicates or interacts with individuals, speech emotion recognition may make greater connections and help provide personalized service based on their emotions, establishing confidence in people. Speech emotion recognition systems are valuable in a wide range of fields, including image processing, medical science, and machine learning. In response to human demands, the effect and potential usage of programmed speech emotion recognition in a wide range of applications, including human–machine communication, robot control, and driver status surveillance, has been rising. In any case, detecting exterior appearances from images and recordings remains a challenging issue due to the difficulties in precisely isolating the important emotional aspects. These characteristics, such as static, dynamic, point-based geometric, or area-based appearance, are widely mentioned in a range of structures. The primary purpose of this research is to develop a reliable system for recognizing and identifying human speech emotions such as anger, sadness, happiness, surprise, fear, disgust, and neutral in real time. The input file in this method is a speech recording, and it is capable of distinguishing one element from another value by applying a considerable set of weights to distinct parts of the speech while taking into account the signals levels. For accurate emotion identification, speech samples from a publically available dataset are analyzed, and a speech emotion recognition model is implemented. This research proposes a CNN-based Speech Emotion Feature Extraction Model with Prodigious Precedence Assortment (CNN-SEFE-PPA) model using fuzzy for precisely identifying the speech emotions. When the proposed methodology is compared to the traditional model, the findings show that the proposed model is reliable.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?