Abstract:With the development of technology, the importance of the research on speech emotion recognition and semantic analysis has increased. The research is primarily applied in companion robot, technology products and medical purpose. In this research, a communication system with speech emotion recognition is proposed. The system pre-process speech with sound data enhancing method in speech emotion recognition and transform the sound into spectrogram by MFCC (Mel Frequency Cepstral Coefficient). Then, GoogLeNet of CNN (Convolutional Neural Network) is applied to recognize the five emotions, which are peace, happy, sad, angry and fear, and the top accuracy of recognition is 79.81%. When applying semantic analysis, the training texts are divided into two categories, positive and negative, and the chatting conversations are conducted in the framework Seq2Seq of RNN (Recurrent Neural Network). The systematic framework of this research has two parts, the client and the server. The former one is developed on Android system to be used in Application, and the latter one is established by Ubuntu Linux system and combined with the web server. With the bi-terminal framework system, the users can record voice in APP one his/her cellphone and upload the voice file to the server. Then, the voice undergoes speech emotion recognition by CNN and semantic analysis by RNN to function as a chatting machine that can respond positively or negatively based on the detected emotion and show the results on APP of the user's cell phone. The main contributions of this research are: 1) This study introduces the Chinese word vector to the robot dialogue system, effectively improving dialogue tolerance and semantic interpretation, 2) The traditional method of emotion identification must first tokenize the Chinese words, analyze the clauses and part of speech, and capture the emotional keywords before being interpreted by the expert system. Different from the traditional method, this study classifies the input directly through the convolutional neural network after the input sentence is converted into a spectrogram by MFCC, and 3) in addition to implementing the companion robot, the user's emotional index can be collected for analysis by the back-end care organization. In addition, compared with other commercial humanoid companion robots, this study is presented in an App, which is easier to use and economical.

Speech emotion recognition based on convolution neural network combined with random forest

Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition

Two-layer fuzzy multiple random forest for speech emotion recognition in human-robot interaction

Speaker-Independent Speech Emotion Recognition Based On Cnn-Blstm And Multiple Svms

Speaker-independent Speech Emotion Recognition Based on Random Forest Feature Selection Algorithm

Speech Emotion Recognition Based on Convolutional Neural Network with Attention-Based Bidirectional Long Short-Term Memory Network and Multi-Task Learning

Speech Emotion Recognition Based on Syllable-Level Feature Extraction

Speech emotion recognition using deep 1D & 2D CNN LSTM networks

Speech Emotion Recognition Based on Feature Selection and Extreme Learning Machine Decision Tree

Fuzzy speech emotion recognition considering semantic awareness

Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features

Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features

Study on emotion recognition and companion Chatbot using deep neural network

Speech emotion recognition based on multi-dimensional feature extraction and multi-scale feature fusion

Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition

A Combined CNN Architecture for Speech Emotion Recognition

A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition

Speech Emotion Recognition Using Mel-Frequency Cepstral Coefficients & Convolutional Neural Networks

Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching

Electroencephalograph-based emotion recognition using convolutional neural network without manual feature extraction

Speech emotion recognition with deep convolutional neural networks