Real-time Emotion and Gender Classification using Ensemble CNN

Abhinav Lahariya,Varsha Singh,Uma Shanker Tiwary
DOI: https://doi.org/10.48550/arXiv.2111.07746
2021-11-15
Abstract:Analysing expressions on the person's face plays a very vital role in identifying emotions and behavior of a person. Recognizing these expressions automatically results in a crucial component of natural human-machine interfaces. Therefore research in this field has a wide range of applications in bio-metric authentication, surveillance systems , emotion to emoticons in various social media platforms. Another application includes conducting customer satisfaction surveys. As we know that the large corporations made huge investments to get feedback and do surveys but fail to get equitable responses. Emotion & Gender recognition through facial gestures is a technology that aims to improve product and services performance by monitoring customer behavior to specific products or service staff by their evaluation. In the past few years there have been a wide variety of advances performed in terms of feature extraction mechanisms , detection of face and also expression classification techniques. This paper is the implementation of an Ensemble CNN for building a real-time system that can detect emotion and gender of the person. The experimental results shows accuracy of 68% for Emotion classification into 7 classes (angry, fear , sad , happy , surprise , neutral , disgust) on FER-2013 dataset and 95% for Gender classification (Male or Female) on IMDB dataset. Our work can predict emotion and gender on single face images as well as multiple face images. Also when input is given through webcam our complete pipeline of this real-time system can take less than 0.5 seconds to generate results.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to perform emotion and gender classification simultaneously in real - time systems. Specifically, the authors developed a system based on Ensemble Convolutional Neural Network (Ensemble CNN), aiming to recognize an individual's emotional state (divided into seven categories: anger, disgust, fear, happiness, sadness, surprise, neutral) and gender (male or female) by analyzing facial expressions. This technology has a wide range of applications, including biometric authentication, surveillance systems, emotion - to - emoji conversion in social media, and customer satisfaction surveys, etc. The paper mentions that through the automatic recognition of facial features, a natural human - machine interaction interface can be achieved, thereby improving the performance of products and services. To achieve this goal, the authors used two datasets: FER - 2013 for emotion classification and IMDB for gender classification. The experimental results show that the accuracy of this model in emotion classification is 68% and in gender classification is 95%. In addition, when the input comes from a webcam, the processing time of the entire real - time system is less than 0.5 seconds. By using the ensemble learning method, that is, averaging the prediction results of two different CNN models, the authors aim to reduce the variance in the final neural network model, thereby improving the robustness and accuracy of the model prediction. This method not only improves the performance of the model but also reduces the fluctuation of the model prediction results, making the model more reliable.