ResEmoteNet: Bridging Accuracy and Loss Reduction in Facial Emotion Recognition

Arnab Kumar Roy,Hemant Kumar Kathania,Adhitiya Sharma,Abhishek Dey,Md. Sarfaraj Alam Ansari
2024-09-02
Abstract:The human face is a silent communicator, expressing emotions and thoughts through its facial expressions. With the advancements in computer vision in recent years, facial emotion recognition technology has made significant strides, enabling machines to decode the intricacies of facial cues. In this work, we propose ResEmoteNet, a novel deep learning architecture for facial emotion recognition designed with the combination of Convolutional, Squeeze-Excitation (SE) and Residual Networks. The inclusion of SE block selectively focuses on the important features of the human face, enhances the feature representation and suppresses the less relevant ones. This helps in reducing the loss and enhancing the overall model performance. We also integrate the SE block with three residual blocks that help in learning more complex representation of the data through deeper layers. We evaluated ResEmoteNet on three open-source databases: FER2013, RAF-DB, and AffectNet, achieving accuracies of 79.79%, 94.76%, and 72.39%, respectively. The proposed network outperforms state-of-the-art models across all three databases. The source code for ResEmoteNet is available at <a class="link-external link-https" href="https://github.com/ArnabKumarRoy02/ResEmoteNet" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the accuracy of the model and reduce the loss in the Facial Emotion Recognition (FER) task. Specifically, the author proposes a new deep - learning architecture named ResEmoteNet, aiming to improve the performance of facial emotion recognition by combining Convolutional Neural Network (CNN), Squeeze - Excitation Network (SENet) and Residual Network. ### Main contributions of the paper 1. **Proposing the ResEmoteNet architecture**: - **Convolutional Neural Network (CNN)**: It is used to extract features from the input image. - **Squeeze - Excitation Network (SENet)**: Through global average pooling and activation mechanisms, it enhances the representation of important features and suppresses irrelevant features, thereby reducing the loss. - **Residual Network (ResNet)**: By using skip connections, it solves the problems of vanishing and exploding gradients in deep networks, and improves the training efficiency and performance of the model. 2. **Experimental verification**: - It was evaluated on three public datasets: FER2013, RAF - DB and AffectNet. - The experimental results show that ResEmoteNet outperforms the existing state - of - the - art methods on these three datasets, achieving accuracies of 79.79%, 94.76% and 72.93% respectively. ### Specific problems and their solutions - **Problem**: Traditional facial emotion recognition models have problems of low accuracy and high loss when dealing with complex emotional expressions. - **Solutions**: - **SENet**: Through the squeeze - excitation mechanism, the model can focus more effectively on important facial features and ignore irrelevant parts, thus improving the quality of feature representation. - **ResNet**: By introducing residual blocks, the model can better learn deep - level feature representations, avoid the problems of vanishing and exploding gradients, and improve training efficiency. - **Integrated architecture**: Combining CNN, SENet and ResNet to form a powerful multi - layer network can achieve excellent performance on different datasets. ### Experimental results - **FER2013**: The classification accuracy reaches 79.79%, which is 2.97% higher than the existing methods. - **RAF - DB**: The classification accuracy reaches 94.76%, which is 2.19% higher than the existing methods. - **AffectNet**: The classification accuracy reaches 72.93%, which is 3.53% higher than the existing methods. ### Conclusion ResEmoteNet significantly improves the accuracy and robustness of facial emotion recognition by combining multiple advanced deep - learning techniques, providing new directions and references for research in this field. Future work will further optimize ResEmoteNet and explore its potential in practical application scenarios.