Abstract:In recent years, the electronic devices and wireless network are seen everywhere, generating a massive amount of online surveillance video data that can be applied to recognize facial expressions to sustain the smart education cloud deployment. However, research highlights of existing Facial Expression Recognition (FER) methods mainly focus on the global or local facial expression separately, but pay less attention to the co-operation relationships among them. To relieve the problem, this paper has proposed a Multi-region Attention Transformation Framework (MATF) to blend the global and local facial details for expression recognition. The proposed framework is structured with a multi-region attention transformation network and a pseudo label generation module. The former is used to fuse the rough global feature and local detail information, which divides the original facial image into multi-region sub-blocks for multi-region expression association. Specifically, it includes the facial local segmentation network, the attention transformation network and the feature weight allocation mechanism for facial expression feature extraction. The pseudo label generation strategy is proposed to enhance the performance of multi-region facial expression integration with a semi-supervised way. Further, a unified training strategy is exploited to optimize the proposed framework to ensure a high FER accuracy. Experiments have been conducted on several public FER datasets (RAF-DB, FERPlus, CAER-S) and results indicate that the proposed method outperforms existing algorithms by 3%-8% in accuracy. Based on the multi-region divided attention network learned by the proposed framework, the algorithm for recognizing the expressions can achieve a time complexity as low as O(n), which can be deployed in the mobile electronic devices like the FaceReader series smart facial expression analysis sever. In addition, our facial expression recognition algorithm can be also deployed into the smart education cloud for E-learning.

POSTER: A Pyramid Cross-Fusion Transformer Network for Facial Expression Recognition

POSTER++: A simpler and stronger facial expression recognition network

Cgan Based Facial Expression Recognition for Human-Robot Interaction

Facial Expression Recognition Based on Multi-Scale Convolutional Vision Transformer

Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer

Facial Expression Recognition Using Hybrid Features of Pixel and Geometry

POST: Prototype‐oriented similarity transfer framework for cross‐domain facial expression recognition

Enhanced Hybrid Vision Transformer with Multi-Scale Feature Integration and Patch Dropping for Facial Expression Recognition

TriCAFFNet: A Tri-Cross-Attention Transformer with a Multi-Feature Fusion Network for Facial Expression Recognition

Efficient Facial Expression Recognition with Representation Reinforcement Network and Transfer Self-Training for Human–Machine Interaction

Facial Expression Recognition With Visual Transformers and Attentional Selective Fusion

Facial Expression Recognition Based on Zero-Addition Pretext Training and Feature Conjunction-Selection Network in Human–Robot Interaction

Automatic 4D Facial Expression Recognition via Collaborative Cross-domain Dynamic Image Network.

Adaptive multilayer perceptual attention network for facial expression recognition

Facial expression recognition through multi-level features extraction and fusion

The Facial Expression Recognition Method Based on Image Fusion and CNN

FER-former: Multi-modal Transformer for Facial Expression Recognition

Two-pathway attention network for real-time facial expression recognition

Spatial-Temporal Graphs Plus Transformers for Geometry-Guided Facial Expression Recognition

Facial expressions recognition with multi-region divided attention networks for smart education cloud applications