Discussions of Different Deep Transfer Learning Models for Emotion Recognitions
Chih-Ta Yen,Kang-Hua Li
DOI: https://doi.org/10.1109/access.2022.3209813
IF: 3.9
2022-10-07
IEEE Access
Abstract:In recent years, facial emotion recognition (FER) has been a popular topic in affective computing. However, FER still faces many challenges in automatic recognition for several reasons, including quality control of sample data, extraction of effective features, creation of models, and multi-feature fusion, which have not been thoroughly researched and therefore are still hot topics in computer visualization. In view of the mature development of deep learning, deep learning methods are increasingly being used in FER. However, because deep learning requires a large amount of data to achieve effective training, many studies have employed transfer learning to compensate for this drawback. Nevertheless, there has been no universal approach for transfer learning in FER. Accordingly, this study used the five classic models in FER (i.e., ResNet-50, Xception, EfficientNet-B0, Inception, and DenseNet121) to conduct a series of experiments: data preprocessing, training type, and the applicability of multi-stage pretraining. According to the results, class wight was the optimal technique for data balance. In addition, the freeze + fine-tuning training type can produce higher accuracy, regardless of the size of the dataset. Multi-stage training was also effective. Compared with the model accuracy in previous studies, the accuracy achieved in this study using the proposed transfer learning method was superior for both large and small datasets. Specifically, on AffectNet, the accuracy for the ResNet-50, Xception, EfficientNet-B0, Inception, and DenseNet-121 models increased by 8.37%, 10.45%, 10.45%, 8.55%, and 5.47%, respectively. On FER2013, the accuracy for these models increased by 5.72%, 2%, 10.45%, 5%, and 9%, respectively. These results proved the validity and advantag- s of the experiments in this study.