Deep Learning Models for Automated Classification of Dog Emotional States from Facial Expressions

Tali Boneh-Shitrit,Shir Amir,Annika Bremhorst,Daniel S. Mills,Stefanie Riemer,Dror Fried,Anna Zamansky
DOI: https://doi.org/10.48550/arXiv.2206.05619
2022-06-12
Abstract:Similarly to humans, facial expressions in animals are closely linked with emotional states. However, in contrast to the human domain, automated recognition of emotional states from facial expressions in animals is underexplored, mainly due to difficulties in data collection and establishment of ground truth concerning emotional states of non-verbal users. We apply recent deep learning techniques to classify (positive) anticipation and (negative) frustration of dogs on a dataset collected in a controlled experimental setting. We explore the suitability of different backbones (e.g. ResNet, ViT) under different supervisions to this task, and find that features of a self-supervised pretrained ViT (DINO-ViT) are superior to the other alternatives. To the best of our knowledge, this work is the first to address the task of automatic classification of canine emotions on data acquired in a controlled experiment.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the automatic identification of dogs' emotional states. Specifically, it is to classify two emotions of dogs, positive anticipation and frustration, from facial expressions. This problem is of great significance in the field of animal emotion research. Because, unlike humans, animals cannot express their emotional states through language. Therefore, inferring animals' emotional states from non - verbal cues such as facial expressions has become an important research direction. The paper points out that although standard tools such as the Facial Action Coding System (FACS) have been developed in the field of human emotion research to objectively measure facial expressions, their application in animals, especially non - human species, is still relatively rare. The main reasons are the difficulty in data collection and the challenges in establishing the "ground truth" of emotional states. In addition, the changes in animals' faces are more diverse in color, shape and texture than those of humans, which also increases the difficulty of automatic identification. To solve these problems, the paper adopts the latest deep - learning techniques. In particular, different pre - trained models (such as ResNet, ViT, etc.) are applied to the data sets collected under controlled experimental conditions to explore the applicability and effectiveness of these models in automatically classifying dogs' emotional states. The research finds that the self - supervised pre - trained ViT model (DINO - ViT) performs best in feature extraction and can more accurately identify the emotional states corresponding to dogs' facial expressions. This provides new possibilities for future applications in the fields of animal welfare and health.