Social Relation Analysis from Videos Via Multi-entity Reasoning

Chenghao Yan,Zihe Liu,Fangtao Li,Chenyu Cao,Zheng Wang,Bin Wu
DOI: https://doi.org/10.1145/3460426.3463634
2021-01-01
Abstract:Videos contain rich semantic information. Analyzing social relations in video semantics can help machines interpret the behavior of human beings. However, most of the work related to social relationship recognition is based on still images, while video-based social relationship analysis tasks are less concerned. Here we propose a Multi-entity Relation Reasoning (MRR) framework that can be used for recognizing or predicting social relations in videos. To capture temporal features and contextual cues in videos, and use richer information to represent the person in the video, we track each person's appearance timeline and design a multi-entity representation method to build a social relationship knowledge graph. Then we use graph attention networks to gather information from the entity's neighborhood. Besides, situation information is helpful to identify relationships, we design a situation information extraction module to generate situation embedding from the video clip. Finally, a decoder is adopted to predict relationships between character entities. We evaluate the model on the MovieGraphs dataset and verify the effectiveness of the proposed framework.
What problem does this paper attempt to address?