HCM: Online Action Detection With Hard Video Clip Mining
Siyu Liu,Jian Cheng,Ziying Xia,Zhilong Xi,Qin Hou,Zhicheng Dong
DOI: https://doi.org/10.1109/tmm.2023.3313258
IF: 7.3
2023-01-01
IEEE Transactions on Multimedia
Abstract:Online action detection plays a vital role in video action understanding and can be widely used in various video analysis applications. This task aims to detect actions at the current moment within long untrimmed video streams. However, accurately identifying action-background transitions that are ambiguous in terms of time during detection can be challenging due to the similarity between the action and background clips, adding to the difficulty in finding a suitable division between them. To address this issue, we propose a hard video clip mining method based on deep metric learning for online action detection named HCM. The HCM method first selects video clips that are hard to distinguish to determine the optimization objects. Then, a hard clip mining loss is adopted to push the features toward the centers of the categories to which they belong and away from others. Furthermore, we introduce an intra-class feature compaction loss to constrain the divergence of action features, ensuring the stability of their distribution. We evaluated the proposed method on two challenging online action detection datasets, THUMOS14 and TVSeries. The results show that HCM is effective and efficient in online action detection and action anticipation tasks.
computer science, information systems,telecommunications, software engineering