Video event detection algorithm based on multi-scale instance learning

Zhangqiong YANG,Zheng LI
DOI: https://doi.org/10.16280/j.videoe.2017.h7.032
2017-01-01
Abstract:The existing most video event detection approaches firstly extract features from video trames or shots,then quantize and pool the features to form a single vector representation for the entire video.Though simple and efficient,the final pooling step may lead to loss of temporally local information,which is important in indicating which part in a long video signifies presence of the event,and weak the accuracy of event detection.To this end,an instance-based video event detection approach is proposed.Each video is firstly represented as multiple "instances",which is defined as video segments of different temporal intervals.Then,aiming at the two cases of the proportion of the positive examples of each video,a detection algorithm based on multi-scale is proposed which treats the instance labels as hidden latent variables,and simultaneously infers the instance labels as well as the instance-level event detection model.Finally,extensive experiments on large-scale video event datasets demonstrate significant performance gains.In addition,the proposed method is also useful in explaining the detection results by localizing the temporal segments in a video which is responsible for the positive detection.
What problem does this paper attempt to address?