Informedia@ trecvid 2014 med and mer
Shoou-I Yu,Lu Jiang,Zexi Mao,Xiaojun Chang,Xingzhong Du,Chuang Gan,Zhenzhong Lan,Zhongwen Xu,Xuanchong Li,Yang Cai,Anurag Kumar,Yajie Miao,Lara Martin,Nikolas Wolfe,Shicheng Xu,Huan Li,Ming Lin,Zhigang Ma,Yi Yang,Deyu Meng,Shiguang Shan,Pinar Duygulu Sahin,Susanne Burger,Florian Metze,Rita Singh,Bhiksha Raj,Teruko Mitamura,Richard Stern,Alexander Hauptmann
2014-01-01
Abstract:We report on our system used in the TRECVID 2014 Multimedia Event Detection (MED) and Multimedia Event Recounting (MER) tasks. On the MED task, the CMU team achieved leading performance in the Semantic Query (SQ), 000Ex, 010Ex and 100Ex settings. Furthermore, SQ and 000Ex runs are significantly better than the submissions from the other teams. We attribute the good performance to 4 main components: 1) our large-scale semantic concept detectors trained on video shots for SQ/000Ex systems, 2) better features such as improved trajectories and deep learning features for 010Ex/100Ex systems, 3) a novel Multistage Hybrid Late Fusion method for 010Ex/100Ex systems and 4) our developed reranking methods for Pseudo Relevance Feedback for 000Ex/010Ex systems. On the MER task, our system utilizes a subset of features and detection results from the MED system from which the recounting is then generated. Recounting evidence is presented by selecting the most likely concepts detected in the salient shots of a video. Salient shots are detected by searching for shots which have high response when predicted by the video level event detector.