Discriminative Multimodal Embedding for Event Classification

Fan Qi,Xiaoshan Yang,Tianzhu Zhang,Changsheng Xu
DOI: https://doi.org/10.1016/j.neucom.2017.11.078
IF: 6
2020-01-01
Neurocomputing
Abstract:Most of existing multimodal event classification methods fuse the traditional hand-crafted features with some manually defined weights, which may be not suitable to the event classification task with large amounts of photos. Besides, the feature extraction and event classification model are always performed separately, which cannot capture the most useful features to describe the semantic concepts of complex events. To deal with these issues, we propose a novel discriminative multimodal embedding (DME) model for event classification in user generated photos by jointly learning the representation together with the classifier in a unified framework. In the proposed DME model, we can effectively resolve the multimodal, intra-class variation and inter-class confusion challenges by using the contrastive constraints on the multimodal event data. Extensive experimental results on two collected datasets demonstrate the effectiveness of the proposed DME model for event classification.
What problem does this paper attempt to address?