Video content description method based on semantic information guidance

Tu Yunbin,Yan Chenggang,Feng Xinle,Li Bing,Lou Jiedong,Peng Dongliang,Zhang Yongdong,Wang Jianzhong
2017-01-01
Abstract:The invention discloses a video content description method based on semantic information guidance. The method comprises the steps that (1) a video format is preprocessed; (2) semantic information for guidance is established; (3) the weight of each semantic feature vector [A,XMS ] is calculated; (4) the semantic feature vectors [A,XMS ] are decoded; and (5) a video description model is tested. According to the method, by use of a faster-rcnn model, key semantic information on each frame of an image can be quickly detected, and the key semantic information is added into original features extracted through a CNN, so that the feature vector input into an LSTM network at each time node has semantic information; thus, in the decoding process, video content space-time relevancy is guaranteed, and language description accuracy is improved.
What problem does this paper attempt to address?