Image Captioning with Novel Topics Guidance and Retrieval-based Topics Re-weighting
Majjed Al-Qatf,Xingfu Wang,Ammar Hawbani,Amr Abdusallam,Saeed Hammod Alsamhi
DOI: https://doi.org/10.1109/tmm.2022.3202690
IF: 7.3
2022-01-01
IEEE Transactions on Multimedia
Abstract:Topic modelling (TM) has shown significant progress in boosting the effectiveness of image captioning in the last few years. Although important improvements have been shown in previous topic-guided image captioning models, some challenges remain unsolved, such as the independence of the topic predictors and the sentence generators, resulting in ineffective exploitation of semantic information. Also, all the predicted topics or the top-one topic are used throughout the whole captioning task without considering the current time step's linguistic context, which deviates the captioning network to focus on inaccurate image objects. To tackle these challenges, we propose a novel image captioning method consisting of four modules: enhanced topic predictor (ETP), retrieval-based topics re-weighting module (RTR), subsequent topic predictor (STP), and caption generation module. The prediction and generation modules are trained in an end-to-end manner to promote the efficient use of topics by predicting suitable topics at each time step. ETP predicts the topics using the image features, and is enhanced with topic embedding (TE). The RTR is only applied in the testing stage for re-weighting the topics predicted by ETP. In each time step, the STP automatically predicts concise topics subsets to alleviate the diversity of the image topics. Compared with the existing topic-based models, our model can automatically generate more accurate and diverse captions, boosting the explainability of how the topics influence the generated word in each time step. Extensive experiments on the MS-COCO and Flickr30K benchmark datasets show that our method enhances the overall image captioning's performance and the topic prediction task, and outperforms many recent image captioning approaches in terms of the evaluation metrics.
computer science, information systems,telecommunications, software engineering