A Novel Automatic Image Caption Generation Using Bidirectional Long-Short Term Memory Framework

Ye Zhongfu,Khan Rashid,Naqvi Nuzhat,Islam M. Shujah
DOI: https://doi.org/10.1007/s11042-021-10632-6
IF: 2.577
2021-01-01
Multimedia Tools and Applications
Abstract:Image Captioning, the process of generating a textual description of an image, has emerged as a hot research due to its practical importance in many domains. It is a challenging task as it uses both Natural Language Processing and Computer Vision related fields to generate the captions. Despite the fact that the literature has reported notable image captioning methodologies, they still lag in accomplishing the substantial performance level for diverse datasets. This paper proposes an image caption generating mechanism based on Optimized Bidirectional Long Short-Term Memory (B-LSTM) model. We propose a variant of Moth Flame Optimization (PMFO), termed here as Proposed Moth Flame Optimization (PMFO), which has logarithmic spiral update based on correlation. The performance of the proposed model is demonstrated on benchmark datasets like Flicker 8 k, Flicker30k, VizWik and COCO datasets using renowned metrics such as CIDEr, BLEU, SPICE and ROUGH. The performance analysis proves that the B-LSTM achieves better performance on caption generation than state-of-the-art methods.
What problem does this paper attempt to address?