Image Description Generator using Deep Learning

Deepak R Ksheerasagar
DOI: https://doi.org/10.22214/ijraset.2022.45988
2022-07-31
International Journal for Research in Applied Science and Engineering Technology
Abstract:Abstract: To recognise the context of an image and describe it in a natural language like English, the fundamental task of creating image captions uses computer vision and natural language processing techniques. To create a natural language description from an input image, image caption generation is used. Convolutional Neural Network (CNN) model and Long Short-Term Memory (LSTM) model are the two parts of this Python project that are used to implement it. The CNN-LSTM architecture combines a Convolutional Neural Network (CNN), which creates features that describe the images, with a Long Short-Term Memory (LSTM), a type of Recurrent Neural Network (RNN), which precisely structures meaningful sentences out of the generated data. The ability to automatically describe an image's content has a variety of uses, including helping visually impaired people better understand the content of images and providing more precise and condensed image information for social media
What problem does this paper attempt to address?