Optimizing Medical Image Report Generation with Varied Attention Mechanisms

Mudunuri Venkata Chaitanya Varma,Pochamreddy Mukesh Reddy,Vivek Kumar Verma
DOI: https://doi.org/10.1109/IC3I59117.2023.10398149
2023-09-14
Abstract:Image captioning plays a crucial role in interpreting visual information in the medical field. This research paper represents an innovative approach to produce descriptive and insightful reports for its respective medical images. The combination of natural language processing and computer vision techniques are used, an encoder-decoder architecture is employed which aids in leveraging by using a qualified InceptionV3 model to extract image features. The captions which are generated were based on a deep learning model with a Gated Recurrent Unit-based decoder and an attention mechanism, guaranteeing the connection between the generated text to the image content. The research focuses mainly on the dataset which consists of medical images with its associated reports by utilizing data extraction and preprocessing methods to enhance the accuracy and contextual understanding of the generated report. The experiments and results section demonstrate the implementation details and efficacy of the proposed model in generating meaningful report for the medical image, thereby facilitating enhanced interpretation and analysis of medical images. This research contributes to the advancement of image captioning techniques in the medical domain and holds significance for clinical decision support systems and medical image analysis.
Medicine,Computer Science
What problem does this paper attempt to address?