Abstract:Abstract The detailed physiological perspectives captured by medical imaging provides actionable insights to doctors to manage comprehensive care of patients. However, the quality of such diagnostic image modalities is often affected by mismanagement of the image capturing process by poorly trained technicians and older/poorly maintained imaging equipment. Further, a patient is often subjected to scanning at different orientations to capture the frontal, lateral and sagittal views of the affected areas. Due to the large volume of diagnostic scans performed at a modern hospital, adequate documentation of such additional perspectives is mostly overlooked, which is also an essential key element of quality diagnostic systems and predictive analytics systems. Another crucial challenge affecting effective medical image data management is that the diagnostic scans are essentially stored as unstructured data, lacking a well-defined processing methodology for enabling intelligent image data management for supporting applications like similar patient retrieval , automated disease prediction etc. One solution is to incorporate automated diagnostic image descriptions of the observation/findings by leveraging computer vision and natural language processing. In this work, we present multi-task neural models capable of addressing these critical challenges. We propose ESRGAN, an image enhancement technique for improving the quality and visualization of medical chest x-ray images, thereby substantially improving the potential for accurate diagnosis, automatic detection and region-of-interest segmentation. We also propose a CNN-based model called ViewNet for predicting the view orientation of the x-ray image and generating a medical report using Xception net, thus facilitating a robust medical image management system for intelligent diagnosis applications. Experimental results are demonstrated using standard metrics like BRISQUE, PIQE and BLEU scores, indicating that the proposed models achieved excellent performance. Further, the proposed deep learning approaches enable diagnosis in a lesser time and their hybrid architecture shows significant potential for supporting many intelligent diagnosis applications.

Optimizing Medical Image Report Generation with Varied Attention Mechanisms

VMEKNet: Visual Memory and External Knowledge Based Network for Medical Report Generation.

Automatic Report Generation Method Based on Multiscale Feature Extraction and Word Attention Network.

Salad, house dressing, but hold the sulfites.

Automatic Medical Report Generation Based on Cross-View Attention and Visual-Semantic Long Short Term Memorys

Deep neural models for automated multi-task diagnostic scan management—quality enhancement, view classification and report generation

Multi-modal transformer architecture for medical image analysis and automated report generation

Retinal OCT image report generation based on visual and semantic topic attention model

Intensive Vision-guided Network for Radiology Report Generation

On the Automatic Generation of Medical Imaging Reports

Cross-modal Contrastive Attention Model for Medical Report Generation.

AIMNet: Adaptive Image-Tag Merging Network For Automatic Medical Report Generation

Study on Medical Image Report Generation Based on Improved Encoding-Decoding Method

A survey on automatic generation of medical imaging reports based on deep learning

Auxiliary signal-guided knowledge encoder-decoder for medical report generation

Designing a Robust Radiology Report Generation System

Understanding transfer learning for chest radiograph clinical report generation with modified transformer architectures

A medical report generation method integrating teacher–student model and encoder–decoder network

Vision-Language Model for Generating Textual Descriptions From Clinical Images: Model Development and Validation Study

Attention based automated radiology report generation using CNN and LSTM

Toward an enhanced automatic medical report generator based on large transformer models