A Labeled Ophthalmic Ultrasound Dataset with Medical Report Generation Based on Cross-modal Deep Learning

Jing Wang,Junyan Fan,Meng Zhou,Yanzhu Zhang,Mingyu Shi

2024-07-26

Abstract:Ultrasound imaging reveals eye morphology and aids in diagnosing and treating eye diseases. However, interpreting diagnostic reports requires specialized physicians. We present a labeled ophthalmic dataset for the precise analysis and the automated exploration of medical images along with their associated reports. It collects three modal data, including the ultrasound images, blood flow information and examination reports from 2,417 patients at an ophthalmology hospital in Shenyang, China, during the year 2018, in which the patient information is de-identified for privacy protection. To the best of our knowledge, it is the only ophthalmic dataset that contains the three modal information simultaneously. It incrementally consists of 4,858 images with the corresponding free-text reports, which describe 15 typical imaging findings of intraocular diseases and the corresponding anatomical locations. Each image shows three kinds of blood flow indices at three specific arteries, i.e., nine parameter values to describe the spectral characteristics of blood flow distribution. The reports were written by ophthalmologists during the clinical care. The proposed dataset is applied to generate medical report based on the cross-modal deep learning model. The experimental results demonstrate that our dataset is suitable for training supervised models concerning cross-modal medical data.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address the issue of automated generation of diagnostic reports for ophthalmic ultrasound images and to construct a multimodal dataset that includes ultrasound images, blood flow information, and corresponding diagnostic reports. #### Specific Problems 1. **Lack of Dataset**: There is a lack of relevant datasets for the automatic generation of ophthalmic ultrasound images and their diagnostic reports, especially Chinese datasets. 2. **Multimodal Information Fusion**: Existing medical report generation methods mostly focus on radiological images (such as chest X-rays), with relatively little research on ophthalmic ultrasound images. 3. **Clinical Application Needs**: Ophthalmologists face a significant workload and time pressure when analyzing ultrasound images and writing diagnostic reports, necessitating automated tools to assist in diagnosis. #### Main Contributions 1. **Construction of a Large-Scale Dataset**: The paper constructs a large-scale dataset containing 4,858 ophthalmic ultrasound images and their corresponding Chinese diagnostic reports. All data come from real clinical practice, and the reports accurately reflect the writing patterns of ophthalmologists. 2. **Introduction of Blood Flow Parameter Information**: Compared to existing datasets, this dataset also includes additional blood flow parameter information extracted from ultrasound examinations. These parameters describe the spectral characteristics of blood flow distribution at specific arteries, aiding in medical diagnosis and treatment decisions. 3. **Multimodal Report Generation Experiments**: Comprehensive report generation experiments were conducted based on the proposed multimodal memory network, and the prediction accuracy was evaluated using natural language generation (NLG) metrics. The results indicate that this dataset is suitable for medical report generation tasks and helps advance AI-based ophthalmic diagnostic technology. Through the above contributions, this paper not only fills the research gap in the field of automated generation of ophthalmic ultrasound images and their diagnostic reports but also provides high-quality data support for future research in related fields.

A Labeled Ophthalmic Ultrasound Dataset with Medical Report Generation Based on Cross-modal Deep Learning

A Deep Learning Analysis Framework for Ophthalmic Diseases and Physical Health from Binocular Fundus Image Pairs

DeepOpht: Medical Report Generation for Retinal Images via Deep Models and Visual Explanation

Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance

Benchmarking Supervised and Self-Supervised Learning Methods in A Large Ultrasound Muti-task Images Dataset

A label information fused medical image report generation framework

MultiEYE: Dataset and Benchmark for OCT-Enhanced Retinal Disease Recognition from Fundus Images

OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye Semantics

Comparative Analysis of Image Classification Methods for Automatic Diagnosis of Ophthalmic Images.

OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models

Automatic Intracranial Abnormality Detection and Localization in Head CT Scans by Learning from Free-Text Reports.

Deep Learning in Medical Ultrasound Image Analysis: A Review

Effect of Oral Calcium Carbonate on Urinary Excretion of Ca, Na and Mg in Advanced Renal Disease

Integrating Medical Imaging and Clinical Reports Using Multimodal Deep Learning for Advanced Disease Analysis

Automatic Medical Report Generation Based on Cross-View Attention and Visual-Semantic Long Short Term Memorys

Automated segmentation of macular edema for the diagnosis of ocular disease using deep learning method

Deep learning to automate the labelling of head MRI datasets for computer vision applications

Semi-Supervised Natural Language Approach for Fine-Grained Classification of Medical Reports

Square variation of Brownian paths in Banach spaces

EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis