Abstract:Esophagogastroduodenoscopy (EGD) is a critical step in the diagnosis of upper gastrointestinal disorders. However, due to inexperience or high workload, there is a wide variation in EGD performance by endoscopists. Variations in performance may result in exams that do not completely cover all anatomical locations of the stomach, leading to a potential risk of missed diagnosis of gastric diseases. Numerous guidelines or expert consensus have been proposed to assess and optimize the quality of endoscopy. However, there is a lack of mature and robust methods to accurately apply to real clinical real-time video environments. In this paper, we innovatively define the problem of recognizing anatomical locations in videos as a multi-label recognition task. This can be more consistent with the model learning of image-to-label mapping relationships. We propose a combined structure of a deep learning model (GL-Net) that combines a graph convolutional network (GCN) with long short-term memory (LSTM) networks to both extract label features and correlate temporal dependencies for accurate real-time anatomical locations identification in gastroscopy videos. Our methodological evaluation dataset is based on complete videos of real clinical examinations. A total of 29,269 images from 49 videos were collected as a dataset for model training and validation. Another 1736 clinical videos were retrospectively analyzed and evaluated for the application of the proposed model. Our method achieves 97.1% mean accuracy (mAP), 95.5% mean per-class accuracy and 93.7% average overall accuracy in a multi-label classification task, and is able to process these videos in real-time at 29.9 FPS. In addition, based on our approach, we designed a system to monitor routine EGD videos in detail and perform statistical analysis of the operating habits of endoscopists, which can be a useful tool to improve the quality of clinical endoscopy.

Self- and Semi-supervised Learning for Gastroscopic Lesion Detection

Real-Time Multi-Label Upper Gastrointestinal Anatomy Recognition from Gastroscope Videos

Category-Level Regularized Unlabeled-to-Labeled Learning for Semi-supervised Prostate Segmentation with Multi-site Unlabeled Data

Semi-Supervised Segmentation Framework for Gastrointestinal Lesion Diagnosis in Endoscopic Images

PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image Segmentation

Self-supervised learning for gastritis detection with gastric X-ray images

Self-Adaptive Transfer Learning for Multicenter Glaucoma Classification in Fundus Retina Images

Lesion Detection of Electronic Gastroscope Images Based on Multiscale Texture Feature

SSL-CPCD: Self-supervised learning with composite pretext-class discrimination for improved generalisability in endoscopic image analysis

Self-Supervised Learning for Endoscopic Video Analysis

Improving the Classification Performance of Esophageal Disease on Small Dataset by Semi-supervised Efficient Contrastive Learning

Pseudo Label-Guided Data Fusion and Output Consistency for Semi-Supervised Medical Image Segmentation

Lesion Search with Self-supervised Learning

Improving image classification of gastrointestinal endoscopy using curriculum self-supervised learning

Unsupervised learning for labeling global glomerulosclerosis

Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions Segmentation

Unsupervised Local Discrimination for Medical Images

Self-Supervised Correction Learning for Semi-Supervised Biomedical Image Segmentation

Self-Supervised Cross-Level Consistency Learning for Fundus Image Classification

Self-FI: Self-Supervised Learning for Disease Diagnosis in Fundus Images

GLGFormer: Global Local Guidance Network for Mucosal Lesion Segmentation in Gastrointestinal Endoscopy Images