Evaluation of Spectrum-Aided Visual Enhancer (SAVE) in Esophageal Cancer Detection Using YOLO Frameworks

Chu-Kuang Chou,Riya Karmakar,Yu-Ming Tsao,Lim Wei Jie,Arvind Mukundan,Chien-Wei Huang,Tsung-Hsien Chen,Chau-Yuan Ko,Hsiang-Chen Wang
DOI: https://doi.org/10.3390/diagnostics14111129
IF: 3.6
2024-05-30
Diagnostics
Abstract:The early detection of esophageal cancer presents a substantial difficulty, which contributes to its status as a primary cause of cancer-related fatalities. This study used You Only Look Once (YOLO) frameworks, specifically YOLOv5 and YOLOv8, to predict and detect early-stage EC by using a dataset sourced from the Division of Gastroenterology and Hepatology, Ditmanson Medical Foundation, Chia-Yi Christian Hospital. The dataset comprised 2741 white-light images (WLI) and 2741 hyperspectral narrowband images (HSI-NBI). They were divided into 60% training, 20% validation, and 20% test sets to facilitate robust detection. The images were produced using a conversion method called the spectrum-aided vision enhancer (SAVE). This algorithm can transform a WLI into an NBI without requiring a spectrometer or spectral head. The main goal was to identify dysplasia and squamous cell carcinoma (SCC). The model's performance was evaluated using five essential metrics: precision, recall, F1-score, mAP, and the confusion matrix. The experimental results demonstrated that the HSI model exhibited improved learning capabilities for SCC characteristics compared with the original RGB images. Within the YOLO framework, YOLOv5 outperformed YOLOv8, indicating that YOLOv5's design possessed superior feature-learning skills. The YOLOv5 model, when used in conjunction with HSI-NBI, demonstrated the best performance. It achieved a precision rate of 85.1% (CI95: 83.2–87.0%, p < 0.01) in diagnosing SCC and an F1-score of 52.5% (CI95: 50.1–54.9%, p < 0.01) in detecting dysplasia. The results of these figures were much better than those of YOLOv8. YOLOv8 achieved a precision rate of 81.7% (CI95: 79.6–83.8%, p < 0.01) and an F1-score of 49.4% (CI95: 47.0–51.8%, p < 0.05). The YOLOv5 model with HSI demonstrated greater performance than other models in multiple scenarios. This difference was statistically significant, suggesting that the YOLOv5 model with HSI significantly improved detection capabilities.
medicine, general & internal
What problem does this paper attempt to address?
This paper attempts to solve the problem of early detection of esophageal cancer. Specifically, the researchers used the YOLO framework (especially YOLOv5 and YOLOv8) to predict and detect early - stage esophageal cancer. They utilized a technology named Spectrum - Aided Visual Enhancer (SAVE) to convert white - light images (WLI) into hyperspectral narrow - band images (HSI - NBI), in order to improve the model's ability to learn the characteristics of early - stage esophageal cancer. ### Research Background Esophageal cancer (EC) is one of the leading causes of cancer - related deaths worldwide, and early detection is crucial for improving patients' survival rates. However, the symptoms of early - stage esophageal cancer are not obvious, leading to difficulties in diagnosis. Although traditional endoscopic examinations are important, their sensitivity in identifying early lesions (such as dysplasia) is limited. Therefore, the researchers explored the use of hyperspectral imaging (HSI) and deep - learning techniques to improve the detection accuracy of early - stage esophageal cancer. ### Research Methods 1. **Data Set**: - The data set contains 2,741 white - light images (WLI) and 2,741 hyperspectral narrow - band images (HSI - NBI). - The images were divided into 60% training set, 20% validation set, and 20% test set. 2. **Image Conversion**: - Use the SAVE technology to convert WLI into HSI - NBI. The SAVE algorithm generates HSI - NBI images by calibrating the relationship between RGB images and spectrometers and converting RGB images into the CIE 1931 XYZ color space. 3. **Model Architecture**: - Two models, YOLOv5 and YOLOv8, were used for the experiment. - The architecture of YOLOv5 includes Model Backbone (CSP - Darknet53), Model Neck (SPPF and PAN structures), and Model Head (using anchor boxes to generate output vectors). - YOLOv8 adopts an anchor - free design, which optimizes efficiency and reduces the time for box prediction and non - maximum suppression processes. ### Experimental Results - **Performance Evaluation**: - Metrics such as precision, recall, F1 - score, and mean average precision (mAP) were used to evaluate the model performance. - The experimental results show that YOLOv5 exhibits the best performance when combined with HSI - NBI images. Specifically: - When detecting squamous cell carcinoma (SCC), the precision of YOLOv5 reaches 85.1% (95% CI: 83.2–87.0%, p < 0.01), and the F1 - score is 52.5% (95% CI: 50.1–54.9%, p < 0.01). - In contrast, the precision of YOLOv8 is 81.7% (95% CI: 79.6–83.8%, p < 0.01), and the F1 - score is 49.4% (95% CI: 47.0–51.8%, p < 0.05). - YOLOv5 outperforms other models in multiple scenarios, and the differences are statistically significant. ### Conclusion This study demonstrates the potential of using HSI - NBI and the YOLOv5 model in the early detection of esophageal cancer. The introduction of the SAVE technology significantly improves the contrast and feature extraction ability of images, thereby enhancing the detection accuracy of the model. This method provides a new tool for clinical applications and helps to improve the diagnosis rate of early - stage esophageal cancer and the survival rate of patients.