Efficient Vocal Cord Lesion Recognition by Combing Yolov7 and Attention Module

Yanda Wu,Yuqing He,Dongyan Huang,Yang Liu,Jingxuan Zhu,Hengli Zhang
DOI: https://doi.org/10.1117/12.3007662
2023-01-01
Abstract:Currently, vocal cord lesion diagnosis of laryngoscopic images mainly relies on physicians' expertise and clinical experience. This greatly increases the work pressure of physicians and has limited efficiency. To solve the above problems, this study aims to construct a deep network structure named VCLR-Net based on the improved YOLOv7 to achieve the detection and recognition of vocal cord lesions. First, Convolutional Block Attention Modules (CBAM) are added to the HEAD network to improve the focus of color and spatial features on lesions. Next, the Alpha Intersection over Union loss (AlphaIOU) loss function is used to improve the robustness of the lesion recognition model. In the experimental results, the proposed VCLR-Net network achieves mAP and F1 of 0.762 and 0.748 in the image dataset. The network enables accurate lesion recognition for a large number of laryngoscopic images.
What problem does this paper attempt to address?