HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation

Ziyuan Ding,Yixiong Liang,Shichao Kan,Qing Liu
DOI: https://doi.org/10.1007/978-3-031-72114-4_32
2024-11-06
Abstract:High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global fusion methods. These methods preserve fine details using local regions and capture long-range context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, adversely affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus lesion segmentation. It integrates a high-resolution representation learning module to capture fine-grained local features and a high-resolution fusion module to fuse multi-scale predictions. Our method effectively improves the overall segmentation accuracy of fundus lesions while consuming reasonable memory and computational overhead, and maintaining satisfying inference speed. Experimental results on the IDRID and DDR datasets demonstrate the effectiveness of our method. Code is available at <a class="link-external link-https" href="https://github.com/CVIU-CSU/HRDecoder" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of high-precision segmentation of small lesions in fundus images. Specifically, pixel-level classification of small lesions in fundus images (such as hard exudates, hemorrhages, soft exudates, and microaneurysms) requires higher resolution, while simply increasing the resolution of the input image leads to increased memory usage, higher computational overhead, and slower inference speed. These issues severely impact the practical application and performance improvement of the model. To solve these problems, the authors propose HRDecoder (High-Resolution Decoder Network), which improves segmentation performance through the following methods: 1. **High-Resolution Representation Learning Module**: Extracts detailed local features from large-scale low-resolution feature maps. 2. **High-Resolution Fusion Module**: Fuses multi-scale prediction results to capture detailed information and local contextual cues. Through these methods, HRDecoder not only improves segmentation performance but also effectively alleviates the issues of high memory usage, high computational overhead, and slow inference speed. Experimental results show that HRDecoder achieves excellent performance on the IDRID and DDR datasets.