HSALC: Hard Sample Aware Label Correction for Medical Image Classification

Yangtao Wang,Yicheng Ye,Yanzhao Xie,Maobin Tang,Lisheng Fan
DOI: https://doi.org/10.1007/s11042-024-20114-0
IF: 2.577
2024-01-01
Multimedia Tools and Applications
Abstract:Medical image automatic classification has always been a research hotspot, but the existing methods suffer from the label noise problem, which either discards those samples with noisy labels or produces wrong label correction, seriously preventing the medical image classification performance improvement. To address the above problems, in this paper, we propose a hard sample aware label correction (termed as HSALC) method for medical image classification. Our HSALC mainly consists of a sample division module, a clean · hard · noisy (termed as CHN) detection module and a label noise correction module. First, in the sample division module, we design a sample division criterion based on the training difficulty and training losses to divide all samples into three preliminary subsets: clean samples, hard samples and noisy samples. Second, in the CHN detection module, we add noise to the above clean samples and repeatedly adopt the sample division criterion to effectively detect all data, which helps obtain highly reliable clean samples, hard samples and noisy samples. Finally, in the label noise correction module, in order to make full use of each available sample, we train a correction model to purify and correct the wrong labels of noisy samples as much as possible, which brings a highly purified dataset. We conduct extensive experiments on five image datasets including three medical image datasets and two natural image datasets. Experimental results demonstrate that HSALC can greatly promote classification performance on noisily labeled datasets, especially with high noise ratios. The source code of this paper is publicly available at GitHub: https://github.com/YYC117/HSALC .
What problem does this paper attempt to address?