No regret sample selection with noisy labels

Heon Song,Nariaki Mitsuo,Seiichi Uchida,Daiki Suehiro
DOI: https://doi.org/10.1007/s10994-023-06478-8
IF: 5.414
2024-01-07
Machine Learning
Abstract:Deep neural networks (DNNs) suffer from noisy-labeled data because of the risk of overfitting. To avoid the risk, in this paper, we propose a novel DNN training method with sample selection based on adaptive k -set selection, which selects k (< n , where n is the number of training samples) samples with a small noise-risk from the whole n noisy training samples at each epoch. It has the strong advantage of guaranteeing the performance of the selection theoretically. Roughly speaking, a regret, which is defined by the difference between the actual selection and the best selection, of the proposed method is theoretically bounded, even though the best selection is unknown until the end of all epochs. The experimental results on multiple noisy-labeled datasets demonstrate that our sample selection strategy works effectively in the DNN training; in fact, the proposed method achieved the best or the second-best performance among state-of-the-art methods, while requiring a significantly lower computational cost.
computer science, artificial intelligence
What problem does this paper attempt to address?