Cross-Validation Is All You Need: A Statistical Approach To Label Noise Estimation

Jianan Chen,Vishwesh Ramanathan,Tony Xu,Anne L. Martel
2024-07-19
Abstract:Machine learning models experience deteriorated performance when trained in the presence of noisy labels. This is particularly problematic for medical tasks, such as survival prediction, which typically face high label noise complexity with few clear-cut solutions. Inspired by the large fluctuations across folds in the cross-validation performance of survival analyses, we design Monte-Carlo experiments to show that such fluctuation could be caused by label noise. We propose two novel and straightforward label noise detection algorithms that effectively identify noisy examples by pinpointing the samples that more frequently contribute to inferior cross-validation results. We first introduce Repeated Cross-Validation (ReCoV), a parameter-free label noise detection algorithm that is robust to model choice. We further develop fastReCoV, a less robust but more tractable and efficient variant of ReCoV suitable for deep learning applications. Through extensive experiments, we show that ReCoV and fastReCoV achieve state-of-the-art label noise detection performance in a wide range of modalities, models and tasks, including survival analysis, which has yet to be addressed in the literature. Our code and data are publicly available at <a class="link-external link-https" href="https://github.com/GJiananChen/ReCoV" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of performance degradation when machine - learning models are trained in the presence of label noise, especially in medical tasks. Label noise refers to inaccurate or incorrect labels of some samples in the dataset, which is very common in real - world datasets, especially in the medical field. Specifically, the paper focuses on how to detect and identify samples with noisy labels. Label noise is particularly complex in medical tasks, such as survival prediction. Such tasks usually face higher label - noise complexity and lack clear solutions. By observing the performance fluctuations between different folds during cross - validation, the author proposes that these fluctuations may be caused by label noise and further designs experiments to verify this hypothesis. ### The methods proposed in the paper To solve the above problems, the paper proposes two novel and straightforward label - noise - detection algorithms: 1. **Repeated Cross - Validation (ReCoV)**: This is a parameter - free, model - independent label - noise - detection algorithm. It can identify samples that frequently lead to poor validation results through multiple repeated cross - validations, thereby determining that these samples may have noisy labels. 2. **fastReCoV**: This is a more efficient but slightly less robust variant of ReCoV, suitable for deep - learning applications. It improves computational efficiency by introducing techniques such as weighted sampling and exponential moving average. ### Experimental verification Through extensive experiments, the author shows the excellent performance of ReCoV and fastReCoV in multiple modalities, models, and tasks, especially in survival analysis, which has not been fully explored in previous literature. ### Conclusion The main contributions of the paper are: - Discovering that performance fluctuations in different folds during cross - validation can reflect the existence of label noise. - Proposing two effective label - noise - detection algorithms, ReCoV and fastReCoV, which can achieve state - of - the - art performance in multiple tasks. - Demonstrating the effectiveness and practicality of these methods in dealing with real - world label noise, especially on medical - image datasets. In summary, this paper provides a new perspective and tool to deal with the label - noise problem, which helps to improve the performance and reliability of machine - learning models in the presence of label noise.