On the Noise Estimation Statistics

Wei Gao,Teng Zhang,Bin-Bin Yang,Zhi-Hua Zhou
DOI: https://doi.org/10.1016/j.artint.2021.103451
IF: 14.4
2021-01-01
Artificial Intelligence
Abstract:Learning with noisy labels has attracted much attention during the past few decades. A fundamental problem is how to estimate noise proportions from corrupted data. Previous studies on this issue resort to the estimations of class distributions, conditional distributions, or the kernel embedding of distributions. In this paper, we present another simple and effective approach for noise estimation. The basic idea is to utilize the first- and second-order statistics of observed data, and the positive semi-definiteness of covariance matrices. Then, an upper bound on noise estimation is provided without additional assumptions over data distribution. Based on this idea and using the locality property of random noise, we develop the Noise Estimation Statistics with Clusters (NESC) method, which firstly clusters the corrupted data by k-means algorithm, and then makes noise estimation from clusters based on the first- and second-order statistics. We present the existence, uniqueness and convergence analysis of our noise estimation, and empirical studies verify the effectiveness of the NESC method.
What problem does this paper attempt to address?