Consolidation of Structure of High Noise Data by a New Noise Index and Reinforcement Learning.

Tianyi Huang,Zhiling Cai,Ruijia Li,Shiping Wang,William Zhu
DOI: https://doi.org/10.1016/j.ins.2022.10.008
IF: 8.1
2022-01-01
Information Sciences
Abstract:Data denoising is an essential issue in machine learning and computer vision. However, most existing denoising methods can handle only low-noise data, because it is difficult for these methods to obtain the true structures of high-noise data. An existing partial solution is to consolidate the structures of data by moving each sample to its near high density region. It actually regards all non-noisy samples as noisy ones, but non-noisy samples are important for constructing the structures of high-noise data so they should not be moved. To address this problem, we propose a new denoising method called Denoising by a new Noise index and Reinforcement learning (DNR). Firstly, it detects each noisy sample by its density and the distance between this sample and the center of its neighbors. A noisy sample usually has a low density and most neighbors of this sample will be on the same side of it, leading to the center of its neighbors far from it, especially for high-noise data. Secondly, for each detected noisy sample, DNR models its movement as a Markov decision process to store the experience in this movement. Finally, NDR learns a policy to iteratively move each detected noisy sample to its near high density region by learns the experience of this movement in reinforcement learning. The learned experience can effectively help the movement adapt to the high-noise in real-world cases. In DNR, the structures of high-noise data can be well consolidated by our detection and movement of noisy samples. To prove the rationality of DNR, we theoretically analyze its convergence. Then, we perform the experiment to illustrate that DNR can better denoise high-noise data than existing denoising methods. The source code can be downloaded from https://github.com/TianyiHuang2022 .
What problem does this paper attempt to address?