A boosting method to detect noisy data

Xiao-Dong Liu,Chun-yi Shi,Xue-Dao Gu
DOI: https://doi.org/10.1109/ICMLC.2005.1527276
2005-01-01
Abstract:Noisy data is inherent in the field of data mining. If prior knowledge of such data was available, it would be a simple process to remove or account for noise and improve model robustness. Unfortunately, in the majority of learning situations, the presence of underlying noise is suspected but difficult to detect. Ensemble classification techniques such as bagging, boosting and arcing algorithms have received much attention in recent literature. Such techniques have been shown to lead to reduced classification error on unseen cases, and this paper demonstrates that they may also be employed as noise detectors. In this paper, a brief overview of ensemble methods is presented, and a boosting method based on instance weights and attribute weights information gain is proposed to make boosting method useful for detecting noisy data. The result of experiments on one city endowment insurance database shows this to be a successful approach.
What problem does this paper attempt to address?