A survey of imbalanced pattern classification problems

YE Zhi-fei,WEN Yi-min
2009-01-01
Abstract:Imbalanced data sets have always been regarded as presenting significant difficulties when applying machine learning methods to real-world pattern classification problems.Although various approaches have been proposed during the past decade,limitations are imposed by many real-world imbalanced data sets,and as a result,a lot of further research is currently being done.In this paper,we provide an up-to-date survey of research on imbalanced pattern classification problems.We first took a deep look into the problems that imbalanced data sets bring,and then we introduced different kinds of solutions in detail,with their representative approaches.Finally,using three real imbalanced data sets,we compared the performance of some typical methods including re-sampling,cost sensitive learning,training set partitions,and the performance of classifier ensembles.In addition,topics such as evaluation indexes and future areas of research were also discussed.
What problem does this paper attempt to address?