Fuzzy-Based Information Decomposition for Incomplete and Imbalanced Data Learning.

Shigang Liu,Jun Zhang,Yang Xiang,Wanlei Zhou
DOI: https://doi.org/10.1109/tfuzz.2017.2754998
IF: 12.253
2017-01-01
IEEE Transactions on Fuzzy Systems
Abstract:Class imbalance and missing values are two critical problems in pattern classification. Researchers have proposed a number of techniques to address each of the problems. However, no single technique can solve the two problems. Moreover, the simple combination approach cannot accurately classify the imbalanced data with missing values. This paper develops a fuzzy-based information decomposition (FID) method to simultaneously address these two problems. In the new FID method, the two different problems are treated as the same missing data estimation problem. In particular, FID rebalances the training data by creating synthetic samples for the minority class. The proposed scheme has two steps: weighting and recovery. In the weighting step, the weights produced by the fuzzy membership functions are used to quantify the contribution of the observed data to the missing estimation. In the recovery step, missing values will be estimated by taking into account different contribution of the observed data. To evaluate the performance of the new FID method, a large number of classification experiments have been carried out on 27 well-known datasets. The results show that the FID method significantly outperforms other ten state-of-the-art individual methods and eight combination methods when missing values and imbalanced data present at the same time.
What problem does this paper attempt to address?