Parallel Learning of Large-Scale Multi-Label Classification Problems with Min-Max Modular Liblinear

Yangyang Chen,Bao-Liang Lu,Hai Zhao
DOI: https://doi.org/10.1109/ijcnn.2012.6252679
2012-01-01
Abstract:The study on pattern classification trends to be towards large-scale, multi-label, and imbalanced problems. The amount of the data which need to be classified is typically dozens of millions and it keeps rapid increasing in recent years. Traditional pattern classification approaches are inefficient and even ineffective in this situation. In our previous work, we proposed a min-max modular (M-3) network for dealing with large-scale and imbalanced problems. M-3-network is a generalized modular learning framework and includes three main steps: decomposing a large-scale problem into several smaller independent sub-problems, learning these sub-problems in parallel, and combining the results of the sub-problems to generate a solution to the original problem. In this paper, we embed LIBLINEAR into M-3-network (M-3-liblnear) to deal with large-scale, multi-label, and imbanlanced pattern classification problems. LIBLINEAR is a fast implementation of a linear classifier. M-3-Liblinear uses LIBLINEAR as a base classifier to learn each of the sub-problems. We compare M-3-Liblinear with Liblinear-cdblock on a large-scale Japanese patent classification problem. Experimental results demonstrate that M-3-Liblinear is superior to Liblinear-cdblock in both training time and generalization performance.
What problem does this paper attempt to address?