Fast ABC-Boost for Multi-Class Classification

Ping Li
DOI: https://doi.org/10.48550/arXiv.1006.5051
IF: 5.414
2010-06-25
Machine Learning
Abstract:Abc-boost is a new line of boosting algorithms for multi-class classification, by utilizing the commonly used sum-to-zero constraint. To implement abc-boost, a base class must be identified at each boosting step. Prior studies used a very expensive procedure based on exhaustive search for determining the base class at each boosting step. Good testing performances of abc-boost (implemented as abc-mart and abc-logitboost) on a variety of datasets were reported. For large datasets, however, the exhaustive search strategy adopted in prior abc-boost algorithms can be too prohibitive. To overcome this serious limitation, this paper suggests a heuristic by introducing Gaps when computing the base class during training. That is, we update the choice of the base class only for every $G$ boosting steps (i.e., G=1 in prior studies). We test this idea on large datasets (Covertype and Poker) as well as datasets of moderate sizes. Our preliminary results are very encouraging. On the large datasets, even with G=100 (or larger), there is essentially no loss of test accuracy. On the moderate datasets, no obvious loss of test accuracy is observed when G<= 20~50. Therefore, aided by this heuristic, it is promising that abc-boost will be a practical tool for accurate multi-class classification.
What problem does this paper attempt to address?