Abstract:LogitBoost is a popular Boosting variant that can be applied to either binary or multi-class classification. From a statistical viewpoint LogitBoost can be seen as additive tree regression by minimizing the Logistic loss. Following this setting, it is still non-trivial to devise a sound multi-class LogitBoost compared with to devise its binary counterpart. The difficulties are due to two important factors arising in multiclass Logistic loss. The first is the invariant property implied by the Logistic loss, causing the optimal classifier output being not unique, i.e. adding a constant to each component of the output vector won’t change the loss value. The second is the density of the Hessian matrices that arise when computing tree node split gain and node value fittings. Oversimplification of this learning problem can lead to degraded performance. For example, the original LogitBoost algorithm is outperformed by ABC-LogitBoost thanks to the latter’s more careful treatment of the above two factors. In this paper we propose new techniques to address the two main difficulties in multiclass LogitBoost setting: (1) we adopt a vector tree model (i.e. each node value is vector) where the unique classifier output is guaranteed by adding a sum-to-zero constraint, and (2) we use an adaptive block coordinate descent that exploits the dense Hessian when computing tree split gain and node values. Higher classification accuracy and faster convergence rates are observed for a range of public data sets when compared to both the original and the ABC-LogitBoost implementations. We also discuss another possibility to cope with LogitBoost’s dense Hessian matrix. We derive a loss similar to the multi-class Logistic loss but which guarantees a diagonal Hessian matrix. While this makes the optimization (by Newton descent) easier we unfortunately observe degraded performance for this modification. We argue that working with the dense Hessian is likely unavoidable, therefore making techniques like those proposed in this paper necessary for efficient implementations.

A New Multi-Layer Classification Method Based on Logistic Regression

A Multi-Class Large Margin Classifier

A Bayesian Network nearest k-labels method for Multi-label classification

Single-Label Multi-Class Image Classification by Deep Logistic Regression

A Two-Stage Active Learning Method for Image Classification

A Selective Ensemble Classifier Using Multiobjective Optimization Based Extreme Learning Machine Algorithm

Robust Multinomial Logistic Regression Based on RPCA

Improved Logistic Regression Algorithm Based on Kernel Density Estimation for Multi-Classification with Non-Equilibrium Samples

Multi-objective Layer-wise Optimization and Multi-level Probability Fusion for Image Description Generation Using LSTM

Multi-labelled classification using maximum entropy method.

Co-learning Binary Classifiers for LP-based Multi-Label Classification

Nesting Algorithm for Multi-Classification Problems

A Multi-class Classification Algorithm Based on Ordinal Regression Machine.

Multi-objective Evolutionary Instance Selection for Multi-label Classification

An improved multiclass LogitBoost using adaptive-one-vs-one

OPM2L: an Optimal Instance Partition-Based Multi-Metric Learning Method for Heterogeneous Dataset Classification

Multi-Label Classification of Research Papers Using Multi-Label K-Nearest Neighbour Algorithm

Multi-Classification using One-versus-One Deep Learning Strategy with Joint Probability Estimates

Deep Neural Network Optimization Based on Binary Method for Handling Multi-Class Problems

Multi-class Boosting Based on Phase-out Model

Experimental Comparisons of Multi-class Classifiers.