Optimal and Adaptive Algorithms for Online Boosting

Alina Beygelzimer,Satyen Kale,Haipeng Luo
DOI: https://doi.org/10.48550/arXiv.1502.02651
2015-02-10
Abstract:We study online boosting, the task of converting any weak online learner into a strong online learner. Based on a novel and natural definition of weak online learnability, we develop two online boosting algorithms. The first algorithm is an online version of boost-by-majority. By proving a matching lower bound, we show that this algorithm is essentially optimal in terms of the number of weak learners and the sample complexity needed to achieve a specified accuracy. This optimal algorithm is not adaptive however. Using tools from online loss minimization, we derive an adaptive online boosting algorithm that is also parameter-free, but not optimal. Both algorithms work with base learners that can handle example importance weights directly, as well as by rejection sampling examples with probability defined by the booster. Results are complemented with an extensive experimental study.
Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to transform any weak online learner into a strong online learner, that is, to improve the accuracy of weak online learning algorithms through online boosting. Specifically, the authors propose two online boosting algorithms: 1. **Online Boost - by - Majority (Online BBM)**: This is an online version based on the classic Boost - by - Majority algorithm. By proving a matching lower bound, the authors show that this algorithm is asymptotically optimal in terms of the number of weak learners and sample complexity required to reach the specified accuracy. 2. **AdaBoost.OL**: This is an adaptive and parameter - free online boosting algorithm that utilizes online loss minimization tools. Although it is not optimal, it usually performs better in practical applications. ### Background of the Main Problem - **Differences between Online Learning and Batch Learning**: Online learning algorithms receive samples one by one and immediately update the predictor, while batch learning algorithms receive all samples at once. Online learning algorithms usually do not need to make any random assumptions about the data, so they are more suitable for situations where the data changes over time. - **Previous Work**: Previous work has explored how to apply boosting methods in batch learning to online learning, but these methods have certain limitations both theoretically and practically. ### Main Contributions of the Paper 1. **New Online Weak Learning Assumption**: The authors propose a weaker online weak learning assumption, which can be directly regarded as the online version of the weak learning assumption in standard batch boosting. 2. **Sampling Technique**: The proposed algorithms do not require weighted online learning but use a sampling technique similar to filtered boosting in the batch setting. 3. **Optimality Proof**: For the Online BBM algorithm, the authors prove that it is asymptotically optimal in terms of the number of weak learners and sample complexity and give the corresponding lower bound. 4. **Adaptive Algorithm**: The AdaBoost.OL algorithm is adaptive and can dynamically adjust the weights according to the performance of each weak learner, thus improving the flexibility and performance of the algorithm. ### Summary of Mathematical Formulas - **Definition of Weak Online Learner**: \[ \sum_{t = 1}^T 1\{\hat{y}_t\neq y_t\} \leq \left(\frac{1}{2}-\gamma\right)T + S \] where $\gamma$ is the edge, and $S$ is the excess loss. - **Error Rate Bound of Online BBM Algorithm**: \[ \exp\left(-\frac{N\gamma^2}{2}\right)T+\tilde{O}\left(\sqrt{N}\left(S + \frac{1}{\gamma}\right)\right) \] - **Error Rate Bound of AdaBoost.OL Algorithm**: \[ 2\sum_i\gamma_i^2T+\tilde{O}\left(\frac{N^2}{\sum_i\gamma_i^2}\right) \] Through these improvements, the paper not only theoretically proves the superiority of the new algorithms but also verifies their practical effects in experiments.