Gradient Boosting Neural Networks: GrowNet

Sarkhan Badirli,Xuanqing Liu,Zhengming Xing,Avradeep Bhowmik,Khoa Doan,Sathiya S. Keerthi
DOI: https://doi.org/10.48550/arXiv.2002.07971
2020-06-15
Abstract:A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision tree. The proposed model rendered outperforming results against state-of-the-art boosting methods in all three tasks on multiple datasets. An ablation study is performed to shed light on the effect of each model components and model hyperparameters.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: to develop a new gradient - boosting framework, using shallow neural networks as "weak learners" to overcome the complexity and difficulties in the design and training of traditional deep neural networks (DNNs). Specifically, the authors introduce a new model called GrowNet, which aims to combine the powerful functions of gradient - boosting with the flexibility and versatility of neural networks, thereby building complex deep neural networks layer by layer. ### Main Problems and Solutions 1. **Complex Design and Training of Traditional Deep Neural Networks** - **Problem**: It is very difficult to customize deep neural networks for specific application areas, which requires a great deal of expertise and luck. The lack of a general design paradigm makes practitioners often rely on heuristic methods or ad - hoc solutions. - **Solution**: By introducing the idea of gradient - boosting, build neural networks layer by layer, so that the model can gradually increase in complexity while maintaining simplicity and controllability at each step. 2. **Limitations of Traditional Gradient - Boosting Decision Trees (GBDT)** - **Problem**: Although GBDT performs well in many tasks, decision trees are not suitable for all fields. Especially in tasks involving structured data, deep neural networks usually perform better. - **Solution**: Use shallow neural networks as weak learners instead of traditional decision trees, thus combining the expressive power of neural networks and the incremental learning advantages of gradient - boosting. 3. **Limitations of Greedy Function Approximation** - **Problem**: The classical gradient - boosting method uses a greedy strategy for function approximation, which may lead to local optimal solutions. - **Solution**: Introduce a global corrective step (Corrective Step), which allows updating all previous weak learner parameters in each iteration, thereby avoiding getting trapped in local optimal solutions and improving the overall performance of the model. ### Specific Contributions - **Propose a Novel Method**: Combine gradient - boosting with deep neural networks to build complex deep neural networks layer by layer. - **Develop an Optimization Algorithm**: Faster and easier to train than traditional deep neural networks, including introducing second - order statistical information and global corrective steps to improve stability and task - specific fine - tuning. - **Demonstrate the Effectiveness of the Method**: Through experimental evaluation, achieve results superior to the existing state - of - the - art methods in classification, regression, and ranking tasks on multiple real - data sets. ### Summary The main objective of this paper is to provide a more flexible and efficient method for building deep neural networks by introducing the combination of gradient - boosting and shallow neural networks, in order to address the complexity and limitations in the design and training of traditional deep neural networks.