Collaborative Learning by Boosting in Distributed Environments

Shijun Wang,Changshui Zhang
DOI: https://doi.org/10.1142/s0218001410008147
2008-01-01
Abstract:In human society, people learn from each other and knowledge is accumulated from generation to generation. This provides some hints to distributed learning. For distributed applications, each site has its own data. If we can build a local model for each site and improve the model based on models learned by its neighbor sites with low communication cost, then it would be very helpful to the distributed applications. In this paper, we propose a new distributed learning method called distributed network boosting (DNB) algorithm for distributed applications. The learned hypotheses are exchanged between neighboring sites during learning process. Theoretical analysis shows that the DNB algorithm minimizes the cost function through collaborative functional gradient descent in hypotheses space. We also give upper bounds of training error and generalization error of the DNB algorithm. Comparison results of the DNB algorithm with other algorithms on real data sets with different sizes show the effectiveness of the proposed algorithm for distributed applications. In order to show the influence of network topology on the performance of the DNB algorithm, we tested it on random graphs and scale-free networks. Bias-variance decomposition shows that the network topology plays an important role in controlling the diversity of the learned classifier ensemble.
What problem does this paper attempt to address?