DGBCT: A Scalable Distributed Gradient Boosting Causal Tree at Alipay.

Jun Zhou,Caizhi Tang,Qing Cui,Yi Ding,Longfei Li,Fei Wu
DOI: https://doi.org/10.1145/3543873.3584645
2023-01-01
Abstract:Causal effect estimation has been increasingly emphasized in the past few years. To handle this problem, tree-based causal methods have been widely used due to their robustness and explainability. However, most of the existing methods are limited to running on a single machine, making it difficult to scale up to hundreds of millions of data in typical industrial scenarios. This paper proposes DGBCT, a Distributed Gradient Boosting Causal Tree to tackle such problem, and the contribution of this paper is three folds. First, we extend the original GBCT method to a multi-treatment setting and take the monotonic constraints into consideration, so that more typical industrial necessities can be resolved with our framework. Moreover, we implement DGBCT based on the ‘Controller-Coordinator-Worker’ framework, in which dual failover mechanism is achieved, and commendable flexibility is ensured. In addition, empirical results show that DGBCT significantly outperforms the state-of-the-art causal trees, and has a near-linear speedup as the number of workers grows. The system is currently deployed in Alipay1 to support the daily business tasks that involve hundreds of millions of users.
What problem does this paper attempt to address?