Overload Control for Scaling WeChat Microservices

Hao Zhou,Ming Chen,Qian Lin,Yong Wang,Xiaobin She,Sifan Liu,Rui Gu,Beng Chin Ooi,Junfeng Yang
DOI: https://doi.org/10.1145/3267809.3267823
2018-12-24
Abstract:Effective overload control for large-scale online service system is crucial for protecting the system backend from overload. Conventionally, the design of overload control is ad-hoc for individual service. However, service-specific overload control could be detrimental to the overall system due to intricate service dependencies or flawed implementation of service. Service developers usually have difficulty to accurately estimate the dynamics of actual workload during the development of service. Therefore, it is essential to decouple the overload control from service logic. In this paper, we propose DAGOR, an overload control scheme designed for the account-oriented microservice architecture. DAGOR is service agnostic and system-centric. It manages overload at the microservice granule such that each microservice monitors its load status in real time and triggers load shedding in a collaborative manner among its relevant services when overload is detected. DAGOR has been used in the WeChat backend for five years. Experimental results show that DAGOR can benefit high success rate of service even when the system is experiencing overload, while ensuring fairness in the overload control.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?