ScaleCheck: A Single-Machine Approach for Discovering Scalability Bugs in Large Distributed Systems
Cesar A. Stuardo,Tanakorn Leesatapornwongsa,Riza O. Suminto,Kaushik Rajan,Huan Ke,Jyoti Leeka,Jeffrey F. Lukman,Changsheng Xie,Jayashree Mohan,Wei-Chiu Chuang,Xubin He,Piyus Kedia,Shan Lu,Haryadi S. Gunawi,Weiguo Liu,Wei Xue
2019-01-01
Abstract:We present SCALECHECK, an approach for discovering scalability bugs (a new class of bug in large storage systems) and for democratizing large- scale testing. SCALECHECK employs a program analysis technique, for finding potential causes of scalability bugs, and a series of colocation techniques, for testing implementation code at real scales but doing so on just a commodity PC. SCALECHECK has been integrated to several large-scale storage systems, Cassandra, HDFS, Riak, and Voldemort, and successfully exposed known and unknown scalability bugs, up to 512-node scale on a 16-core PC.