A Performance Debugging Framework for Unnecessary Lock Contentions with Record/Replay Techniques

Xiaofei Liao,Long Zheng,Bingsheng He,Song Wu,Hai Jin
DOI: https://doi.org/10.1109/tpds.2015.2472412
IF: 5.3
2015-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:Locks have been widely used as an effective synchronization mechanism among processes and threads. However, we observe that, a large number of false inter-thread dependencies (i.e., unnecessary lock contentions) exist during the program execution on multicore processors, incurring significant performance overhead. This paper presents a performance debugging framework, PERFPLAY, to facilitate the identification of unnecessary lock contentions and to guide programmers to improve the program performance by eliminating the unnecessary lock contentions. Since the performance debugging of unnecessary lock contentions is input-sensitive, we first identify the representative inputs for performance debugging. Next, PERFPLAY quantifies the performance impact of unnecessary lock contention code regions for each candidate input. Taking into account conflicting attribute of performance impact and input coverage in the real world, we finally make the tradeoff between performance impact and input coverage to recommend the optimal unnecessary lock contention code regions. Our final results on five real-world programs and PARSEC benchmarks demonstrate the significant performance overhead of unnecessary lock contentions, and the effectiveness of PERFPLAY in troubleshooting the target unnecessary lock contention code regions with the consideration of both performance impact and input coverage.
What problem does this paper attempt to address?