Characterizing Fine-Grain Parallelism on Modern Multicore Platform

Xuhao Chen,Wei Chen,Jiawen Li,Zhong Zheng,Li Shen,Zhiying Wang
DOI: https://doi.org/10.1109/ICPADS.2011.41
2011-01-01
Abstract:Since chip multiprocessors have dominated the processor market, developing a parallel programming model with proper trade-off between productivity and efficiency become increasingly important. As a typical fine-grain parallelism model, Intel Threading Building Blocks (TBB) simplifies parallel programming by runtime schedule. Despite its simplicity, it costs non-trivial runtime overhead which may increase as the thread counts increase. In this work, we conduct an experiment on real commodity hardware to evaluate performance scalability of TBB using PARSEC benchmark suite. We first compare TBB with Pthreads to show that TBB applications can achieve comparable performance as Pthreads applications. To find the performance bottleneck of TBB applications, we measure the runtime overhead of TBB focused on 3 basic TBB runtime activities. The result provides valuable implications which can be used to develop scalable runtime libraries and architectural support for alleviating performance bottlenecks.
What problem does this paper attempt to address?