Multi-Threading Performance on Commodity Multi-core Processors

Jie Chen,W. Watson,W. Mao
2007-01-01
Abstract:Multi-core processors based commodity servers recently become building blocks for high performance computing Linux clusters. The multi-core processors deliver bett er performance-to-cost ratios relative to their single-corepredecessors through on-chip multi-threading. However, they pr esent challenges in developing high performance multi-threadedcode. In this paper we study the performance of different software barrier algorithms on Intel Xeon and AMD Opteron multi-core processor based servers. Especially, we explore how differ ent memory subsystems, such as shared bus or ccNUMA, and their cache coherence protocols effect the performance of barrie r algorithms. In addition, we compare multi-threading software overhead between OpenMP directives and a locally developed threading library that utilizes optimized barrier algorit hms along with low overhead locking primitives. We find that OpenMP implementations provide high performance run-time librar ies coupled with excellent compiler directives with overhead s lightly more than the carefully optimized library.
What problem does this paper attempt to address?