Efficient Gpu Hardware Transactional Memory Through Early Conflict Resolution

Sui Chen,Lu Peng
DOI: https://doi.org/10.1109/HPCA.2016.7446071
2016-01-01
Abstract:It has been proposed that Transactional Memory be added to Graphics Processing Units (GPUs) in recent years. One proposed hardware design, Warp TM, can scale to 1000s of concurrent transactions. As a programming method that can atomicize an arbitrary number of memory access locations and greatly reduce the efforts to program parallel applications, transactional memory handles the complexity of interthread synchronization. However, when thousands of transactions run concurrently on a GPU, conflicts and resource contentions arise, causing performance loss.In this paper, we identify and analyze the cause of conflicts and contentions and propose two enhancements that try to resolve conflicts early: (1) Early-Abort global conflict resolution that allows conflicts to be detected before they reach the Commit Units so that contention in the Commit Units is reduced and (2) Pause-and-Go execution scheme that reduces the chance of conflict and the performance penalty of re-executing long transactions. These two enhancements are enabled by a single hardware modification. Our evaluation shows the combination of the two enhancements greatly improves overall execution speed while reducing energy consumption.
What problem does this paper attempt to address?