Exploring Hardware Transaction Processing for Reliable Computing in Chip-Multiprocessors Against Soft Errors

Chuanlei Zheng,Parijat Shukla,Shuai Wang,Jie S. Hu
DOI: https://doi.org/10.1109/dft.2012.6378206
2012-01-01
Abstract:With shrinking transistor feature size, lowering nodal capacitance and supply voltage at new technology generations, microprocessors are becoming more vulnerable to single-event upsets and transients, a.k.a., soft errors. While chip-multiprocessor (CMP) architecture has been employed in mainstream microprocessors and the number of on-chip processor cores keeps increasing, the system-level reliability of chip-multiprocessors is degrading reversely proportional to the core number. In this work, we propose to exploit abundant on-chip processor cores for redundant hardware transaction processing, which provides native support for error detection and recovery in transactional chip-multiprocessors (TxCMPs) against soft errors. The proposed transactional processor cores execute everything as transactions and TxCMPs execute redundant transactions on different cores. To alleviate the performance overhead due to transaction commits, we further propose two architectural optimizations, namely early partial commit packet transmission and speculative transaction execution in reliable computing mode. Our experimental evaluation confirms the effectiveness of our optimized TxCMPs in achieving low cost reliable computing against soft errors.
What problem does this paper attempt to address?