3 System-level Dependability for Multicore and Real-time Systems
Liying Li,Tongquan Wei,Junlong Zhou,Mingsong Chen,Sharon Hu,Zhongsheng Chen,Ying Zhang,Zebo Peng,Jianhui Jiang,Muhammad Shafique,Joerg Henkel
2019-01-01
Abstract:11:00 10.3.1 IDENTIFYING THE MOST RELIABLE COLLABORATIVE WORKLOAD DISTRIBUTION IN HETEROGENEOUS DEVICES Speaker: Paolo Rech, UFRGS, BR Authors: Gabriel Piscoya Dávila, Daniel Oliveira, Philippe Navaux and Paolo Rech, UFRGS, BR Abstract The constant need for higher performances and reduced power consumption has lead vendors to design heterogeneous devices that embed traditional CPU and an accelerator, like a GPU or FPGA. When the CPU and the accelerator are used collaboratively the device computational performances reach their peak. However, the higher amount of resources employed for computation has, potentially, the side effect of increasing soft error rate. In this paper, we evaluate the reliability behaviour of AMD Kaveri Accelerated Processing Units executing a set of heterogeneous applications. We distribute the workload between the CPU and GPU and evaluate which configuration provides the lowest error rate or allows the computation of the highest amount of data before experiencing a failure. We show that, in most cases, the most reliable workload distribution is the one that delivers the highest performances. As experimentally proven, by choosing the correct workload distribution the device reliability can increase of up to 9x. Download Paper (PDF; Only available from the DATE venue WiFi)