Abstract:This study investigates the impact of transient hardware faults and algorithmic inaccuracies on DNN misclassifications in terms of safety‐critical behavior. Specifically, this study offers a more comprehensive understanding of the impact of multifaceted factors influencing the likelihood of safety‐critical misclassifications across different DNN models. Our thorough findings highlight that transient hardware faults pose a greater risk than intrinsic algorithmic inaccuracies to cause safety‐critical misclassifications. Summary Safety‐critical applications, such as autonomous vehicles, healthcare, and space applications, have witnessed widespread deployment of deep neural networks (DNNs). Inherent algorithmic inaccuracies have consistently been a prevalent cause of misclassifications, even in modern DNNs. Simultaneously, with an ongoing effort to minimize the footprint of contemporary chip design, there is a continual rise in the likelihood of transient hardware faults in deployed DNN models. Consequently, researchers have wondered the extent to which these faults contribute to DNN misclassifications compared to algorithmic inaccuracies. This article delves into the impact of DNN misclassifications caused by transient hardware faults and intrinsic algorithmic inaccuracies in safety‐critical applications. Initially, we enhance a cutting‐edge fault injector, TensorFI, for TensorFlow applications to facilitate fault injections on modern DNN non‐sequential models in a scalable manner. Subsequently, we analyse the DNN‐inferred outcomes based on our defined safety‐critical metrics. Finally, we conduct extensive fault injection experiments and a comprehensive analysis to achieve the following objectives: (1) investigate the impact of different target class groupings on DNN failures and (2) pinpoint the most vulnerable bit locations within tensors, as well as DNN layers accountable for the majority of safety‐critical misclassifications. Our findings regarding different grouping formations reveal that failures induced by transient hardware faults can have a substantially greater impact (with a probability up to 4 × higher) on safety‐critical applications compared to those resulting from algorithmic inaccuracies. Additionally, our investigation demonstrates that higher order bit positions in tensors, as well as initial and final layers of DNNs, necessitate prioritized protection compared to other regions.

Transient Fault Detection in Tensor Cores for Modern GPUs

High-performance Reconfigurable DNN Accelerator on a Bandwidth-limited Embedded System

A Deep Investigation on Stealthy DVFS Fault Injection Attacks at DNN Hardware Accelerators

Exploring Hardware Fault Impacts on Different Real Number Representations of the Structural Resilience of TCUs in GPUs

Investigating the impact of transient hardware faults on deep learning neural network inference

RepTFD: Replay Based Transient Fault Detection

TOD: Transprecise Object Detection to Maximise Real-Time Accuracy on the Edge

TSTC: Two-Level Sparsity Tensor Core Enabling Both Algorithm Flexibility and Hardware Efficiency

DSTC: Dual-Side Sparsity Tensor Core for DNNs Acceleration on Modern GPU Architectures

Lightning: Striking the Secure Isolation on GPU Clouds with Transient Hardware Faults

Demystifying TensorRT: Characterizing Neural Network Inference Engine on Nvidia Edge Devices

Lightning: Leveraging DVFS-induced Transient Fault Injection to Attack Deep Learning Accelerator of GPUs

GTCO: Graph and Tensor Co-Design for Transformer-Based Image Recognition on Tensor Cores

TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs

GOAT: GPU Outsourcing of Deep Learning Training With Asynchronous Probabilistic Integrity Verification Inside Trusted Execution Environment

Uncovering Nested Data Parallelism and Data Reuse in DNN Computation with FractalTensor

Mixed-TD: Efficient Neural Network Accelerator with Layer-Specific Tensor Decomposition

BCB-SpTC: An Efficient Sparse High-Dimensional Tensor Contraction Employing Tensor Core Acceleration

A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization

15.4 A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization