Abstract:With the fast evolvement of embedded deep-learning computing systems, applications powered by deep learning are moving from the cloud to the edge. When deploying neural networks (NNs) onto the devices under complex environments, there are various types of possible faults: soft errors caused by cosmic radiation and radioactive impurities, voltage instability, aging, temperature variations, and malicious attackers. Thus the safety risk of deploying NNs is now drawing much attention. In this paper, after the analysis of the possible faults in various types of NN accelerators, we formalize and implement various fault models from the algorithmic perspective. We propose Fault-Tolerant Neural Architecture Search (FT-NAS) to automatically discover convolutional neural network (CNN) architectures that are reliable to various faults in nowadays devices. Then we incorporate fault-tolerant training (FTT) in the search process to achieve better results, which is referred to as FTT-NAS. Experiments on CIFAR-10 show that the discovered architectures outperform other manually designed baseline architectures significantly, with comparable or fewer floating-point operations (FLOPs) and parameters. Specifically, with the same fault settings, F-FTT-Net discovered under the feature fault model achieves an accuracy of 86.2% (VS. 68.1% achieved by MobileNet-V2), and W-FTT-Net discovered under the weight fault model achieves an accuracy of 69.6% (VS. 60.8% achieved by ResNet-20). By inspecting the discovered architectures, we find that the operation primitives, the weight quantization range, the capacity of the model, and the connection pattern have influences on the fault resilience capability of NN models.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is various failure problems faced when deploying neural networks on edge devices. With the rapid development of embedded deep - learning computing systems, deep - learning - based applications are migrating from the cloud to the edge. However, when deploying neural networks in complex environments, multiple types of failures may be encountered, such as soft errors caused by cosmic radiation and radioactive impurities, voltage instability, aging, temperature changes, and malicious attackers. These failures pose a threat to the security of neural networks. For this reason, by analyzing the possible failures in different types of neural network accelerators, the paper formalizes and implements multiple failure models from an algorithmic perspective. The author proposes Fault - Tolerant Neural Architecture Search (FT - NAS) to automatically discover convolutional neural network (CNN) architectures that are reliable for various failures in current devices, and combines Fault - Tolerant Training (FTT) in the search process, called FTT - NAS. The experimental results show that the discovered architectures significantly outperform manually - designed baseline architectures while maintaining or reducing the number of floating - point operations (FLOPs) and parameters. Specifically, under the same failure settings, the F - FTT - Net discovered under the feature failure model achieves an accuracy of 86.2% (while the accuracy of MobileNet - V2 is 68.1%), and the W - FTT - Net discovered under the weight failure model achieves an accuracy of 69.6% (while the accuracy of ResNet - 18 is 60.8%). By examining the discovered architectures, the author finds that factors such as operation primitives, weight quantization ranges, model capacity, and connection patterns have an impact on the fault - tolerance ability of neural network models.

FTT-NAS: Discovering Fault-Tolerant Convolutional Neural Architecture

Algorithm-Based Fault Tolerance for Convolutional Neural Networks

Multi-shot NAS for Discovering Adversarially Robust Convolutional Neural Architectures at Targeted Capacities

FD Technology for HSs based on Deep Convolutional Generative Adversarial Networks

Fault Tolerance Research of Visual Convolutional Neural Networks Based on Soft Errors

Discovering Robust Convolutional Architecture at Targeted Capacity: A Multi-Shot Approach

Towards Enhancing Fault Tolerance in Neural Networks

FPGA Implementation of a Fault-Tolerant Fused and Branched CNN Accelerator With Reconfigurable Capabilities

TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

Enhancing Neural Network Robustness Against Fault Injection Through Non-linear Weight Transformations

A Cross-Layer Fault Propagation Analysis Method for Edge Intelligence Systems Deployed with DNNs

An Energy-Efficient Neural Network Accelerator With Improved Resilience Against Fault Attacks

Efficient Error-Tolerant Quantized Neural Network Accelerators

NAS-ASDet: An Adaptive Design Method for Surface Defect Detection Network using Neural Architecture Search

An Autonomous Error-Tolerant Architecture Featuring Self-reparation for Convolutional Neural Networks

DeepCNN: A Dual Approach to Fault Localization and Repair in Convolutional Neural Networks

Soft Error Tolerant Convolutional Neural Networks on FPGAs with Ensemble Learning

Reliable Classification with Ensemble Convolutional Neural Networks.

AN ADAPTIVE FAULT-TOLERANT NEURAL NETWORK CLASSIFIER

Efficient Visual Fault Detection for Freight Train via Neural Architecture Search with Data Volume Robustness

FN-Net: A lightweight CNN-based architecture for fabric defect detection with adaptive threshold-based class determination