Identifying and Exploiting Structures for Reliable Deep Learning

Amartya Sanyal
DOI: https://doi.org/10.48550/arXiv.2108.07083
2021-08-16
Abstract:Deep learning research has recently witnessed an impressively fast-paced progress in a wide range of tasks including computer vision, natural language processing, and reinforcement learning. The extraordinary performance of these systems often gives the impression that they can be used to revolutionise our lives for the better. However, as recent works point out, these systems suffer from several issues that make them unreliable for use in the real world, including vulnerability to adversarial attacks (Szegedy et al. [248]), tendency to memorise noise (Zhang et al. [292]), being over-confident on incorrect predictions (miscalibration) (Guo et al. [99]), and unsuitability for handling private data (Gilad-Bachrach et al. [88]). In this thesis, we look at each of these issues in detail, investigate their causes, and propose computationally cheap algorithms for mitigating them in practice. To do this, we identify structures in deep neural networks that can be exploited to mitigate the above causes of unreliability of deep learning algorithms.
Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on improving the reliability of deep - learning models. Specifically: 1. **Vulnerability to adversarial attacks**: Deep - learning models are vulnerable to adversarial samples, which can cause the model to predict incorrectly after small but carefully - designed perturbations (Szegedy et al. [248]). 2. **Tendency to remember noise**: Deep - learning models tend to remember the noise in the training data, which will affect the model's generalization ability in the real world (Zhang et al. [292]). 3. **Over - confident predictions**: Deep - learning models often show over - confidence when making incorrect predictions, that is, the confidence given by the model does not match the actual accuracy (Guo et al. [99]). 4. **Incompatibility in handling private data**: Deep - learning models face challenges when handling privacy - related data, for example, how to perform efficient inference while protecting user privacy (Gilad - Bachrach et al. [88]). To address these problems, the author proposes solutions in the following aspects: - **Improve generalization ability through stable rank normalization**: By minimizing the stable rank of each weight matrix in the neural network, reduce the model's tendency to remember noise without affecting its performance on clean data (Chapter 4). - **The impact of label noise and representation learning on adversarial robustness**: It is proved that remembering label noise or incorrect representation learning will make it impossible to achieve adversarial robustness, and corresponding theoretical and experimental results are proposed (Chapter 5). - **Enhance adversarial robustness in low - rank representation space**: Introduce low - rank priors to increase the robustness of neural networks under adversarial perturbations without affecting the actual accuracy (Chapter 6). - **Improve calibration using focal loss function**: Propose to use the focal loss function, which weights the losses of different samples to alleviate the calibration problem of deep - learning models (Chapter 7). - **Accelerate encrypted prediction**: Define a new framework - Encrypted Prediction as a Service (EPAAS), and propose to use fully homomorphic encryption (FHE) combined with binary neural networks (BNN) to meet the computational and privacy requirements (Chapter 8). Through these methods, the author aims to improve the reliability and security of deep - learning models in real - world applications.