Set-Valued Sensitivity Analysis of Deep Neural Networks

Xin Wang,Feiling wang,Xuegang Ban
2024-12-15
Abstract:This paper proposes a sensitivity analysis framework based on set valued mapping for deep neural networks (DNN) to understand and compute how the solutions (model weights) of DNN respond to perturbations in the training data. As a DNN may not exhibit a unique solution (minima) and the algorithm of solving a DNN may lead to different solutions with minor perturbations to input data, we focus on the sensitivity of the solution set of DNN, instead of studying a single solution. In particular, we are interested in the expansion and contraction of the set in response to data perturbations. If the change of solution set can be bounded by the extent of the data perturbation, the model is said to exhibit the Lipschitz like property. This "set-to-set" analysis approach provides a deeper understanding of the robustness and reliability of DNNs during training. Our framework incorporates both isolated and non-isolated minima, and critically, does not require the assumption that the Hessian of loss function is non-singular. By developing set-level metrics such as distance between sets, convergence of sets, derivatives of set-valued mapping, and stability across the solution set, we prove that the solution set of the Fully Connected Neural Network holds Lipschitz-like properties. For general neural networks (e.g., Resnet), we introduce a graphical-derivative-based method to estimate the new solution set following data perturbation without retraining.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the sensitivity analysis of the solution set (i.e., model weights) of deep neural networks (DNN) under training data perturbation. Specifically, the paper focuses on: 1. **Sensitivity of multi - valued solutions**: Since DNN may have multiple local minima and these minima may form a manifold, the traditional single - point sensitivity analysis method is not sufficient to fully understand the behavior of DNN. The paper proposes a framework based on set mapping to study the response of the solution set to training data perturbation. 2. **Lipschitz - like property**: The paper explores whether the change of the solution set can be limited by the degree of data perturbation, that is, whether the solution set has the Lipschitz - like property. If the change of the solution set can be limited by the degree of data perturbation, it indicates that the model has good robustness and reliability during the training process. 3. **No need for non - singular Hessian assumption**: Traditional methods usually rely on the assumption that the Hessian matrix of the loss function is non - singular, but this is often not true in DNN. The method proposed in the paper does not require this assumption, so it is more suitable for DNN in practical scenarios. 4. **Estimation of new solution sets**: When the training data is perturbed, how to estimate the new solution set without retraining. The paper introduces the method of graphical derivative to estimate the new solution set, which is especially suitable for general neural networks (such as ResNet). ### Formula summary - **Definition of solution set**: \[ S(x)=\left\{\hat{w} \mid \hat{w}=\arg \min_{w \in W} \frac{1}{n} \sum_{i = 1}^{n} L(x_i,y_i,w)\right\} \] - **Lipschitz - like property**: \[ S(x')\cap U\subset S(x)+\kappa\|x - x'\|B,\quad \forall x',x\in V \] where \(B\) is the unit closed ball and \(\kappa\) is the Lipschitz modulus. - **Graphical derivative**: \[ DS(\bar{x}|\bar{w})(\mu)=\left\{v \mid \nabla_w R(\bar{x},\bar{w})v+\nabla_{x_k} R(\bar{x},\bar{w})\mu = 0\right\} \] - **Estimation of new solution sets**: \[ S(x_p)\approx\bar{w}+DS(\bar{x}|\bar{w})(\Delta x) \] Through these methods and formulas, the paper provides a more comprehensive and accurate way to understand and quantify the behavior changes of DNN under training data perturbation.