Forensicability of Deep Neural Network Inference Pipelines

Alexander Schlögl,Tobias Kupek,Rainer Böhme
DOI: https://doi.org/10.48550/arXiv.2102.00921
2021-02-18
Abstract:We propose methods to infer properties of the execution environment of machine learning pipelines by tracing characteristic numerical deviations in observable outputs. Results from a series of proof-of-concept experiments obtained on local and cloud-hosted machines give rise to possible forensic applications, such as the identification of the hardware platform used to produce deep neural network predictions. Finally, we introduce boundary samples that amplify the numerical deviations in order to distinguish machines by their predicted label only.
Machine Learning,Cryptography and Security,Multimedia
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **By analyzing the output of the deep neural network inference pipeline, infer the characteristics of its execution environment, so as to realize the forensic analysis of the machine - learning inference process**. Specifically, the author proposes a method to infer the hardware platform and software configuration used to generate prediction results by tracing the feature value deviations generated during the deep neural network (DNN) inference process. This method can be applied to the following scenarios: 1. **Improve transparency**: Verify the environment in which automated decisions are made. 2. **Service verification**: Ensure that the technology platform provided by the machine - learning - as - a - service (MLaaS) provider is the same as that rented by the user. 3. **Trace the generative model**: For example, trace the source of generative models such as DeepFakes, or at least narrow down the possible source range. ### Main research contents 1. **Identify the complete inference pipeline**: - By comparing the inference outputs under different architectures, it is found that even different processors with small samples can be uniquely identified. 2. **Impact of the execution plan**: - Analyze how the preparation stage of the inference pipeline (such as the generation of the computation plan) affects the final output, and find that the computation plans generated by different architectures will produce different results when executed on the same architecture. 3. **Impact of model and input attributes**: - Study the impact of model complexity (such as the number and type of convolutional layers) and input data characteristics (such as size) on feature traces, and find that more complex models will leave more feature traces. 4. **Amplify forensic information**: - Use adversarial sample generation methods (such as the iterative fast gradient sign method, FGSM) to generate boundary samples, so that feature traces are propagated from real - valued outputs to class labels, thereby amplifying forensic - related information. ### Experimental results - **Identification of different architectures**: It has been verified through experiments that different hardware and software configurations can be effectively distinguished. - **Impact of the computation plan**: It is found that the computation plan generation stage has a significant impact on the results, especially the optimized TFLite computation plan. - **Impact of model complexity**: More complex models (such as multi - layer convolutional networks) are more likely to generate feature traces. - **Effect of boundary samples**: By generating boundary samples, the classification labels between different architectures are successfully changed, verifying the effectiveness of the method. ### Conclusions and prospects This research shows that the forensic analysis of the deep neural network inference pipeline is feasible, and proposes future research directions, including expansion to GPUs, embedded devices, and exploration of the forensic analysis of generative models.