Similarity-Based Label Inference Attack Against Training and Inference of Split Learning

Junlin Liu,Xinchen Lyu,Qimei Cui,Xiaofeng Tao
DOI: https://doi.org/10.1109/tifs.2024.3356821
IF: 7.231
2024-02-06
IEEE Transactions on Information Forensics and Security
Abstract:Split learning is a promising paradigm for privacy-preserving distributed learning. The learning model can be cut into multiple portions to be collaboratively trained at the participants by exchanging only the intermediate results at the cut layer. It is crucial to understand the security performance of split learning, particularly for various privacy-sensitive applications. This paper shows that the exchanged intermediate results, including the smashed data (i.e., extracted features from the raw data) and gradients during training and inference of split learning, can already reveal the private labels. We mathematically analyze the potential label leakages and propose the cosine and Euclidean similarity measurements for gradients and smashed data. The two similarity measurements are shown to be unified in Euclidean space. Leveraging the similarity metric, we design three label inference attacks to efficiently recover the private labels during both the training and inference phases. Experimental results validate that the proposed attacks can achieve close to 100% accuracy of label attacks. Furthermore, our proposed attacks can remain effective against various state-of-the-art defense mechanisms, including DP-SGD, label differential privacy, gradient compression, and Marvell.
computer science, theory & methods,engineering, electrical & electronic
What problem does this paper attempt to address?