Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

Shaopeng Fu,Xuexue Sun,Ke Qing,Tianhang Zheng,Di Wang
2024-08-06
Abstract:Though pre-trained encoders can be easily accessed online to build downstream machine learning (ML) services quickly, various attacks have been designed to compromise the security and privacy of these encoders. While most attacks target encoders on the upstream side, it remains unknown how an encoder could be threatened when deployed in a downstream ML service. This paper unveils a new vulnerability: the Pre-trained Encoder Inference (PEI) attack, which posts privacy threats toward encoders hidden behind downstream ML services. By only providing API accesses to a targeted downstream service and a set of candidate encoders, the PEI attack can infer which encoder is secretly used by the targeted service based on candidate ones. We evaluate the attack performance of PEI against real-world encoders on three downstream tasks: image classification, text classification, and text-to-image generation. Experiments show that the PEI attack succeeds in revealing the hidden encoder in most cases and seldom makes mistakes even when the hidden encoder is not in the candidate set. We also conducted a case study on one of the most recent vision-language models, LLaVA, to illustrate that the PEI attack is useful in assisting other ML attacks such as adversarial attacks. The code is available at <a class="link-external link-https" href="https://github.com/fshp971/encoder-inference" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
This paper primarily explores a novel privacy threat—Pre-trained Encoder Inference (PEI) attack, which targets pre-trained encoders that have been deployed in downstream machine learning services. Specifically, the goal of the PEI attack is to infer which pre-trained encoders are covertly used in the target service by only accessing the API interface of the target downstream service and a set of candidate encoders' API interfaces. The paper first introduces the background knowledge and related work, then elaborates on the working principle, design framework, and how to implement the attack for different data modalities (such as images and text). ### Main Contributions 1. **Revealing New Vulnerabilities**: Proposes a new attack method called PEI, which can infer the pre-trained encoders hidden in downstream machine learning services. 2. **General Framework**: Proposes a general black-box attack framework that can implement PEI attacks in a task-agnostic manner. 3. **Experimental Validation**: Conducts experiments on three downstream tasks (image classification, text classification, and text-to-image generation) to validate the effectiveness of the PEI attack. 4. **Case Study**: Demonstrates how PEI attacks can assist in adversarial attacks on the multimodal model LLaVA through a case study. 5. **Defense Discussion**: Provides a preliminary discussion on some potential defense measures to resist PEI attacks. ### Attack Framework The PEI attack includes two stages: - **PEI Attack Sample Synthesis Stage**: Generates attack samples by minimizing the embedding difference between the attack samples and the target samples under a specific encoder. - **Hidden Encoder Inference Stage**: Uses the generated attack samples to evaluate the behavior similarity of the target service, thereby inferring whether the hidden encoder belongs to the candidate set and which specific encoder it is. ### Features - The attacker only needs API-level access to carry out the attack. - PEI attacks can successfully reveal hidden encoders in most cases with a low false positive rate. - The cost of implementing the attack is relatively low, approximately a few hundred dollars per candidate encoder. In summary, this paper delves into the privacy threats that pre-trained encoders may face when deployed in downstream services and proposes an effective attack method to reveal these hidden encoders. Additionally, it demonstrates its effectiveness through experiments and discusses potential defense strategies.