SVIP: Towards Verifiable Inference of Open-source Large Language Models

Yifan Sun,Yuhang Li,Yue Zhang,Yuchen Jin,Huan Zhang
2024-10-30
Abstract:Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains. However, their increasing model sizes render local deployment impractical for individual users, pushing many to rely on computing service providers for inference through a blackbox API. This reliance introduces a new risk: a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby delivering inferior outputs while benefiting from cost savings. In this paper, we formalize the problem of verifiable inference for LLMs. Existing verifiable computing solutions based on cryptographic or game-theoretic techniques are either computationally uneconomical or rest on strong assumptions. We introduce SVIP, a secret-based verifiable LLM inference protocol that leverages intermediate outputs from LLM as unique model identifiers. By training a proxy task on these outputs and requiring the computing provider to return both the generated text and the processed intermediate outputs, users can reliably verify whether the computing provider is acting honestly. In addition, the integration of a secret mechanism further enhances the security of our protocol. We thoroughly analyze our protocol under multiple strong and adaptive adversarial scenarios. Our extensive experiments demonstrate that SVIP is accurate, generalizable, computationally efficient, and resistant to various attacks. Notably, SVIP achieves false negative rates below 5% and false positive rates below 3%, while requiring less than 0.01 seconds per query for verification.
Machine Learning,Artificial Intelligence,Computation and Language,Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: when using open - source large language models (LLMs) for inference, how to ensure that the computing service provider actually uses the model specified by the user, rather than secretly replacing it with a smaller and less - capable model to save costs. Specifically, the core of the problem lies in verifying whether the computing service provider honestly uses the specified LLM for inference. ### Problem Background As the capabilities of open - source large language models continue to improve, they perform excellently in natural language understanding and generation and are widely used in various fields. However, these models are becoming larger and larger, making local deployment impractical. Many users have to rely on computing service providers to perform inference through black - box APIs. This dependence brings new risks: the computing service provider may quietly replace the LLM requested by the user with a much smaller and less - capable model without the user's consent, thus providing inferior output while saving costs. ### Problem Definition To address this problem, the authors formally defined the "verifiable inference" problem and proposed a secret - based verifiable LLM inference protocol (SVIP). Specifically, they hope to design a mechanism that enables users to reliably verify whether the computing service provider uses the specified LLM for inference. The ideal solution should meet the following criteria: 1. **Low false negative rate (FNR)**: The protocol should minimize the instances where the computing service provider actually uses the specified model but is mislabeled as not using it. 2. **Low false positive rate (FPR)**: The protocol should minimize the instances where the computing service provider actually uses an alternative model but is wrongly identified as using the specified model. 3. **Efficiency**: The verification protocol should be computationally efficient and have minimal additional overhead for the computing service provider and the user. 4. **Maintain completion quality**: The protocol should not affect the quality of the inference results returned by the computing service provider. ### Solution Overview The authors proposed the SVIP protocol, which requires the computing service provider to return not only the generated text but also the intermediate outputs (hidden state representations) in the LLM inference process. By training a proxy task, specifically optimized for the hidden representations generated by the specified model, the user can evaluate the performance of the returned intermediate outputs on the proxy task. If the performance is good, it indicates that the correct model has been used; otherwise, there may be a problem. In addition, to enhance the security of the protocol, SVIP introduced a secret mechanism, making it difficult for malicious computing service providers to forge or bypass the verification process. The authors also conducted a thorough security analysis of multiple strong - adaptability attack scenarios, proving the effectiveness and security of SVIP. ### Main Contributions - Proposed the first systematically formalized verifiable inference problem for large language models and innovatively constructed a verification protocol using intermediate outputs. - Verified the effectiveness of SVIP on multiple open - source large language models through experiments, with an average false negative rate of 3.49% and a false positive rate of less than 3%. - Conducted detailed discussions and analyses of various strong - adaptability attacks, demonstrating the security and robustness of SVIP in practical applications. In summary, this paper aims to solve the possible deceptive behaviors of computing service providers in the large - language - model inference process and ensure that users can verify whether the computing service provider honestly uses the specified LLM by introducing the SVIP protocol.