Abstract:Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains. However, their increasing model sizes render local deployment impractical for individual users, pushing many to rely on computing service providers for inference through a blackbox API. This reliance introduces a new risk: a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby delivering inferior outputs while benefiting from cost savings. In this paper, we formalize the problem of verifiable inference for LLMs. Existing verifiable computing solutions based on cryptographic or game-theoretic techniques are either computationally uneconomical or rest on strong assumptions. We introduce SVIP, a secret-based verifiable LLM inference protocol that leverages intermediate outputs from LLM as unique model identifiers. By training a proxy task on these outputs and requiring the computing provider to return both the generated text and the processed intermediate outputs, users can reliably verify whether the computing provider is acting honestly. In addition, the integration of a secret mechanism further enhances the security of our protocol. We thoroughly analyze our protocol under multiple strong and adaptive adversarial scenarios. Our extensive experiments demonstrate that SVIP is accurate, generalizable, computationally efficient, and resistant to various attacks. Notably, SVIP achieves false negative rates below 5% and false positive rates below 3%, while requiring less than 0.01 seconds per query for verification.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: when using open - source large language models (LLMs) for inference, how to ensure that the computing service provider actually uses the model specified by the user, rather than secretly replacing it with a smaller and less - capable model to save costs. Specifically, the core of the problem lies in verifying whether the computing service provider honestly uses the specified LLM for inference. ### Problem Background As the capabilities of open - source large language models continue to improve, they perform excellently in natural language understanding and generation and are widely used in various fields. However, these models are becoming larger and larger, making local deployment impractical. Many users have to rely on computing service providers to perform inference through black - box APIs. This dependence brings new risks: the computing service provider may quietly replace the LLM requested by the user with a much smaller and less - capable model without the user's consent, thus providing inferior output while saving costs. ### Problem Definition To address this problem, the authors formally defined the "verifiable inference" problem and proposed a secret - based verifiable LLM inference protocol (SVIP). Specifically, they hope to design a mechanism that enables users to reliably verify whether the computing service provider uses the specified LLM for inference. The ideal solution should meet the following criteria: 1. **Low false negative rate (FNR)**: The protocol should minimize the instances where the computing service provider actually uses the specified model but is mislabeled as not using it. 2. **Low false positive rate (FPR)**: The protocol should minimize the instances where the computing service provider actually uses an alternative model but is wrongly identified as using the specified model. 3. **Efficiency**: The verification protocol should be computationally efficient and have minimal additional overhead for the computing service provider and the user. 4. **Maintain completion quality**: The protocol should not affect the quality of the inference results returned by the computing service provider. ### Solution Overview The authors proposed the SVIP protocol, which requires the computing service provider to return not only the generated text but also the intermediate outputs (hidden state representations) in the LLM inference process. By training a proxy task, specifically optimized for the hidden representations generated by the specified model, the user can evaluate the performance of the returned intermediate outputs on the proxy task. If the performance is good, it indicates that the correct model has been used; otherwise, there may be a problem. In addition, to enhance the security of the protocol, SVIP introduced a secret mechanism, making it difficult for malicious computing service providers to forge or bypass the verification process. The authors also conducted a thorough security analysis of multiple strong - adaptability attack scenarios, proving the effectiveness and security of SVIP. ### Main Contributions - Proposed the first systematically formalized verifiable inference problem for large language models and innovatively constructed a verification protocol using intermediate outputs. - Verified the effectiveness of SVIP on multiple open - source large language models through experiments, with an average false negative rate of 3.49% and a false positive rate of less than 3%. - Conducted detailed discussions and analyses of various strong - adaptability attacks, demonstrating the security and robustness of SVIP in practical applications. In summary, this paper aims to solve the possible deceptive behaviors of computing service providers in the large - language - model inference process and ensure that users can verify whether the computing service provider honestly uses the specified LLM by introducing the SVIP protocol.

SVIP: Towards Verifiable Inference of Open-source Large Language Models

A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation

Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility

Large Language Models are reasoners with Self-Verification

PermLLM: Private Inference of Large Language Models within 3 Seconds under WAN

SLIP: Securing LLMs IP Using Weights Decomposition

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Trust but Verify: Programmatic VLM Evaluation in the Wild

A Survey on Efficient Inference for Large Language Models

Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification

FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware

Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding

Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models

Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong

Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding

Small Language Models: Survey, Measurements, and Insights

SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization

VeriML: Enabling Integrity Assurances and Fair Payments for Machine Learning as a Service

FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

Inference Performance Optimization for Large Language Models on CPUs