Abstract:Training Large Language Models (LLMs) requires immense computational power and vast amounts of data. As a result, protecting the intellectual property of these models through fingerprinting is essential for ownership authentication. While adding fingerprints to LLMs through fine-tuning has been attempted, it remains costly and unscalable. In this paper, we introduce FP-VEC, a pilot study on using fingerprint vectors as an efficient fingerprinting method for LLMs. Our approach generates a fingerprint vector that represents a confidential signature embedded in the model, allowing the same fingerprint to be seamlessly incorporated into an unlimited number of LLMs via vector addition. Results on several LLMs show that FP-VEC is lightweight by running on CPU-only devices for fingerprinting, scalable with a single training and unlimited fingerprinting process, and preserves the model's normal behavior. The project page is available at <a class="link-external link-https" href="https://fingerprintvector.github.io" rel="external noopener nofollow">this https URL</a> .

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to embed fingerprints in large - language models (LLMs) efficiently and scalably to protect the intellectual property rights of these models and verify their ownership. ### Problem Background Training large - language models (LLMs) requires huge computational resources and a large amount of data. Therefore, it is crucial to protect the intellectual property rights of these models through fingerprint technology. Existing methods mainly add fingerprints through fine - tuning, but this is both expensive and not scalable. ### Solution Proposed in the Paper The paper introduces a new fingerprint embedding method - FP - V EC (Fingerprint Vector Embedding via Efficient Vector Addition). This method generates a fingerprint vector and adds this vector seamlessly to multiple downstream models, thereby achieving efficient fingerprint embedding. Specifically: 1. **Generation of Fingerprint Vector**: Generate a compact fingerprint vector by subtracting the parameters of the base model from the fingerprinted model parameters. 2. **Fingerprint Transmission**: Add the generated fingerprint vector to the parameters of other downstream models, and the fingerprint can be quickly embedded without re - fine - tuning. ### Main Contributions - **Efficiency**: FP - V EC can run on CPU - only devices and complete fingerprint embedding in just a few seconds, greatly reducing the demand for computational resources. - **Scalability**: The fingerprint vector generated by one - time training can be applied to an unlimited number of downstream models. - **Performance Preservation**: After fingerprint embedding, the normal behavior and performance of the model are hardly affected. - **Robustness**: The fingerprinted model has strong resistance to key - guessing attacks. ### Experimental Results Experiments show that FP - V EC can not only successfully embed fingerprints in multiple LLMs but also maintain the performance of the models in different tasks. In addition, this method performs excellently in terms of efficiency, can complete fingerprint embedding in a short time, and can also run efficiently on CPU - only devices. In general, FP - V EC provides a lightweight, scalable and efficient solution for protecting the intellectual property rights of large - language models.

FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition

A Fingerprint for Large Language Models

Instructional Fingerprinting of Large Language Models

UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification

HuRef: HUman-REadable Fingerprint for Large Language Models

ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models

LLMmap: Fingerprinting For Large Language Models

MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding

FP8-LM: Training FP8 Large Language Models

MetaV: A Meta-Verifier Approach to Task-Agnostic Model Fingerprinting

Hide and Seek: Fingerprinting Large Language Models with Evolutionary Learning

Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique

Accelerating Vision-Language Pretraining with Free Language Modeling

FVT: Finger Vein Transformer for Authentication

ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression

FoPru: Focal Pruning for Efficient Large Vision-Language Models

Estimating Fingerprint Pose Via Dense Voting.

MergePrint: Robust Fingerprinting against Merging Large Language Models

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

MMRAN: A novel model for finger vein recognition based on a residual attention mechanism

vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving