Abstract:Large Language Model (LLM) is changing the software development paradigm and has gained huge attention from both academia and industry. Researchers and developers collaboratively explore how to leverage the powerful problem-solving ability of LLMs for specific domain tasks. Due to the wide usage of LLM-based applications, e.g., ChatGPT, multiple works have been proposed to ensure the security of LLM systems. However, a comprehensive understanding of the entire processes of LLM system construction (the LLM supply chain) is crucial but relevant works are limited. More importantly, the security issues hidden in the LLM SC which could highly impact the reliable usage of LLMs are lack of exploration. Existing works mainly focus on assuring the quality of LLM from the model level, security assurance for the entire LLM SC is ignored. In this work, we take the first step to discuss the potential security risks in each component as well as the integration between components of LLM SC. We summarize 12 security-related risks and provide promising guidance to help build safer LLM systems. We hope our work can facilitate the evolution of artificial general intelligence with secure LLM ecosystems.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the security risks in the large - language - model (LLM) supply chain. Specifically, although existing research mainly focuses on ensuring the quality and security of the LLM model itself, less attention has been paid to the security of the entire LLM supply chain. The author points out that the LLM supply chain includes multiple components and parties, such as data providers, model developers, third - party libraries, etc., and the dependencies between these components and their potential security risks have not been fully explored. ### Main problems of the paper 1. **Lack of comprehensive understanding of the entire LLM supply chain**: Most of the existing research focuses on the security at the model level, ignoring the security risks that other components in the supply chain (such as data preparation, model training, deployment environment, etc.) may bring. 2. **Security risks in all links of the supply chain**: From data collection to final application deployment, every link may have security risks, such as data selection attacks, data cleaning bypass, attacks on automatic annotation tools, vulnerabilities in frameworks and third - party libraries, exploitation of training techniques, distribution conflicts, etc. 3. **Limitations of existing research**: Most research focuses on specific components (such as ChatGPT) or specific tasks and fails to comprehensively consider the security of the entire supply chain. ### Goals of the paper - **Identify and summarize potential security risks**: By analyzing each link of the LLM supply chain, the author identifies 12 potential security risks and provides detailed descriptions. - **Propose mitigation measures**: In response to these risks, the author proposes corresponding mitigation measures and guiding principles to help researchers and developers build more secure LLM systems. - **Promote the development of more reliable artificial intelligence**: By increasing awareness of the security of the LLM supply chain, promote the development and application of safer and more reliable artificial intelligence systems. ### Main contributions 1. **For the first time, explored the security risks of integrating all components in the LLM supply chain** and summarized 12 related security risks. 2. **Provided promising guidelines** to help mitigate these risks and support the development of more secure LLM systems. 3. **Emphasized the importance of the overall security of the supply chain**, not just focusing on the security of the model itself. Through these efforts, the author hopes that their work can promote the development of a more secure LLM ecosystem, thereby promoting the evolution of general artificial intelligence.

Large Language Model Supply Chain: Open Problems From the Security Perspective

Large Language Model Supply Chain: A Research Agenda

Lifting the Veil on the Large Language Model Supply Chain: Composition, Risks, and Mitigations

A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly

Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

Exploring Advanced Methodologies in Security Evaluation for LLMs

Recent Advances in Attack and Defense Approaches of Large Language Models

A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems

Large language models in 6G security: challenges and opportunities

Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey

Safeguarding Large Language Models: A Survey

Large Language Models for Cyber Security: A Systematic Literature Review

Large Language Models and Security

Privacy in Large Language Models: Attacks, Defenses and Future Directions

Security and Privacy Challenges of Large Language Models: A Survey

Exploring Vulnerabilities and Threats in Large Language Models: Safeguarding Against Exploitation and Misuse

Exploring Vulnerabilities and Protections in Large Language Models: A Survey