Abstract:Many programs involves operations and logic manipulating user privileges, which is essential for the security of an organization. Therefore, one common malicious goal of attackers is to obtain or escalate the privileges, causing privilege leakage. To protect the program and the organization against privilege leakage attacks, it is important to eliminate the vulnerabilities which can be exploited to achieve such attacks. Unfortunately, while memory vulnerabilities are less challenging to find, logic vulnerabilities are much more imminent, harmful and difficult to identify. Accordingly, many analysts choose to find user privilege related (UPR) variables first as start points to investigate the code where the UPR variables may be used to see if there exists any vulnerabilities, especially the logic ones. In this paper, we introduce a large language model (LLM) workflow that can assist analysts in identifying such UPR variables, which is considered to be a very time-consuming task. Specifically, our tool will audit all the variables in a program and output a UPR score, which is the degree of relationship (closeness) between the variable and user privileges, for each variable. The proposed approach avoids the drawbacks introduced by directly prompting a LLM to find UPR variables by focusing on leverage the LLM at statement level instead of supplying LLM with very long code snippets. Those variables with high UPR scores are essentially potential UPR variables, which should be manually investigated. Our experiments show that using a typical UPR score threshold (i.e., UPR score >0.8), the false positive rate (FPR) is only 13.49%, while UPR variable found is significantly more than that of the heuristic based method.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to efficiently and accurately identify variables related to user privilege (User Privilege Related, UPR) in programs of any size. Specifically, the authors focus on: 1. **Challenges in logical vulnerability detection**: Compared with memory vulnerabilities, logical vulnerabilities are more concealed, difficult to find, and pose a greater security threat. Traditional automated tools such as fuzz - testing tools and static analysis tools are not effective in detecting logical vulnerabilities because these tools rely on crashes or abnormal behaviors to detect problems, while logical vulnerabilities do not cause program interruption. 2. **Limitations of existing methods**: - **Heuristic - rule - based methods**: For example, using regular expressions to match variable names that may contain UPR information. This method is prone to false negatives (high false - negative rate) and false positives (high false - positive rate), and it is difficult to handle complex application logic. - **The labor cost of code review**: For large - scale codebases, manually reviewing all variables to find potential UPR variables is very time - consuming and impractical. 3. **Challenges in applying large - scale language models (LLM)**: Although LLM performs well in understanding natural language and program code, directly inputting long code fragments into LLM for UPR variable identification is not ideal. As the length of the input context increases, the performance of LLM will gradually deteriorate, resulting in an increase in false positives and false negatives. To solve the above problems, the authors propose a workflow that combines static program analysis and LLM, aiming to improve the identification of UPR variables in the following ways: - **Reducing the impact of input length**: By dividing the code into smaller statement - level fragments, the impact of overly long input context on LLM performance is avoided. - **Utilizing the program dependence graph (PDG)**: Construct a sub - graph corresponding to each variable to capture the control and data dependence relationships between variables, thereby more comprehensively evaluating the importance of variables. - **Introducing a quantitative scoring mechanism**: Output the UPR score of each variable instead of a simple binary judgment (yes/no), in order to better handle boundary cases and help analysts prioritize variables with high scores. Through this method, researchers hope to significantly reduce the time of manual review while improving the accuracy and coverage of UPR variable identification.

A hybrid LLM workflow can help identify user privilege related variables in programs of any size

LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

Harnessing the Power of LLMs in Source Code Vulnerability Detection

LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks

Can LLMs be Fooled? Investigating Vulnerabilities in LLMs

An Insight into Security Code Review with LLMs: Capabilities, Obstacles and Influential Factors

Prompt Leakage effect and defense strategies for multi-turn LLM interactions

Intention Analysis Makes LLMs A Good Jailbreak Defender

LLbezpeky: Leveraging Large Language Models for Vulnerability Detection

Can LLMs Deeply Detect Complex Malicious Queries? A Framework for Jailbreaking via Obfuscating Intent

A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems

Software Vulnerability and Functionality Assessment using LLMs

Characterizing and Evaluating the Reliability of LLMs against Jailbreak Attacks

A survey on Large Language Model (LLM) security and privacy: The Good, The Bad, and The Ugly

Attention Is All You Need for LLM-based Code Vulnerability Localization

LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning

LPET -- Mining MS-Windows Software Privilege Escalation Vulnerabilities by Monitoring Interactive Behavior

Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation