Fine Grained Insider Risk Detection

Birkett Huber,Casper Neo,Keiran Sampson,Alex Kantchelian,Brett Ksobiech,Yanis Pavlidis
2024-11-05
Abstract:We present a method to detect departures from business-justified workflows among support agents. Our goal is to assist auditors in identifying agent actions that cannot be explained by the activity within their surrounding context, where normal activity patterns are established from historical data. We apply our method to help audit millions of actions of over three thousand support agents. We collect logs from the tools used by support agents and construct a bipartite graph of Actions and Entities representing all the actions of the agents, as well as background information about entities. From this graph, we sample subgraphs rooted on security-significant actions taken by the agents. Each subgraph captures the relevant context of the root action in terms of other actions, entities and their relationships. We then prioritize the rooted-subgraphs for auditor review using feed-forward and graph neural networks, as well as nearest neighbors techniques. To alleviate the issue of scarce labeling data, we use contrastive learning and domain-specific data augmentations. Expert auditors label the top ranked subgraphs as ``worth auditing" or ``not worth auditing" based on the company's business policies. This system finds subgraphs that are worth auditing with high enough precision to be used in production.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to detect the abnormal behaviors of support agents when handling customer support requests, which may deviate from the business - reasonable operation processes. Specifically, the author aims to help auditors identify agent operations that cannot be explained by their surrounding context, especially those potential security - risk behaviors. Traditional methods usually rely on historical data to establish normal activity patterns and try to predict future access behaviors, but this method is not ideal for internal risk detection of support agents, because the work content of support agents and the resources they need to access depend on the support request tickets they are currently handling, and this information is usually determined by external variables and cannot be predicted in advance. To solve this problem, this paper proposes a fine - grained method to capture the operations of support agents and their background information by constructing and analyzing bipartite graphs. The specific steps are as follows: 1. **Log collection and graph construction**: Collect logs from the tools used by support agents and construct a bipartite graph composed of "Actions" and "Entities". Each action represents a specific operation performed by the agent, such as replying to work orders or querying data, while entities include persistent identifiers such as users, account IDs, and work order IDs. 2. **Sub - graph sampling**: Sample sub - graphs with security - related actions as root nodes from this graph. Each sub - graph not only contains the root - node action but also other related actions, entities, and their relationships, thus providing the relevant context of the root - node action. 3. **Priority ranking**: Use feed - forward neural networks, graph neural networks, and nearest neighbors techniques to prioritize these sub - graphs so that auditors can efficiently review the sub - graphs most worthy of auditing. 4. **Contrastive learning and data augmentation**: To address the problem of scarce labeled data, contrastive learning and domain - specific data augmentation methods are introduced, thereby improving the learning effect and generalization ability of the model. 5. **Expert review**: Finally, expert auditors label the top - ranked sub - graphs according to the company's business policies to determine whether they are "worthy of auditing". Through this method, the system can find the sub - graphs that need to be audited with sufficient precision in the production environment, thereby effectively detecting and supporting the identification of internal risks of agents.