Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning

Udi Aharon,Revital Marbel,Ran Dubin,Amit Dvir,Chen Hajaj
2024-05-18
Abstract:Web applications and APIs face constant threats from malicious actors seeking to exploit vulnerabilities for illicit gains. These threats necessitate robust anomaly detection systems capable of identifying malicious API traffic efficiently despite limited and diverse datasets. This paper proposes a novel few-shot detection approach motivated by Natural Language Processing (NLP) and advanced Generative Adversarial Network (GAN)-inspired techniques. Leveraging state-of-the-art Transformer architectures, particularly RoBERTa, our method enhances the contextual understanding of API requests, leading to improved anomaly detection compared to traditional methods. We showcase the technique's versatility by demonstrating its effectiveness with both Out-of-Distribution (OOD) and Transformer-based binary classification methods on two distinct datasets: CSIC 2010 and ATRDF 2023. Our evaluations reveal consistently enhanced or, at worst, equivalent detection rates across various metrics in most vectors, highlighting the promise of our approach for improving API security.
Cryptography and Security
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve an important problem in the field of API (Application Programming Interface) security: **How to effectively detect malicious API traffic in the case of data scarcity**. Specifically, the paper focuses on: 1. **The complexity of API security threats**: As the use of APIs increases, the risk of attackers exploiting API vulnerabilities for unauthorized access, data leakage, and other malicious activities also increases. Traditional anomaly detection systems face challenges in dealing with these threats, especially in the case of limited and diverse data. 2. **The problem of data scarcity**: Most public APIs are managed and maintained by enterprises. For privacy and security reasons, enterprises usually do not share detailed data on communication with clients. This results in an insufficient dataset of benign (normal) and malicious API requests for training deep - learning models, thus affecting the accuracy of the models. 3. **The limitations of existing methods**: Although some research has attempted to use deep - learning techniques to improve the detection ability of API security threats, these methods are not very effective in the case of data scarcity. Although existing few - shot learning methods can alleviate this problem to a certain extent, there is still room for improvement. To solve the above problems, the authors propose a new few - shot detection method based on GAN (Generative Adversarial Network) and NLP (Natural Language Processing) - **GIAAD (GAN - Inspired Anomalous API Detection)**. This method overcomes the problem of data scarcity in the following ways: - **Utilizing advanced Transformer architectures (such as RoBERTa)**: Enhancing the contextual understanding of API requests, thereby generating and classifying API requests more accurately. - **Introducing GAN - inspired techniques**: By generating more API request samples, enriching the dataset and improving the generalization ability of the model. - **Combining few - shot learning**: It can still effectively distinguish between normal and abnormal API requests even with only a small number of samples. Through this method, GIAAD can significantly improve the performance of API anomaly detection in the case of data scarcity, thus providing a new solution for improving API security.