Evaluating few-shot and contrastive learning methods for code clone detection

Mohamad Khajezade,Fatemeh H. Fard,Mohamed S. Shehata
DOI: https://doi.org/10.1007/s10664-024-10441-z
IF: 3.762
2024-10-11
Empirical Software Engineering
Abstract:Code Clone Detection (CCD) is a software engineering task that is used for plagiarism detection, code search, and code comprehension. Recently, deep learning-based models have achieved an F1-Score (a metric used to assess classifiers) of 95% on the CodeXGLUE benchmark. These models require many training data, mainly fine-tuned on Java or C++ datasets. However, no previous study evaluates the generalizability of these models where a limited amount of annotated data is available.
computer science, software engineering
What problem does this paper attempt to address?