Abstract:Background:Code comment generation techniques aim to generate natural language descriptions for source code. There are two orthogonal approaches for this task, i.e., information retrieval (IR) based and neural-based methods. Recent studies have focused on combining their strengths by feeding the input code and its similar code snippets retrieved by the IR-based approach to the neural-based approach, which can enhance the neural-based approach’s ability to output low-frequency words and further improve the performance.Aim:However, despite the tremendous progress, our pilot study reveals that the current combination is not generalizable and can lead to performance degradation. In this paper, we propose a straightforward but effective approach to tackle the issue of existing combinations of these two comment generation approaches.Method:Instead of binding IR- and neural-based approaches statically, we combine them in a dynamic manner. Specifically, given an input code snippet, we first use an IR-based technique to retrieve a similar code snippet from the corpus. Then we use a Cross-Encoder based classifier to decide the comment generation method to be used dynamically, i.e., if the retrieved similar code snippet is a true positive (i.e., is semantically similar to the input), we directly use the IR-based technique. Otherwise, we pass the input to the neural-based model to generate the comment.Results:We evaluate our approach on a large-scale dataset of Java projects. Experiment results show that our approach can achieve 25.45 BLEU score, which improves the state-of-the-art IR-based approach, neural-based approach, and their combination by 41%, 26%, and 7%, respectively.Conclusions:We propose a straightforward but effective dynamic combination of IR-based and neural-based comment generation, which outperforms state-of-the-art approaches by a substantial margin.

Towards Usable Neural Comment Generation Via Code-Comment Linkage Interpretation: Method and Empirical Study

Neural-machine-translation-based Commit Message Generation: How Far Are We?

Automating Just-In-Time Comment Updating

Retrieve and Refine: Exemplar-based Neural Comment Generation

Neural Comment Generation for Source Code with Auxiliary Code Classification Task

Augmenting Java Method Comments Generation with Context Information Based on Neural Networks

A Simple Retrieval-based Method for Code Comment Generation

Why My Code Summarization Model Does Not Work: Code Comment Improvement with Category Prediction

An Intra-Class Relation Guided Approach for Code Comment Generation.

Yet another combination of IR- and neural-based comment generation

From Code to Natural Language: Type-Aware Sketch-Based Seq2Seq Learning

Integrating Extractive and Abstractive Models for Code Comment Generation

Code to Comment "Translation": Data, Metrics, Baselining & Evaluation

Leveraging Generative AI: Improving Software Metadata Classification with Generated Code-Comment Pairs

DeepCommenter: a Deep Code Comment Generation Tool with Hybrid Lexical and Syntactical Information

Towards Context-Aware Code Comment Generation

MESIA: Understanding and Leveraging Supplementary Nature of Method-level Comments for Automatic Comment Generation

CloCom: Mining existing source code for automatic comment generation

Enhancing Code Annotation Reliability: Generative AI's Role in Comment Quality Assessment Models

Deep Code Comment Generation with Hybrid Lexical and Syntactical Information

Code Attention: Translating Code to Comments by Exploiting Domain Features