Abstract:(Source) Code summarization aims to automatically generate summaries/comments for given code snippets in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization techniques can be categorized into extractive methods and abstractive methods . The extractive methods extract a subset of important statements and keywords from the code snippet using retrieval techniques and generate a summary that preserves factual details in important statements and keywords. However, such a subset may miss identifier or entity naming, and consequently, the naturalness of the generated summary is usually poor. The abstractive methods can generate human-written-like summaries leveraging encoder-decoder models. However, the generated summaries often miss important factual details. To generate human-written-like summaries with preserved factual details, we propose a novel extractive-and-abstractive framework. The extractive module in the framework performs the task of extractive code summarization, which takes in the code snippet and predicts important statements containing key factual details. The abstractive module in the framework performs the task of abstractive code summarization, which takes in the code snippet and important statements in parallel and generates a succinct and human-written-like natural language summary. We evaluate the effectiveness of our technique, called EACS, by conducting extensive experiments on three datasets involving six programming languages. Experimental results show that EACS significantly outperforms state-of-the-art techniques for all three widely used metrics, including BLEU, METEOR, and ROUGH-L. In addition, the human evaluation demonstrates that the summaries generated by EACS have higher naturalness and informativeness and are more relevant to given code snippets.

Evaluating Code Summarization with Improved Correlation with Human Assessment.

Why My Code Summarization Model Does Not Work

Why My Code Summarization Model Does Not Work: Code Comment Improvement with Category Prediction

On the Evaluation of Neural Code Summarization

Improving Code Summarization Through Automated Quality Assurance

SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods

Can Large Language Models Serve as Evaluators for Code Summarization?

Interpretation-based Code Summarization.

EditSum: A Retrieve-and-Edit Framework for Source Code Summarization

Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics

Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors

EnCoSum: enhanced semantic features for multi-scale multi-modal source code summarization

Contextual Information Enhanced Source Code Summarization

CoCoSum: Contextual Code Summarization with Multi-Relational Graph Neural Network

An Extractive-and-Abstractive Framework for Source Code Summarization.

GA-SCS: Graph-Augmented Source Code Summarization

OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization

CodeSum: Translate Program Language to Natural Language

Impact of Evaluation Methodologies on Code Summarization

Enhancing Trust in LLM-Generated Code Summaries with Calibrated Confidence Scores