Abstract:Commit messages summarize code changes of each commit in natural language, which help developers understand code changes without digging into detailed implementations and play an essential role in comprehending software evolution. To alleviate human efforts in writing commit messages, researchers have proposed various automated techniques to generate commit messages, including template-based, information retrieval-based, and learning-based techniques. Although promising, previous techniques have limited effectiveness due to their coarse-grained code change representations. This work proposes a novel commit message generation technique, FIRA, which first represents code changes via fine-grained graphs and then learns to generate commit messages automatically. Different from previous techniques, FIRA represents the code changes with fine-grained graphs, which explicitly describe the code edit operations between the old version and the new version, and code tokens at different granularities (i.e., sub-tokens and integral tokens). Based on the graph-based representation, FIRA generates commit messages by a generation model, which includes a graph-neural-network-based encoder and a transformer-based decoder. To make both sub-tokens and integral tokens as available ingredients for commit message generation, the decoder is further incorporated with a novel dual copy mechanism. We further perform an extensive study to evaluate the effectiveness of FIRA. Our quantitative results show that FIRA outperforms state-of-the-art techniques in terms of BLEU, ROUGE-L, and METEOR; and our ablation analysis further shows that major components in our technique both positively contribute to the effectiveness of FIRA. In addition, we further perform a human study to evaluate the quality of generated commit messages from the perspective of developers, and the results consistently show the effectiveness of FIRA over the compared techniques.

Combining Code Context and Fine-grained Code Difference for Commit Message Generation

Neural-machine-translation-based Commit Message Generation: How Far Are We?

Commit Message Generation for Source Code Changes.

Context-aware Retrieval-based Deep Commit Message Generation

Automatically Generating Commit Messages from Diffs using Neural Machine Translation

CoreGen: Contextualized Code Representation Learning for Commit Message Generation

A Sketch-Based Neural Model for Generating Commit Messages from Diffs

Revisiting Learning-based Commit Message Generation.

RAG-Enhanced Commit Message Generation

A large-scale empirical study of commit message generation: models, datasets and evaluation

Automated Commit Message Generation with Large Language Models: An Empirical Study and Beyond

COLARE: Commit Classification Via Fine-grained Context-aware Representation of Code Changes

COMET: Generating Commit Messages using Delta Graph Context Representation

RACE: Retrieval-Augmented Commit Message Generation

Understanding Code Change with Micro-Changes

Delving into Commit-Issue Correlation to Enhance Commit Message Generation Models

On the Evaluation of Commit Message Generation Models: An Experimental Study

From Code to Natural Language: Type-Aware Sketch-Based Seq2Seq Learning

Jointly Learning to Repair Code and Generate Commit Message

FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

Mining version control system for automatically generating commit comment