Abstract:TODO comments play an important role in helping developers to manage their tasks and communicate with other team members. TODO comments are often introduced by developers as a type of technical debt, such as a reminder to add/remove features or a request to optimize the code implementations. These can all be considered as notifications for developers to revisit regarding the current suboptimal solutions. TODO comments often bring short-term benefits – higher productivity or shorter development cost – and indicate attention needs to be paid for the long-term software quality. Unfortunately, due to their lack of knowledge or experience and/or the time constraints, developers sometimes may forget or even not be aware of suboptimal implementations. The loss of the TODO comments for these suboptimal solutions may hurt the software quality and reliability in the long-term. Therefore it is beneficial to remind the developers of the suboptimal solutions whenever they change the code. In this work, we refer this problem to the task of detecting TODO-missed commits , and we propose a novel approach named TDR eminder ( T O D O comment Reminder ) to address the task. With the help of TDR eminder , developers can identify possible missing TODO commits just-in-time when submitting a commit. Our approach has two phases: offline training and online inference. We first embed code change and commit message into contextual vector representations using two neural encoders respectively. The association between these representations is learned by our model automatically.In the online inference phase, TDR eminder leverages the trained model to compute the likelihood of a commit being a TODO-missed commit . We evaluate TDR eminder on datasets crawled from 10k popular Python and Java repositories in GitHub respectively. Our experimental results show that TDR eminder outperforms a set of benchmarks by a large margin in TODO-missed commits detection. Moreover, to better help developers use TDR eminder in practice, we have incorporated Large Language Models (LLMs) with our approach to provide explainable recommendations. The user study shows that our tool can effectively inform developers not only “when” to add TODOs, but also “where” and “what” TODOs should be added, verifying the value of our tool in practical application.

FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

Neural-machine-translation-based Commit Message Generation: How Far Are We?

Automatically Generating Commit Messages from Diffs using Neural Machine Translation

RAG-Enhanced Commit Message Generation

Commit Message Generation Via ChatGPT: How Far Are We?

Revisiting Learning-based Commit Message Generation.

A large-scale empirical study of commit message generation: models, datasets and evaluation

Commit Message Generation for Source Code Changes.

RACE: Retrieval-Augmented Commit Message Generation

On the Evaluation of Commit Message Generation Models: An Experimental Study

Context-aware Retrieval-based Deep Commit Message Generation

What Makes a Good Commit Message?

Just-In-Time TODO-Missed Commits Detection

Commit Messages in the Age of Large Language Models

Combining Code Context and Fine-grained Code Difference for Commit Message Generation

Using Large Language Models for Commit Message Generation: A Preliminary Study

Is It Hard to Generate Holistic Commit Message?

Automated Commit Message Generation with Large Language Models: An Empirical Study and Beyond

A Sketch-Based Neural Model for Generating Commit Messages from Diffs

Automated Commit Intelligence by Pre-training

Quality Assurance for Automated Commit Message Generation.