Abstract:In software development, developers frequently apply maintenance activities to the source code that change a few lines by a single commit. A good understanding of the characteristics of such small changes can support quality assurance approaches (e.g., automated program repair), as it is likely that small changes are addressing deficiencies in other changes; thus, understanding the reasons for creating small changes can help understand the types of errors introduced. Eventually, these reasons and the types of errors can be used to enhance quality assurance approaches for improving code quality. While prior studies used code churns to characterize and investigate the small changes, such a definition has a critical limitation. Specifically, it loses the information of changed tokens in a line. For example, this definition fails to distinguish the following two one-line changes: (1) changing a string literal to fix a displayed message and (2) changing a function call and adding a new parameter. These are definitely maintenance activities, but we deduce that researchers and practitioners are interested in supporting the latter change. To address this limitation, in this paper, we define micro commits, a type of small change based on changed tokens. Our goal is to quantify small changes using changed tokens. Changed tokens allow us to identify small changes more precisely. In fact, this token-level definition can distinguish the above example. We investigate defined micro commits in four OSS projects and understand their characteristics as the first empirical study on token-based micro commits. We find that micro commits mainly replace a single name or literal token, and micro commits are more likely used to fix bugs. Additionally, we propose the use of token-based information to support software engineering approaches in which very small changes significantly affect their effectiveness.

What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

Why is My Code Change Abandoned?

Using Source Code Density to Improve the Accuracy of Automatic Commit Classification into Maintenance Activities

The Mind Is a Powerful Place: How Showing Code Comprehensibility Metrics Influences Code Understanding

Do Design Metrics Capture Developers Perception of Quality? An Empirical Study on Self-Affirmed Refactoring Activities

The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes

Broken Windows: Exploring the Applicability of a Controversial Theory on Code Quality

Impact of methodological choices on the analysis of code metrics and maintenance

Towards Understanding Fixes of SonarQube Static Analysis Violations: A Large-Scale Empirical Study

An Empirical Study of the Impact of Code Smell on File Changes

Towards Understanding the Impact of Code Modifications on Software Quality Metrics

Understanding Code Change with Micro-Changes

Are Smell-Based Metrics Actually Useful in Effort-Aware Structural Change-Proneness Prediction? an Empirical Study

Boosting Automatic Commit Classification Into Maintenance Activities By Utilizing Source Code Changes

A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects

An Empirical Study of Token-based Micro Commits

Watch out for This Commit! A Study of Influential Software Changes

Increasing, not Diminishing: Investigating the Returns of Highly Maintainable Code

Which Process Metrics Are Significantly Important to Change of Defects in Evolving Projects: an Empirical Study.

U Owns the Code That Changes and How Marginal Owners Resolve Issues Slower in Low-Quality Source Code

How do Developers Improve Code Readability? An Empirical Study of Pull Requests