Identifying Technical Debt and Its Types Across Diverse Software Projects Issues

Karthik Shivashankar,Mili Orucevic,Maren Maritsdatter Kruke,Antonio Martini
2024-08-17
Abstract:Technical Debt (TD) identification in software projects issues is crucial for maintaining code quality, reducing long-term maintenance costs, and improving overall project health. This study advances TD classification using transformer-based models, addressing the critical need for accurate and efficient TD identification in large-scale software development. Our methodology employs multiple binary classifiers for TD and its type, combined through ensemble learning, to enhance accuracy and robustness in detecting various forms of TD. We train and evaluate these models on a comprehensive dataset from GitHub Archive Issues (2015-2024), supplemented with industrial data validation. We demonstrate that in-project fine-tuned transformer models significantly outperform task-specific fine-tuned models in TD classification, highlighting the importance of project-specific context in accurate TD identification. Our research also reveals the superiority of specialized binary classifiers over multi-class models for TD and its type identification, enabling more targeted debt resolution strategies. A comparative analysis shows that the smaller DistilRoBERTa model is more effective than larger language models like GPTs for TD classification tasks, especially after fine-tuning, offering insights into efficient model selection for specific TD detection tasks. The study also assesses generalization capabilities using metrics such as MCC, AUC ROC, Recall, and F1 score, focusing on model effectiveness, fine-tuning impact, and relative performance. By validating our approach on out-of-distribution and real-world industrial datasets, we ensure practical applicability, addressing the diverse nature of software projects.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to effectively identify and classify technical debt (TD) and its different types in software projects**. Specifically, the paper aims to improve the accuracy and efficiency of technical debt identification by using Transformer - based models (such as DistilRoBERTa), and explore the effects of different classification methods to address the shortcomings of traditional methods in complex and large - scale software development. ### Main problem decomposition: 1. **Challenges in identifying and classifying technical debt**: - Technical debt (TD) refers to the cost of future additional work in the software development process due to choosing quick, sub - optimal solutions or because of requirements and technological progress. If not managed, technical debt will increase maintenance costs, reduce code quality, and affect the long - term health of the project. - Traditional technical debt identification methods (such as manual code review and static analysis tools) are often time - consuming and error - prone, and it is difficult to cope with the scale and complexity of modern software systems. 2. **The need for automated technical debt detection**: - Automated technical debt tracking can reduce the workload of manual tracking, help identify technical debt that may be overlooked, provide analysis of the quantity and type of technical debt, and thus support more informed decision - making and proactive management. - Deep - learning models based on Transformer (such as BERT, GPT, etc.) perform well in natural - language - processing tasks, can understand and generate human - like text, and are therefore suitable for interpreting ambiguous and context - dependent descriptions in software documents and issue trackers. 3. **Research objectives**: - Explore the effectiveness of Transformer - based models in technical debt classification. - Compare the effects of different classification methods (such as integrated learning of multi - class classification and binary classification). - Evaluate the generalization ability of the model on unseen data to ensure its reliability in practical application scenarios. - Provide a public data set to promote further research and development. ### Specific research questions (Research Questions, RQs): - **RQ1**: How effective are Transformer - based models in classifying technical debt issues? - **RQ1.1**: Can fine - tuning the technical debt model on specific project data improve its performance? - **RQ2**: What are the differences in the performance of large - language models (LLM) similar to GPT and the DistilRoBERTa model in technical debt classification? - **RQ2.1**: Is task - specific fine - tuning of large - language models (such as GPT) for technical debt more effective than the fine - tuned DistilRoBERTa? - **RQ3**: Is an expert ensemble of multiple binary classifiers more effective in classifying different types of problems than multi - class models? - **RQ3.1**: What are the differences in the performance of large - language models similar to GPT and the DistilRoBERTa model on different types of technical debt? Through these questions, the paper aims to provide more effective tools and methods for the automatic identification and classification of technical debt, thereby improving the maintenance strategies of software projects and enhancing software quality and development efficiency.