Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study

Emmanuel Iko-Ojo Simon,Chirath Hettiarachchi,Alex Potanin,Hanna Suominen,Fatemeh Fard
2024-08-22
Abstract:Context: Previous studies demonstrate that Machine or Deep Learning (ML/DL) models can detect Technical Debt from source code comments called Self-Admitted Technical Debt (SATD). Despite the importance of ML/DL in software development, limited studies focus on automated detection for new SATD types: Algorithm Debt (AD). AD detection is important because it helps to identify TD early, facilitating research, learning, and preventing the accumulation of issues related to model degradation and lack of scalability. Aim: Our goal is to improve AD detection performance of various ML/DL models. Method: We will perform empirical studies using approaches: TF-IDF, Count Vectorizer, Hash Vectorizer, and TD-indicative words to identify features that improve AD detection, using ML/DL classifiers with different data featurisations. We will use an existing dataset curated from seven DL frameworks where comments were manually classified as AD, Compatibility, Defect, Design, Documentation, Requirement, and Test Debt. We will explore various word embedding methods to further enrich features for ML models. These embeddings will be from models founded in DL such as ROBERTA, ALBERTv2, and large language models (LLMs): INSTRUCTOR and VOYAGE AI. We will enrich the dataset by incorporating AD-related terms, then train various ML/DL classifiers, Support Vector Machine, Logistic Regression, Random Forest, ROBERTA, and ALBERTv2.
Software Engineering
What problem does this paper attempt to address?
The problem this paper attempts to address is: improving the performance of automatic detection of Algorithm Debt (AD) within deep learning frameworks. Specifically, the paper aims to enhance the performance of various machine learning (ML) and deep learning (DL) models in detecting algorithm debt through empirical research. The paper points out that although existing research has demonstrated that ML/DL models can automatically detect technical debt (TD) from source code comments, there is still limited research on the automated detection of a new type of technical debt—algorithm debt (AD). Therefore, this study aims to explore different feature extraction methods and ML/DL models to improve the performance of AD detection. ### Main Issues: 1. **Improving AD Detection Performance**: How to improve the performance of AD detection through different feature extraction methods and ML/DL models? 2. **Performance Comparison of Different Models**: Which ML/DL models perform best in detecting AD? ### Research Background: - **Technical Debt (TD)**: TD refers to compromises made during software development for short-term benefits, which may increase maintenance costs in the future. - **Algorithm Debt (AD)**: AD specifically refers to suboptimal implementations of algorithm logic, which may lead to system performance degradation, model deterioration, and lack of scalability. - **Existing Research**: While existing research has shown that ML/DL models can automatically detect technical debt in traditional software, there is relatively little research on the automated detection of AD, especially in the field of deep learning. ### Research Objectives: - **Improving AD Detection Performance**: Through empirical research, explore different feature extraction methods and ML/DL models to improve the accuracy of AD detection. - **Evaluating the Performance of Different Models**: Compare the performance of different ML/DL models in detecting AD to identify the most effective model. ### Methods: - **Dataset**: Use the dataset compiled by Liu et al., which contains manually classified SATD (self-admitted technical debt) comments from seven deep learning frameworks. - **Feature Extraction Methods**: Include TF-IDF, Count Vectorizer, Hash Vectorizer, and AD indicator words. - **ML/DL Models**: Include SVM, Logistic Regression, Random Forest, ROBERTA, and ALBERTv2. - **Data Augmentation**: Enrich the dataset by adding terms and definitions related to AD to provide more contextual information. - **Model Training and Tuning**: Use Grid Search CV for parameter optimization and 10-fold cross-validation for model training and validation. - **Performance Evaluation**: Evaluate model performance using metrics such as accuracy, recall, F1 score, and conduct statistical significance tests. ### Expected Contributions: - **Improving AD Detection Performance**: Improve the accuracy of AD detection by exploring different feature extraction methods and ML/DL models. - **Promoting AD Research**: Provide a foundation for understanding the characteristics and impacts of AD in deep learning systems, encouraging further research. - **Practical Tools**: Provide developers with an automated tool to help early identification and management of AD, thereby improving system performance and maintainability.