Abstract:Issue resolution and bug-fixing processes are essential in the development of machine-learning libraries, similar to software development, to ensure well-optimized functions. Understanding the issue resolution and bug-fixing process of machine-learning libraries can help developers identify areas for improvement and optimize their strategies for issue resolution and bug-fixing. However, detailed studies on this topic are lacking. Therefore, we investigated the effectiveness of issue resolution for bug-fixing processes in six machine-learning libraries: Tensorflow, Keras, Theano, Pytorch, Caffe, and Scikit-learn. We addressed seven research questions (RQs) using 16,921 issues extracted from the GitHub repository via the GitHub Rest API. We employed several quantitative methods of data analysis, including correlation, OLS regression, percentage and frequency count, and heatmap to analyze the RQs. We found the following through our empirical investigation: (1) The most common categories of issues that arise in machine-learning libraries are bugs, documentation, optimization, crashes, enhancement, new feature requests, build/CI, support, and performance. (2) Effective strategies for addressing these problems include fixing critical bugs, optimizing performance, and improving documentation. (3) These categorized issues are related to testing and runtime and are common among all six machine-learning libraries. (4) Monitoring the total number of comments on issues can provide insights into the duration of the issues. (5) It is crucial to strike a balance between prioritizing critical issues and addressing other issues in a timely manner. Therefore, this study concludes that efficient issue-tracking processes, effective communication, and collaboration are vital for effective resolution of issues and bug fixing processes in machine-learning libraries.

"Won't We Fix this Issue?" Qualitative Characterization and Automated Identification of Wontfix Issues on GitHub

The Future Can’t Help Fix the Past: Assessing Program Repair in the Wild

An Empirical Study of Bug Fixing Rate.

Predicting Issue Types on GitHub

Software issues report for bug fixing process: An empirical study of machine-learning libraries

Characterizing Issue Management in Runtime Systems

What Do Users Ask in Open-Source AI Repositories? An Empirical Study of GitHub Issues

Automated Defects Detection and Fix in Logging Statement

Towards Understanding Fixes of SonarQube Static Analysis Violations: A Large-Scale Empirical Study

A Quantitative Study of Security Bug Fixes of GitHub Repositories

A Method To Identify And Correct Problematic Software Activity Data: Exploiting Capacity Constraints And Data Redundancies

Empirically Revisiting and Enhancing Automatic Classification of Bug and Non-Bug Issues

On the feasibility of automated prediction of bug and non-bug issues

An Empirical Study of False Negatives and Positives of Static Code Analyzers From the Perspective of Historical Issues

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

Problems with SZZ and Features: An empirical study of the state of practice of defect prediction data collection

Comparative analysis of real bugs in open-source Machine Learning projects -- A Registered Report

Bug Analysis Towards Bug Resolution Time Prediction

How Do Developers Fix Cross-Project Correlated Bugs? A Case Study on the GitHub Scientific Python Ecosystem

Examining the Effects of Developer Familiarity on Bug Fixing

CodeR: Issue Resolving with Multi-Agent and Task Graphs