API recommendation for machine learning libraries: how far are we?

Moshi Wei,Yuchao Huang,Junjie Wang,Jiho Shin,Nima Shiri Harzevili,Song Wang
DOI: https://doi.org/10.1145/3540250.3549124
2022-11-07
Abstract:Application Programming Interfaces (APIs) are designed to help developers build software more effectively. Recommending the right APIs for specific tasks is gaining increasing attention among researchers and developers. However, most of the existing approaches are mainly evaluated for general programming tasks using statically typed programming languages such as Java. Little is known about their practical effectiveness and usefulness for machine learning (ML) programming tasks with dynamically typed programming languages such as Python, whose paradigms are fundamentally different from general programming tasks. This is of great value considering the increasing popularity of ML and the large number of new questions appearing on question answering websites. In this work, we set out to investigate the effectiveness of existing API recommendation approaches for Python-based ML programming tasks from Stack Overflow (SO). Specifically, we conducted an empirical study of six widely-used Python-based ML libraries using two state-of-the-art API recommendation approaches, i.e., BIKER and DeepAPI. We found that the existing approaches perform poorly for two main reasons: (1) Python-based ML tasks often require significant long API sequences; and (2) there are common API usage patterns in Python-based ML programming tasks that existing approaches cannot handle. Inspired by our findings, we proposed a simple but effective frequent itemset mining-based approach, i.e., FIMAX, to boost API recommendation approaches, i.e., enhance existing API recommendation approaches for Python-based ML programming tasks by leveraging the common API usage information from SO questions. Our evaluation shows that FIMAX improves existing state-of-the-art API recommendation approaches by up to 54.3% and 57.4% in MRR and MAP, respectively. Our user study with 14 developers further demonstrates the practicality of FIMAX for API recommendation.
Computer Science
What problem does this paper attempt to address?