The JavaScript Package Selection Task: A Comparative Experiment Using an LLM-based Approach

Andres Diaz Pace,Antonela Tommasel,Hernan Ceferino Vazquez
DOI: https://doi.org/10.19153/cleiej.27.2.4
2024-07-21
CLEI electronic journal
Abstract:When developing JavaScript (JS) applications, the assessment and selection of JS packages becomes challenging for developers due to the growing number of technology options available. Given a technology-related task, a common developers’ strategy is to query Web repositories (e.g., from GitHub) via a search engine (e.g., NPM, Google) and then shortlist candidate JS packages. However, this search might return a long list of results and not all of them might be relevant. Thus, these results often need to be (re-)ordered according to the developer’s criteria. To address these problems, in prior work, we developed a recommender system called AIDT that assists developers in the package selection task. AIDT relies on meta-search and machine learning techniques to infer the relevant packages for a query. An initial evaluation of AIDT showed good search effectiveness, but the tool was unable to explain its choices to the developer. Research on Large Language Models (LLMs) has recently opened new opportunities for this kind of recommender systems. Anyway, human developers should judge whether the recommendations (e.g., JS packages) of these tools (either AIDT or LLMs) are fit to purpose. In this paper, we propose a Retrieval Augmented Generation (RAG) architecture for using LLMs in the domain of technology selection, which enhances the AIDT original design. Furthermore, we report on a user study using both AIDT and different LLM-based variants (ChatGPT, Cohere, Llama2) on a sample of JS-related queries, in which we compared their results and also validated them against developers’ criteria for the task. Our findings show that, although the ranking capabilities of LLMs are not yet on par with AIDT or human efforts, the RAG architecture can achieve a decent performance and is good at providing explanations for the package choices in the rankings. The latter feature makes it more transparent than AIDT and, thus, potentially more flexible to support developers’ tasks.
What problem does this paper attempt to address?