Simplifying the Search of npm Packages

Ahmad Abdellatif,Yi Zeng,Mohamed Elshafei,Emad Shihab,Weiyi Shang
DOI: https://doi.org/10.1016/j.infsof.2020.106365
IF: 3.9
2020-10-01
Information and Software Technology
Abstract:<p><strong>Context:</strong> Code reuse, generally done through software packages, allows developers to reduce time-to-market and improve code quality. The npm ecosystem is a Node.js package management system which contains more than 700K Node.js packages and to help developers find high-quality packages that meet their needs, npms developed a search engine to rank Node.js packages in terms of quality, popularity, and maintenance. However, the current ranking mechanism for npms tends to be arbitrary and contains many different equations, which increases complexity and computation. <strong>Objective:</strong> The goal of this paper is to empirically improve the efficiency of npms by simplifying the used components without impacting the current npms package ranks. <strong>Method:</strong> We use feature selection methods with the aim of simplifying npms' equations. We remove the features that do not have a significant effect on the package's rank. Then, we study the impact of the simplified npms on the packages' rank, the amount of resources saved compared to the original npms, and the performance of the simplified npms as npm evolves. <strong>Results:</strong> Our findings indicate that 1) 31% of the unique variables of npms' equation can be removed without breaking the original packages' ranks; 2) The simplified npms, on average, preserves the overlapping of the packages by 98% and the ranking of those packages by 97%; 3) Using the simplified npms saves 10% of packages scoring time and more than 1.47 million network requests on each scoring run; 4) As the npm evolve through a period of 12 months, the simplified-npms was able to achieve results similar to the original npms. <strong>Conclusion:</strong> Our results show that the simplified npms preserves the original ranks of packages and is more efficient than the original npms. We believe that using our approach, helps the npms community speed up the scoring process by saving computational resources and time.</p>
computer science, information systems, software engineering
What problem does this paper attempt to address?