Sustainability Forecasting for Deep Learning Packages
Junxiao Han,Yunkun Wang,Zhongxin Liu,Lingfeng Bao,Jiakun Liu,David Lo,Shuiguang Deng
DOI: https://doi.org/10.1109/saner60148.2024.00106
2024-01-01
Abstract:Deep Learning (DL) technologies have been widely adopted to tackle various tasks. In this process, through software dependencies, a multi-layer DL supply chain (SC) is formed, with DL frameworks acting as the root, DL packages acting as the bridge nodes, and downstream DL projects acting as the periphery. However, most Open Source Software (OSS) projects may fail. Considering the crucial position of DL packages in the DL SC, to foster the sustainable development of DL SCs and DL packages, we aim to forecast the long-term sustainability of DL packages. Here, sustained activity is adopted as the main proxy of sustainability, and the sustainability status is classified as “sus-tainable” or “dormant”. Relatedly, a DL package is considered as “sustainable” if it has sustained activity in its last 12 months. Otherwise, it is deemed as “dormant”. To this end, we propose an approach that begins with obtaining longitudinal features for each DL package in each month. Then, we develop a model to forecast the sustainability of DL packages by incorporating the longitudinal features, which can aptly predict sustainability with an accuracy of up to 0.81. Subsequently, an interpretable module is developed to interpret the determinants (i.e., important features) that impact the sustainability of DL packages. Finally, we generate sustainability trajectories for each DL package to better understand the monthly changes of their sustainability status. Our findings uncover that for most DL packages, fewer but more centralized developers and a balanced collaboration are more likely to help sustain the DL packages. Furthermore, although some DL packages are sustainable, their sustainability trajectories present statistically decreasing trends over time. Based on the findings, we shed light on the dynamic sustainability of DL packages, highlight future research directions, and provide practical suggestions to DL package maintainers, developers, users, and software engineering researchers.