Abstract:There is a trend of researchers and practitioners to directly apply pre-trained models to solve their specific tasks. For example, researchers in software engineering (SE) have successfully exploited the pre-trained language models to automatically generate the source code and comments. However, there are domain gaps in different benchmark datasets. These data-driven (or machine learning based) models trained on one benchmark dataset may not operate smoothly on other benchmarks. Thus, the reuse of pre-trained models introduces large costs and additional problems of checking whether arbitrary pre-trained models are suitable for the task-specific reuse or not. To our knowledge, software engineers can leverage code contracts to maximize the reuse of existing software components or software services. Similar to the software reuse in the SE field, reuse SE could be extended to the area of pre-trained model reuse. Therefore, according to the model card’s and FactSheet’s guidance for suppliers of pre-trained models on what information they should be published, we propose model contracts including the pre- and post-conditions of pre-trained models to enable better model reuse. Furthermore, many non-trivial yet challenging issues have not been fully investigated, although many pre-trained models are readily available on the model repositories. Based on our model contract, we conduct an exploratory study of 1908 pre-trained models on six mainstream model repositories (i.e., the TensorFlow Hub, PyTorch Hub, Model Zoo, Wolfram Neural Net Repository, Nvidia, and Hugging Face) to investigate the gap between necessary pre- and post-condition information and actual specifications. Our results clearly show that (1) the model repositories tend to provide confusing information of the pre-trained models, especially the information about the task’s type, model, training set, and (2) the model repositories cannot provide all of our proposed pre/post-condition information, especially the intended use, limitation, performance, and quantitative analysis. On the basis of our new findings, we suggest that (1) the developers of model repositories shall provide some necessary options (e.g., the training dataset, model algorithm, and performance measures) for each of pre/post-conditions of pre-trained models in each task type, (2) future researchers and practitioners provide more efficient metrics to recommend suitable pre-trained model, and (3) the suppliers of pre-trained models should report their pre-trained models in strict accordance with our proposed pre/post-condition and report their models according to the characteristics of each condition that has been reported in the model repositories.

ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse

Model Reuse with Domain Knowledge

An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry

Towards Robust Model Reuse in the Presence of Latent Domains

Deep Learning for Fixed Model Reuse.

Model Reuse with Reduced Kernel Mean Embedding Specification

Deep Learning Model Reuse in the HuggingFace Community: Challenges, Benefit and Trends

Reusing Pretrained Models by Multi-linear Operators for Efficient Training

Nonlinear Multi-Model Reuse.

Improving Large Models with Small models: Lower Costs and Better Performance

Towards Efficient Task-Driven Model Reprogramming with Foundation Models

DyRep: Bootstrapping Training with Dynamic Re-parameterization

Rapid Performance Gain Through Active Model Reuse

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

What is the Intended Usage Context of This Model? an Exploratory Study of Pre-Trained Models on Various Model Repositories

Reusing Deep Neural Network Models through Model Re-engineering

From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models

Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance

Modal Consistency Based Pre-Trained Multi-Model Reuse.

ModelPS: an Interactive and Collaborative Platform for Editing Pre-trained Models at Scale

Jack and Masters of all Trades: One-Pass Learning Sets of Model Sets From Large Pre-Trained Models