Meta-learning without data via Wasserstein distributionally-robust model fusion.

Zhenyi Wang,Xiaoyang Wang,Li Shen,Qiuling Suo,Kaiqiang Song,Dong Yu,Yan Shen,Mingchen Gao
2022-01-01
Abstract:Existing meta-learning works assume that each task has available training and testing data. However, there are many available pre-trained models without accessing their training data in practice. We often need a single model to solve different tasks simultaneously as this is much more convenient to deploy the models. Our work aims to meta-learn a model initialization from these pre-trained models without using corresponding training data. We name this challenging problem setting as Data-Free Learning To Learn (DFL2L). We propose a distributionally robust optimization (DRO) framework to learn a black-box model to fuse and compress all the pre-trained models into a single network to address this problem. To encourage good generalization to the unseen new tasks, the proposed DRO framework diversifies the learned task embedding associated with each pre-trained model to cover the diversity in the underlying training task distributions. A model initialization is sampled from the black-box network during meta-testing as the meta learned initialization. Extensive experiments on offline and online DFL2L settings and several real image datasets demonstrate the effectiveness of the proposed methods.
What problem does this paper attempt to address?