Understanding the Transferability of Representations via Task-Relatedness

Akshay Mehra,Yunbei Zhang,Jihun Hamm
2024-10-29
Abstract:The growing popularity of transfer learning, due to the availability of models pre-trained on vast amounts of data, makes it imperative to understand when the knowledge of these pre-trained models can be transferred to obtain high-performing models on downstream target tasks. However, the exact conditions under which transfer learning succeeds in a cross-domain cross-task setting are still poorly understood. To bridge this gap, we propose a novel analysis that analyzes the transferability of the representations of pre-trained models to downstream tasks in terms of their relatedness to a given reference task. Our analysis leads to an upper bound on transferability in terms of task-relatedness, quantified using the difference between the class priors, label sets, and features of the two tasks. Our experiments using state-of-the-art pre-trained models show the effectiveness of task-relatedness in explaining transferability on various vision and language tasks. The efficient computability of task-relatedness even without labels of the target task and its high correlation with the model's accuracy after end-to-end fine-tuning on the target task makes it a useful metric for transferability estimation. Our empirical results of using task-relatedness to select the best pre-trained model from a model zoo for a target task highlight its utility for practical problems.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is when the representations of pre - trained models can be successfully transferred to downstream tasks to obtain high - performance models in cross - domain and cross - task settings. Specifically, the paper aims to fill the gaps in existing research, that is, when transferring knowledge between different tasks, what conditions can guarantee the success of such transfer. To achieve this goal, the authors propose a new analysis method to evaluate the transferability of pre - trained model representations to downstream tasks through task relevance, and give an upper bound of transferability, which is quantified by the class - prior differences, label - set differences, and feature - distribution differences between tasks. ### Main Contributions 1. **Strict Transferability Analysis**: For the first time in cross - domain and cross - task settings, the paper conducts a strict analysis of the transferability of classification tasks through task relevance and gives an upper bound of transferability. 2. **Efficient Transferability Computation**: An optimization problem is proposed to efficiently calculate task relevance, and it can predict the performance after end - to - end fine - tuning even without the labels of the target task. 3. **Empirical Analysis**: Using state - of - the - art pre - trained models and computer vision and natural language processing tasks, it is shown that task relevance can accurately predict transferability, and by improving the transferability of relevant reference tasks, the transferability to unseen tasks can be improved. ### Method Overview The paper transforms the distribution of the reference task into that of the target task through a series of transformations, including: - **Class - Prior Transformation**: Adjust the class - prior distribution of the reference task to make it closer to the target task. - **Label Transformation**: Match the label sets of the reference task and the target task. - **Feature Transformation**: Adjust the feature space to make the feature distribution of the reference task more similar to that of the target task. ### Theoretical Results - **Theorem 1**: Gives the upper bound of the classifier loss after transformation, which consists of the re - weighted reference loss and the label - mismatch term. - **Theorem 2**: Explains the loss gap caused by distribution mismatch, which is quantified by the Wasserstein distance. - **Theorem 3**: Combining the above two theorems, gives the final upper bound of transferability, which consists of three parts: the re - weighted reference loss, the label - mismatch term, and the distribution - mismatch term. ### Experimental Verification - **The Relationship between Task Relevance and Actual Transferability**: The experimental results show that task relevance is highly correlated with actual transferability, and task relevance is also highly correlated with the accuracy of the classifier after end - to - end fine - tuning. - **The Influence of Reference Tasks**: It is experimentally verified that choosing appropriate reference tasks can significantly improve task relevance, thereby improving transferability. ### Conclusion Through theoretical analysis and experiments, the paper proves that task relevance is an effective metric that can be used to predict the transferability of pre - trained models on downstream tasks. This finding is of great significance for selecting the best pre - trained models and understanding knowledge transfer in cross - domain and cross - task settings.