Abstract:We study the fundamental problem of transfer learning where a learning algorithm collects data from some source distribution $P$ but needs to perform well with respect to a different target distribution $Q$. A standard change of measure argument implies that transfer learning happens when the density ratio $dQ/dP$ is bounded. Yet, prior thought-provoking works by Kpotufe and Martinet (COLT, 2018) and Hanneke and Kpotufe (NeurIPS, 2019) demonstrate cases where the ratio $dQ/dP$ is unbounded, but transfer learning is possible.
What problem does this paper attempt to address?
### Problems the paper attempts to solve
This paper aims to solve a fundamental problem in **Transfer Learning**, that is, how to achieve effective transfer learning when the density ratio \(\frac{dQ}{dP}\) between the source distribution \(P\) and the target distribution \(Q\) is **unbounded**. Traditional transfer learning theories usually assume that the density ratio is bounded, but in many practical scenarios, this assumption does not hold. Therefore, researchers need to explore whether transfer learning is still feasible under more relaxed conditions, especially when the density ratio is unbounded.
Specifically, this paper focuses on the transfer learning problem of low - degree polynomial estimators and raises the following core questions:
1. **Possibility of transfer learning under unbounded density ratio**: When traditional methods fail (i.e., \(\frac{dQ}{dP}\) is unbounded), are there other conditions under which transfer learning can still be successful?
2. **Transfer characteristics of low - degree polynomials**: For low - degree polynomial estimators, under what distribution conditions can they be transferred from the source distribution \(P\) to the target distribution \(Q\) with controllable error?
3. **Transfer learning in the discrete domain**: Especially on the Boolean hypercube \(\{- 1,1\}^n\), what are the transfer learning characteristics of low - degree Boolean functions? What role does the maximum influence play in it?
### Research background and motivation
- **Limitations of traditional transfer learning**: Most existing research relies on the assumption that \(\frac{dQ}{dP}\) is bounded, but this assumption does not always hold in many practical application scenarios. For example, in problems such as truncated regression and distribution shift, the density ratio may be unbounded.
- **Inspiration from existing work**: The work of Kpotufe and Martinet (COLT, 2018) and Hanneke and Kpotufe (NeurIPS, 2019) shows that in some cases, even if \(\frac{dQ}{dP}\) is unbounded, transfer learning is still possible. This provides an important theoretical basis and inspiration for the research in this paper.
- **New challenges and opportunities**: With the development of deep learning and neural networks, the application of transfer learning is becoming more and more widespread. Especially when dealing with Out - of - Distribution (OOD) data, understanding transfer learning under unbounded density ratios is of great significance.
### Main contributions
1. **Generalized transfer inequality**: The paper proposes a general transfer inequality applicable to low - degree polynomial estimators, which is still valid under very relaxed assumptions, going beyond the traditional bounded density ratio assumption.
2. **Transferability of low - degree polynomials in Euclidean space**: It is proved that under the log - concave distribution \(Q\) and any continuous distribution \(P\), the transfer error of low - degree polynomials can be controlled by the error under \(P\).
3. **Transferability in the discrete domain**: Especially for the Boolean hypercube \(\{-1,1\}^n\), the maximum influence of low - degree Boolean functions is proposed as a sufficient condition for transferability.
4. **Application examples**: Through specific examples (such as truncated regression, distribution shift under the linear attention mechanism, etc.), the practical application value of the proposed transfer inequality is demonstrated.
### Conclusion
By introducing new mathematical tools and techniques (such as the Carbery - Wright anti - concentration inequality and the invariance principle), this paper breaks through the limitations of traditional transfer learning and provides a theoretical basis and practical methods for transfer learning under unbounded density ratios. This is of great significance for understanding and optimizing the performance of machine learning models under different distributions.