Beyond Model Adaptation at Test Time: A Survey

Zehao Xiao,Cees G. M. Snoek
2024-11-06
Abstract:Machine learning algorithms have achieved remarkable success across various disciplines, use cases and applications, under the prevailing assumption that training and test samples are drawn from the same distribution. Consequently, these algorithms struggle and become brittle even when samples in the test distribution start to deviate from the ones observed during training. Domain adaptation and domain generalization have been studied extensively as approaches to address distribution shifts across test and train domains, but each has its limitations. Test-time adaptation, a recently emerging learning paradigm, combines the benefits of domain adaptation and domain generalization by training models only on source data and adapting them to target data during test-time inference. In this survey, we provide a comprehensive and systematic review on test-time adaptation, covering more than 400 recent papers. We structure our review by categorizing existing methods into five distinct categories based on what component of the method is adjusted for test-time adaptation: the model, the inference, the normalization, the sample, or the prompt, providing detailed analysis of each. We further discuss the various preparation and adaptation settings for methods within these categories, offering deeper insights into the effective deployment for the evaluation of distribution shifts and their real-world application in understanding images, video and 3D, as well as modalities beyond vision. We close the survey with an outlook on emerging research opportunities for test-time adaptation.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the poor performance of machine - learning models during the testing phase when encountering distribution shifts. Specifically, many machine - learning algorithms assume that the training data and the testing data come from the same distribution, but in practical applications, this assumption is often not valid. For example, during the inference process, situations such as noisy sensor recordings, sudden changes in weather conditions, the evolution of user needs, or completely new and unforeseen targets may be encountered, all of which can cause the distribution of the testing data to be different from that of the training data. When the distribution of the testing data begins to deviate from that of the training data, the performance of the model will decline significantly and may even become fragile. To solve this problem, the paper explores a new paradigm called "test - time adaptation". The goal of test - time adaptation is to fine - tune the model during the testing phase to reduce the negative impact caused by the distribution difference between the training data and the testing data. Specifically, test - time adaptation methods train the model using only the source data during the training phase, and during the testing phase, adjust the model parameters or the representation of the testing data through a small amount or no - label target data, thereby enhancing the performance and robustness of the model on specific test samples. ### Main Problem Definition 1. **Distribution Shift**: Test - time adaptation mainly focuses on solving the distribution shift problem in machine - learning algorithms. Specifically, it is manifested as: \[ p(x_t, y_t)\neq p(x_s, y_s) \] where \(p(x_s, y_s)\) is the source distribution and \(p(x_t, y_t)\) is the target distribution. This inconsistency will lead to the problem of inaccurate predictions when the source - trained model \(f_{\theta_s}\) is applied to the target - distribution data. 2. **Four Common Types of Distribution Shifts**: - **Covariate Shifts**: Only the input space \(p(x)\) changes, while the labels for the given input features remain unchanged. \[ p(x_t)\neq p(x_s),\quad p(y_t|x_t) = p(y_s|x_s) \] - **Label Shifts**: Only the label space \(p(y)\) changes, while the data distribution for the given labels remains unchanged. \[ p(y_t)\neq p(y_s),\quad p(x_t|y_t) = p(x_s|y_s) \] - **Concept Shifts**: The input distribution is the same, but the conditional distribution changes, such as noisy labels or different annotation methods. \[ p(x_t) = p(x_s),\quad p(y_t|x_t)\neq p(y_s|x_s) \] - **Conditional Shifts**: The label space remains unchanged, but the distribution of the input samples changes according to the labels. \[ p(y_t) = p(y_s),\quad p(x_t|y_t)\neq p(x_s|y_s) \] 3. **Test - time Adaptation**: Given the labeled source distribution \(S\) and the unlabeled target distribution \(T\), test - time adaptation aims to train the model \(f_{\theta_s}\) based only on the source distribution \(S\), and adapt during the testing phase through the source - trained model \(f_{\theta_s}\) and the target data \(x_t\) so as to make predictions on \(x_t\) after adaptation. The adaptation process can be carried out in an online or batch manner without a large amount of target data. ### Difference from Related Problems - **Domain Adaptation**: Narrow the domain gap by accessing the source and target data, but assume that the target data is available during training. - **Domain Generalization**: Avoid the need during training.