Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models

Jifeng Wang,Kaouther Messaoud,Yuejiang Liu,Juergen Gall,Alexandre Alahi
2024-07-29
Abstract:Recent progress in motion forecasting has been substantially driven by self-supervised pre-training. However, adapting pre-trained models for specific downstream tasks, especially motion prediction, through extensive fine-tuning is often inefficient. This inefficiency arises because motion prediction closely aligns with the masked pre-training tasks, and traditional full fine-tuning methods fail to fully leverage this alignment. To address this, we introduce Forecast-PEFT, a fine-tuning strategy that freezes the majority of the model's parameters, focusing adjustments on newly introduced prompts and adapters. This approach not only preserves the pre-learned representations but also significantly reduces the number of parameters that need retraining, thereby enhancing efficiency. This tailored strategy, supplemented by our method's capability to efficiently adapt to different datasets, enhances model efficiency and ensures robust performance across datasets without the need for extensive retraining. Our experiments show that Forecast-PEFT outperforms traditional full fine-tuning methods in motion prediction tasks, achieving higher accuracy with only 17% of the trainable parameters typically required. Moreover, our comprehensive adaptation, Forecast-FT, further improves prediction performance, evidencing up to a 9.6% enhancement over conventional baseline methods. Code will be available at <a class="link-external link-https" href="https://github.com/csjfwang/Forecast-PEFT" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: In the field of autonomous driving, although existing self - supervised pre - training models (such as Traj - MAE, SEPT, and Forecast - MAE) have made significant progress in trajectory prediction tasks, they are inefficient when fine - tuning for specific downstream tasks (especially motion prediction). Specifically, traditional methods usually require large - scale fine - tuning of the entire model, which not only consumes a large amount of computing resources but may also lead to catastrophic forgetting, that is, losing the knowledge learned in the pre - training stage. To solve these problems, the author proposes a parameter - efficient fine - tuning strategy named Forecast - PEFT. This method freezes most of the model parameters and only adjusts newly introduced prompts and adapters, thereby retaining the pre - trained representation while greatly reducing the number of parameters that need to be re - trained and improving the fine - tuning efficiency. Specifically, Forecast - PEFT contains the following three key components: 1. **Contextual Embedding Prompt (CEP)**: \[ T'_E=\text{Encoder}(\text{concat}(T_H, P_{CE}, T_L)+PE) \] These prompts help the encoder better understand the context information of the input data during the fine - tuning process. 2. **Modality - Control Prompt (MCP)**: \[ M'_F = \text{Decoder}(\text{concat}(T'_E, P_{MC}, M_F)+PE) \] This prompt enables the decoder to generate multi - modal future trajectory predictions and adapt to different prediction tasks. 3. **Parallel Adapter (PA)**: \[ \text{Adapter}(f_{input})=\text{Up}(\text{GeLU}(\text{Down}(f_{input}))) \] Parallel adapters are added to the multi - head self - attention module and the feed - forward network in the Transformer layer to capture local information and enhance the model's adaptability to new data. In addition, Forecast - PEFT can also be flexibly fine - tuned across datasets. It only needs to be pre - trained once on a large - scale dataset and then can be efficiently fine - tuned on different datasets without maintaining a separate set of weights for each dataset. Experimental results show that Forecast - PEFT performs excellently on multiple trajectory prediction datasets and can achieve higher prediction accuracy with only about 17% of the trainable parameters required by traditional fine - tuning methods. In summary, Forecast - PEFT aims to solve the problem of inefficient fine - tuning of existing motion prediction models. Through a parameter - efficient fine - tuning strategy, it significantly improves the performance and generalization ability of the model while maintaining pre - trained knowledge.