Abstract:Recent progress in motion forecasting has been substantially driven by self-supervised pre-training. However, adapting pre-trained models for specific downstream tasks, especially motion prediction, through extensive fine-tuning is often inefficient. This inefficiency arises because motion prediction closely aligns with the masked pre-training tasks, and traditional full fine-tuning methods fail to fully leverage this alignment. To address this, we introduce Forecast-PEFT, a fine-tuning strategy that freezes the majority of the model's parameters, focusing adjustments on newly introduced prompts and adapters. This approach not only preserves the pre-learned representations but also significantly reduces the number of parameters that need retraining, thereby enhancing efficiency. This tailored strategy, supplemented by our method's capability to efficiently adapt to different datasets, enhances model efficiency and ensures robust performance across datasets without the need for extensive retraining. Our experiments show that Forecast-PEFT outperforms traditional full fine-tuning methods in motion prediction tasks, achieving higher accuracy with only 17% of the trainable parameters typically required. Moreover, our comprehensive adaptation, Forecast-FT, further improves prediction performance, evidencing up to a 9.6% enhancement over conventional baseline methods. Code will be available at <a class="link-external link-https" href="https://github.com/csjfwang/Forecast-PEFT" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: In the field of autonomous driving, although existing self - supervised pre - training models (such as Traj - MAE, SEPT, and Forecast - MAE) have made significant progress in trajectory prediction tasks, they are inefficient when fine - tuning for specific downstream tasks (especially motion prediction). Specifically, traditional methods usually require large - scale fine - tuning of the entire model, which not only consumes a large amount of computing resources but may also lead to catastrophic forgetting, that is, losing the knowledge learned in the pre - training stage. To solve these problems, the author proposes a parameter - efficient fine - tuning strategy named Forecast - PEFT. This method freezes most of the model parameters and only adjusts newly introduced prompts and adapters, thereby retaining the pre - trained representation while greatly reducing the number of parameters that need to be re - trained and improving the fine - tuning efficiency. Specifically, Forecast - PEFT contains the following three key components: 1. **Contextual Embedding Prompt (CEP)**: \[ T'_E=\text{Encoder}(\text{concat}(T_H, P_{CE}, T_L)+PE) \] These prompts help the encoder better understand the context information of the input data during the fine - tuning process. 2. **Modality - Control Prompt (MCP)**: \[ M'_F = \text{Decoder}(\text{concat}(T'_E, P_{MC}, M_F)+PE) \] This prompt enables the decoder to generate multi - modal future trajectory predictions and adapt to different prediction tasks. 3. **Parallel Adapter (PA)**: \[ \text{Adapter}(f_{input})=\text{Up}(\text{GeLU}(\text{Down}(f_{input}))) \] Parallel adapters are added to the multi - head self - attention module and the feed - forward network in the Transformer layer to capture local information and enhance the model's adaptability to new data. In addition, Forecast - PEFT can also be flexibly fine - tuned across datasets. It only needs to be pre - trained once on a large - scale dataset and then can be efficiently fine - tuned on different datasets without maintaining a separate set of weights for each dataset. Experimental results show that Forecast - PEFT performs excellently on multiple trajectory prediction datasets and can achieve higher prediction accuracy with only about 17% of the trainable parameters required by traditional fine - tuning methods. In summary, Forecast - PEFT aims to solve the problem of inefficient fine - tuning of existing motion prediction models. Through a parameter - efficient fine - tuning strategy, it significantly improves the performance and generalization ability of the model while maintaining pre - trained knowledge.

Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models

Forecasting Distillation: Enhancing 3D Human Motion Prediction with Guidance Regularization

Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive?

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

PPT: Pre-Training with Pseudo-Labeled Trajectories for Motion Forecasting

See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition

Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training

Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey

Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders

An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model

UPetu: A Unified Parameter-Efficient Fine-Tuning Framework for Remote Sensing Foundation Model

AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning

Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation

Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform

Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: an Empirical Study

BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models

Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning

MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation