Yijun Zhai,Pengzhan Zhou,Yuepeng He,Fang Qu,Zhida Qin,Xianlong Jiao,Guiyan Liu,Songtao Guo
Abstract:The emerging federated learning enables distributed autonomous vehicles to train equipped deep learning models collaboratively without exposing their raw data, providing great potential for utilizing explosively growing autonomous driving data. However, considering the complicated traffic environments and driving scenarios, deploying federated learning for autonomous vehicles is inevitably challenged by non-independent and identically distributed (Non-IID) data of vehicles, which may lead to failed convergence and low training accuracy. In this paper, we propose a novel hierarchically Federated Region-learning framework of Autonomous Vehicles (FedRAV), a two-stage framework, which adaptively divides a large area containing vehicles into sub-regions based on the defined region-wise distance, and achieves personalized vehicular models and regional models. This approach ensures that the personalized vehicular model adopts the beneficial models while discarding the unprofitable ones. We validate our FedRAV framework against existing federated learning algorithms on three real-world autonomous driving datasets in various heterogeneous settings. The experiment results demonstrate that our framework outperforms those known algorithms, and improves the accuracy by at least 3.69%. The source code of FedRAV is available at: <a class="link-external link-https" href="https://github.com/yjzhai-cs/FedRAV" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
### What problems does this paper attempt to solve?
This paper aims to solve the non - independent and identically distributed (Non - IID) data challenges faced by autonomous vehicles (AVs) in federated learning (FL). Specifically, the data collected by autonomous vehicles in different regions and driving scenarios is significantly heterogeneous, which may lead to convergence failure and low accuracy problems when training models.
#### Main problems:
1. **Non - IID data problem**: Due to the complex traffic environment and driving scenarios, the data collected by autonomous vehicles is often non - independent and identically distributed. For example, the data distributions in different places such as highways, urban areas, and rural areas are very different, which makes it difficult for traditional federated learning methods to work effectively.
2. **How to perform adaptive partitioning in large areas**: In order to better handle Non - IID data, it is necessary to divide large areas into multiple sub - areas so that the data distribution within each sub - area is more similar. To this end, the paper proposes a partitioning mechanism based on regional distance.
3. **How to personalize models for each autonomous vehicle**: In order to enable each autonomous vehicle to adapt to local driving data, an effective personalization strategy is required. To this end, the paper introduces a hypernetwork to generate personalized mask vectors, thereby adjusting the shared model to meet the needs of specific vehicles.
#### Solution overview:
To solve the above problems, the paper proposes a hierarchical federated regional learning framework FedRA V (Hierarchically Federated Region - learning for Traffic Object Classification of Autonomous Vehicles). The main features of this framework include:
- **Adaptive partitioning mechanism**: By defining the region - wise distance (RWD), large areas are divided into multiple sub - areas to ensure that the data distribution within each sub - area is more similar.
- **Personalized model training**: Use the hypernetwork to generate personalized mask vectors for each autonomous vehicle, thereby adjusting the shared model to meet the needs of specific vehicles. At the same time, personalized regional models are also generated for each region, enabling vehicles to benefit from regional data.
- **Experimental verification**: Through experiments on three real - world autonomous driving datasets, the effectiveness of the FedRA V framework is verified, and its advantages over existing federated learning algorithms are demonstrated.
#### Formula summary:
- Region - wise distance formula:
\[
\text{RWD}(x, u)=\|V_x - V_u\|_2+\gamma\cdot[\zeta(C_x - C_u)^T W\zeta(C_x - C_u)]^{1/2}
\]
where $\zeta(\cdot)$ represents the element - level absolute value, $\gamma$ is a hyperparameter that controls the importance of spatial distance and label distance, $W$ is a weight matrix, and $C_x$ and $C_u$ are the relative abundance vectors of vehicle $x$ and region center $u$ respectively.
- Label distance formula:
\[
d_{\text{label}}(i, j)=\|\zeta(C_i - C_j)^T W\zeta(C_i - C_j)\|^{1/2}
\]
Through these methods, the FedRA V framework can effectively deal with the Non - IID data problem in autonomous vehicles and improve the accuracy and robustness of the model.