FedVisual: Heterogeneity-Aware Model Aggregation for Federated Learning in Visual-Based Vehicular Crowdsensing

Wenjun Zhang,Xiaoli Liu,Ruoyi Zhang,Chao Zhu,Sasu Tarkoma
DOI: https://doi.org/10.1109/jiot.2024.3456751
IF: 10.6
2024-01-01
IEEE Internet of Things Journal
Abstract:With the advancement of assisted and autonomous driving technologies, vehicles are being outfitted with an ever-increasing number of sensors. Among these, visible light sensors, or dash-cameras, produce visual data rich in information. Analyzing this visual data through crowdsensing allows for low-cost and timely perception of urban road conditions, such as identifying dangerous driving behaviors and locating parking spaces. However, uploading such massive visual data to the cloud for centralized processing can lead to significant bandwidth challenges and also raise privacy concerns among vehicle owners. Federated learning (FL), in which vehicles serve as both data generators and computing nodes, presents a promising solution to address these challenges. Nevertheless, urban roads are complex and vehicles in different locations encounter completely different scenes, resulting in non-i.i.d. (non-independently and identically distributed) characteristics. Additionally, the diversity in dash-camera and onboard computation resources may lead to differences in the performance of locally trained models. Indiscriminate aggregating of local models from all vehicles can potentially degrade the global models performance. To overcome these challenges, we introduce FedVisual, a model aggregation approach for FL in vehicular visual crowdsensing. FedVisual leverages deep Q-Network (DQN) to select appropriate local models, considering the heterogeneities in visual data contents and vehicles specifications. By leveraging the historical training experience, an effective model selection strategy can be obtained without complex mathematical modeling. Through the extensive simulations of our self-collected driving videos, FedVisual reduces model aggregation latency by up to 3.8% while improving the models performance by up to 3.2% compared to reference works.
What problem does this paper attempt to address?