Socially Intelligent Reinforcement Learning for Optimal Automated Vehicle Control in Traffic Scenarios

Hamid Taghavifar,Chongfeng Wei,Leyla Taghavifar
DOI: https://doi.org/10.1109/tase.2023.3347264
IF: 6.636
2024-01-01
IEEE Transactions on Automation Science and Engineering
Abstract:In this paper, a novel approach is presented for modeling the interaction dynamics between an ego car and a bicycle in a traffic scenario using a hybrid reinforcement learning framework combined with a social value orientation (SVO) model. The proposed framework leverages the SARSA algorithm to learn the optimal policy for the ego vehicle while incorporating risk cost as the negative log-likelihood of collision. Additionally, a customized SVO model is introduced to capture the social preferences of the ego car and the bicycle, defining the SVO of each agent as a continuous variable between egoistic and cooperative orientations. Furthermore, a weight parameter is incorporated in the framework to regulate the influence of the SVO model on the learning process. We demonstrate the effectiveness of our approach through extensive simulations, showing that the ego car can balance between maximizing its reward and avoiding collisions while considering the social preferences of the agents. The obtained results are compared to other models in the literature, and it is shown that the proposed method contributes to the development of safe and efficient autonomous driving systems that interact with human-driven vehicles in a socially intelligent manner Note to Practitioners—This proposed framework is motivated by the pressing challenge of navigation for autonomous cars in complex urban driving scenarios and mixed traffic situations. With the increasing prevalence of autonomous vehicles on roads, developing intelligent navigation systems that can effectively interact with other road users has become essential. Our novel framework addresses this need by leveraging the SARSA algorithm to learn the optimal policy for the ego vehicle while incorporating risk cost as the negative log-likelihood of collision. Additionally, a customized SVO model is introduced to capture the social preferences of the ego car and the bicycle, defining the SVO of each agent as a continuous variable between egoistic and cooperative orientations. This enables autonomous vehicles to make informed decisions and navigate safely and efficiently. Our framework can enormously help the field of autonomous vehicle navigation and contribute significantly to developing safe, human-centric, and reliable transportation systems. The versatility of our approach is evident in its potential to support a network of autonomous vehicles interacting with multiple road users, thereby enhancing scalability. By leveraging the power of machine learning, our solution provides a robust and adaptable approach that can handle the diverse and ever-changing conditions of urban driving scenarios.
automation & control systems
What problem does this paper attempt to address?