Counter-Intuitive Effects of Q-Learning Exploration in a Congestion Dilemma

Cesare Carissimo
DOI: https://doi.org/10.1109/access.2024.3358608
IF: 3.9
2024-02-07
IEEE Access
Abstract:Exploration is an integral part of learning dynamics which allows algorithms to search a space of solutions. When many algorithms simultaneously explore, this can lead to counter-intuitive effects. This paper contributes an analysis of the influence that exploration has on a multi-agent system of -learners in a famous congestion dilemma, the Braess paradox. I find ranges of the exploration rate for which -greedy -learners show chaotic and oscillatory dynamics which do not converge, and yield better than Nash equilibrium results. I decouple the dynamics endogenous to -learning from the exogenous exploration rate , and find that -learners implicitly coordinate with low exploration rates , but are disrupted in their coordination for larger exploration rates . The best implicit coordination leads to a 20% reduction in average travel times which approaches the social optimum. I discuss how our results may inform multi-agent algorithm design, fit within a cognitive science perspective of cognitive noise during learning, and provide a mechanistic hypothesis for the lack of empirical evidence of the Braess Paradox in traffic systems.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?