All by Myself: Learning Individualized Competitive Behaviour with a Contrastive Reinforcement Learning optimization

Pablo Barros,Alessandra Sciutti
DOI: https://doi.org/10.1016/j.neunet.2022.03.013
2023-10-02
Abstract:In a competitive game scenario, a set of agents have to learn decisions that maximize their goals and minimize their adversaries' goals at the same time. Besides dealing with the increased dynamics of the scenarios due to the opponents' actions, they usually have to understand how to overcome the opponent's strategies. Most of the common solutions, usually based on continual learning or centralized multi-agent experiences, however, do not allow the development of personalized strategies to face individual opponents. In this paper, we propose a novel model composed of three neural layers that learn a representation of a competitive game, learn how to map the strategy of specific opponents, and how to disrupt them. The entire model is trained online, using a composed loss based on a contrastive optimization, to learn competitive and multiplayer games. We evaluate our model on a pokemon duel scenario and the four-player competitive Chef's Hat card game. Our experiments demonstrate that our model achieves better performance when playing against offline, online, and competitive-specific models, in particular when playing against the same opponent multiple times. We also present a discussion on the impact of our model, in particular on how well it deals with on specific strategy learning for each of the two scenarios.
Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how agents (or players) learn to make decisions to maximize their own goals while minimizing their opponents' goals in competitive game scenarios. In particular, when facing different opponents, most existing solutions, such as continuous learning or multi - agent centralized experience, find it difficult to develop personalized strategies to compete against specific opponents. Therefore, this paper proposes a new model. Through contrastive reinforcement learning optimization, it learns how to represent competitive games, map the strategies of specific opponents, and how to disrupt these strategies, aiming to address the shortcomings of existing methods in terms of personalized adaptation. Specifically, the goals of this paper are: 1. **Personalized Adaptation**: Develop mechanisms that can adjust the game style for specific opponents and identify repetitive strategies that occur preferentially during the game. 2. **Maintain Generalization Ability**: Ensure that agents can not only adapt quickly in multiple battles with specific opponents but also maintain good performance when facing different opponents, avoiding typical transfer learning problems such as catastrophic forgetting. 3. **Improve Performance**: Through contrastive optimization techniques, enhance the learning efficiency of agents so that they perform better in battles against offline, online, and specific competitive models. To verify the effectiveness of the proposed model, the authors evaluated it in two different competitive learning scenarios: one is the Pokémon Duel Simulator (PokEnv), and the other is the four - person Chef's Hat card game. Through a series of experiments, including benchmark tests, strategy prediction accuracy evaluations, individual performance measurements, and knowledge retention ability tests after long - time intervals, the advantages of the model in personalized strategy learning and long - term competition are demonstrated.