Abstract:In a competitive game scenario, a set of agents have to learn decisions that maximize their goals and minimize their adversaries' goals at the same time. Besides dealing with the increased dynamics of the scenarios due to the opponents' actions, they usually have to understand how to overcome the opponent's strategies. Most of the common solutions, usually based on continual learning or centralized multi-agent experiences, however, do not allow the development of personalized strategies to face individual opponents. In this paper, we propose a novel model composed of three neural layers that learn a representation of a competitive game, learn how to map the strategy of specific opponents, and how to disrupt them. The entire model is trained online, using a composed loss based on a contrastive optimization, to learn competitive and multiplayer games. We evaluate our model on a pokemon duel scenario and the four-player competitive Chef's Hat card game. Our experiments demonstrate that our model achieves better performance when playing against offline, online, and competitive-specific models, in particular when playing against the same opponent multiple times. We also present a discussion on the impact of our model, in particular on how well it deals with on specific strategy learning for each of the two scenarios.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how agents (or players) learn to make decisions to maximize their own goals while minimizing their opponents' goals in competitive game scenarios. In particular, when facing different opponents, most existing solutions, such as continuous learning or multi - agent centralized experience, find it difficult to develop personalized strategies to compete against specific opponents. Therefore, this paper proposes a new model. Through contrastive reinforcement learning optimization, it learns how to represent competitive games, map the strategies of specific opponents, and how to disrupt these strategies, aiming to address the shortcomings of existing methods in terms of personalized adaptation. Specifically, the goals of this paper are: 1. **Personalized Adaptation**: Develop mechanisms that can adjust the game style for specific opponents and identify repetitive strategies that occur preferentially during the game. 2. **Maintain Generalization Ability**: Ensure that agents can not only adapt quickly in multiple battles with specific opponents but also maintain good performance when facing different opponents, avoiding typical transfer learning problems such as catastrophic forgetting. 3. **Improve Performance**: Through contrastive optimization techniques, enhance the learning efficiency of agents so that they perform better in battles against offline, online, and specific competitive models. To verify the effectiveness of the proposed model, the authors evaluated it in two different competitive learning scenarios: one is the Pokémon Duel Simulator (PokEnv), and the other is the four - person Chef's Hat card game. Through a series of experiments, including benchmark tests, strategy prediction accuracy evaluations, individual performance measurements, and knowledge retention ability tests after long - time intervals, the advantages of the model in personalized strategy learning and long - term competition are demonstrated.

All by Myself: Learning Individualized Competitive Behaviour with a Contrastive Reinforcement Learning optimization

Self-play Reinforcement Learning with Comprehensive Critic in Computer Games

Learning from Learners: Adapting Reinforcement Learning Agents to be Competitive in a Card Game

Moody Learners -- Explaining Competitive Behaviour of Reinforcement Learning Agents

You Were Always on My Mind: Introducing Chef's Hat and COPPER for Personalized Reinforcement Learning

Incorporating Rivalry in Reinforcement Learning for a Competitive Game

Opponent Modeling in Deep Reinforcement Learning

Hierarchical Deep Reinforcement Learning Agent with Counter Self-play on Competitive Games

Personalized Dynamic Difficulty Adjustment -- Imitation Learning Meets Reinforcement Learning

Multi-agent Reinforcement Learning with Approximate Model Learning for Competitive Games.

Neural Auto-Curricula

Neural Auto-Curricula in Two-Player Zero-Sum Games.

Efficient Competitive Self-Play Policy Optimization

Incomplete Information Competition Strategy Based on Improved Asynchronous Advantage Actor Critical Model.

Infer Your Enemies and Know Yourself, Learning in Real-Time Bidding with Partially Observable Opponents

Online meta-learning by parallel algorithm competition

Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Competitive Games

Competitive Multi-agent Deep Reinforcement Learning with Counterfactual Thinking

Offline Fictitious Self-Play for Competitive Games

Curriculum Learning for Cooperation in Multi-Agent Reinforcement Learning

In-Context Exploiter for Extensive-Form Games