RAMario: Experimental Approach to Reptile Algorithm -- Reinforcement Learning for Mario

Sanyam Jain

2023-05-17

Abstract:This research paper presents an experimental approach to using the Reptile algorithm for reinforcement learning to train a neural network to play Super Mario Bros. We implement the Reptile algorithm using the Super Mario Bros Gym library and TensorFlow in Python, creating a neural network model with a single convolutional layer, a flatten layer, and a dense layer. We define the optimizer and use the Reptile class to create an instance of the Reptile meta-learning algorithm. We train the model using multiple tasks and episodes, choosing actions using the current weights of the neural network model, taking those actions in the environment, and updating the model weights using the Reptile algorithm. We evaluate the performance of the algorithm by printing the total reward for each episode. In addition, we compare the performance of the Reptile algorithm approach to two other popular reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Deep Q-Network (DQN), applied to the same Super Mario Bros task. Our results demonstrate that the Reptile algorithm provides a promising approach to few-shot learning in video game AI, with comparable or even better performance than the other two algorithms, particularly in terms of moves vs distance that agent performs for 1M episodes of training. The results shows that best total distance for world 1-2 in the game environment were ~1732 (PPO), ~1840 (DQN) and ~2300 (RAMario). Full code is available at <a class="link-external link-https" href="https://github.com/s4nyam/RAMario" rel="external noopener nofollow">this https URL</a>.

Machine Learning,Multiagent Systems

What problem does this paper attempt to address?

The problem this paper attempts to address is: how to use the Reptile algorithm for reinforcement learning to train neural networks to play Super Mario Bros and improve few-shot learning capabilities in video game AI. Specifically, the paper implements the Reptile algorithm and compares it with two popular reinforcement learning algorithms—Proximal Policy Optimization (PPO) and Deep Q-Network (DQN)—to evaluate its performance in training Mario game agents. The main objectives of the study include: 1. **Improving few-shot learning capabilities**: The Reptile algorithm can quickly adapt to new tasks with limited data, which is particularly important in video game AI. 2. **Enhancing training efficiency and stability**: Compared to PPO and DQN, the Reptile algorithm demonstrates better convergence and stability during training. 3. **Optimizing movement and distance performance**: Experimental results show that the Reptile algorithm outperforms PPO and DQN in terms of the number of movements and distance covered in the Mario game. Through these objectives, the paper aims to explore the potential applications of the Reptile algorithm in video game AI and provide a reference for future few-shot learning research.

RAMario: Experimental Approach to Reptile Algorithm -- Reinforcement Learning for Mario

Optimizing Mario Adventures in a Constrained Environment

Exploiting semantic segmentation to boost reinforcement learning in video game environments

Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning

Autonomous Agents in Snake Game Via Deep Reinforcement Learning

Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning

Neuroevolution of Recurrent Architectures on Control Tasks

Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft

A Comparative Study of Deep Reinforcement Learning Models: DQN vs PPO vs A2C

Go-Explore: a New Approach for Hard-Exploration Problems

AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov Decision Process (student Abstract)

Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

Mastering the Game of 3v3 Snakes with Rule-Enhanced Multi-Agent Reinforcement Learning

Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning

AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov Decision Process

Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics

Google Research Football: A Novel Reinforcement Learning Environment

A Modular Deep-learning Environment for Rogue

Reptile: a Scalable Metalearning Algorithm

Generating intelligent agent behaviors in multi-agent game AI using deep reinforcement learning algorithm