Abstract:Contemporary neural networks frequently encounter the challenge of catastrophic forgetting, wherein newly acquired learning can overwrite and erase previously learned information. The paradigm of continual learning offers a promising solution by enabling intelligent systems to retain and build upon their acquired knowledge over time. This paper introduces a novel approach within the continual learning framework, employing deep reinforcement learning agents that process unprocessed pixel data and interact with microcircuit-like components. These agents autonomously advance through a series of learning stages, culminating in the development of a sophisticated neural network system optimized for predictive performance in the game of tic-tac-toe. Structured to operate in sequential order, each agent is tasked with achieving forward-looking objectives based on Bellman's principles of reinforcement learning. Knowledge retention is facilitated through the integration of specific microcircuits, which securely store the insights gained by each agent. During the training phase, these microcircuits work in concert, employing high-energy, sparse encoding techniques to enhance learning efficiency and effectiveness. The core contribution of this paper is the establishment of an artificial neural network system capable of accurately predicting tic-tac-toe moves, akin to the observational strategies employed by humans. Our experimental results demonstrate that after approximately 5000 cycles of backpropagation, the system significantly reduced the training loss to , thereby increasing the expected cumulative reward. This advancement in training efficiency translates into superior predictive capabilities, enabling the system to secure consistent victories by anticipating up to four moves ahead.

Learning Successor Features the Simple Way

Learn to Differ: Sim2Real Small Defection Segmentation Network

Efficient Deep Reinforcement Learning Via Policy-Extended Successor Feature Approximator

Exploring the Noise Resilience of Successor Features and Predecessor Features Algorithms in One and Two-Dimensional Environments

Continual Reinforcement Learning with Multi-Timescale Successor Features

Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning

Combining Behaviors with the Successor Features Keyboard

Deep Successor Feature Learning for Text Generation.

Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Successor Feature Neural Episodic Control

Continual learning, deep reinforcement learning, and microcircuits: a novel method for clever game playing

Temporally extended successor feature neural episodic control

Learning Successor Features with Distributed Hebbian Temporal Memory

Generative Adversarial Implicit Successor Representation

Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making

Learning Sparse Control Tasks from Pixels by Latent Nearest-Neighbor-Guided Explorations

Learning Sparse Representations Incrementally in Deep Reinforcement Learning

Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning

Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement

Learning from Pixels with Expert Observations

State Representation Learning with Adjacent State Consistency Loss for Deep Reinforcement Learning.