Abstract:Recently, triggered by the impressive results in TV-games or game of Go by Google DeepMind, end-to-end reinforcement learning (RL) is collecting attentions. Although little is known, the author's group has propounded this framework for around 20 years and already has shown various functions that emerge in a neural network (NN) through RL. In this paper, they are introduced again at this timing. "Function Modularization" approach is deeply penetrated subconsciously. The inputs and outputs for a learning system can be raw sensor signals and motor commands. "State space" or "action space" generally used in RL show the existence of functional modules. That has limited reinforcement learning to learning only for the action-planning module. In order to extend reinforcement learning to learning of the entire function on a huge degree of freedom of a massively parallel learning system and to explain or develop human-like intelligence, the author has believed that end-to-end RL from sensors to motors using a recurrent NN (RNN) becomes an essential key. Especially in the higher functions, this approach is very effective by being free from the need to decide their inputs and outputs. The functions that emerge, we have confirmed, through RL using a NN cover a broad range from real robot learning with raw camera pixel inputs to acquisition of dynamic functions in a RNN. Those are (1)image recognition, (2)color constancy (optical illusion), (3)sensor motion (active recognition), (4)hand-eye coordination and hand reaching movement, (5)explanation of brain activities, (6)communication, (7)knowledge transfer, (8)memory, (9)selective attention, (10)prediction, (11)exploration. The end-to-end RL enables the emergence of very flexible comprehensive functions that consider many things in parallel although it is difficult to give the boundary of each function clearly.

Zero-Shot Reinforcement Learning via Function Encoders

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning

Zero-Shot Reinforcement Learning from Low Quality Data

Functions that Emerge through End-to-End Reinforcement Learning - The Direction for Artificial General Intelligence -

Zero-Shot Transfer of Neural ODEs

An Ensemble Fuzzy Approach for Inverse Reinforcement Learning

Zero-Shot Generalization of Vision-Based RL Without Data Augmentation

Zero-Shot Stitching in Reinforcement Learning using Relative Representations

Effective Representation Learning is More Effective in Reinforcement Learning Than You Think

Zero-shot Policy Learning with Spatial Temporal RewardDecomposition on Contingency-aware Observation

Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning.

Pixel to policy: DQN Encoders for within & cross-game reinforcement learning

Zero-shot Policy Learning with Spatial Temporal Reward Decomposition on Contingency-aware Observation.

RL Zero: Zero-Shot Language to Behaviors without any Supervision

Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning

Pre-training with Non-expert Human Demonstration for Deep Reinforcement Learning

Graph and Autoencoder Based Feature Extraction for Zero-shot Learning

EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning