Abstract:Developing a reinforcement learning (RL) agent often involves identifying effective values for a large number of parameters, covering the policy, reward function, environment, and the agent's internal architecture, such as parameters controlling how the peripheral vision and memory modules work. Critically, since these parameters are interrelated in complex ways, optimizing them can be viewed as a black box optimization problem, which is especially challenging for non-experts. Although existing optimization-as-a-service platforms (e.g., Vizier, Optuna) can handle such problems, they are impractical for RL systems, as users must manually map each parameter to different components, making the process cumbersome and error-prone. They also require deep understanding of the optimization process, limiting their application outside ML experts and restricting access for fields like cognitive science, which models human decision-making. To tackle these challenges, we present AgentForge, a flexible low-code framework to optimize any parameter set across an RL system. AgentForge allows the user to perform individual or joint optimization of parameter sets. An optimization problem can be defined in a few lines of code and handed to any of the interfaced optimizers. We evaluated its performance in a challenging vision-based RL problem. AgentForge enables practitioners to develop RL agents without requiring extensive coding or deep expertise in optimization.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: when developing reinforcement learning (RL) agents, how to effectively optimize a large number of parameters. These parameters cover multiple aspects such as policies, reward functions, environments, and the internal architecture of the agents, for example, the parameters that control the operation of the peripheral vision and memory modules. Due to the complex inter - relationships among these parameters, optimizing them can be regarded as a "black - box optimization problem", which is especially challenging for non - expert users. Specifically, although existing optimization - as - a - service platforms (such as Vizier, Optuna) can handle this type of problem, they are not practical when applied to RL systems because users must manually map each parameter to different components, a process that is both cumbersome and error - prone. In addition, these platforms require users to have a deep understanding of the optimization process, which limits their application in fields such as cognitive science, where human decision - making processes need to be modeled. To solve these problems, the authors propose **AGENT FORGE**, a flexible low - code platform designed to optimize the parameter sets of any RL system. AGENT FORGE allows users to optimize parameter sets individually or jointly, and defining an optimization problem can be completed with just a few lines of code. In this way, users can quickly develop and optimize RL agents without needing a deep understanding of optimization techniques, thereby greatly simplifying the design process of RL agents. ### Main problem summary: 1. **Complexity of parameter optimization**: RL agents involve a large number of parameters, and optimizing these parameters is a complex black - box optimization problem. 2. **Limitations of existing tools**: Existing optimization platforms are not friendly enough to RL systems, are complex to use, and require users to have in - depth professional knowledge. 3. **Need for interdisciplinary applications**: Many fields (such as cognitive science) need to use RL to model human behavior, but lack corresponding optimization tools. By providing a low - code, easy - to - use optimization framework, AGENT FORGE aims to enable non - machine - learning experts to efficiently develop and optimize RL agents, thereby promoting the application of RL technology in more fields.

AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent Design

Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own

Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone

Automatic tuning of hyper-parameters of reinforcement learning algorithms using Bayesian optimization with behavioral cloning

Behavior Alignment via Reward Function Optimization

Online Pareto-Optimal Decision-Making for Complex Tasks using Active Inference

Off-Agent Trust Region Policy Optimization

CaiRL: A High-Performance Reinforcement Learning Environment Toolkit

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

Metacontrol for Adaptive Imagination-Based Optimization

Scilab-RL: A software framework for efficient reinforcement learning and cognitive modeling research

Foundation Reinforcement Learning: Towards Embodied Generalist Agents with Foundation Prior Assistance

Godot Reinforcement Learning Agents

An agent design with goal reaching guarantees for enhancement of learning

Efficient Reinforcement Learning via Decoupling Exploration and Utilization

Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation

Configurable Agent With Reward As Input: A Play-Style Continuum Generation

BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

rl_reach: Reproducible Reinforcement Learning Experiments for Robotic Reaching Tasks

Structural Design Through Reinforcement Learning

A Hybrid Online Off-Policy Reinforcement Learning Agent Framework Supported by Transformers