A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation

Aicheng Gong,Kai Yang,Jiafei Lyu,Xiu Li
2024-06-30
Abstract:Task allocation is a key combinatorial optimization problem, crucial for modern applications such as multi-robot cooperation and resource scheduling. Decision makers must allocate entities to tasks reasonably across different scenarios. However, traditional methods assume static attributes and numbers of tasks and entities, often relying on dynamic programming and heuristic algorithms for solutions. In reality, task allocation resembles Markov decision processes, with dynamically changing task and entity attributes. Thus, algorithms must dynamically allocate tasks based on their states. To address this issue, we propose a two-stage task allocation algorithm based on similarity, utilizing reinforcement learning to learn allocation strategies. The proposed pre-assign strategy allows entities to preselect appropriate tasks, effectively avoiding local optima and thereby better finding the optimal allocation. We also introduce an attention mechanism and a hyperparameter network structure to adapt to the changing number and attributes of entities and tasks, enabling our network structure to generalize to new tasks. Experimental results across multiple environments demonstrate that our algorithm effectively addresses the challenges of dynamic task allocation in practical applications. Compared to heuristic algorithms like genetic algorithms, our reinforcement learning approach better solves dynamic allocation problems and achieves zero-shot generalization to new tasks with good performance. The code is available at <a class="link-external link-https" href="https://github.com/yk7333/TaskAllocation" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the task allocation problem primarily focused on multi-entity task allocation in dynamic environments. Specifically, traditional methods usually assume that the number and attributes of tasks and entities are static and rely on dynamic programming and heuristic algorithms to solve the problem. However, in practical applications, task allocation is more like a Markov Decision Process (MDP), where the attributes of tasks and entities change dynamically over time. Therefore, a method is needed that can dynamically adjust the task allocation strategy. To solve this problem, the authors propose a two-stage reinforcement learning task allocation algorithm based on similarity. The main contributions of this method include: 1. Introducing a pre-allocation method, where entities are pre-assigned to tasks as candidates, and then entities are selected based on the correlation between tasks and entities. This effectively avoids the problem of local optima, thereby better finding the global optimum. 2. Introducing the Actor-Critic structure in the combinatorial optimization problem and proposing the Two-head Attention Module (TAM). This module calculates the values of Actor and Critic based on the similarity of entity and task attributes, can handle variable numbers of entity task allocations, and achieves zero-shot generalization for new tasks. 3. Proposing an attention-based hyperparameter network structure to estimate the overall value of key outputs for different numbers of entities, facilitating fine-tuning of variable numbers of entities in new scenarios. 4. Proposing a method similar to the seq2seq structure, akin to PointNet, for selecting pre-allocated entities. This method can select an appropriate number of entities with suitable attributes for each task. These innovations enable the proposed algorithm to outperform traditional heuristic algorithms in dynamic task allocation problems, handle changes in the number and attributes of entities, and achieve good generalization effects without retraining.