Abstract:We consider a new variant of the multi-robot task allocation problem - Inverse Risk-sensitive Multi-Robot Task Allocation (IR-MRTA). "Forward" MRTA - the process of deciding which robot should perform a task given the reward (cost)-related parameters, is widely studied in the multi-robot literature. In this setting, the reward (cost)-related parameters are assumed to be already known: parameters are first fixed offline by domain experts, followed by coordinating robots online. What if we need these parameters to be adjusted by non-expert human supervisors who oversee the robots during tasks to adapt to new situations? We are interested in the case where the human supervisor's perception of the allocation risk may change and suggest different allocations for robots compared to that from the MRTA algorithm. In such cases, the robots need to change the parameters of the allocation problem based on evolving human preferences. We study such problems through the lens of inverse task allocation, i.e., the process of finding parameters given solutions to the problem. Specifically, we propose a new formulation IR-MRTA, in which we aim to find a new set of parameters of the human behavioral risk model that minimally deviates from the current MRTA parameters and can make a greedy task allocation algorithm allocate robot resources in line with those suggested by humans. We show that even in the simple case such a problem is a non-convex optimization problem. We propose a Branch $\&$ Bound algorithm (BB-IR-MRTA) to solve such problems. In numerical simulations of a case study on multi-robot target capture, we demonstrate how to use BB-IR-MRTA and we show that the proposed algorithm achieves significant advantages in running time and peak memory usage compared to a brute-force baseline.

Risk-Averse Biased Human Policies in Assistive Multi-Armed Bandit Settings

When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans

Human-AI Learning Performance in Multi-Armed Bandits

Multi-Armed Bandits with Fairness Constraints for Distributing Resources to Human Teammates

A Risk-Averse Framework for Non-Stationary Stochastic Multi-Armed Bandits

Mixed-Initiative Human-Robot Teaming under Suboptimality with Online Bayesian Adaptation

A Survey of Risk-Aware Multi-Armed Bandits

Multiarmed Bandits Problem Under the Mean-Variance Setting

Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

Desperate Times Call for Desperate Measures: Towards Risk-Adaptive Task Allocation

Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration

HR-Bandit: Human-AI Collaborated Linear Recourse Bandit

Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits

Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems

Risk-Sensitive Cooperative Games for Human-Machine Systems

A General Framework for Bandit Problems Beyond Cumulative Objectives

Entropic Risk Measure in Policy Search

Minimax-optimal trust-aware multi-armed bandits

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics

Inverse Risk-sensitive Multi-Robot Task Allocation