A Behavioral Model for Exploration vs. Exploitation: Theoretical Framework and Experimental Evidence

Jingying Ding,Yifan Feng,Ying Rong
2024-12-24
Abstract:How do people navigate the exploration-exploitation (EE) trade-off when making repeated choices with unknown rewards? We study this question through the lens of multi-armed bandit problems and introduce a novel behavioral model, Quantal Choice with Adaptive Reduction of Exploration (QCARE). It generalizes Thompson Sampling, allowing for a principled way to quantify the EE trade-off and reflect human decision-making patterns. The model adaptively reduces exploration as information accumulates, with the reduction rate serving as a parameter to quantify the EE trade-off dynamics. We theoretically analyze how varying reduction rates influence decision quality, shedding light on the effects of ``over-exploration'' and ``under-exploration.'' Empirically, we validate QCARE through experiments collecting behavioral data from human participants. QCARE not only captures critical behavioral patterns in the EE trade-off but also outperforms alternative models in predictive power. Our analysis reveals a behavioral tendency toward over-exploration.
Optimization and Control
What problem does this paper attempt to address?