Photonic Decision-Making for Arbitrary-Number-armed Bandit Problem Utilizing Parallel Chaos Generation.

Jiafa Peng,Ning Jiang,Anke Zhao,Shiqin Liu,Yiqun Zhang,Kun Qiu,Qianwu Zhang
DOI: https://doi.org/10.1364/oe.432956
IF: 3.8
2021-01-01
Optics Express
Abstract:In this paper, we propose and experimentally demonstrate a novel scheme that helps to solve an any-number-armed bandit problem by utilizing two parallel simultaneously-generated chaotic signals and the epsilon (ɛ)-greedy strategy. In the proposed scheme, two chaotic signals are experimentally generated, and then processed by an 8-bit analog-to-digital conversion (ADC) with 4 least significant bits (LSBs), to generate two amplitude-distribution-uniform sequences for decision-making. The correspondence between these two random sequences and different arms is established by a mapping rule designed in virtue of the ɛ-greedy-strategy. Based on this, decision-making for an exemplary 5-armed bandit problem is successfully performed, and moreover, the influences of the mapping rule and unknown reward probabilities on the correction decision rate (CDR) performance for the 4-armed to 7-armed bandit problems are investigated. This work provides a novel way for solving the arbitrary-number-armed bandit problem.
What problem does this paper attempt to address?