Online Sequential Channel Accessing Control: A Double Exploration Vs. Exploitation Problem
Panlong Yang,Bowen Li,Jinlong Wang,Xiang-Yang Li,Zhiyong Du,Yubo Yan,Yan Xiong
DOI: https://doi.org/10.1109/twc.2015.2424413
IF: 10.4
2015-01-01
IEEE Transactions on Wireless Communications
Abstract:In opportunistic channel access, the user needs to make real time decisions on when and which channel to access with uncertainty. Assuming perfect channel statistics, several studies have applied optimal stopping theory to derive control strategy for sequential sensing/probing based opportunistically accessing (s-SPA), exploiting temporary opportunities among multiple channels. Meanwhile, numerous multi-arm bandit (MAB)-based approaches have been proposed for online learning of channel selection in periodical sensing/accessing system, however, these schemes fail to exploit the opportunistic diversity in short term. In this paper, we investigate online learning of optimal control in s-SPA systems, where both statistics learning and temporary opportunity utilization are jointly considered. An effective and efficient online policy, so called IE-OSP, is proposed, which theoretically guarantees system converges to the optimal s-SPA strategy with bounded probability. Experimental results further show that, the regret of IE-OSP is almost in optimal logarithmic increasing rate over time, and is sub-linear with the increasing number of channels. Compared with existing solutions, our proposed algorithm achieves 25 similar to 30% throughput gain in typical scenarios.