Sensor Activation Policy Optimization for Opacity Enforcement Based on Reinforcement Learning

Jiahan He,Deguang Wang,Ming Yang,Chengbin Liang
DOI: https://doi.org/10.1109/jsen.2024.3471931
IF: 4.3
2024-01-01
IEEE Sensors Journal
Abstract:As a confidentiality property, opacity characterises the ability of an external intruder to infer the secret information of a system. Ensuring opacity can be realized by dynamic sensor activation to manage event observability. By controlling which sensors are active and what events are observable, the system can effectively prevent the exposure of sensitive information, ensuring that the confidential parts of its behavior remain opaque. In practice, event hiding and sensor switching involved in dynamic sensor activation are recognized as costly operations. This study addresses the numerical optimization problem of sensor activation policy (SAP) to enforce opacity using reinforcement learning. A most permissive observer (MPO) is used to incorporate all valid SAPs that ensure opacity. The quantitative objective of the optimization problem is to minimize the maximum discounted total cost. A systematic procedure is provided to convert MPO into a Markov game, facilitating the use of reinforcement learning techniques. Minimax Q-learning is presented as the methodology to derive an optimal policy for sensor activation/deactivation decisions from the convergent Q-table. Finally, the effectiveness and applicability of the proposed method are demonstrated on a location-tracking problem in a smart factory setting.
What problem does this paper attempt to address?