TY - GEN
T1 - Adapting strategies to dynamic environments in controllable stackelberg security games
AU - Trejo, Kristal K.
AU - Clempner, Julio B.
AU - Poznyak, Alexander S.
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/12/27
Y1 - 2016/12/27
N2 - There is a growing interest in applying Stackelberg games to model resource allocation for patrolling security problems in which defenders must allocate limited security resources to protect targets from attack by adversaries. In real-world adversaries are sophisticated presenting dynamic strategies. Most existing approaches for computing defender strategies calculate the game against fixed behavioral models of adversaries, and cannot ensure success in the realization of the game. To address this shortcoming, this paper presents a novel approach for adapting preferred strategies in controlled Stackelberg security games using a reinforcement learning (RL) approach for attackers and defenders employing an average rewards.We propose a common framework that combines prior knowledge and temporal-difference method in reinforcement learning. The overall RL architecture involves two highest components: the adaptive primary learning architecture and the actor-critic architecture. In this work we consider a Stackelberg security game in case of a metric state space for a class of time-discrete ergodic controllable Markov chains games. For computing the equilibrium point we employ the extraproximal method. Finally, a game theory example illustrates the main results and the effectiveness of the method.
AB - There is a growing interest in applying Stackelberg games to model resource allocation for patrolling security problems in which defenders must allocate limited security resources to protect targets from attack by adversaries. In real-world adversaries are sophisticated presenting dynamic strategies. Most existing approaches for computing defender strategies calculate the game against fixed behavioral models of adversaries, and cannot ensure success in the realization of the game. To address this shortcoming, this paper presents a novel approach for adapting preferred strategies in controlled Stackelberg security games using a reinforcement learning (RL) approach for attackers and defenders employing an average rewards.We propose a common framework that combines prior knowledge and temporal-difference method in reinforcement learning. The overall RL architecture involves two highest components: the adaptive primary learning architecture and the actor-critic architecture. In this work we consider a Stackelberg security game in case of a metric state space for a class of time-discrete ergodic controllable Markov chains games. For computing the equilibrium point we employ the extraproximal method. Finally, a game theory example illustrates the main results and the effectiveness of the method.
UR - http://www.scopus.com/inward/record.url?scp=85010748463&partnerID=8YFLogxK
U2 - 10.1109/CDC.2016.7799111
DO - 10.1109/CDC.2016.7799111
M3 - Contribución a la conferencia
AN - SCOPUS:85010748463
T3 - 2016 IEEE 55th Conference on Decision and Control, CDC 2016
SP - 5484
EP - 5489
BT - 2016 IEEE 55th Conference on Decision and Control, CDC 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 55th IEEE Conference on Decision and Control, CDC 2016
Y2 - 12 December 2016 through 14 December 2016
ER -