Repeated Stackelberg security games: Learning with incomplete state information

Guillermo Alcantara-Jiménez; Julio B. Clempner

doi:10.1016/j.ress.2019.106695

Repeated Stackelberg security games: Learning with incomplete state information

Guillermo Alcantara-Jiménez, Julio B. Clempner

Escuela Superior de Física y Matemáticas (ESFM)

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Existing applications of Stackelberg Security Games (SSGs) have make use of Reinforcement Learning (RL) approaches to learn and adapt defenders-attackers behavior. The learning process for defenders-attackers is represented by randomized strategies for the defenders applied against adversarial strategies of the attackers, which acquire feedback on their strategies observing the target that was defended-attacked. However, must of the existing SSGs RL models feature strong assumptions including that the defenders and attackers have perfect information about the behavioral model, producing inconsistencies. We address these problems proposing a practical framework for representing real-world security problems by empowering SSGs with a RL approach considering incomplete state information. The players’ behavior and rationality are restricted to a class of partially observed Markov games (POMG). We develop an algorithm that consider randomized strategies for both defenders and attackers and obtain feedback on their partially observed states. We propose adaptive rules for computing the estimated transition matrices and utilities considering the number of unobserved experiences in the game. Furthermore, we study the problems of convergence of the estimated transition matrices and utilities in SSGs. For the realization of the SSG, we propose a new partially observed random walk technique for the randomization in the scheduling of the patrol planning. Results are applied to security games between defenders and attackers, where the noncooperative behaviors are well characterized by the features of the learning process in Stackelberg games.

Original language	English
Article number	106695
Journal	Reliability Engineering and System Safety
Volume	195
DOIs	https://doi.org/10.1016/j.ress.2019.106695
State	Published - Mar 2020

Keywords

Incomplete information
Reinforcement learning
Security games.

Access to Document

10.1016/j.ress.2019.106695

Cite this

@article{38a88ee7764b4b8e81b2639a83f51a30,

title = "Repeated Stackelberg security games: Learning with incomplete state information",

abstract = "Existing applications of Stackelberg Security Games (SSGs) have make use of Reinforcement Learning (RL) approaches to learn and adapt defenders-attackers behavior. The learning process for defenders-attackers is represented by randomized strategies for the defenders applied against adversarial strategies of the attackers, which acquire feedback on their strategies observing the target that was defended-attacked. However, must of the existing SSGs RL models feature strong assumptions including that the defenders and attackers have perfect information about the behavioral model, producing inconsistencies. We address these problems proposing a practical framework for representing real-world security problems by empowering SSGs with a RL approach considering incomplete state information. The players{\textquoteright} behavior and rationality are restricted to a class of partially observed Markov games (POMG). We develop an algorithm that consider randomized strategies for both defenders and attackers and obtain feedback on their partially observed states. We propose adaptive rules for computing the estimated transition matrices and utilities considering the number of unobserved experiences in the game. Furthermore, we study the problems of convergence of the estimated transition matrices and utilities in SSGs. For the realization of the SSG, we propose a new partially observed random walk technique for the randomization in the scheduling of the patrol planning. Results are applied to security games between defenders and attackers, where the noncooperative behaviors are well characterized by the features of the learning process in Stackelberg games.",

keywords = "Incomplete information, Reinforcement learning, Security games.",

author = "Guillermo Alcantara-Jim{\'e}nez and Clempner, {Julio B.}",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier Ltd",

year = "2020",

month = mar,

doi = "10.1016/j.ress.2019.106695",

language = "Ingl{\'e}s",

volume = "195",

journal = "Reliability Engineering and System Safety",

issn = "0951-8320",

}

TY - JOUR

T1 - Repeated Stackelberg security games

T2 - Learning with incomplete state information

AU - Alcantara-Jiménez, Guillermo

AU - Clempner, Julio B.

PY - 2020/3

Y1 - 2020/3

N2 - Existing applications of Stackelberg Security Games (SSGs) have make use of Reinforcement Learning (RL) approaches to learn and adapt defenders-attackers behavior. The learning process for defenders-attackers is represented by randomized strategies for the defenders applied against adversarial strategies of the attackers, which acquire feedback on their strategies observing the target that was defended-attacked. However, must of the existing SSGs RL models feature strong assumptions including that the defenders and attackers have perfect information about the behavioral model, producing inconsistencies. We address these problems proposing a practical framework for representing real-world security problems by empowering SSGs with a RL approach considering incomplete state information. The players’ behavior and rationality are restricted to a class of partially observed Markov games (POMG). We develop an algorithm that consider randomized strategies for both defenders and attackers and obtain feedback on their partially observed states. We propose adaptive rules for computing the estimated transition matrices and utilities considering the number of unobserved experiences in the game. Furthermore, we study the problems of convergence of the estimated transition matrices and utilities in SSGs. For the realization of the SSG, we propose a new partially observed random walk technique for the randomization in the scheduling of the patrol planning. Results are applied to security games between defenders and attackers, where the noncooperative behaviors are well characterized by the features of the learning process in Stackelberg games.

AB - Existing applications of Stackelberg Security Games (SSGs) have make use of Reinforcement Learning (RL) approaches to learn and adapt defenders-attackers behavior. The learning process for defenders-attackers is represented by randomized strategies for the defenders applied against adversarial strategies of the attackers, which acquire feedback on their strategies observing the target that was defended-attacked. However, must of the existing SSGs RL models feature strong assumptions including that the defenders and attackers have perfect information about the behavioral model, producing inconsistencies. We address these problems proposing a practical framework for representing real-world security problems by empowering SSGs with a RL approach considering incomplete state information. The players’ behavior and rationality are restricted to a class of partially observed Markov games (POMG). We develop an algorithm that consider randomized strategies for both defenders and attackers and obtain feedback on their partially observed states. We propose adaptive rules for computing the estimated transition matrices and utilities considering the number of unobserved experiences in the game. Furthermore, we study the problems of convergence of the estimated transition matrices and utilities in SSGs. For the realization of the SSG, we propose a new partially observed random walk technique for the randomization in the scheduling of the patrol planning. Results are applied to security games between defenders and attackers, where the noncooperative behaviors are well characterized by the features of the learning process in Stackelberg games.

KW - Incomplete information

KW - Reinforcement learning

KW - Security games.

UR - http://www.scopus.com/inward/record.url?scp=85073571521&partnerID=8YFLogxK

U2 - 10.1016/j.ress.2019.106695

DO - 10.1016/j.ress.2019.106695

M3 - Artículo

SN - 0951-8320

VL - 195

JO - Reliability Engineering and System Safety

JF - Reliability Engineering and System Safety

M1 - 106695

ER -

Repeated Stackelberg security games: Learning with incomplete state information

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this