Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games

Kristal K. Trejo; Julio B. Clempner; Alexander S. Poznyak

doi:10.1016/j.jcss.2017.12.004

Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games

Kristal K. Trejo, Julio B. Clempner, Alexander S. Poznyak

Escuela Superior de Física y Matemáticas (ESFM)

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

18 Citas (Scopus)

Resumen

This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.

Idioma original	Inglés
Páginas (desde-hasta)	35-54
Número de páginas	20
Publicación	Journal of Computer and System Sciences
Volumen	95
DOI	https://doi.org/10.1016/j.jcss.2017.12.004
Estado	Publicada - ago. 2018

Acceder al documento

10.1016/j.jcss.2017.12.004

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{a298c51fde834f879f8c942d81eadf71,

title = "Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games",

abstract = "This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.",

keywords = "Behavioral games, Multiple players, Reinforcement learning, Security games, Stackelberg games, Strong Stackelberg/Nash equilibrium",

author = "Trejo, {Kristal K.} and Clempner, {Julio B.} and Poznyak, {Alexander S.}",

note = "Publisher Copyright: {\textcopyright} 2018 Elsevier Inc.",

year = "2018",

month = aug,

doi = "10.1016/j.jcss.2017.12.004",

language = "Ingl{\'e}s",

volume = "95",

pages = "35--54",

journal = "Journal of Computer and System Sciences",

issn = "0022-0000",

}

TY - JOUR

T1 - Adapting attackers and defenders patrolling strategies

T2 - A reinforcement learning approach for Stackelberg security games

AU - Trejo, Kristal K.

AU - Clempner, Julio B.

AU - Poznyak, Alexander S.

PY - 2018/8

Y1 - 2018/8

N2 - This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.

AB - This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.

KW - Behavioral games

KW - Multiple players

KW - Reinforcement learning

KW - Security games

KW - Stackelberg games

KW - Strong Stackelberg/Nash equilibrium

UR - http://www.scopus.com/inward/record.url?scp=85040372581&partnerID=8YFLogxK

U2 - 10.1016/j.jcss.2017.12.004

DO - 10.1016/j.jcss.2017.12.004

M3 - Artículo

SN - 0022-0000

VL - 95

SP - 35

EP - 54

JO - Journal of Computer and System Sciences

JF - Journal of Computer and System Sciences

ER -

Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games

Resumen

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto