Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games

Kristal K. Trejo, Julio B. Clempner, Alexander S. Poznyak

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

© 2018 Elsevier Inc. This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.
Original languageAmerican English
Pages (from-to)35-54
Number of pages29
JournalJournal of Computer and System Sciences
DOIs
StatePublished - 1 Aug 2018

Fingerprint

Reinforcement learning
Reinforcement Learning
Game
Stackelberg Equilibrium
Imitation
Coalitions
Prior Knowledge
Reward
Nash Equilibrium
Resource Allocation
Difference Method
Resource allocation
Paradigm
Numerical Examples
Computing
Architecture
Strategy

Cite this

@article{a298c51fde834f879f8c942d81eadf71,
title = "Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games",
abstract = "{\circledC} 2018 Elsevier Inc. This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.",
author = "Trejo, {Kristal K.} and Clempner, {Julio B.} and Poznyak, {Alexander S.}",
year = "2018",
month = "8",
day = "1",
doi = "10.1016/j.jcss.2017.12.004",
language = "American English",
pages = "35--54",
journal = "Journal of Computer and System Sciences",
issn = "0022-0000",
publisher = "Academic Press Inc.",

}

Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games. / Trejo, Kristal K.; Clempner, Julio B.; Poznyak, Alexander S.

In: Journal of Computer and System Sciences, 01.08.2018, p. 35-54.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games

AU - Trejo, Kristal K.

AU - Clempner, Julio B.

AU - Poznyak, Alexander S.

PY - 2018/8/1

Y1 - 2018/8/1

N2 - © 2018 Elsevier Inc. This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.

AB - © 2018 Elsevier Inc. This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor–critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85040372581&origin=inward

UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85040372581&origin=inward

U2 - 10.1016/j.jcss.2017.12.004

DO - 10.1016/j.jcss.2017.12.004

M3 - Article

SP - 35

EP - 54

JO - Journal of Computer and System Sciences

JF - Journal of Computer and System Sciences

SN - 0022-0000

ER -