Traffic-signal control reinforcement learning approach for continuous-time Markov games

Román Aragon-Gómez; Julio B. Clempner

doi:10.1016/j.engappai.2019.103415

Traffic-signal control reinforcement learning approach for continuous-time Markov games

Román Aragon-Gómez, Julio B. Clempner

Escuela Superior de Física y Matemáticas (ESFM)

Research output: Contribution to journal › Article › peer-review

22 Scopus citations

Abstract

Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.

Original language	English
Article number	103415
Journal	Engineering Applications of Artificial Intelligence
Volume	89
DOIs	https://doi.org/10.1016/j.engappai.2019.103415
State	Published - Mar 2020

Keywords

Continuous-time
Markov models
Nash games
Traffic signal

Access to Document

10.1016/j.engappai.2019.103415

Cite this

@article{428525768e7b42c1b9d977ef090a3ca9,

title = "Traffic-signal control reinforcement learning approach for continuous-time Markov games",

abstract = "Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.",

keywords = "Continuous-time, Markov models, Nash games, Traffic signal",

author = "Rom{\'a}n Aragon-G{\'o}mez and Clempner, {Julio B.}",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier Ltd",

year = "2020",

month = mar,

doi = "10.1016/j.engappai.2019.103415",

language = "Ingl{\'e}s",

volume = "89",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

}

TY - JOUR

T1 - Traffic-signal control reinforcement learning approach for continuous-time Markov games

AU - Aragon-Gómez, Román

AU - Clempner, Julio B.

PY - 2020/3

Y1 - 2020/3

N2 - Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.

AB - Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.

KW - Continuous-time

KW - Markov models

KW - Nash games

KW - Traffic signal

UR - http://www.scopus.com/inward/record.url?scp=85076713535&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2019.103415

DO - 10.1016/j.engappai.2019.103415

M3 - Artículo

SN - 0952-1976

VL - 89

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 103415

ER -

Traffic-signal control reinforcement learning approach for continuous-time Markov games

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this