Traffic-signal control reinforcement learning approach for continuous-time Markov games

Román Aragon-Gómez; Julio B. Clempner

doi:10.1016/j.engappai.2019.103415

Traffic-signal control reinforcement learning approach for continuous-time Markov games

Román Aragon-Gómez, Julio B. Clempner

Escuela Superior de Física y Matemáticas (ESFM)

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

22 Citas (Scopus)

Resumen

Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.

Idioma original	Inglés
Número de artículo	103415
Publicación	Engineering Applications of Artificial Intelligence
Volumen	89
DOI	https://doi.org/10.1016/j.engappai.2019.103415
Estado	Publicada - mar. 2020

Acceder al documento

10.1016/j.engappai.2019.103415

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{428525768e7b42c1b9d977ef090a3ca9,

title = "Traffic-signal control reinforcement learning approach for continuous-time Markov games",

abstract = "Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.",

keywords = "Continuous-time, Markov models, Nash games, Traffic signal",

author = "Rom{\'a}n Aragon-G{\'o}mez and Clempner, {Julio B.}",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier Ltd",

year = "2020",

month = mar,

doi = "10.1016/j.engappai.2019.103415",

language = "Ingl{\'e}s",

volume = "89",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

}

TY - JOUR

T1 - Traffic-signal control reinforcement learning approach for continuous-time Markov games

AU - Aragon-Gómez, Román

AU - Clempner, Julio B.

PY - 2020/3

Y1 - 2020/3

N2 - Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.

AB - Traffic-Signal Control (TSC) models have been transformed from simple pre-timed isolated indications to a more complex form of actuated and coordinated TSC models for highways, railroads, etc. However, existing TSC models cannot always manage inconveniences like: over-saturation, delays by incidents, congestion by weather conditions, among others, which is why this is still an open area of research. An important challenge is to propose a TSC solution model for multiple intersections, which adapts traffic signal timing according to real-time traffic. This paper introduces a novel Reinforcement Learning (RL) approach for solving the Traffic-Signal Control problem for multiple intersections using Continuous-Time Markov Games (CTMG). The RL model is based on a temporal difference method. For estimating the transition rates of the Markov model, we use non-degenerate randomized Markov laws are being used, such that the connected chain is shown to be ergodic, and to visit all states infinitely often, using all the controls in every state. Our reinforcement learning model supposes to have complete information. The estimation of the transition rates is obtained by the number of transitions on an interval of time divided by the total value of the holding time. The estimation of the rewards is defined as the arithmetic mean of the observed rewards. We consider a non-cooperative game model for solving the multiple intersections problem. For computing the Nash equilibrium, we employ an iterative proximal gradient method. As our final contribution, we present a numerical example for validating our model and concretely measure the benefits of the TSC model.

KW - Continuous-time

KW - Markov models

KW - Nash games

KW - Traffic signal

UR - http://www.scopus.com/inward/record.url?scp=85076713535&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2019.103415

DO - 10.1016/j.engappai.2019.103415

M3 - Artículo

SN - 0952-1976

VL - 89

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 103415

ER -

Traffic-signal control reinforcement learning approach for continuous-time Markov games

Resumen

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto