TY - GEN

T1 - A parallel rollout algorithm for wildfire suppression

AU - Montenegro, Mauro

AU - López, Roberto

AU - Menchaca-Méndez, Rolando

AU - Becerra, Emanuel

AU - Menchaca-Méndez, Ricardo

N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

PY - 2020

Y1 - 2020

N2 - In this paper, we formulate the problem of optimal Wildfire Suppression as an infinite horizon Decision Process (DP) problem where an agent (e.g., a robotic firefighter) decides which areas to intervene to extinguish the fire. The dynamics of the wildfire is modeled by a cellular automaton whose state at time k is defined as a bi-dimensional grid $$x:k$$ where each cell in this grid describes the state of a rectangular geographic region of the wildland. The proposed algorithm, which is based on a non-parametric reinforcement learning (RL) methodology, computes optimized control policies that determine the agent’s actions that minimize a cost function that aims to preserve most of the cells with trees. From a given state $$x:k$$, the proposed algorithms employs rollout to take advantage of heuristic solutions to approximate, in polynomial-time, the future cost function. Two different heuristics approaches were applied: A corrective-based model that only takes into account surrounding burning cells, and a predictive strategy that calculates a coefficient-based metric over nearby trees and empty cells. We implemented a parallel sampler using CUDA to simulate the trajectories that rollout generates. This parallel implementation allows us to increase the number of lookahead steps without incurring in large computing times. Our experimental results show that the rollout strategy outperforms the base heuristics and that effectively suppresses wildfires.

AB - In this paper, we formulate the problem of optimal Wildfire Suppression as an infinite horizon Decision Process (DP) problem where an agent (e.g., a robotic firefighter) decides which areas to intervene to extinguish the fire. The dynamics of the wildfire is modeled by a cellular automaton whose state at time k is defined as a bi-dimensional grid $$x:k$$ where each cell in this grid describes the state of a rectangular geographic region of the wildland. The proposed algorithm, which is based on a non-parametric reinforcement learning (RL) methodology, computes optimized control policies that determine the agent’s actions that minimize a cost function that aims to preserve most of the cells with trees. From a given state $$x:k$$, the proposed algorithms employs rollout to take advantage of heuristic solutions to approximate, in polynomial-time, the future cost function. Two different heuristics approaches were applied: A corrective-based model that only takes into account surrounding burning cells, and a predictive strategy that calculates a coefficient-based metric over nearby trees and empty cells. We implemented a parallel sampler using CUDA to simulate the trajectories that rollout generates. This parallel implementation allows us to increase the number of lookahead steps without incurring in large computing times. Our experimental results show that the rollout strategy outperforms the base heuristics and that effectively suppresses wildfires.

UR - http://www.scopus.com/inward/record.url?scp=85096615013&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-62554-2_18

DO - 10.1007/978-3-030-62554-2_18

M3 - Contribución a la conferencia

AN - SCOPUS:85096615013

SN - 9783030625535

T3 - Communications in Computer and Information Science

SP - 244

EP - 255

BT - Telematics and Computing - 9th International Congress, WITCOM 2020, Proceedings

A2 - Mata-Rivera, Miguel Félix

A2 - Zagal-Flores, Roberto

A2 - Barria-Huidobro, Cristian

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 2 November 2020 through 6 November 2020

ER -