TY - GEN
T1 - A parallel rollout algorithm for wildfire suppression
AU - Montenegro, Mauro
AU - López, Roberto
AU - Menchaca-Méndez, Rolando
AU - Becerra, Emanuel
AU - Menchaca-Méndez, Ricardo
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - In this paper, we formulate the problem of optimal Wildfire Suppression as an infinite horizon Decision Process (DP) problem where an agent (e.g., a robotic firefighter) decides which areas to intervene to extinguish the fire. The dynamics of the wildfire is modeled by a cellular automaton whose state at time k is defined as a bi-dimensional grid $$x:k$$ where each cell in this grid describes the state of a rectangular geographic region of the wildland. The proposed algorithm, which is based on a non-parametric reinforcement learning (RL) methodology, computes optimized control policies that determine the agent’s actions that minimize a cost function that aims to preserve most of the cells with trees. From a given state $$x:k$$, the proposed algorithms employs rollout to take advantage of heuristic solutions to approximate, in polynomial-time, the future cost function. Two different heuristics approaches were applied: A corrective-based model that only takes into account surrounding burning cells, and a predictive strategy that calculates a coefficient-based metric over nearby trees and empty cells. We implemented a parallel sampler using CUDA to simulate the trajectories that rollout generates. This parallel implementation allows us to increase the number of lookahead steps without incurring in large computing times. Our experimental results show that the rollout strategy outperforms the base heuristics and that effectively suppresses wildfires.
AB - In this paper, we formulate the problem of optimal Wildfire Suppression as an infinite horizon Decision Process (DP) problem where an agent (e.g., a robotic firefighter) decides which areas to intervene to extinguish the fire. The dynamics of the wildfire is modeled by a cellular automaton whose state at time k is defined as a bi-dimensional grid $$x:k$$ where each cell in this grid describes the state of a rectangular geographic region of the wildland. The proposed algorithm, which is based on a non-parametric reinforcement learning (RL) methodology, computes optimized control policies that determine the agent’s actions that minimize a cost function that aims to preserve most of the cells with trees. From a given state $$x:k$$, the proposed algorithms employs rollout to take advantage of heuristic solutions to approximate, in polynomial-time, the future cost function. Two different heuristics approaches were applied: A corrective-based model that only takes into account surrounding burning cells, and a predictive strategy that calculates a coefficient-based metric over nearby trees and empty cells. We implemented a parallel sampler using CUDA to simulate the trajectories that rollout generates. This parallel implementation allows us to increase the number of lookahead steps without incurring in large computing times. Our experimental results show that the rollout strategy outperforms the base heuristics and that effectively suppresses wildfires.
UR - http://www.scopus.com/inward/record.url?scp=85096615013&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-62554-2_18
DO - 10.1007/978-3-030-62554-2_18
M3 - Contribución a la conferencia
AN - SCOPUS:85096615013
SN - 9783030625535
T3 - Communications in Computer and Information Science
SP - 244
EP - 255
BT - Telematics and Computing - 9th International Congress, WITCOM 2020, Proceedings
A2 - Mata-Rivera, Miguel Félix
A2 - Zagal-Flores, Roberto
A2 - Barria-Huidobro, Cristian
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th International Congress on Telematics and Computing, WITCOM 2020
Y2 - 2 November 2020 through 6 November 2020
ER -