TY - JOUR
T1 - Leveraging Expert Demonstration Features for Deep Reinforcement Learning in Floor Cleaning Robot Navigation
AU - Cimurs, Reinis
AU - Merchán-Cruz, Emmanuel Alejandro
PY - 2022/10/12
Y1 - 2022/10/12
N2 - In this paper, a Deep Reinforcement Learning (DRL)-based approach for learning mobile cleaning robot navigation commands that leverage experience from expert demonstrations is presented. First, expert demonstrations of robot motion trajectories in simulation in the cleaning robot domain are collected. The relevant motion features with regard to the distance to obstacles and the heading difference towards the navigation goal are extracted. Each feature weight is optimized with respect to the collected data, and the obtained values are assumed as representing the optimal motion of the expert navigation. A reward function is created based on the feature values to train a policy with semi-supervised DRL, where an immediate reward is calculated based on the closeness to the expert navigation. The presented results show the viability of this approach with regard to robot navigation as well as the reduced training time.
AB - In this paper, a Deep Reinforcement Learning (DRL)-based approach for learning mobile cleaning robot navigation commands that leverage experience from expert demonstrations is presented. First, expert demonstrations of robot motion trajectories in simulation in the cleaning robot domain are collected. The relevant motion features with regard to the distance to obstacles and the heading difference towards the navigation goal are extracted. Each feature weight is optimized with respect to the collected data, and the obtained values are assumed as representing the optimal motion of the expert navigation. A reward function is created based on the feature values to train a policy with semi-supervised DRL, where an immediate reward is calculated based on the closeness to the expert navigation. The presented results show the viability of this approach with regard to robot navigation as well as the reduced training time.
KW - Deep Reinforcement Learning
KW - autonomous cleaning robots
KW - mobile robot navigation
KW - semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85140933449&partnerID=8YFLogxK
U2 - 10.3390/s22207750
DO - 10.3390/s22207750
M3 - Artículo
C2 - 36298101
AN - SCOPUS:85140933449
SN - 1424-8220
VL - 22
JO - Sensors (Switzerland)
JF - Sensors (Switzerland)
IS - 20
M1 - 7750
ER -