Learning an efficient gait cycle of a biped robot based on reinforcement learning and artificial neural networks

Cristyan R. Gil; Hiram Calvo; Humberto Sossa

doi:10.3390/app9030502

Learning an efficient gait cycle of a biped robot based on reinforcement learning and artificial neural networks

Cristyan R. Gil, Hiram Calvo, Humberto Sossa

Centro de Investigación en Computación (CIC)

Research output: Contribution to journal › Article › peer-review

33 Scopus citations

Abstract

Programming robots for performing different activities requires calculating sequences of values of their joints by taking into account many factors, such as stability and efficiency, at the same time. Particularly for walking, state of the art techniques to approximate these sequences are based on reinforcement learning (RL). In this work we propose a multi-level system, where the same RL method is used first to learn the configuration of robot joints (poses) that allow it to stand with stability, and then in the second level, we find the sequence of poses that let it reach the furthest distance in the shortest time, while avoiding falling down and keeping a straight path. In order to evaluate this, we focus on measuring the time it takes for the robot to travel a certain distance. To our knowledge, this is the first work focusing both on speed and precision of the trajectory at the same time. We implement our model in a simulated environment using q-learning. We compare with the built-in walking modes of an NAO robot by improving normal-speed and enhancing robustness in fast-speed. The proposed model can be extended to other tasks and is independent of a particular robot model.

Original language	English
Article number	502
Journal	Applied Sciences (Switzerland)
Volume	9
Issue number	3
DOIs	https://doi.org/10.3390/app9030502
State	Published - 1 Feb 2019

Keywords

Biped robots
Gait cycle
Q-learning
Q-networks
Reinforcement learning

Access to Document

10.3390/app9030502

Cite this

@article{cfc7d1532a0146abbbcf15b3f1a1e598,

title = "Learning an efficient gait cycle of a biped robot based on reinforcement learning and artificial neural networks",

abstract = "Programming robots for performing different activities requires calculating sequences of values of their joints by taking into account many factors, such as stability and efficiency, at the same time. Particularly for walking, state of the art techniques to approximate these sequences are based on reinforcement learning (RL). In this work we propose a multi-level system, where the same RL method is used first to learn the configuration of robot joints (poses) that allow it to stand with stability, and then in the second level, we find the sequence of poses that let it reach the furthest distance in the shortest time, while avoiding falling down and keeping a straight path. In order to evaluate this, we focus on measuring the time it takes for the robot to travel a certain distance. To our knowledge, this is the first work focusing both on speed and precision of the trajectory at the same time. We implement our model in a simulated environment using q-learning. We compare with the built-in walking modes of an NAO robot by improving normal-speed and enhancing robustness in fast-speed. The proposed model can be extended to other tasks and is independent of a particular robot model.",

keywords = "Biped robots, Gait cycle, Q-learning, Q-networks, Reinforcement learning",

author = "Gil, {Cristyan R.} and Hiram Calvo and Humberto Sossa",

note = "Publisher Copyright: {\textcopyright} 2019 by the authors.",

year = "2019",

month = feb,

day = "1",

doi = "10.3390/app9030502",

language = "Ingl{\'e}s",

volume = "9",

journal = "Applied Sciences (Switzerland)",

issn = "2076-3417",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "3",

}

TY - JOUR

T1 - Learning an efficient gait cycle of a biped robot based on reinforcement learning and artificial neural networks

AU - Gil, Cristyan R.

AU - Calvo, Hiram

AU - Sossa, Humberto

PY - 2019/2/1

Y1 - 2019/2/1

N2 - Programming robots for performing different activities requires calculating sequences of values of their joints by taking into account many factors, such as stability and efficiency, at the same time. Particularly for walking, state of the art techniques to approximate these sequences are based on reinforcement learning (RL). In this work we propose a multi-level system, where the same RL method is used first to learn the configuration of robot joints (poses) that allow it to stand with stability, and then in the second level, we find the sequence of poses that let it reach the furthest distance in the shortest time, while avoiding falling down and keeping a straight path. In order to evaluate this, we focus on measuring the time it takes for the robot to travel a certain distance. To our knowledge, this is the first work focusing both on speed and precision of the trajectory at the same time. We implement our model in a simulated environment using q-learning. We compare with the built-in walking modes of an NAO robot by improving normal-speed and enhancing robustness in fast-speed. The proposed model can be extended to other tasks and is independent of a particular robot model.

AB - Programming robots for performing different activities requires calculating sequences of values of their joints by taking into account many factors, such as stability and efficiency, at the same time. Particularly for walking, state of the art techniques to approximate these sequences are based on reinforcement learning (RL). In this work we propose a multi-level system, where the same RL method is used first to learn the configuration of robot joints (poses) that allow it to stand with stability, and then in the second level, we find the sequence of poses that let it reach the furthest distance in the shortest time, while avoiding falling down and keeping a straight path. In order to evaluate this, we focus on measuring the time it takes for the robot to travel a certain distance. To our knowledge, this is the first work focusing both on speed and precision of the trajectory at the same time. We implement our model in a simulated environment using q-learning. We compare with the built-in walking modes of an NAO robot by improving normal-speed and enhancing robustness in fast-speed. The proposed model can be extended to other tasks and is independent of a particular robot model.

KW - Biped robots

KW - Gait cycle

KW - Q-learning

KW - Q-networks

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85060991389&partnerID=8YFLogxK

U2 - 10.3390/app9030502

DO - 10.3390/app9030502

M3 - Artículo

AN - SCOPUS:85060991389

SN - 2076-3417

VL - 9

JO - Applied Sciences (Switzerland)

JF - Applied Sciences (Switzerland)

IS - 3

M1 - 502

ER -

Learning an efficient gait cycle of a biped robot based on reinforcement learning and artificial neural networks

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this