A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications

Carlos Vasquez-Jalpa; Mariko Nakano-Miyatake; Enrique Escamilla-Hernandez

doi:10.23919/ICCAS52745.2021.9649882

A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications

Carlos Vasquez-Jalpa, Mariko Nakano-Miyatake, Enrique Escamilla-Hernandez

Escuela Superior de Ingeniería Mecánica y Eléctrica (ESIME), Unidad Culhuacán

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

This paper proposes a deep reinforcement learning algorithm for autonomous robotics, in which we modify twin delay deep deterministic policy gradient (TD3) to adapt for autonomous robots with higher degree freedom in movement. To provide a robot with free movement in the 2D space without collisions against some obstacles, such as wall, a robot is equipped with three cameras. The images captured by camera are used to train Convolutional Neural Networks (CNN) to understand environment with collisions or not-collisions. We added two additional parameters, observation' O', which are images obtained from cameras, and degrees of turns' deg' into the original TD3' s parameters composed of four values: [state's', reward 'r', action 'a' and next-state's' ']. To determine a next action with higher reward from the observation, two additional Neural Networks are constructed, being the first one determines an action from observation and the second one determines degree of turn from the observation and the action. The simulation results under three environments constructed by CoppeliaSim show a good performance of the proposed algorithm, reaching the target with higher rewards, even though the environments are unknown by robots.

Original language	English
Title of host publication	2021 21st International Conference on Control, Automation and Systems, ICCAS 2021
Publisher	IEEE Computer Society
Pages	743-748
Number of pages	6
ISBN (Electronic)	9788993215212
DOIs	https://doi.org/10.23919/ICCAS52745.2021.9649882
State	Published - 2021
Event	21st International Conference on Control, Automation and Systems, ICCAS 2021 - Jeju, Korea, Republic of Duration: 12 Oct 2021 → 15 Oct 2021

Publication series

Name	International Conference on Control, Automation and Systems
Volume	2021-October
ISSN (Print)	1598-7833

Conference

Conference	21st International Conference on Control, Automation and Systems, ICCAS 2021
Country/Territory	Korea, Republic of
City	Jeju
Period	12/10/21 → 15/10/21

Keywords

Actor-Critic
Deep Q-Learning
Deep Reinforcement Learning
Policy Gradient
Robot Vision

Access to Document

10.23919/ICCAS52745.2021.9649882

Cite this

Vasquez-Jalpa, C., Nakano-Miyatake, M., & Escamilla-Hernandez, E. (2021). A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications. In 2021 21st International Conference on Control, Automation and Systems, ICCAS 2021 (pp. 743-748). (International Conference on Control, Automation and Systems; Vol. 2021-October). IEEE Computer Society. https://doi.org/10.23919/ICCAS52745.2021.9649882

Vasquez-Jalpa, Carlos ; Nakano-Miyatake, Mariko ; Escamilla-Hernandez, Enrique. / A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications. 2021 21st International Conference on Control, Automation and Systems, ICCAS 2021. IEEE Computer Society, 2021. pp. 743-748 (International Conference on Control, Automation and Systems).

@inproceedings{e815cb8637ac489dbdcd8cab272900e7,

title = "A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications",

abstract = "This paper proposes a deep reinforcement learning algorithm for autonomous robotics, in which we modify twin delay deep deterministic policy gradient (TD3) to adapt for autonomous robots with higher degree freedom in movement. To provide a robot with free movement in the 2D space without collisions against some obstacles, such as wall, a robot is equipped with three cameras. The images captured by camera are used to train Convolutional Neural Networks (CNN) to understand environment with collisions or not-collisions. We added two additional parameters, observation' O', which are images obtained from cameras, and degrees of turns' deg' into the original TD3' s parameters composed of four values: [state's', reward 'r', action 'a' and next-state's' ']. To determine a next action with higher reward from the observation, two additional Neural Networks are constructed, being the first one determines an action from observation and the second one determines degree of turn from the observation and the action. The simulation results under three environments constructed by CoppeliaSim show a good performance of the proposed algorithm, reaching the target with higher rewards, even though the environments are unknown by robots.",

keywords = "Actor-Critic, Deep Q-Learning, Deep Reinforcement Learning, Policy Gradient, Robot Vision",

author = "Carlos Vasquez-Jalpa and Mariko Nakano-Miyatake and Enrique Escamilla-Hernandez",

note = "Publisher Copyright: {\textcopyright} 2021 ICROS.; 21st International Conference on Control, Automation and Systems, ICCAS 2021 ; Conference date: 12-10-2021 Through 15-10-2021",

year = "2021",

doi = "10.23919/ICCAS52745.2021.9649882",

language = "Ingl{\'e}s",

series = "International Conference on Control, Automation and Systems",

publisher = "IEEE Computer Society",

pages = "743--748",

booktitle = "2021 21st International Conference on Control, Automation and Systems, ICCAS 2021",

address = "Estados Unidos",

}

Vasquez-Jalpa, C, Nakano-Miyatake, M & Escamilla-Hernandez, E 2021, A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications. in 2021 21st International Conference on Control, Automation and Systems, ICCAS 2021. International Conference on Control, Automation and Systems, vol. 2021-October, IEEE Computer Society, pp. 743-748, 21st International Conference on Control, Automation and Systems, ICCAS 2021, Jeju, Korea, Republic of, 12/10/21. https://doi.org/10.23919/ICCAS52745.2021.9649882

A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications. / Vasquez-Jalpa, Carlos; Nakano-Miyatake, Mariko ; Escamilla-Hernandez, Enrique.
2021 21st International Conference on Control, Automation and Systems, ICCAS 2021. IEEE Computer Society, 2021. p. 743-748 (International Conference on Control, Automation and Systems; Vol. 2021-October).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications

AU - Vasquez-Jalpa, Carlos

AU - Nakano-Miyatake, Mariko

AU - Escamilla-Hernandez, Enrique

PY - 2021

Y1 - 2021

N2 - This paper proposes a deep reinforcement learning algorithm for autonomous robotics, in which we modify twin delay deep deterministic policy gradient (TD3) to adapt for autonomous robots with higher degree freedom in movement. To provide a robot with free movement in the 2D space without collisions against some obstacles, such as wall, a robot is equipped with three cameras. The images captured by camera are used to train Convolutional Neural Networks (CNN) to understand environment with collisions or not-collisions. We added two additional parameters, observation' O', which are images obtained from cameras, and degrees of turns' deg' into the original TD3' s parameters composed of four values: [state's', reward 'r', action 'a' and next-state's' ']. To determine a next action with higher reward from the observation, two additional Neural Networks are constructed, being the first one determines an action from observation and the second one determines degree of turn from the observation and the action. The simulation results under three environments constructed by CoppeliaSim show a good performance of the proposed algorithm, reaching the target with higher rewards, even though the environments are unknown by robots.

AB - This paper proposes a deep reinforcement learning algorithm for autonomous robotics, in which we modify twin delay deep deterministic policy gradient (TD3) to adapt for autonomous robots with higher degree freedom in movement. To provide a robot with free movement in the 2D space without collisions against some obstacles, such as wall, a robot is equipped with three cameras. The images captured by camera are used to train Convolutional Neural Networks (CNN) to understand environment with collisions or not-collisions. We added two additional parameters, observation' O', which are images obtained from cameras, and degrees of turns' deg' into the original TD3' s parameters composed of four values: [state's', reward 'r', action 'a' and next-state's' ']. To determine a next action with higher reward from the observation, two additional Neural Networks are constructed, being the first one determines an action from observation and the second one determines degree of turn from the observation and the action. The simulation results under three environments constructed by CoppeliaSim show a good performance of the proposed algorithm, reaching the target with higher rewards, even though the environments are unknown by robots.

KW - Actor-Critic

KW - Deep Q-Learning

KW - Deep Reinforcement Learning

KW - Policy Gradient

KW - Robot Vision

UR - http://www.scopus.com/inward/record.url?scp=85124227977&partnerID=8YFLogxK

U2 - 10.23919/ICCAS52745.2021.9649882

DO - 10.23919/ICCAS52745.2021.9649882

M3 - Contribución a la conferencia

AN - SCOPUS:85124227977

T3 - International Conference on Control, Automation and Systems

SP - 743

EP - 748

BT - 2021 21st International Conference on Control, Automation and Systems, ICCAS 2021

PB - IEEE Computer Society

T2 - 21st International Conference on Control, Automation and Systems, ICCAS 2021

Y2 - 12 October 2021 through 15 October 2021

ER -

Vasquez-Jalpa C, Nakano-Miyatake M , Escamilla-Hernandez E. A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications. In 2021 21st International Conference on Control, Automation and Systems, ICCAS 2021. IEEE Computer Society. 2021. p. 743-748. (International Conference on Control, Automation and Systems). doi: 10.23919/ICCAS52745.2021.9649882

A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this