Better Sampling Strategy for Locomotion Control Tasks
Authors
Junning Huang, Zhifeng Hao
Corresponding Author
Junning Huang
Available Online May 2018.
- DOI
- 10.2991/ncce-18.2018.135How to use a DOI?
- Keywords
- TRPO; OU; high dimensional; control tasks; sampling strategy; performance; convergence.
- Abstract
Recently, model-free reinforcement learning algorithms such as TRPO for solving locomotion control tasks has achieved great success. But for difficult locomotion problem with high dimensional visual observation, these algorithms are not sample efficient. This paper proposes an OU process sampling strategy for locomotion control tasks. As experimental results show, TRPO algorithm with OU process sampling strategy shows better performance and better convergence compare with TRPO without OU process strategy.
- Copyright
- © 2018, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Junning Huang AU - Zhifeng Hao PY - 2018/05 DA - 2018/05 TI - Better Sampling Strategy for Locomotion Control Tasks BT - Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018) PB - Atlantis Press SP - 819 EP - 824 SN - 1951-6851 UR - https://doi.org/10.2991/ncce-18.2018.135 DO - 10.2991/ncce-18.2018.135 ID - Huang2018/05 ER -