Comparison of Q-learning and SARSA Reinforcement Learning Models on Cliff Walking Problem
- DOI
- 10.2991/978-94-6463-370-2_23How to use a DOI?
- Keywords
- SARSA; Q-learning; Reinforcement learning
- Abstract
In this work, the performance of two value-based reinforcement learning algorithms is evaluated in the cliff walking problem, including State-Action-Reward-State-Action (SARSA) and Q-learning. This paper uses Python language and Numpy library to implement SARSA and Q-learning algorithms, and compares and analyzes their policy graphs and reward curves. The experimental results show that SARSA is a conservative algorithm, which tends to choose a path away from the cliff, thus reducing the risk but also increasing the steps and time; Q-learning is a greedy algorithm, which tends to choose a path close to the cliff, thus increasing the reward but also increasing the fluctuation and instability. This paper discusses the balance between exploration and exploitation of these two algorithms, as well as their performance under different parameter settings, as well as their adaptability and generalization ability in complex environments. This paper also points out some shortcomings and prospects, such as only using a simple grid world as the experimental environment, without considering more complex or realistic environments; only using two value-based reinforcement learning algorithms, without considering other types of reinforcement learning algorithms; only using one exploration policy, namely ε-greedy policy, without considering other types of exploration policies. This paper provides some valuable contributions and innovations for the reinforcement learning field.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Lv Zhong PY - 2024 DA - 2024/02/14 TI - Comparison of Q-learning and SARSA Reinforcement Learning Models on Cliff Walking Problem BT - Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023) PB - Atlantis Press SP - 207 EP - 213 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-370-2_23 DO - 10.2991/978-94-6463-370-2_23 ID - Zhong2024 ER -