Comparison of Q-learning and SARSA Reinforcement Learning Models on Cliff Walking Problem

Lv Zhong

doi:10.2991/978-94-6463-370-2_23

<Previous Article In Volume

Next Article In Volume>

Comparison of Q-learning and SARSA Reinforcement Learning Models on Cliff Walking Problem

Authors

Lv Zhong¹^{, *}

¹Brunel London School, North China University of Technology, Beijing, 100144, China

^*Corresponding author. Email: 21190010330@mail.ncut.edu.cn

Corresponding Author

Lv Zhong

Available Online 14 February 2024.

DOI: 10.2991/978-94-6463-370-2_23 How to use a DOI?
Keywords: SARSA; Q-learning; Reinforcement learning
Abstract: In this work, the performance of two value-based reinforcement learning algorithms is evaluated in the cliff walking problem, including State-Action-Reward-State-Action (SARSA) and Q-learning. This paper uses Python language and Numpy library to implement SARSA and Q-learning algorithms, and compares and analyzes their policy graphs and reward curves. The experimental results show that SARSA is a conservative algorithm, which tends to choose a path away from the cliff, thus reducing the risk but also increasing the steps and time; Q-learning is a greedy algorithm, which tends to choose a path close to the cliff, thus increasing the reward but also increasing the fluctuation and instability. This paper discusses the balance between exploration and exploitation of these two algorithms, as well as their performance under different parameter settings, as well as their adaptability and generalization ability in complex environments. This paper also points out some shortcomings and prospects, such as only using a simple grid world as the experimental environment, without considering more complex or realistic environments; only using two value-based reinforcement learning algorithms, without considering other types of reinforcement learning algorithms; only using one exploration policy, namely ε-greedy policy, without considering other types of exploration policies. This paper provides some valuable contributions and innovations for the reinforcement learning field.
Copyright: © 2024 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023)
Series: Advances in Intelligent Systems Research
Publication Date: 14 February 2024
ISBN: 978-94-6463-370-2
ISSN: 1951-6851
DOI: 10.2991/978-94-6463-370-2_23 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Lv Zhong
PY  - 2024
DA  - 2024/02/14
TI  - Comparison of Q-learning and SARSA Reinforcement Learning Models on Cliff Walking Problem
BT  - Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023)
PB  - Atlantis Press
SP  - 207
EP  - 213
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-370-2_23
DO  - 10.2991/978-94-6463-370-2_23
ID  - Zhong2024
ER  -

download .riscopy to clipboard