An Investigation into Hyperparameter Adjustment and Learning Rate Optimization Algorithm Utilizing Normal Distribution and Greedy Heuristics in Parallel Training
- DOI
- 10.2991/978-94-6463-300-9_16How to use a DOI?
- Keywords
- Data Parallel; Deep Learning; LRSA
- Abstract
In the era of large models, traditional training methods can no longer meet the massive requirements of computing power and data sets. Using distributed training can alleviate this problem to some extent. However, in distributed training, the challenge and complexity of hyperparameter adjustment will increase. In order to solve these challenges, special distributed hyperparameter adjustment algorithms and strategies are needed to find the best hyperparameter combination faster. This paper proposed Learning Rate Search Algorithm (LRSA) to quickly determine the initial value of learning rate, which makes the adjustment of hyperparameter more efficient. This work analyzed the reason why the parallel speed of multi card data decreases is that Using data parallelism at a small batch size can lead to resources being mainly used for communication and related overhead between GPUs, resulting in lower effective utilization of GPUs and an increase in training duration. Furtherly, this paper explored the reasons for the decrease in accuracy under large batch size and used LRSA to improve this situation effectively. This article also proposed an empirical rule to determine the lower bound of batch size. Experimental results indicate that LRSA on several deep learning models demonstrated that they can improve training efficiency and accuracy. For example, LRSA was used in VGG16 to get a suitable learning rate so that the model has the same Accuracy as the smaller batchsize at a faster training speed.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Weixuan Qiao PY - 2023 DA - 2023/11/27 TI - An Investigation into Hyperparameter Adjustment and Learning Rate Optimization Algorithm Utilizing Normal Distribution and Greedy Heuristics in Parallel Training BT - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023) PB - Atlantis Press SP - 153 EP - 163 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-300-9_16 DO - 10.2991/978-94-6463-300-9_16 ID - Qiao2023 ER -