Investigation Related to the Influence of Two Data Parallel Strategies on Pytorch-Based Convolutional Neural Network Model Training
- DOI
- 10.2991/978-94-6463-300-9_62How to use a DOI?
- Keywords
- CNN; Data Parallel; Deep Learning
- Abstract
The escalating prevalence of Convolutional Neural Networks (CNNs) coupled with the incessant growth in both model variants and datasets necessitates the formulation of a judicious data parallelism approach to effectively enhance the pace of model training. This imperative arises as a significant challenge confronted by developers and researchers alike. This paper compares data parallelism and distributed data parallelism. Experiments were designed using the CIFAR-10 and VGG16 models. It is found that the training time of multi-GPU adopting data parallel strategy is not ideal. Analyze the reasons for unsatisfactory training time by studying the impact of hardware and hyperparameters on the data parallel strategy. The data path dependence may be the main reason affecting the training time of the data parallel strategy from the unbalanced GPU usage rate when the data parallel strategy is used. The distributed data parallel strategy training model is compared with the data parallel strategy, and the difference between the results of the two is analyzed. Provides advice on the choice of data parallel strategy. The experimental results show that hardware and hyperparameters are not the main reasons for the unsatisfactory training time of the data parallel strategy. Distributed data parallel strategy training time is better than data parallel strategy, but it needs to ensure the accuracy with multiple GPUs.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Hao Xue PY - 2023 DA - 2023/11/27 TI - Investigation Related to the Influence of Two Data Parallel Strategies on Pytorch-Based Convolutional Neural Network Model Training BT - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023) PB - Atlantis Press SP - 600 EP - 608 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-300-9_62 DO - 10.2991/978-94-6463-300-9_62 ID - Xue2023 ER -