Optimizing Synchronous Stochastic Gradient Descent with Local Efficient Sign and Model Averaging Correction

28 Pages Posted: 24 Sep 2024

See all articles by Dongyang Liu

Dongyang Liu

affiliation not provided to SSRN

Zeqiang Chen

affiliation not provided to SSRN

Nengcheng Chen

affiliation not provided to SSRN

Abstract

Synchronous Stochastic Gradient Descent (SSGD) is a key method of distributed deep learning, while its lots of iterations and large communicated weight parameters per interaction cause communication bottlenecks. To deal with this problem, this paper proposes a communication optimized approach for Synchronous Stochastic Gradient Descent with Local Efficient Sign and Model Averaging Correction (LEFS-SGDM). The LEFS-SGDM method merges delay communication with gradient compression technology Sign Stochastic Gradient Descent to diminish communication frequency and data volume. Furthermore, it incorporates the error accumulation and global model constraint mechanisms to enhance training accuracy. Experiments were carried out on Residual Network-20 (ResNet-20), Visual Geometry Group-11 (VGG-11) and Dense Convolutional Network-40 (DenseNet-40) models with Canadian Institute for Advanced Research, 10 classes (CIFAR-10) and Canadian Institute for Advanced Research, 100 classes (CIFAR-100) datasets. Compared with existing Local Stochastic Gradient Descent, the results show that LEFS-SGDM not only reduces the amount of communication data by 97.04% but also improves the test accuracy by 0.46%-2.32%. The results prove the effectiveness of the method and demonstrate its potential applicability in distributed deep learning.

Keywords: Distributed Deep Learning, Synchronous Stochastic Gradient Descent, Local Efficient Sign, Model Averaging Correction, Gradient Compression, Communication Bottleneck

Suggested Citation

Liu, Dongyang and Chen, Zeqiang and Chen, Nengcheng, Optimizing Synchronous Stochastic Gradient Descent with Local Efficient Sign and Model Averaging Correction. Available at SSRN: https://ssrn.com/abstract=4965637 or http://dx.doi.org/10.2139/ssrn.4965637

Dongyang Liu

affiliation not provided to SSRN ( email )

No Address Available

Zeqiang Chen (Contact Author)

affiliation not provided to SSRN ( email )

No Address Available

Nengcheng Chen

affiliation not provided to SSRN ( email )

No Address Available

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
22
Abstract Views
109
PlumX Metrics