A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization

Cao, Jian; Qian, Chen; Huang, Yihui; Chen, Dicheng; Gao, Yuncheng; Dong, Jiyang; Guo, Di; Qu, Xiaobo

doi:10.2139/ssrn.4663079

Download This Paper

Open PDF in Browser

Add Paper to My Library

A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization

25 Pages Posted: 13 Dec 2023

See all articles by Jian Cao

Jiyang Dong

Xiamen University - Department of Electronic Science

Implicit regularization induced by gradient optimization is an important way to understand the generalization in neural networks. Recent theory explains implicit regularization over the model of deep matrix factorization (DMF) and analyzes the trajectory of discrete gradient dynamics in the optimization process. These discrete gradient dynamics can mathematically characterize the practical learning rate of adaptive gradient optimization such as RMSProp. Discrete gradient dynamics analysis has been successfully applied to shallow networks but encounters the difficulty of complex computation for deep networks. In this work, we introduce another discrete gradient dynamics approach to explain implicit regularization of RMSProp, i.e. landscape analysis. It mainly focuses on gradient regions like saddle points and local minima. We investigate that increasing learning rates benefit saddle point escaping (SPE) stages. In elucidating implicit regularization through the convergence of RMSProp, we prove that for a rank-R matrix reconstruction, DMF will converge to a second-order critical point after R stages of SPE. This conclusion is further experimentally verified on a low-rank matrix reconstruction problem.

Keywords: Deep learning, implicit regularization, low-rank matrix factorization, discrete gradient dynamics, saddle point

Suggested Citation: Suggested Citation

Cao, Jian and Qian, Chen and Huang, Yihui and Chen, Dicheng and Gao, Yuncheng and Dong, Jiyang and Guo, Di and Qu, Xiaobo, A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization. Available at SSRN: https://ssrn.com/abstract=4663079 or http://dx.doi.org/10.2139/ssrn.4663079