A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization

25 Pages Posted: 13 Dec 2023

See all articles by Jian Cao

Jian Cao

Xiamen University

Chen Qian

Xiamen University

Yihui Huang

Xiamen University

Dicheng Chen

Xiamen University

Yuncheng Gao

Xiamen University

Jiyang Dong

Xiamen University - Department of Electronic Science

Di Guo

Xiamen University of Technology

Xiaobo Qu

Xiamen University

Abstract

Implicit regularization induced by gradient optimization is an important way to understand the generalization in neural networks. Recent theory explains implicit regularization over the model of deep matrix factorization (DMF) and analyzes the trajectory of discrete gradient dynamics in the optimization process. These discrete gradient dynamics can mathematically characterize the practical learning rate of adaptive gradient optimization such as RMSProp. Discrete gradient dynamics analysis has been successfully applied to shallow networks but encounters the difficulty of complex computation for deep networks. In this work, we introduce another discrete gradient dynamics approach to explain implicit regularization of RMSProp, i.e. landscape analysis. It mainly focuses on gradient regions like saddle points and local minima. We investigate that increasing learning rates benefit saddle point escaping (SPE) stages. In elucidating implicit regularization through the convergence of RMSProp, we prove that for a rank-R matrix reconstruction, DMF will converge to a second-order critical point after R stages of SPE. This conclusion is further experimentally verified on a low-rank matrix reconstruction problem.

Keywords: Deep learning, implicit regularization, low-rank matrix factorization, discrete gradient dynamics, saddle point

Suggested Citation

Cao, Jian and Qian, Chen and Huang, Yihui and Chen, Dicheng and Gao, Yuncheng and Dong, Jiyang and Guo, Di and Qu, Xiaobo, A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization. Available at SSRN: https://ssrn.com/abstract=4663079 or http://dx.doi.org/10.2139/ssrn.4663079

Jian Cao

Xiamen University ( email )

Xiamen, 361005
China

Chen Qian

Xiamen University ( email )

Xiamen, 361005
China

Yihui Huang

Xiamen University ( email )

Xiamen, 361005
China

Dicheng Chen

Xiamen University ( email )

Xiamen, 361005
China

Yuncheng Gao

Xiamen University ( email )

Xiamen, 361005
China

Jiyang Dong

Xiamen University - Department of Electronic Science ( email )

XIamen
China

Di Guo

Xiamen University of Technology ( email )

Xiamen
China

Xiaobo Qu (Contact Author)

Xiamen University ( email )

Xiamen, 361005
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
17
Abstract Views
157
PlumX Metrics