In-Person Poster presentation / poster accept
Uniform-in-time propagation of chaos for the mean-field gradient Langevin dynamics
Taiji Suzuki · Atsushi Nitanda · Denny Wu
MH1-2-3-4 #143
Keywords: [ mean-field regime ] [ Neural network optimization ] [ interacting particle system ] [ propagation of chaos ] [ Theory ]
Abstract:
The mean-field Langevin dynamics is characterized by a stochastic differential equation that arises from (noisy) gradient descent on an infinite-width two-layer neural network, which can be viewed as an interacting particle system. In this work, we establish a quantitative weak propagation of chaos result for the system, with a finite-particle discretization error of $\mathcal{O}(1/N)$ \textit{uniformly over time}, where $N$ is the width of the neural network. This allows us to directly transfer the optimization guarantee for infinite-width networks to practical finite-width models without excessive overparameterization. On the technical side, our analysis differs from most existing studies on similar mean field dynamics in that we do not require the interaction between particles to be sufficiently weak to obtain a uniform propagation of chaos, because such assumptions may not be satisfied in neural network optimization. Instead, we make use of a logarithmic Sobolev-type condition which can be verified in appropriate regularized risk minimization settings.
Chat is not available.