Department of Mathematics

Applied Mathematics

  •  Boris Hanin, Princeton University
  •  Random Fully Connected Neural Networks as Perturbatively Solvable Models
  •  03/03/2022
  •  2:30 PM - 3:30 PM
  •  Online (virtual meeting)
  •  Olga Turanova (turanova@msu.edu)

Fully connected networks are roughly described by two structural parameters: a depth L and a width n. It is well known that, with some important caveats on the scale at initialization, in the regime of fixed L and the limit of infinite n, neural networks at the start of training are a free (i.e. Gaussian) field and that network optimization is kernel regression for the so-called neural tangent kernel (NTK). This is a striking and insightful simplification of infinitely overparameterized networks. However, in this particular infinite width limit neural networks cannot learn data-dependent features, which is perhaps their most important empirical feature. To understand feature learning one must therefore study networks at finite width. In this talk I will do just that. I will report on recent work joint with Dan Roberts and Sho Yaida (done at a physics level of rigor) and some more mathematical ongoing work which allows one to compute, perturbatively in 1/n and recursively in L, all correlation functions of the neural network function (and its derivatives) at initialization. An important upshot is the emergence of L/n, instead of simply L, as the effective network depth. This cut-off parameter provably measures the extent of feature learning and the distance at initialization to the large n free theory.

 

Contact

Department of Mathematics
Michigan State University
619 Red Cedar Road
C212 Wells Hall
East Lansing, MI 48824

Phone: (517) 353-0844
Fax: (517) 432-1562

College of Natural Science