## Applied Mathematics

•  Rebecca Willett, University of Chicago
•  09/24/2020
•  2:30 PM - 3:30 PM
•  Online (virtual meeting)

A growing body of research illustrates that neural network generalization performance is less dependent on the network size (i.e. number of weights or parameters) and more dependent on the magnitude of the weights. That is, generalization is not achieved by limiting the size of the network, but rather by explicitly or implicitly controlling the magnitude of the weights. To better understand this phenomenon, we will explore how neural networks represent functions as the number of weights in the network approaches infinity. Specifically, we characterize the norm required to realize a function f as a single hidden-layer ReLU network with an unbounded number of units (infinite width), but where the Euclidean norm of the weights is bounded, including precisely characterizing which functions can be realized with finite norm. This was settled for univariate functions in Savarese et al. (2019), where it was shown that the required norm is determined by the L1-norm of the second derivative of the function. We extend the characterization to multivariate functions (i.e., networks with d input units), relating the required norm to the L1-norm of the Radon transform of a (d+1)/2-power Laplacian of the function. This characterization allows us to show that all functions in certain Sobolev spaces can be represented with bounded norm and to obtain a depth separation result. These results have important implications for understanding generalization performance and the distinction between neural networks and more traditional kernel learning.

## Contact

Department of Mathematics
Michigan State University