Talk_id | Date | Speaker | Title |
26835
|
Thursday 1/7 2:30 PM
|
Jose Perea, Michigan State University
|
TBA; zoom link @ https://sites.google.com/view/minds-seminar/home
- Jose Perea, Michigan State University
- TBA; zoom link @ https://sites.google.com/view/minds-seminar/home
- 01/07/2021
- 2:30 PM - 3:30 PM
-
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
No abstract available.
|
26848
|
Thursday 1/14 3:30 PM
|
Raymond Chan, City University of Hong Kong
|
TBA; zoom link @ https://sites.google.com/view/minds-seminar/home
- Raymond Chan, City University of Hong Kong
- TBA; zoom link @ https://sites.google.com/view/minds-seminar/home
- 01/14/2021
- 3:30 PM - 4:30 PM
-
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
TBA; (Note the unusual time: 4:30pm Shanghai, 10:30am Paris.)
|
26961
|
Thursday 1/21 2:30 PM
|
Daniel Kane, University of California, San Diego
|
Point Location and Active Learning
- Daniel Kane, University of California, San Diego
- Point Location and Active Learning
- 01/21/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
In the point location problem one is given a hyperplane arrangement and an unknown point. By making linear queries about that point one wants to determine which cell of the hyperplane arrangement it lies in. This problem has an unexpected connection to the problem in machine learning of actively learning a halfspace. We discuss these problems and their relationship and provide a new and nearly optimal algorithm for solving them.
|
26962
|
Thursday 1/28 3:30 AM
|
Zuowei Shen, National University of Singapore
|
Deep Approximation via Deep Learning
- Zuowei Shen, National University of Singapore
- Deep Approximation via Deep Learning
- 01/28/2021
- 3:30 AM - 4:30 AM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
The primary task of many applications is approximating/estimating a function through samples drawn from a probability distribution on the input space. The deep approximation is to approximate a function by compositions of many layers of simple functions, that can be viewed as a series of nested feature extractors. The key idea of deep learning network is to convert layers of compositions to layers of tuneable parameters that can be adjusted through a learning process, so that it achieves a good approximation with respect to the input data. In this talk, we shall discuss mathematical theory behind this new approach and approximation rate of deep network; how this new approach differs from the classic approximation theory, and how this new theory can be used to understand and design deep learning network.
|
26963
|
Thursday 2/4 2:30 PM
|
Tino Ullrich , TU Chemnitz
|
A New Subsampling Technique for Random Points and Optimal Least Squares Approximation of High-Dimensional Functions
- Tino Ullrich , TU Chemnitz
- A New Subsampling Technique for Random Points and Optimal Least Squares Approximation of High-Dimensional Functions
- 02/04/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
We provide a new general upper bound for the minimal L2-worst-case recovery error in the framework of RKHS, where only n function samples are allowed. This quantity can be bounded in terms of the singular numbers of the compact embedding into the space of square integrable functions. It turns out that in many relevant situations this quantity is asymptotically only worse by square root of log(n) compared to the singular numbers. The algorithm which realizes this behavior is a weighted least squares algorithm based on a specific set of sampling nodes which works for the whole class of functions simultaneously. These points are constructed out of a random draw with respect to distribution tailored to the spectral properties of the reproducing kernel (importance sampling) in combination with a sub-sampling procedure coming from the celebrated proof of Weaver's conjecture, which was shown to be equivalent to the Kadison-Singer problem. For the above multivariate setting, it is still a fundamental open problem whether sampling algorithms are as powerful as algorithms allowing general linear information like Fourier or wavelet coefficients. However, the gap is now rather small. As a consequence, we may study well-known scenarios where it was widely believed that sparse grid sampling recovery methods perform optimally. It turns out that this is not the case for dimensions d > 2.
This is joint work with N. Nagel and M. Schaefer from TU Chemnitz.
|
26964
|
Thursday 2/11 2:30 PM
|
Massimo Fornasier , Technische Universität München
|
Consensus-based Optimization on the Sphere
- Massimo Fornasier , Technische Universität München
- Consensus-based Optimization on the Sphere
- 02/11/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
I present new stochastic multi-particle models for global optimization of nonconvex functions on the sphere. These models belong to the class of Consensus-Based Optimization methods. In fact, particles move over the manifold driven by a drift towards an instantaneous consensus point, computed as a combination of the particle locations weighted by the cost function according to Laplace's principle. The consensus point represents an approximation to a global minimizer. The dynamics is further perturbed by a random vector field to favor exploration, whose variance is a function of the distance of the particles to the consensus point. In particular, as soon as the consensus is reached, then the stochastic component vanishes. In the first part of the talk, I present the well-posedness of the model on the sphere and we derive rigorously its mean-field approximation for large particle limit.
In the second part I address the proof of convergence of numerical schemes to global minimizers provided conditions of well-preparation of the initial datum. The proof combines the mean-field limit with a novel asymptotic analysis, and classical convergence results of numerical methods for SDE. We present several numerical experiments, which show that the proposed algorithm scales well with the dimension and is extremely versatile. To quantify the performances of the new approach, we show that the algorithm is able to perform essentially as good as ad hoc state of the art methods in challenging problems in signal processing and machine learning, namely the phase retrieval problem and the robust subspace detection.
Joint work with H. Huang, L. Pareschi, and P. Sünnen
|
26965
|
Thursday 2/18 2:30 PM
|
Mikhail Belkin , University of California, San Diego
|
A theory of optimization and transition to linearity in deep learning
- Mikhail Belkin , University of California, San Diego
- A theory of optimization and transition to linearity in deep learning
- 02/18/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
The success of deep learning is due, to a large extent, to the remarkable effectiveness of gradient-based optimization methods applied to large neural networks. In this talk I will discuss some general mathematical principles allowing for efficient optimization in over-parameterized non-linear systems, a setting that includes deep neural networks. Remarkably, it seems that optimization of such systems is "easy". In particular, optimization problems corresponding to these systems are not convex, even locally, but instead satisfy locally the Polyak-Lojasiewicz (PL) condition allowing for efficient optimization by gradient descent or SGD. We connect the PL condition of these systems to the condition number associated to the tangent kernel and develop a non-linear theory parallel to classical analyses of over-parameterized linear equations.
In a related but conceptually separate development, I will discuss a new perspective on the remarkable recently discovered phenomenon of transition to linearity (constancy of NTK) in certain classes of large neural networks. I will show how this transition to linearity results from the scaling of the Hessian with the size of the network.
Joint work with Chaoyue Liu and Libin Zhu
|
26983
|
Thursday 2/25 3:30 AM
|
Zaiwen Wen, Peking University, China
|
Stochastic Second-Order Methods For Deep Learning
- Zaiwen Wen, Peking University, China
- Stochastic Second-Order Methods For Deep Learning
- 02/25/2021
- 3:30 AM - 4:30 AM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
Stochastic methods are widely used in deep learning. In this talk, we first review the state-of-the-art methods such as KFAC. Then we present a structured stochastic quasi-Newton method and a sketchy empirical natural gradient method. Numerical results on deep convolution networks illustrate that our methods are quite competitive to SGD and KFAC.
|
26984
|
Thursday 3/4 2:30 PM
|
Ronald DeVore, Texas A&M University
|
Deep Learning and Neural Networks: The Mathematical View
- Ronald DeVore, Texas A&M University
- Deep Learning and Neural Networks: The Mathematical View
- 03/04/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
Deep Learning is much publicized and has had great empirical success on challenging problems in learning. Yet there is no quantifiable proof of performance and certified guarantees for these methods. This talk will give an overview of Deep Learning from the viewpoint of mathematics and numerical computation.
|
27001
|
Thursday 3/11 3:30 AM
|
Jianfeng Cai, Hong Kong University of Science and Technology
|
Landscape analysis of non-convex optimizations in phase retrieval
- Jianfeng Cai, Hong Kong University of Science and Technology
- Landscape analysis of non-convex optimizations in phase retrieval
- 03/11/2021
- 3:30 AM - 4:30 AM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
Non-convex optimization is a ubiquitous tool in scientific and engineering research. For many important problems, simple non-convex optimization algorithms often provide good solutions efficiently and effectively, despite possible local minima. One way to explain the success of these algorithms is through the global landscape analysis. In this talk, we present some results along with this direction for phase retrieval. The main results are, for several of non-convex optimizations in phase retrieval, a local minimum is also global and all other critical points have a negative directional curvature. The results not only explain why simple non-convex algorithms usually find a global minimizer for phase retrieval, but also are useful for developing new efficient algorithms with a theoretical guarantee by applying algorithms that are guaranteed to find a local minimum.
|
27002
|
Thursday 3/18 2:30 PM
|
Roberto Imbuzeiro Oliveira, IMPA, Rio de Janeiro
|
Sample average approximation with heavier tails
- Roberto Imbuzeiro Oliveira, IMPA, Rio de Janeiro
- Sample average approximation with heavier tails
- 03/18/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
Consider an "ideal" optimization problem where constraints and objective function are defined in terms of expectations over some distribution P. The sample average approximation (SAA) -- a fundamental idea in stochastic optimization -- consists of replacing the expectations by an average over a sample from P. A key question is how much the solutions of the SAA differ from those of the original problem. Results by Shapiro from many years ago consider what happens asymptotically when the sample size diverges, especially when the solution of the ideal problem lies on the boundary of the feasible set. In joint work with Philip Thompson (Purdue), we consider what happens with finite samples. As we will see, our results improve upon the nonasymptotic state of the art in various ways: we allow for heavier tails, unbounded feasible sets, and obtain bounds that (in favorable cases) only depend on the geometry of the feasible set in a small neighborhood of the optimal solution. Our results combine "localization" and "fixed-point" type arguments inpired by the work of Mendelson with chaining-type inequalities. One of our contributions is showing what can be said when the SAA constraints are random.
|
27003
|
Thursday 3/25 2:30 PM
|
Rachel Ward , University of Texas at Austin
|
TBA
- Rachel Ward , University of Texas at Austin
- TBA
- 03/25/2021
- 2:30 PM - 2:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
No abstract available.
|
27004
|
Thursday 4/1 2:30 PM
|
Yi Ma , University of California, Berkeley
|
TBA
- Yi Ma , University of California, Berkeley
- TBA
- 04/01/2021
- 2:30 PM - 2:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
No abstract available.
|
28028
|
Thursday 4/8 2:30 PM
|
Mahdi Soltanolkotabi, University of Southern California
|
TBA
- Mahdi Soltanolkotabi, University of Southern California
- TBA
- 04/08/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
No abstract available.
|
28029
|
Thursday 4/15 2:30 PM
|
Michael Wakin, Colorado School of Mines
|
TBA
- Michael Wakin, Colorado School of Mines
- TBA
- 04/15/2021
- 2:30 PM - 3:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
No abstract available.
|
28030
|
Thursday 4/22 4:30 AM
|
Shahar Mendelson, Australian National University
|
TBA
- Shahar Mendelson, Australian National University
- TBA
- 04/22/2021
- 4:30 AM - 5:30 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
No abstract available.
|
28031
|
Thursday 4/29 1:00 PM
|
Anne Gelb, Dartmouth College
|
TBA
- Anne Gelb, Dartmouth College
- TBA
- 04/29/2021
- 1:00 PM - 2:00 PM
- Online (virtual meeting)
(Virtual Meeting Link)
- Olga Turanova (turanova@msu.edu)
https://sites.google.com/view/minds-seminar/home
|