Speaker: Alexander Veselov, Laughborough U., UK and MSU, Russia

Title: Variations on Conway and Markov themes, https://msu.zoom.us/j/94925518997

Date: 05/07/2020

Time: 9:30 AM - 10:30 AM

Place: C204A Wells Hall

In 1990s John H. Conway proposed "topographic" approach to describe the values of the binary quadratic forms, which can be applied also to the description of the celebrated Markov triples. The growth of the orbits on the corresponding trivalent tree depends on the choice of the branch, which can be labelled by the points of real projective line.
I will show that the function on real projective line, describing the growth of Markov triples and their tropical versions, has some very interesting properties. The classical Markov's results and the link with the hyperbolic geometry, going back to Gorshkov and Cohn, will play a crucial role here.
The talk is based on joint works with K. Spalding.

There is a huge system of parallels between various geometric and topological properties of the moduli space of curves of genus $0$ with marked points $M_{0,1+n}$ and the so-called brick manifolds $B_n$, defined a few years ago by Laura Escobar. In fact, it appears that all interesting properties relevant for the topological field theory (considered as general phenomena surrounding $M_{0,1+n}$) find their non-commutative analogs in $B_n$:
-- stratification, description of cohomology (that implies the WDVV equations), psi-classes, structure of topological operad, cohomological field theories, Givental group action, surrounding topological operads (little disks, framed little discs, gravity), algebraic structures implied by their representations (hypercommutative algebra, also known in the literature as flat F-manifold, BV/Gerstenhaber algebras, etc.), and so on.
(To this end, my favourite alternative title for this talk would be "What is a non-commutative hypercommutative algebra?" )
In addition the brick manifolds have very rich geometric structure: they are toric varieties of Loday's realisations of associahedra and have interpretation as wonderful models of De Concini-Procesi.
In the talk, we'll try to show some of the properties mentioned above, step-by-step constructing them both on the side of moduli spaces of curves and on the side of brick manifolds.
(along the lines of my joint work with Volodya Dotsenko and Bruno Vallette).

Title: Dilogarithm identities and cluster algebras, https://msu.zoom.us/j/94925518997

Date: 05/21/2020

Time: 9:30 AM - 10:30 AM

Place: C204A Wells Hall

The dilogarithm function was introduced by Euler. It has been known for a long time that the function satisfies various functional identities, which we call dilogarithm identities. I this talk I will explain from scratch how and why a dilogarithm identity is associated with any periodic sequence of mutations in cluster algebras. Part of the talk is based on joint work with M. Gekhtman and D. Rupel.

Title: Balancing covariates in randomized experiments using the Gram–Schmidt walk

Date: 05/21/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of the One World MINDS series: https://sites.google.com/view/minds-seminar/home)
In randomized experiments, such as a medical trials, we randomly assign the treatment, such as a drug or a placebo, that each experimental subject receives. Randomization can help us accurately estimate the difference in treatment effects with high probability. We also know that we want the two groups to be similar: ideally the two groups would be similar in every statistic we can measure beforehand. Recent advances in algorithmic discrepancy theory allow us to divide subjects into groups with similar statistics.
By exploiting the recent Gram-Schmidt Walk algorithm of Bansal, Dadush, Garg, and Lovett, we can obtain random assignments of low discrepancy. These allow us to obtain more accurate estimates of treatment effects when the information we measure about the subjects is predictive, while also bounding the worst-case behavior when it is not.
We will explain the experimental design problem we address, the Gram-Schmidt walk algorithm, and the major ideas behind our analyses. This is joint work with Chris Harshaw, Fredrik Sävje, and Peng Zhang.
Paper: https://arxiv.org/abs/1911.03071
Code: https://github.com/crharshaw/GSWDesign.jl

Speaker: Pierre Guy Plamandon, Université de Paris Sud XI

Title: Realizations of associahedra and minimal relations between g-vectors, https://msu.zoom.us/j/94925518997

Date: 05/28/2020

Time: 9:30 AM - 10:30 AM

Place:

The associahedron is a famous polytope that is connected to Catalan families. In cluster theory, it can be realized as a polytope whose normal fan in the g-vector fan in type A. Other Dynkin types yield other polytopes, called generalized associahedra. Recently, the space of all polytopal realizations of the g-vector fan of a cluster algebra of Dynkin type has been of interest to physicists. In this talk, I will show how this space is governed by minimal relations between g-vectors. I will then sketch how these minimal relations can be computed using the additive categorification of cluster algebras. This is a report on a joint work with Arnau Padrol, Yann Palu and Vincent Pilaud.

Title: The Analytic Geometries of Data; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 05/28/2020

Time: 2:30 PM - 2:30 PM

Place:

(Part of the One World MINDS series: https://sites.google.com/view/minds-seminar/home)
\[\]
We will describe methodologies to build data geometries designed to simultaneously analyze and process data bases. The different geometries or affinity metrics arise naturally as we learn to contextualize and conceptualize. I.e; relate data regions, and data features (which we extend to data tensors). Moreover we generate tensorial multiscale structures.
We will indicate connection to analysis by deep nets and describe applications to modeling observations of dynamical systems, from stochastic molecular dynamics to calcium imaging of brain activity.

Speaker: Sergey Shadrin, Kortew-de Vries Instituut

Title: Non-commutative Gerstenhaber, BV, (framed) little disks, and other algebraic structures, https://msu.zoom.us/j/94925518997

Date: 06/04/2020

Time: 9:30 AM - 10:03 AM

Place:

I want to discuss the set up in which a number of algebraic structures (the ones mentioned in the title) get their natural non-commutative analogs. Surprisingly, these non-commutative analogs (emerged in the literature for completely different reasons) replicate all essential properties of the original structures --- for instance, I can very explicitly show how from nc-BV structure one can get, via the homotopy quotient of the circle action, a new structure that is natural to call the nc-Hycomm algebra / nc-flat F-manifold structure.
What is also very interesting --- while the geometric models for these structures (nc-Gerst and nc-BV) are quite simple, they are related to Brick manifolds (that I discussed a few weeks ago, but now I'll give a quite different independent definition) in exactly the same way as (framed) little disks are related to the moduli spaces of curves of genus $0$ --- via the real orientable blow-up along the boundary.
I intend to make this talk completely independent from the previous one.
Along the lines of my joint work with Volodya Dotsenko and Bruno Vallette.

Title: The troublesome kernel: instabilities in deep learning for inverse problems; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 06/04/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[\]
Due to their stunning success in traditional machine learning applications such as classification, techniques based on deep learning have recently begun to be actively investigated for problems in computational science and engineering. One of the key areas at the forefront of this trend is inverse problems, and specifically, inverse problems in imaging. The last few years have witnessed the emergence of many neural network-based algorithms for important imaging modalities such as MRI and X-ray CT. These claim to achieve competitive, and sometimes even superior, performance to current state-of-the-art techniques.
However, there is a problem. Techniques based on deep learning are typically unstable. For example, small perturbations in the data can lead to a myriad of artefacts in the recovered images. Such artifacts can be hard to dismiss as obviously unphysical, meaning that this phenomenon has potentially serious consequences for the safe deployment of deep learning in practice. In this talk, I will first showcase the instability phenomenon empirically in a range of examples. I will then focus on its mathematical underpinnings, the consequences of these insights when it comes to potential remedies, and the future possibilities for computing genuinely stable neural networks for inverse problems in imaging.
This is joint work with Vegard Antun, Nina M. Gottschling, Anders C. Hansen, Clarice Poon, and Francesco Renna
Papers:
https://www.pnas.org/content/early/2020/05/08/1907377117
https://arxiv.org/abs/2001.01258

Title: Optimal terminal dimensionality reduction in Euclidean space; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 06/11/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[
\]
The Johnson-Lindenstrauss lemma states that for any X a subset of R^d with |X| = n and for any epsilon, there exists a map f:X-->R^m for m = O(log n / epsilon^2) such that: for all x in X, for all y in X, (1-epsilon)|x - y|_2 <= |f(x) - f(y)|_2 <= (1+epsilon)|x - y|_2. We show that this statement can be strengthened. In particular, the above claim holds true even if "for all y in X" is replaced with "for all y in R^d". Joint work with Shyam Narayanan.

Title: Understanding Deep Neural Networks: From Generalization to Interpretability; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 06/18/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[
\]
Deep neural networks have recently seen an impressive comeback with
applications both in the public sector and the sciences. However,
despite their outstanding success, a comprehensive theoretical
foundation of deep neural networks is still missing.
For deriving a theoretical understanding of deep neural networks,
one main goal is to analyze their generalization ability, i.e.
their performance on unseen data sets. In case of graph
convolutional neural networks, which are today heavily used, for
instance, for recommender systems, already the generalization
capability to signals on graphs unseen in the training set,
typically coined transferability, was not rigorously analyzed. In
this talk, we will prove that spectral graph convolutional neural
networks are indeed transferable, thereby also debunking a common
misconception about this type of graph convolutional neural
networks.
If such theoretical approaches fail or if one is just given a
trained neural network without knowledge of how it was trained,
interpretability approaches become necessary. Those aim to "break
open the black box" in the sense of identifying those features from
the input, which are most relevant for the observed output. Aiming
to derive a theoretically founded approach to this problem, we
introduced a novel approach based on rate-distortion theory coined
Rate-Distortion Explanation (RDE), which not only provides
state-of-the-art explanations, but in addition allows first
theoretical insights into the complexity of such problems. In this
talk we will discuss this approach and show that it also gives a
precise mathematical meaning to the previously vague term of
relevant parts of the input.

Title: Mad Max: Affine Spline Insights into Deep Learning; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 06/25/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of the One World MINDS series: https://sites.google.com/view/minds-seminar/home)
\[
\]
We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs), which provide a powerful portal through which to view and analyze their inner workings. For instance, conditioned on the input signal, the output of a MASO DN can be written as a simple affine transformation of the input. This implies that a DN constructs a set of signal-dependent, class-specific templates against which the signal is compared via a simple inner product; we explore the links to the classical theory of optimal classification via matched filters and the effects of data memorization. Going further, we propose a simple penalty term that can be added to the cost function of any DN learning algorithm to force the templates to be orthogonal with each other; this leads to significantly improved classification performance and reduced overfitting with no change to the DN architecture. The spline partition of the input signal space that is implicitly induced by a MASO directly links DNs to the theory of vector quantization (VQ) and K-means clustering, which opens up new geometric avenue to study how DNs organize signals in a hierarchical fashion. To validate the utility of the VQ interpretation, we develop and validate a new distance metric for signals and images that quantifies the difference between their VQ encodings.

Title: Augmentations, Fillings, and Clusters of Positive Braid Closures, https://msu.zoom.us/j/94925518997

Date: 07/02/2020

Time: 9:30 AM - 10:30 AM

Place:

Consider the standard contact structure on $\mathbb{R}^3_{xyz}$ with contact 1-form $\alpha=dz-ydx$. A Legendrian link $\Lambda$ is a link in $\mathbb{R}^3$ along which $\alpha$ vanishes. Chekanov associated a dga (also known as the Chekanov-Eliashberg dga) to every Legendrian link $\Lambda$ and proved that the stable-tame equivalence class of such dga is invariant under Legendrian isotopy of Legendrian links. The augmentation variety of $\Lambda$ is defined to be the moduli space of augmentations of its CE dga.
Let $\mathbb{R}^4_{xyzt}$ be the symplectization of the standard contact $\mathbb{R}^3_{xyz}$ and consider the symplectic field theory of exact Lagrangian cobordisms between Legendrian links placed at different constant $t$ slices. Ekholm, Honda, Kalman constructed a contravariant functor that maps an exact Lagrangian cobordism to a dga homomorphism between the corresponding CE dga’s associated to the Legendrian links at the two ends. This induces a covariant functor that maps exact Lagrangian cobordisms to morphisms between augmentation varieties. In the special case of an exact Lagrangian filling of a Legendrian link $\Lambda$, i.e., an exact Lagrangian cobordism from the empty link to $\Lambda$, such covariant functor defines an algebraic torus inside the augmentation variety of $\Lambda$.
In this talk, I will focus on the cases of rainbow closures for positive braids, which are naturally Legendrian links. By using an enhanced version of the CE dga, my collaborators and I construct a cluster $K_2$ structure on the augmentation variety of the rainbow closure $\Lambda_\beta$ for any positive braid $\beta$, and prove that the algebraic tori arisen from exact Lagrangian fillings constructed by pinching crossings are cluster charts in this cluster structure. We also relate the Kalman automorphism to the cluster Donaldson-Thomas transformation on these augmentation varieties. As an application of this new cluster structure, we give a sufficient condition on which $\Lambda_\beta$ admits infinitely many non-Hamiltonian-isotopic exact Lagrangian fillings based on the order of DT, solving a conjecture on the existence of Legendrian links that admit infinitely many fillings. This is joint work in progress with Honghao Gao and Linhui Shen.

Title: Beyond Sparsity: Non-Linear Harmonic Analysis with Phase for Deep Networks; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 07/02/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[
\]
Understanding the properties of deep neural networks is not just about applying standard harmonic analysis tools with a bit of optimization. It is shaking our understanding of non-linear harmonic analysis and opening new horizons. By considering complex image generation and classification problems of different complexities, I will show that sparsity is not always the answer and phase plays an important role to capture important structures including symmetries within multiscale representations. This talk will raise more questions than answers.

Title: Convergence of gradient flows for learning deep linear neural networks; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 07/09/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[
\]
Learning neural networks amounts to minimizing a loss function over given training data. Often gradient descent algorithms are used for this task, but their convergence properties are not yet well-understood. In order to make progress we consider the simplified setting of linear networks optimized via gradient flows. We show that such gradient flow defined with respect to the layers (factors) can be reinterpreted as a Riemannian gradient flow on the manifold of rank-r matrices in certain cases. The gradient flow always converges to a critical point of the underlying loss functional and, for almost all initializations, it converges to a global minimum on the manifold of rank-k matrices for some k.

Title: Multi-Domain Data Integration: From Observations to Mechanistic Insights; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 07/16/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar: https://sites.google.com/view/minds-seminar/home)
\[
\]
Massive data collection holds the promise of a better understanding of complex phenomena and ultimately, of better decisions. An exciting opportunity in this regard stems from the growing availability of perturbation / intervention data (manufacturing, advertisement, education, genomics, etc.). In order to obtain mechanistic insights from such data, a major challenge is the integration of different data modalities (video, audio, interventional, observational, etc.). Using genomics and in particular the problem of identifying drugs for the repurposing against COVID-19 as an example, I will first discuss our recent work on coupling autoencoders in the latent space to integrate and translate between data of very different modalities such as sequencing and imaging. I will then present a framework for integrating observational and interventional data for causal structure discovery and characterize the causal relationships that are identifiable from such data. We end by a theoretical analysis of autoencoders linking overparameterization to memorization. In particular, I will characterize the implicit bias of overparameterized autoencoders and show that such networks trained using standard optimization methods implement associative memory. Collectively, our results have major implications for planning and learning from interventions in various application domains.

Speaker: Tamara Kolda, Sandia National Laboratories

Title: Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 07/23/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[
\]
Conventional algorithms for finding low-rank canonical polyadic (CP) tensor decompositions are unwieldy for large sparse tensors. The CP decomposition can be computed by solving a sequence of overdetermined least problems with special Khatri-Rao structure. In this work, we present an application of randomized algorithms to fitting the CP decomposition of sparse tensors, solving a significantly smaller sampled least squares problem at each iteration with probabilistic guarantees on the approximation errors. Prior work has shown that sketching is effective in the dense case, but the prior approach cannot be applied to the sparse case because a fast Johnson-Lindenstrauss transform (e.g., using a fast Fourier transform) must be applied in each mode, causing the sparse tensor to become dense. Instead, we perform sketching through leverage score sampling, crucially relying on the fact that the structure of the Khatri-Rao product allows sampling from overestimates of the leverage scores without forming the full product or the corresponding probabilities. Naive application of leverage score sampling is ineffective because we often have cases where a few scores are quite large, so we propose a novel hybrid of deterministic and random leverage-score sampling which consistently yields improved fits. Numerical results on real-world large-scale tensors show the method is significantly faster than competing methods without sacrificing accuracy. This is joint work with Brett Larsen, Stanford University.

Title: Sparse Recovery for Orthogonal Polynomial Transforms; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 07/30/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar:
https://sites.google.com/view/minds-seminar/home)
\[
\]
Consider the following sparse recovery problem. We have query access to a vector x in R^N such that \hat{x} = Fx is k-sparse (or nearly k-sparse) for some orthogonal transform F. The goal is to output an approximation to \hat{x} in sublinear time. This problem has been well-studied in the special case that F is the Discrete Fourier Transform (DFT), and a long line of work has resulted in sparse Fast Fourier Transforms that run in time O(k polylog(N)). However, for transforms F other than the DFT (or closely related transforms like the Discrete Cosine Transform), the question is much less settled.
In this talk I'll describe sublinear-time algorithms---running in time poly(k log N)---for solving the sparse recovery problem for orthogonal transforms F that arise from orthogonal polynomials. More precisely, our algorithm works for any F that is an orthogonal polynomial transform derived from Jacobi polynomials. The Jacobi polynomials are a large class of classical orthogonal polynomials (and include Chebyshev and Legendre polynomials as special cases), and show up extensively in applications like numerical analysis and signal processing. One caveat of our work is that we require an assumption on the sparsity structure of the sparse vector, although we note that vectors with random support have this property with high probability.
Our approach is to give a very general reduction from the k-sparse sparse recovery problem to the 1-sparse sparse recovery problem, which works for any "flat" orthogonal polynomial transform.
This is joint work with Anna Gilbert, Albert Gu, Chris Re, and Atri Rudra.

Title: Reconciling Statistical Queries and the Low Degree Likelihood Ratio; zoom link @ https://sites.google.com/view/minds-seminar/home

Date: 08/06/2020

Time: 2:30 PM - 3:30 PM

Place:

(Part of One World MINDS seminar: https://sites.google.com/view/minds-seminar/home)
\[
\]
In many high-dimensional statistics problems, we observe information-computation tradeoffs: given access to more data, statistical estimation and inference tasks require fewer computational resources. Though this phenomenon is ubiquitous, we lack rigorous evidence that it is inherent. In the current day, to prove that a statistical estimation task is computationally intractable, researchers must prove lower bounds against each type of algorithm, one by one, resulting in a "proliferation of lower bounds". We scientists dream of a more general theory which unifies these lower bounds and explains computational intractability in an algorithm-independent way.
In this talk, I will make one small step towards realizing this dream. I will demonstrate general conditions under which two popular frameworks yield the same information-computation tradeoffs for high-dimensional hypothesis testing: the first being statistical queries in the "SDA" framework, and the second being hypothesis testing with low-degree hypothesis tests, also known as the low-degree-likelihood ratio. Our equivalence theorems capture numerous well-studied high-dimensional learning problems: sparse PCA, tensor PCA, community detection, planted clique, and more.
Based on joint work with Matthew Brennan, Guy Bresler, Samuel B. Hopkins and Jerry Li.