Title: ZOOM TALK (password the smallest prime > 100) - Towards Intrinsically Low-Dimensional Models in Wasserstein Space: Geometry, Statistics, and Learning

Date: 02/02/2023

Time: 2:30 PM - 3:30 PM

Place: C304 Wells Hall

Contact: Mark A Iwen ()

We consider the problems of efficient modeling and representation learning for probability distributions in Wasserstein space. We consider a general barycentric coding model in which data are represented as Wasserstein-2 (W2) barycenters of a set of fixed reference measures. Leveraging the Riemannian structure of W2-space, we develop a tractable optimization program to learn the barycentric coordinates when given access to the densities of the underlying measures. We provide a consistent statistical procedure for learning these coordinates when the measures are accessed only by i.i.d. samples. Our consistency results and algorithms exploit entropic regularization of the optimal transport problem, thereby allowing our barycentric modeling approach to scale efficiently. We also consider the problem of learning reference measures given observed data. Our regularized approach to dictionary learning in Wasserstein space addresses core problems of ill-posedness and in practice learns interpretable dictionary elements and coefficients useful for downstream tasks. Applications to image and natural language processing will be shown throughout the talk.