Department of Mathematics

Applied Mathematics

  •  Anna Little, University of Utah
  •  Clustering High-dimensional Data with Path Metrics: A Balance of Density and Geometry; zoom link @ https://sites.google.com/view/minds-seminar/home
  •  12/10/2020
  •  2:30 PM - 3:30 PM
  •  Online (virtual meeting) (Virtual Meeting Link)
  •  Olga Turanova (turanova@msu.edu)

This talk discusses multiple methods for clustering high-dimensional data, and explores the delicate balance between utilizing data density and data geometry. I will first present path-based spectral clustering, a novel approach which combines a density-based metric with graph-based clustering. This density-based path metric allows for fast algorithms and strong theoretical guarantees when clusters concentrate around low-dimensional sets. However, the method suffers from a loss of geometric information, information which is preserved by simple linear dimension reduction methods such as classic multidimensional scaling (CMDS). The second part of the talk will explore when CMDS followed by a simple clustering algorithm can exactly recover all cluster labels with high probability. However, scaling conditions become increasingly restrictive as the ambient dimension increases, and the method will fail for irregularly shaped clusters. Finally, I will discuss how a more general family of path metrics, combined with MDS, give low-dimensional embeddings which respect both data density and data geometry. This new method exhibits promising performance on single cell RNA sequence data and can be computed efficiently by restriction to a sparse graph.

 

Contact

Department of Mathematics
Michigan State University
619 Red Cedar Road
C212 Wells Hall
East Lansing, MI 48824

Phone: (517) 353-0844
Fax: (517) 432-1562

College of Natural Science