The Mathematics and Deep Learning Collective: Exploratory Data Analysis for Data Objects on a Metric Space via Tukey’s Depth

April 2, 2021, 4:10-5:00pm | Zoom

Exploratory data analysis involves looking at the data and understanding what can be done with them. Non-standard data objects such as directions, covariance matrices, trees, functions, and images have become increasingly common in modern practice. Such complex data objects are hard to examine due to the lack of a natural ordering and efficient visualization tools. We develop a novel exploratory tool for data objects lying on a metric space based on data depth, extending the celebrated Tukey’s depth for Euclidean data. The proposed metric halfspace depth assigns depth values to data points, characterizing the centrality and outlyingness of these points. This also leads to an interpretable center-outward ranking, which can be used to construct rank tests. I will demonstrate two applications, one to reveal differential brain connectivity patterns in an Alzheimer’s disease study, and another to infer the phylogenetic history and outlying phylogenies in 7 pathogenic parasites.