On Rank Energy Statistics via Optimal Transport: Continuity, Convergence, and Change Point Detection
Matthew Werenski,Shoaib Bin Masud,James M. Murphy,Shuchin Aeron
DOI: https://doi.org/10.1109/tit.2024.3367182
IF: 2.5
2024-01-01
IEEE Transactions on Information Theory
Abstract:This paper considers the use of recently proposed optimal transport-based multivariate goodness-of-fit (GoF) test statistics, namely rank energy and its variant the soft rank energy derived from entropy-regularized optimal transport, for unsupervised non-parametric change point detection (CPD) in multivariate time series data. We show that the soft rank energy enjoys both fast rates of statistical convergence and robust continuity properties which lead to strong performance on real datasets. Our analyses remove the need for resampling and out-of-sample extensions previously required to obtain such rates. Our theoretical results show that the rank energy suffers from the curse of dimensionality in statistical estimation and moreover can signal a change point from arbitrarily small perturbations, which leads to a high rate of false alarms in CPD. Additionally, under mild regularity conditions, we quantify the discrepancy between soft rank energy and rank energy in terms of the regularization parameter. Finally, we show our approach performs favorably in numerical experiments compared to several other optimal transport-based methods as well as maximum mean discrepancy (MMD), which is a popular multivariate GoF statistic.
computer science, information systems,engineering, electrical & electronic