Relationships Between Tail Entropies and Local Intrinsic Dimensionality and Their Use for Estimation and Feature Representation

James Bailey,Michael E. Houle,Xingjun Ma
DOI: https://doi.org/10.1016/j.is.2023.102245
IF: 3.18
2023-01-01
Information Systems
Abstract:The local intrinsic dimensionality (LID) model assesses the complexity of data within the vicinity of a query point, through the growth rate of the probability measure within an expanding neighborhood. In this paper, we show how LID is asymptotically related to the entropy of the lower tail of the distribution of distances from the query. We establish relationships for cumulative Shannon entropy, entropy power, Bregman formulation of cumulative Kullback–Leibler divergence, and generalized Tsallis entropy variants. Leveraging these relationships, we propose four new estimators of LID, one of them expressible in an intuitive analytic form. We investigate the effectiveness of these new estimators, as well as the effectiveness of entropy power as the basis for feature representations in classification.
What problem does this paper attempt to address?