Abstract:Defining a successful notion of a multivariate quantile has been an open problem for more than half a century, motivating a plethora of possible solutions. Of these, the approach of [8] and [25] leading to M-quantiles, is very appealing for its mathematical elegance combining elements of convex analysis and probability theory. The key idea is the description of a convex function (the K-function) whose gradient (the K-transform) is in one-to-one correspondence between all of R^d and the unit ball in R^d. By analogy with the d=1 case where the K-transform is a cumulative distribution function-like object (an M-distribution), the fact that its inverse is guaranteed to exist lends itself naturally to providing the basis for the definition of a quantile function for all d>=1. Over the past twenty years the resulting M-quantiles have seen applications in a variety of fields, primarily for the purpose of detecting outliers in multidimensional spaces. In this article we prove that for odd d>=3, it is not the gradient but a poly-Laplacian of the K-function that is (almost everywhere) proportional to the density function. For d even one cannot establish a differential equation connecting the K-function with the density. These results show that usage of the K-transform for outlier detection in higher odd-dimensions is in principle flawed, as the K-transform does not originate from inversion of a true M-distribution. We demonstrate these conclusions in two dimensions through examples from non-standard asymmetric distributions. Our examples illustrate a feature of the K-transform whereby regions in the domain with higher density map to larger volumes in the co-domain, thereby producing a magnification effect that moves inliers closer to the boundary of the co-domain than outliers. This feature obviously disrupts any outlier detection mechanism that relies on the inverse K-transform.

STAR_outliers: a python package that separates univariate outliers from non-normal distributions

Check your outliers! An introduction to identifying statistical outliers in R with easystats

A Novel Outlier Detection Method for Multivariate Data

SDROF: outlier detection algorithm based on relative skewness density ratio outlier factor

A robust distance-based approach for detecting multidimensional outliers

An experimental study of existing tools for outlier detection and cleaning in trajectories

Onion-Peeling Outlier Detection in 2-D data Sets

An outlier map for Support Vector Machine classification

MaxSkew and MultiSkew: Two R Packages for Detecting, Measuring and Removing Multivariate Skewness

Mean-shift outlier detection and filtering

Finding Outliers in Gaussian Model-based Clustering

Anomaly Detection by Robust Statistics

Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures

A method for outlier detection based on cluster analysis and visual expert criteria

Efficient Generation of Hidden Outliers for Improved Outlier Detection

The orthogonal skew model: computationally efficient multivariate skew-normal and skew-t distributions with applications to model-based clustering

On the use of the M-quantiles for outlier detection in multivariate data

Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators

Data-driven cluster analysis method: a novel outliers detection method in multivariate data

Outliers Detection in Networks with Missing Links

Simultaneous Transformation and Rounding (STAR) Models for Integer-Valued Data