MS2OD: Outlier Detection Using Minimum Spanning Tree and Medoid Selection

Jia Li,Jiangwei Li,Chenxu Wang,Fons J Verbeek,Tanja Schultz,Hui Liu
DOI: https://doi.org/10.1088/2632-2153/ad2492
2024-02-02
Machine Learning Science and Technology
Abstract:As an essential task in data mining, outlier detection identifies abnormal patterns in numerous applications, among which clustering-based outlier detection is one of the most popular methods for its effectiveness in detecting cluster-related outliers, especially in medical applications. This article presents an advanced method to extract cluster-based outliers by employing a scaled minimum spanning tree (MST) data structure and a new medoid selection method: 1. We compute a scaled MST and iteratively cut the current longest edge to obtain clusters; 2. We apply a new medoid selection method, considering the noise effect to improve the quality of cluster-based outlier identification. The experimental results on real-world data, including extensive medical corpora and other semantically meaningful datasets, demonstrate the wide applicability and outperforming metrics of the proposed method.
What problem does this paper attempt to address?