Abstract:While the classical KNN (k nearest neighbor) shares its avoidance of the consistent distribution assumption between training and testing samples to achieve fast prediction, it still faces two challenges: (a) its generalization ability heavily depends on an appropriate number k of nearest neighbors; (b) its prediction behavior lacks interpretability. In order to address the two challenges, a novel Bayes-decisive linear KNN with adaptive nearest neighbors (i.e., BLA-KNN) is proposed to obtain the following three merits: (a) a diagonal matrix is introduced to adaptively select the nearest neighbors and simultaneously improve the generalization capability of the proposed BLA-KNN method; (b) the proposed BLA-KNN method owns the group effect, which inherits and extends the group property of the sum of squares for total deviations by reflecting the training sample class-aware information in the group effect regularization term; (c) the prediction behavior of the proposed BLA-KNN method can be interpreted from the Bayes-decision-rule perspective. In order to do so, we first use a diagonal matrix to weigh each training sample so as to obtain the importance of the sample, while constraining the importance weights to ensure that the adaptive k value is carried out efficiently. Second, we introduce a class-aware information regularization term in the objective function to obtain the nearest neighbor group effect of the samples. Finally, we introduce linear expression weights related to the distance measure between the testing and training samples in the regularization term to ensure that the interpretation of Bayes-decision-rule can be performed smoothly. We also optimize the proposed objective function using an alternating optimization strategy. We experimentally demonstrate the effectiveness of the proposed BLA-KNN method by comparing it with 7 comparative methods on 15 benchmark datasets.

Reachable Distance Function for KNN Classification

Accelerating Exact Nearest Neighbor Search in High Dimensional Euclidean Space Via Block Vectors

Image Retrieval and Recognition Based on Generalized Local Distance Functions

Introduction to Machine Learning: K-Nearest Neighbors

K-Ns: A Classifier by the Distance to the Nearest Subspace.

A new distance measurement and its application in K-Means Algorithm

Efficient Data-aware Distance Comparison Operations for High-Dimensional Approximate Nearest Neighbor Search

A Generalization of Proximity Functions for K-Means

Hybrid Metric K-Nearest Neighbor Algorithm and Applications

A Novel Effective Distance Measure and a Relevant Algorithm for Optimizing the Initial Cluster Centroids of K-means

Optimization of distance formula in K-Nearest Neighbor method

Adaptive Nearest Neighbor: A General Framework for Distance Metric Learning

A Fast and Easy Regression Technique for k-NN Classification Without Using Negative Pairs

A Novel Pseudo Nearest Neighbor Classification Method Using Local Harmonic Mean Distance

Lazylsh: Approximate Nearest Neighbor Search For Multiple Distance Functions With A Single Index

Hausdorff distance with k-nearest neighbors

A Novel Adaptive Hit-Distance For Pattern Recognition

FedKNN: Secure Federated k-Nearest Neighbor Search

Efficient index-based KNN join processing for high-dimensional data

Bayes-Decisive Linear KNN with Adaptive Nearest Neighbors

p -adic distance and k -Nearest Neighbor classification