Abstract:Many researches have been devoted to learn a Mahalanobis distance metric, which can effectively improve the performance of kNN classification. Most approaches are iterative and computational expensive and linear rigidity still critically limits metric learning algorithm to perform better. We proposed a computational economical framework to learn multiple metrics in closed-form.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the insufficient performance of the K - Nearest Neighbor (kNN) classifier when using traditional distance metrics such as the Euclidean distance. Specifically, the paper focuses on:
1. **Linear Limitations**: Traditional Mahalanobis distance metric learning methods have linear rigidity, which restricts their performance when dealing with datasets with inherent nonlinear structures.
2. **Computational Efficiency**: Most existing metric learning methods rely on iterative numerical solvers, which are computationally costly and tend to converge to local optimal solutions easily.
3. **The Gap between Metric Learning and the kNN Classifier**: Even if the objective function is well - optimized, it does not necessarily significantly improve the accuracy of the kNN classifier.
To solve these problems, the author proposes a new framework - **Multiple Closed - Form Local Metric Learning (CFLML)**. This framework aims to improve the performance of the kNN classifier in the following ways:
- **Multi - Metric Learning**: Instead of learning a single global metric, learn multiple local metrics to adapt to different regions of the data.
- **Closed - Form Solution**: Adopt a closed - form solution to generate new metrics, thus avoiding the high computational cost of iterative solving.
- **Complementary Performance**: The newly generated sub - metrics can complement the deficiencies of the parent metric, thereby improving the overall classification performance.
### Specific Problem Description
The paper points out that traditional kNN classifiers usually use the simple Euclidean distance to calculate the similarity between instances. However, this method ignores the statistical regularities that can be estimated from a large number of labeled training samples. To overcome this limitation, many researchers have proposed learning the Mahalanobis distance, that is, the quadratic metric, to improve the performance of the kNN classifier.
However, existing methods mainly have two major problems:
1. **High Computational Complexity**: Most metric learning methods need to optimize the objective function through iterative solvers, which leads to high computational costs.
2. **Linear Limitations**: The Mahalanobis distance is essentially linear, and for datasets with nonlinear structures, linear metrics may perform poorly.
Therefore, the author proposes a more computationally economical framework that can learn multiple metrics in a closed - form and can effectively improve the performance of the kNN classifier.
### Related Formulas
The Mahalanobis distance is defined as follows:
\[ d(x, y)=\sqrt{(x - y)^{T} S(x - y)} \]
where \( S \) is a positive semi - definite matrix and can be represented by eigenvalue decomposition as:
\[ S = U^{T} \Lambda U = L^{T} L \]
where \( \Lambda \) is a diagonal matrix composed of the non - zero eigenvalues of \( S \), \( U \) is a matrix composed of the corresponding eigenvectors, and \( L=\Lambda^{1 / 2} U \).
By introducing multiple local metrics, the author hopes to use different metrics in different regions, thereby better capturing the local characteristics of the data. Specifically, for each training sample, one or more metrics are associated, and classification is performed by selecting the metric with the least ambiguity.
In summary, the goal of this paper is to overcome the computational complexity and linear limitations of existing methods by proposing a new multi - metric learning framework, thereby improving the performance of the kNN classifier.