Kernel Wasserstein Distance

Jung Hun Oh,Maryam Pouryahya,Aditi Iyer,Aditya P. Apte,Allen Tannenbaum,Joseph O. Deasy
DOI: https://doi.org/10.1016/j.compbiomed.2020.103731
2019-05-23
Abstract:The Wasserstein distance is a powerful metric based on the theory of optimal transport. It gives a natural measure of the distance between two distributions with a wide range of applications. In contrast to a number of the common divergences on distributions such as Kullback-Leibler or Jensen-Shannon, it is (weakly) continuous, and thus ideal for analyzing corrupted data. To date, however, no kernel methods for dealing with nonlinear data have been proposed via the Wasserstein distance. In this work, we develop a novel method to compute the L2-Wasserstein distance in a kernel space implemented using the kernel trick. The latter is a general method in machine learning employed to handle data in a nonlinear manner. We evaluate the proposed approach in identifying computerized tomography (CT) slices with dental artifacts in head and neck cancer, performing unsupervised hierarchical clustering on the resulting Wasserstein distance matrix that is computed on imaging texture features extracted from each CT slice. Our experiments show that the kernel approach outperforms classical non-kernel approaches in identifying CT slices with artifacts.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the problem of effectively identifying artifacts caused by high-density materials such as dental metal fillings or crowns in computed tomography (CT) slices in medical imaging. Specifically, the authors propose an L2-Wasserstein distance calculation method based on kernel methods to handle nonlinear data and apply it to identify slices with dental artifacts in CT images of head and neck cancer patients. In existing methods, although the Wasserstein distance is a powerful metric tool widely used in various fields, no one has yet proposed a method to handle nonlinear data in kernel space. Therefore, the main contribution of this paper is the development of a new method to calculate the L2-Wasserstein distance in kernel space through the kernel trick, in order to improve the ability to identify artifacts in CT slices. Experimental results show that the proposed kernel method performs better in identifying artifacts in CT slices compared to traditional non-kernel methods.