Diffusion Representation for Asymmetric Kernels

Alvaro Almeida Gomez,Antonio Silva Neto,Jorge zubelli
2024-01-21
Abstract:We extend the diffusion-map formalism to data sets that are induced by asymmetric kernels. Analytical convergence results of the resulting expansion are proved, and an algorithm is proposed to perform the dimensional reduction. In this work we study data sets in which its geometry structure is induced by an asymmetric kernel. We use a priori coordinate system to represent this geometry and, thus, be able to improve the computational complexity of reducing the dimensionality of data sets. A coordinate system connected to the tensor product of Fourier basis is used to represent the underlying geometric structure obtained by the diffusion-map, thus reducing the dimensionality of the data set and making use of the speedup provided by the two-dimensional Fast Fourier Transform algorithm (2-D FFT). We compare our results with those obtained by other eigenvalue expansions, and verify the efficiency of the algorithms with synthetic data, as well as with real data from applications including climate change studies.
Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to develop a new methodology for handling the dimension reduction of data sets induced by asymmetric kernels. Specifically: 1. **A New Method for Dimension Reduction of Asymmetric Kernel Data**: The paper proposes a new framework for dimension reduction of asymmetric kernel data based on an efficient FFT algorithm. This method uses two - dimensional fast Fourier transform (2 - D FFT) to represent the geometric structure under the diffusion map, thereby reducing the dimension of the data set and taking advantage of the acceleration effect provided by 2 - D FFT. 2. **Representation of Diffusion Distance for Asymmetric Kernels**: The traditional diffusion map theory is mainly applicable to symmetric kernels, while this paper extends this theory to enable it to handle asymmetric kernels. The author solves the problem of representing the diffusion distance under asymmetric kernels by using the Fourier basis of the tensor product to represent the diffusion geometric structure induced by asymmetric kernels. 3. **Efficiency Improvement**: Compared with the method based on eigenvalue decomposition, this method has a significant advantage in computing time. For an \(n\times n\) matrix, the complexity of 2 - D FFT is \(O(n^{2}\log n)\), while the complexity of the eigenvector - based representation method is \(O(n^{3})\). 4. **Verification in Practical Applications**: The paper verifies the effectiveness and efficiency of this method through synthetic data and real data (such as meteorological data in climate change research). The experimental results show that this method performs well in handling the geometric features of complex data sets, especially in identifying areas with the largest temperature changes. In conclusion, the main objective of this paper is to develop an efficient dimension - reduction method applicable to asymmetric kernel data sets in order to improve the speed and accuracy of data processing.