Isolation Kernel: The X Factor in Efficient and Effective Large Scale Online Kernel Learning

Kai Ming Ting,Jonathan R. Wells,Takashi Washio
DOI: https://doi.org/10.48550/arXiv.1907.01104
2019-09-24
Abstract:Large scale online kernel learning aims to build an efficient and scalable kernel-based predictive model incrementally from a sequence of potentially infinite data points. A current key approach focuses on ways to produce an approximate finite-dimensional feature map, assuming that the kernel used has a feature map with intractable dimensionality---an assumption traditionally held in kernel-based methods. While this approach can deal with large scale datasets efficiently, this outcome is achieved by compromising predictive accuracy because of the approximation. We offer an alternative approach which overrides the assumption and puts the kernel used at the heart of the approach. It focuses on creating an exact, sparse and finite-dimensional feature map of a kernel called Isolation Kernel. Using this new approach, to achieve the above aim of large scale online kernel learning becomes extremely simple---simply use Isolation Kernel instead of a kernel having a feature map with intractable dimensionality. We show that, using Isolation Kernel, large scale online kernel learning can be achieved efficiently without sacrificing accuracy.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the trade - off between efficiency and accuracy in large - scale online kernel learning. Specifically: 1. **Limitations of traditional methods**: Existing large - scale online kernel learning methods mainly rely on two approaches to improve efficiency: - **Limiting the number of support vectors**: This method cannot handle an infinite number of support vectors and performs poorly on high - dimensional data sets. - **Using approximate feature mapping**: The computational complexity is reduced through approximate feature mapping, but this will lead to a decline in prediction accuracy. 2. **Challenges of assumptions**: These methods are all based on an assumption that the used kernel function has an intractable high - dimensional feature mapping. This assumption leads to the necessity of approximation when dealing with large - scale data, thus sacrificing accuracy. 3. **Proposed new method**: This paper proposes a new method to overcome the above - mentioned assumption by introducing the Isolation Kernel. The Isolation Kernel has the following characteristics: - **Exact, sparse, and finite - dimensional feature mapping**: This enables online kernel learning to efficiently handle large - scale data without sacrificing accuracy. - **No approximation required**: Since the feature mapping of the Isolation Kernel is exact, no approximation processing is required, thus avoiding the accuracy loss caused by approximation. 4. **Specific contributions**: - A new online kernel learning method is proposed, which does not require the assumption that the kernel function has an intractable high - dimensional feature mapping. - It is revealed that the Isolation Kernel has an exact, sparse, and finite - dimensional feature mapping. - The key role of the Isolation Kernel in efficiently handling large - scale online kernel learning tasks is demonstrated, which can significantly improve efficiency while maintaining high accuracy. - The application effects of the Isolation Kernel in Online Gradient Descent (OGD) and Support Vector Machine (SVM) are verified through experiments, proving its superior performance on high - dimensional data sets. In summary, this paper aims to solve the contradiction between efficiency and accuracy in large - scale online kernel learning by introducing the Isolation Kernel, providing an efficient and accurate solution.