Scalable Nonlinear Mappings for Classifying Large Sparse Data

Xiang Li,Xiao Li
DOI: https://doi.org/10.1007/978-3-030-95408-6_30
2022-01-01
Abstract:Classifying very large sparse data is important in many real-world applications, such as to predict users' gender and profitability based on product ratings. Traditionally, such datasets are often classified with linear models (logistic regression and linear SVM) for scalability, but the predictive accuracy may suffer. Existing applications demonstrate that large sparse datasets often have a low-rank structure. By computing a polynomial approximation to the low-rank data space, a previous work [9] has developed a kernelized classifier, KARMA, to improve the performance on sparse data. However, such method does not scale well to large datasets. In this paper, we develop scalable feature mappings to efficiently approximate the kernels used in KARMA. In experiments, our method inherits the good predictive performance of KARMA on small to medium sized data while showing a significantly better scalability on large datasets.
What problem does this paper attempt to address?