Model-free Global Likelihood Subsampling for Massive Data

Si-Yu Yi,Yong-Dao Zhou
DOI: https://doi.org/10.1007/s11222-022-10185-0
IF: 2.3241
2022-01-01
Statistics and Computing
Abstract:Most existing studies for subsampling heavily depend on a specified model. If the assumed model is not correct, the performance of the subsample may be poor. This paper focuses on a model-free subsampling method, called global likelihood subsampling, such that the subsample is robust to different model choices. It leverages the idea of the global likelihood sampler, which is an effective and robust sampling method from a given continuous distribution. Furthermore, we accelerate the algorithm for large-scale datasets and extend it to deal with high-dimensional data with relatively low computational complexity. Simulations and real data studies are conducted to apply the proposed method to regression and classification problems. It illustrates that this method is robust against different modeling methods and has promising performance compared with some existing model-free subsampling methods for data compression.
What problem does this paper attempt to address?