Subsampling under distributional constraints

Florian Combes,Ricardo Fraiman,Badih Ghattas
DOI: https://doi.org/10.1002/sam.11661
2024-02-01
Abstract:Abstract Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting, we have an input in a general space, and an output where is a very complicated function, whose computational cost for every new input is very high, and may be also very expensive. We are given two sets of observations of , and of different sizes such that only is available. We tackle the problem of selecting a subset of smaller size on which to run the complex model , and such that the empirical distribution of is close to that of . We suggest three algorithms to solve this problem and show their efficiency using simulated datasets and the Airfoil self‐noise data set.
computer science, artificial intelligence, interdisciplinary applications,statistics & probability
What problem does this paper attempt to address?