Sampling strategies for mining in data-scarce domains

N. Ramakrishnan,C. Bailey-Kellogg
DOI: https://doi.org/10.1109/mcise.2002.1014978
2002-07-01
Abstract:A novel framework leverages physical properties for mining in data-scarce domains. It interleaves bottom-up data mining with top-down data collection, leading to effective and explainable sampling strategies. This article describes focused sampling strategies for mining scientific data. Our approach is based on the spatial aggregation language, which supports construction of data interpretation and control design applications for spatially distributed physical systems in a bottom-up manner. Used as a basis for describing data mining algorithms, SAL programs also help exploit knowledge of physical properties such as continuity and locality in data fields. We also introduce a top-down sampling strategy that focuses data collection in only those regions that are deemed most important to support a data mining objective.
computer science, interdisciplinary applications
What problem does this paper attempt to address?