Knowledge Discovery in Very Large Databases.

Xindong Wu
DOI: https://doi.org/10.1145/568760.568764
2002-01-01
Abstract:Dealing with very large databases is one of the defining challenges in data mining research and development. When a data base is not a static repository of data, or if the data come from different data sources and putting all data together might amass a huge database for centralized processing, knowledge discovery in such data environments cannot be a one-time process. Existing techniques include data sampling, windowing, bagging, boosting, batch learning, hierarchical meta-learning, and parallel and distributed data mining. This talk will provide a review on these techniques, and present our own recent research efforts on multi-layer induction and synthesizing association rules from different data sources.
What problem does this paper attempt to address?