Distributed data mining: a survey

Li Zeng,Ling Li,Lian Duan,Kevin Lu,Zhongzhi Shi,Maoguang Wang,Wenjuan Wu,Ping Luo
DOI: https://doi.org/10.1007/s10799-012-0124-y
2012-01-01
Information Technology and Management
Abstract:Most data mining approaches assume that the data can be provided from a single source. If data was produced from many physically distributed locations like Wal-Mart, these methods require a data center which gathers data from distributed locations. Sometimes, transmitting large amounts of data to a data center is expensive and even impractical. Therefore, distributed and parallel data mining algorithms were developed to solve this problem. In this paper, we survey the-state-of-the-art algorithms and applications in distributed data mining and discuss the future research opportunities.
What problem does this paper attempt to address?