Privacy-Preserving Data Mining For Medical Data: Application Of Data Partition Methods

Yi Peng,Gang Kou,Yong Shi,Zhengxin Chen
DOI: https://doi.org/10.1007/978-3-540-78733-4_20
2008-01-01
Abstract:Medical data mining has been a popular data mining topic of late. Compared with other data mining applications, medical data mining has some unique characteristics. Since medical records are related to human subjects, privacy protection is taken more seriously than other data mining tasks. This paper applied two data separation techniques - vertical and horizontal partition - to preserve privacy in medical data classification. In the vertical partition approach, each site uses a portion of the attributes to compute its results and the distributed results are assembled at a central trusted party using majority-vote ensemble method. In the horizontal partition approach, data are distributed among several sites. Each site computes its own data and a central trusted party integrate these results using ensemble. We implement these two approaches using medical datasets from UCI Machine Learning archive and report the experimental results.
What problem does this paper attempt to address?