A DPk-medoids Clustering Algorithm with Differential Privacy Protection

Yu GAO,Feng TIAN,Zhen-qiang WU
DOI: https://doi.org/10.3969/j.issn.1673-629X.2017.10.025
2017-01-01
Abstract:Cluster analysis is one of the significant research fields in the data mining. Due to its paramount advantages in identification of the internal data structure and pretreatment/analysis of the data,it can be used in fields of the image processing and pattern recognition and so on. Users' sensitive information could face leaking threats if mining tools are used to obtain the personal privacy by some organi-zations which own large datasets,such as medical companies. Therefore,taken into the characteristic of differential privacy account,a DPk-medoids algorithm based on differential privacy protection is proposed. It releases the noised center points before using Laplace mecha-nism to add noise,and in certain degree,personal privacy security and the effectiveness of clustering can be ensured. Experimental results with the ture datasets show that it can be applied to datasets with different scales and dimensions and moreover the range of effective ratio can reach to 0. 9~1 compared with original clustering algorithm when the privacy budget reaches a certain value.
What problem does this paper attempt to address?