Non-linear Dimensionality Reduction for Privacy-Preserving Data Classification

Khaled Alotaibi,V. J. Rayward-Smith,Wenjia Wang,Beatriz de la Iglesia
DOI: https://doi.org/10.1109/socialcom-passat.2012.76
2012-01-01
Abstract:Many techniques have been proposed to protect the privacy of data outsourced for analysis by external parties. However, most of these techniques distort the underlying data properties, and therefore, hinder data mining algorithms from discovering patterns. The aim of Privacy-Preserving Data Mining (PPDM) is to generate a data-friendly transformation that maintains both the privacy and the utility of the data. We have proposed a novel privacy-preserving framework based on non-linear dimensionality reduction (i.e. non-metric multidimensional scaling) to perturb the original data. The perturbed data exhibited good utility in terms of distance-preservation between objects. This was tested on a clustering task with good results. In this paper, we test our novel PPDM approach on a classification task using a k-Nearest Neighbour (k-NN) classification algorithm. We compare the classification results obtained from both the original and the perturbed data and find them to be much same particularly for the few lower dimensions. We show that, for distance-based classification, our approach preserves the utility of the data while hiding the private details.
What problem does this paper attempt to address?