Privacy-preserving k-NN interpolation over two encrypted databases

Murat Osmanoglu,Salih Demir,Bulent Tugrul
DOI: https://doi.org/10.7717/peerj-cs.965
2022-05-31
Abstract:Cloud computing enables users to outsource their databases and the computing functionalities to a cloud service provider to avoid the cost of maintaining a private storage and computational requirements. It also provides universal access to data, applications, and services without location dependency. While cloud computing provides many benefits, it possesses a number of security and privacy concerns. Outsourcing data to a cloud service provider in encrypted form may help to overcome these concerns. However, dealing with the encrypted data makes it difficult for the cloud service providers to perform some operations over the data that will especially be required in query processing tasks. Among the techniques employed in query processing task, the k-nearest neighbor method draws attention due to its simplicity and efficiency, particularly on massive data sets. A number of k-nearest neighbor algorithms for query processing task on a single encrypted database have been proposed. However, the performance of k-nearest neighbor algorithms on a single database may create accuracy and reliability problems. It is a fact that collaboration among different cloud service providers yields more accurate and more reliable results in query processing. By considering this fact, we focus on the k-nearest neighbor (k-NN) problem over two encrypted databases. We introduce a secure two-party k-NN interpolation protocol that enables a query owner to extract the interpolation of the k-nearest neighbors of a query point from two different databases outsourced to two different cloud service providers. We also show that our protocol protects the confidentiality of the data and the query point, and hides data access patterns. Furthermore, we conducted a number of experiment to demonstrate the efficiency of our protocol. The results show that the running time of our protocol is linearly dependent on both the number of nearest neighbours and data size.
What problem does this paper attempt to address?