Towards Private and Scalable Cross-Media Retrieval
Shengshan Hu,Leo Yu Zhang,Qian Wang,Zhan Qin,Cong Wang
DOI: https://doi.org/10.1109/tdsc.2019.2926968
2020-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Cross-media retrieval (CMR) is an attractive networked application where a server responds to queries with retrieval results of different modalities. Different from traditional information retrieval, CMR relies on a more enriched set of machine learning techniques to produce semantic models projecting multimodal data into a common space. A larger training dataset usually gives more accurate models, leading to a better retrieval result. Despite very promising with potential underpinnings in network analytics and multimedia applications, applying CMR in such contexts also faces severe privacy challenges, due to the fact that various data scattering among multiple parties may be sensitive and not allowed to be shared publicly. Studies jointly considering cross-media analytics, privacy protection, collaborative learning, and distributed networking contexts, are relatively sparse. In this work, we propose the first practical system for privacy-preserving cross-media retrieval by utilizing trusted processors. Our scheme enables secure aggregation of the data from distinct parties, and secure canonical correlation analysis (CCA) over collaborated data to obtain semantic models. Verification mechanisms are designed to defend against active attacks from a malicious adversary. Furthermore, to deal with large data sets, we provide a set of optimization methods to accomodate to limited trusted memory and improve the efficiency of training process in CMR. We consider issues such as data block splitting to manage memory overhead, ordering of operations as well as parameters reuse and release to simplify I/O, and parallel computation to speed up dual operations. Our experiments over both synthetic and real datasets show that our solution is very efficient in practice, outperforms the existing solutions, and performs comparably with the original CMR system.
computer science, information systems, software engineering, hardware & architecture