VPPLR: Privacy-preserving logistic regression on vertically partitioned data using vectorization sharing

Yuhao Zhang,Min Tang
DOI: https://doi.org/10.1016/j.jisa.2024.103725
IF: 4.96
2024-05-01
Journal of Information Security and Applications
Abstract:The construction of high-precision machine learning models relies on large-scale data collection from IoT devices. Since the training data involves sensitive user information, it is essential to design a privacy-preserving machine learning (PPML) paradigm to obtain reliable models without leaking data. Due to the diversity of IoT devices, the collected data from multiple sources often has different attributes, namely vertically partitioned data. PPML over vertically settings (abbreviated as VPPML) is more challenging than horizontal cases since VPPML requires both secure feature integration and sample aggregation. Most of existing VPPML approaches require multi-round interactions among multiple users and servers, therefore exhibiting high computation and communication overheads. To close this gap, we propose an efficient framework VPPLR to realize the security training for logistic regression. We employ a well-designed structure to reformulate the gradient update rules, ensuring that any user can go offline after uploading the rearranged local data to the cloud server in secret sharing. In the training phase, we introduce vectorization approach to complete the global parameter update in the plain-domain through common channel, thereby going beyond approaches depending on homomorphic encryption in efficiency. We illustrate the performance of our VPPLR in developing classifiers using private IoT data in vertically partitioned scenarios by comparing it with the advanced PPLR methods.
computer science, information systems
What problem does this paper attempt to address?