SecureMLDebugger: A Privacy-Preserving Machine Learning Debugging Tool.

Peiyi Han,Chaozheng Wang,Chuanyi Liu,Shaoming Duan,Hezhong Pan,Pengshuai Luo
DOI: https://doi.org/10.1109/dsc50466.2020.00027
2020-01-01
Abstract:The issue of data privacy is uniquely challenging in machine learning which requires large datasets. Privacypreserving machine learning method based on the concept of model training isolated from data scientists have become a hot topic in recent years. In order to protect data privacy, training data is completely isolated from data scientists. Although this method can protect data privacy, data scientists cannot perceive any training information of data nodes during training, and it is difficult to debug machine learning model. Existing works provide data collection APIs to collect and display metadata during training to help data scientists debug machine learning models. Malicious data scientists can obtain private data through these APIs. In this paper, a novel security machine learning debug tool based non-intrusive metadata collection scheme, called SecureMLDebugger(SMLD), is proposed, which automatically collect, store and manage non-privacy metadata during training without any data collection API. Our tool accelerates users in their machine learning experiment while protecting data privacy. We achieve this by transparently tracing each function call in machine learning code and automatically extracting metadata such as hyperparameters of models, training runs, evaluations and layouts of neural networks. SMLD is integrated with popular frameworks such as scikit-learn and PyTorch, and meets the demands of various privacy-preserving training cases in practical.
What problem does this paper attempt to address?