STDA: Secure Time Series Data Analytics with Practical Efficiency in Wide-Area Network

Xiaoguo Li,Zixi Huang,Bowen Zhao,Guomin Yang,Tao Xiang,Robert H. Deng
DOI: https://doi.org/10.1109/tifs.2023.3336512
IF: 7.231
2024-01-01
IEEE Transactions on Information Forensics and Security
Abstract:Time series data analytics technology significantly benefits modern scientific research, especially in fields such as medical health, financial investment, and transportation. Unfortunately, privacy issues hinder people from handing over the data to a third party for various analytical tasks; because the data may reveal much more individual sensitive information, e.g., disease information from medical data, investment tendency from financial data, or the daily trajectory from transportation data. To break down this barrier, secure computation approaches have shown their importance in processing sensitive data, and have attracted much attention from the industry and research communities. However, when considering the case of secure time-series data analytics (e.g., DTW similarity), we are still far from achieving high efficiency due to high round complexity in communication or expensive computational complexity. We observe that DTW involves a lot of comparison operations and existing approaches in dealing with the comparison require higher communication costs. To this end, this paper studies secure DTW-based analytics with practical efficiency over time series data. Specifically, we propose the framework of secure time series data analytics (STDA) and formulate the problem of top- k query for outsourced time series data. Based on threshold Paillier encryption, we present a top- k query protocol utilizing the DTW distance as a metric and its security analysis, optimizations, and performance evaluation. The experimental results demonstrate that in a wide-area network with a 10 ms latency, our top- k approach outperforms the state-of-the-art by 3x times, while DTW calculation outperforms by 9x times. Correspondingly, the optimized F DTW achieves 17x times better, and optimized top- k achieves 4-10x times better.
What problem does this paper attempt to address?