PrivShape: Extracting Shapes in Time Series under User-Level Local Differential Privacy

Yulian Mao,Qingqing Ye,Haibo Hu,Qi Wang,Kai Huang
2024-04-05
Abstract:Time series have numerous applications in finance, healthcare, IoT, and smart city. In many of these applications, time series typically contain personal data, so privacy infringement may occur if they are released directly to the public. Recently, local differential privacy (LDP) has emerged as the state-of-the-art approach to protecting data privacy. However, existing works on LDP-based collections cannot preserve the shape of time series. A recent work, PatternLDP, attempts to address this problem, but it can only protect a finite group of elements in a time series due to {\omega}-event level privacy guarantee. In this paper, we propose PrivShape, a trie-based mechanism under user-level LDP to protect all elements. PrivShape first transforms a time series to reduce its length, and then adopts trie-expansion and two-level refinement to improve utility. By extensive experiments on real-world datasets, we demonstrate that PrivShape outperforms PatternLDP when adapted for offline use, and can effectively extract frequent shapes.
Cryptography and Security
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the problem of extracting time - series shapes under user - level local differential privacy (LDP). Specifically, time - series data are widely used in many fields such as finance, healthcare, the Internet of Things (IoT), and smart cities. However, the time - series in these applications usually contain personal data, and direct release may lead to privacy leakage. The existing LDP - based data collection methods cannot preserve the shape features of time - series. Although PatternLDP attempts to solve this problem, it can only protect a limited set of elements in the time - series because the privacy guarantee it provides is based on the ω - event level. When it comes to more stringent user - level privacy, the performance of PatternLDP drops significantly because only a single privacy budget $\epsilon$ is allocated to the entire time - series, which results in a very small privacy budget for each selected element, thus severely distorting the original shape. To solve these problems, this paper proposes a new mechanism, PrivShape, which uses a trie structure to protect all elements under user - level LDP and improves utility through trie expansion and two - level refinement strategies. Experimental results show that PrivShape outperforms PatternLDP in adapting to offline use and can effectively extract frequent shapes. ### Main contributions 1. **Propose PrivShape**: This is the first mechanism to extract time - series shapes under user - level local differential privacy. 2. **Optimization strategies**: Design two optimization strategies - trie expansion pruning and two - level refinement - to efficiently utilize the privacy budget. 3. **Experimental verification**: Through extensive experiments on two benchmark datasets, verify the effectiveness of PrivShape and show its performance significantly superior to existing mechanisms. ### Key technical points - **Compressive SAX**: Reduce the number of elements in the time - series through compressed SAX processing while preserving shape information. - **Trie structure**: Used to generate candidate shapes and improve utility through pruning and refinement strategies. - **Exponential Mechanism**: Used for the selection process of privacy protection on the user side. ### Formula summary - Definition of differential privacy: \[ \Pr(A(v)=v^*) \leq e^\epsilon\times\Pr(A(v') = v^*) \] where $\epsilon$ is the privacy budget. - Probability formula of the Exponential Mechanism: \[ \Pr[\Psi(S_{ui}) = F_{cj}]=\frac{\exp\left(\frac{\epsilon}{2\Delta S(S_{ui}, F_{cj})}\right)}{\sum_{F_{cz}\in F_c}\exp\left(\frac{\epsilon}{2\Delta S(S_{ui}, F_{cz})}\right)} \] Through these techniques and strategies, PrivShape can effectively extract shape features in time - series while strictly protecting user privacy.