Abstract:Differential privacy is the de-facto privacy standard in data analysis. The classic model of differential privacy considers the data to be static. The dynamic setting, called differential privacy under continual observation, captures many applications more realistically. In this work we consider several natural dynamic data structure problems under continual observation, where we want to maintain information about a changing data set such that we can answer certain sets of queries at any given time while satisfying $\epsilon$-differential privacy. The problems we consider include (a) maintaining a histogram and various extensions of histogram queries such as quantile queries, (b) maintaining a predecessor search data structure of a dynamically changing set in a given ordered universe, and (c) maintaining the cardinality of a dynamically changing set. For (a) we give new error bounds parameterized in the maximum output of any query $c_{\max}$: our algorithm gives an upper bound of $O(d\log^2dc_{\max}+\log T)$ for computing histogram, the maximum and minimum column sum, quantiles on the column sums, and related queries. The bound holds for unknown $c_{\max}$ and $T$. For (b), we give a general reduction to orthogonal range counting. Further, we give an improvement for the case where only insertions are allowed. We get a data structure which for a given query, returns an interval that contains the predecessor, and at most $O(\log^2 u \sqrt{\log T})$ more elements, where $u$ is the size of the universe. The bound holds for unknown $T$. Lastly, for (c), we give a parameterized upper bound of $O(\min(d,\sqrt{K\log T}))$, where $K$ is an upper bound on the number of updates. We show a matching lower bound. Finally, we show how to extend the bound for (c) for unknown $K$ and $T$.

Extracting Sparse Data Via Histogram Queries

Data Extraction Via Histogram and Arithmetic Mean Queries: Fundamental Limits and Algorithms

Differentially Private Histogram Publication for Dynamic Datasets: an Adaptive Sampling Approach.

Image Feature Extraction Method Based On Sparse Coding Framework Of Hypergraph

Efficient data gathering using Compressed Sparse Functions

Selectivity Estimation by Batch-Query Based Histogram and Parametric Method.

Multi-resolution Algorithms for Building Spatial Histograms

Efficient Histogram-Based Similarity Search In Ultra-High Dimensional Space

Towards answering analytical query over hierarchical histogram under untrusted servers

SUM-optimal Histograms for Approximate Query Processing

LHist: Towards Learning Multi-dimensional Histogram for Massive Spatial Data

On Linear-Spline Based Histograms

Efficiently Collecting Histograms over RFID Tags

Histogram Publication over Numerical Values under Local Differential Privacy

Summarizing spatial relations – a hybrid histogram

Simultaneous Estimation of Number of Clusters and Feature Sparsity in Clustering High-Dimensional Data

Personalized Privacy-Preserving Data Aggregation for Histogram Estimation.

Federated Heavy Hitter Recovery under Linear Sketching

Differentially Private Histogram, Predecessor, and Set Cardinality under Continual Observation

On the Number of Graphs with a Given Histogram

Query Sampling Based High Dimensional Hybrid Index