Private Set Intersection: A Multi-Message Symmetric Private Information Retrieval Perspective
Zhusheng Wang,Karim Banawan,Sennur Ulukus
DOI: https://doi.org/10.1109/tit.2021.3125006
IF: 2.5
2022-03-01
IEEE Transactions on Information Theory
Abstract:We study the problem of private set intersection (PSI). In this problem, there are two entities $E_{i}$ , for $i=1, 2$ , each storing a set $\mathcal {P}_{i}$ , whose elements are picked from a finite set $\mathbb {S}_{K}$ , on $N_{i}$ replicated and non-colluding databases. It is required to determine the set intersection ${\mathcal {P}}_{1} \cap {\mathcal {P}} _{2}$ without leaking any information about the remaining elements to the other entity, and to do this with the least amount of downloaded bits. We first show that the PSI problem can be recast as a multi-message symmetric private information retrieval (MM-SPIR) problem with certain added restrictions. Next, as a stand-alone result, we derive the information-theoretic sum capacity of MM-SPIR, $C_{MM-SPIR}$ . We show that with $K$ messages, $N$ databases, and a given size of the desired message set $P$ , the exact capacity of MM-SPIR is $C_{MM-SPIR} = 1 - \frac {1}{N}$ when $P \leq K-1$ , provided that the entropy of the common randomness $S$ satisfies $H(S) \geq \frac {P}{N-1}$ per desired symbol. When $P = K$ , the MM-SPIR capacity is trivially 1 without the need for any common randomness $S$ . This result implies that there is no gain for MM-SPIR over successive single-message SPIR (SM-SPIR). For the MM-SPIR problem, we present a novel capacity-achieving scheme which builds seamlessly over the near-optimal scheme of Banawan-Ulukus originally proposed for the multi-message PIR (MM-PIR) problem without any database privacy constraints. Surprisingly, our scheme here is exactly optimal for the MM-SPIR problem for any $P$ , in contrast to the scheme for the MM-PIR problem, which was proved only to be near-optimal. Our scheme is an alternative to the successive usage of the SM-SPIR scheme of Sun-Jafar. Based on this capacity result for the MM-SPIR problem, and after addressing the added requirements in its conversion to the PSI problem, we show that the optimal download cost for the PSI problem is given by $\min \left \{{\left \lceil{ \frac {P_{1} N_{2}}{N_{2}-1}}\right \rceil, \left \lceil{ \frac {P_{2} N_{1}}{N_{1}-1}}\right \rceil }\right \}$ , where $P_{i}$ is the cardinality of set ${\mathcal {P}}_{i}$ .
computer science, information systems,engineering, electrical & electronic