Sparse Recovery with Shuffled Labels: Statistical Limits and Practical Estimators

Hang Zhang,Ping Li
2023-03-21
Abstract:This paper considers the sparse recovery with shuffled labels, i.e., $\by = \bPitrue \bX \bbetatrue + \bw$, where $\by \in \RR^n$, $\bPi\in \RR^{n\times n}$, $\bX\in \RR^{n\times p}$, $\bbetatrue\in \RR^p$, $\bw \in \RR^n$ denote the sensing result, the unknown permutation matrix, the design matrix, the sparse signal, and the additive noise, respectively. Our goal is to reconstruct both the permutation matrix $\bPitrue$ and the sparse signal $\bbetatrue$. We investigate this problem from both the statistical and computational aspects. From the statistical aspect, we first establish the minimax lower bounds on the sample number $n$ and the \emph{signal-to-noise ratio} ($\snr$) for the correct recovery of permutation matrix $\bPitrue$ and the support set $\supp(\bbetatrue)$, to be more specific, $n \gtrsim k\log p$ and $\log\snr \gtrsim \log n + \frac{k\log p}{n}$. Then, we confirm the tightness of these minimax lower bounds by presenting an exhaustive-search based estimator whose performance matches the lower bounds thereof up to some multiplicative constants. From the computational aspect, we impose a parsimonious assumption on the number of permuted rows and propose a computationally-efficient estimator accordingly. Moreover, we show that our proposed estimator can obtain the ground-truth $(\bPitrue, \supp(\bbetatrue))$ under mild conditions. Furthermore, we provide numerical experiments to corroborate our claims.
Information Theory,Machine Learning
What problem does this paper attempt to address?