D S ] 6 A pr 2 01 8 Tight Bounds for l p Oblivious Subspace Embeddings

Ruosong Wang,David P. Woodruff
2018-01-01
Abstract:An lp oblivious subspace embedding is a distribution over r × n matrices Π such that for any fixed n× d matrix A, Pr Π [for all x, ‖Ax‖p ≤ ‖ΠAx‖p ≤ κ‖Ax‖p] ≥ 9/10, where r is the dimension of the embedding, κ is the distortion of the embedding, and for an n-dimensional vector y, ‖y‖p = ( ∑n i=1 |yi|) 1/p is the lp-norm. Another important property is the sparsity of Π, that is, the maximum number of non-zero entries per column, as this determines the running time of computing Π ·A. While for p = 2 there are nearly optimal tradeoffs in terms of the dimension, distortion, and sparsity, for the important case of 1 ≤ p < 2, much less was known. In this paper we obtain nearly optimal tradeoffs for lp oblivious subspace embeddings for every 1 ≤ p < 2. Our main results are as follows: 1. We show for every 1 ≤ p < 2, any oblivious subspace embedding with dimension r has distortion κ = Ω ( 1 ( 1 d ) 1/p ·log2/p r+( r n ) 1/p−1/2 ) . When r = poly(d) ≪ n in applications, this gives a κ = Ω(d log d) lower bound, and shows the oblivious subspace embedding of Sohler and Woodruff (STOC, 2011) for p = 1 and the oblivious subspace embedding of Meng and Mahoney (STOC, 2013) for 1 < p < 2 are optimal up to poly(log(d)) factors. 2. We give sparse oblivious subspace embeddings for every 1 ≤ p < 2 which are optimal in dimension and distortion, up to poly(log d) factors. Importantly for p = 1, we achieve r = O(d log d), κ = O(d log d) and s = O(log d) non-zero entries per column. The best previous construction with s ≤ poly(log d) is due to Woodruff and Zhang (COLT, 2013), giving κ = Ω(dpoly(log d)) or κ = Ω(d √ log n · poly(log d)) and r ≥ d · poly(log d); in contrast our r = O(d log d) and κ = O(d log d) are optimal up to poly(log(d)) factors even for dense matrices. We also give (1) nearly-optimal lp oblivious subspace embeddings with an expected 1+ε number of non-zero entries per column for arbitrarily small ε > 0, and (2) the first oblivious subspace embeddings for 1 ≤ p < 2 with O(1)-distortion and dimension independent of n. Oblivious subspace embeddings are crucial for distributed and streaming environments, as well as entrywise lp low rank approximation. Our results give improved algorithms for these applications.
What problem does this paper attempt to address?