Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
Abstract
We initiate the study of trade-offs between sparsity and the number of measurements in sparse recovery schemes for generic norms. Specifically, for a norm || · ||, sparsity parameter k, approximation factor K > 0, and probability of failure P > 0, we ask: what is the minimal value of m so that there is a distribution over m × n matrices A with the property that for any x, given Ax, we can recover a k-sparse approximation to x in the given norm with probability at least 1 - PI We give a partial answer to this problem, by showing that for norms that admit efficient linear sketches, the optimal number of measurements m is closely related to the doubling dimension of the metric induced by the norm || · || on the set of all k-sparse vectors. By applying our result to specific norms, we cast known measurement bounds in our general framework (for the lp norms, p ∈ [1,2]) as well as provide new, measurementefficient schemes (for the Earth-Mover Distance norm). The latter result directly implies more succinct linear sketches for the well-studied planar k-median clustering problem. Finally, our lower bound for the doubling dimension of the EMD norm enables us to resolve the open question of [Frahling-Sohler, STOC'05] about the space complexity of clustering problems in the dynamic streaming model.