Achieving pareto-optimal MI-Based privacy-utility tradeoffs under full data
Abstract
We study a fine-grained model in which a perturbed version of some data (D) is to be disclosed, with the aims of permitting the receiver to accurately infer some useful aspects (X=f(D)) of it, while preventing her from inferring other private aspects (Y=g(D)). Correlation between the bases for these inferences necessitates compromise between these goals. Determining exactly how the disclosure (M) will be probabilistically generated (from D), somehow trading off between making I(M;X) large and I(M;Y) small, is cast as an algorithmic optimization problem. In 2013, Chakraborty et al. provided optimal solutions for the two extreme points on these objectives' Pareto frontier: maximizing I(M;X) s.t. I(M;Y)=0 ('perfect privacy,' via linear programming (LP)) and minimizing I(M;Y) s.t. H(X|M)=0 ('perfect utility,' for which the trivial solution M=X is optimal). We show that when minimizing I(M;Y)-β I(M;X) , we can restrict ourselves w.l.o.g. to solutions satisfying several normal-form conditions, which leads to 1) an alternative convex programming formulation when β [0,1], which we provide a practical optimal algorithm for, and 2) proof that M=X is actually optimal for all β ≥ 1. This solves the primary open problem posed by Chakraborty et al. (It also provides a faster solution than Chakraborty et al.'s LP for the 'perfect privacy' special case.).