On the `Semantics of Differential Privacy: A Bayesian Formulation

On the ‘Semantics’ of Dif ferential Pri v ac y: A B ayesian F ormula t io n * Shiv a Prasad Kasiviswanathan † General Electric Global Research kasivisw@g mail.com Adam Smith ‡ Department of Computer Science and Engi neering The Pennsyl vania State Univer sity ads22@bu.e du Abstract Differential pri vacy is a deﬁn itio n of “priv acy” fo r algorithms that analy ze and publish informa tion about statistical databases. It is often claimed th at differential priv acy provides guarantees again st adver- saries with ar bitrary side info rmation. In this pap er , we p rovide a precise formula tio n of these guarante e s in terms o f the infer ences drawn by a Bayesian adversary . W e show that this formulation is satisﬁed by both ǫ -differential priv acy as well as a relaxatio n, ( ǫ, δ ) -differential privac y . Ou r formulatio n f ollows th e ideas orig inally due to Dwork and McShe rry , stated implicitly in [5]. This paper is, to o ur knowledge, the ﬁrst place such a formulation appears exp licitly . The analysis of the relaxed deﬁnitio n is ne w to this paper, and provid es gu idance for setting the δ parameter when using ( ǫ, δ ) -d ifferential privac y . 1 Introd uction Pri vac y is an increasingl y importan t aspect of data publishing . Reason ing about priv acy , ho wev er , is fra ught with pitf alls. One of the most sign iﬁcant is the auxiliary information (also calle d ext ernal knowled ge, backg round kno wledge, or side information) tha t an adv ersary gleans fro m other chann els such as the web, public records, or domain knowledg e. Schemes that retain pri vac y guaran tees in the presence of indepen dent release s are said to compose secur ely . The terminology , borro wed from crypt ography (which borro w ed, in turn, from software engine ering), stems from the fact that sch emes that compose securely can be designed in a stand- alone f ashion without explici tly tak ing other releases into account. T hus, understandi ng independen t release s is essential for enabling mod ular desig n. In f act, one would like schemes tha t compose securely not only with inde pendent instances of themselv es, but with arb itrary ex ternal knowledg e . Certain randomizati on-based noti ons of pri v acy (such as dif fer entia l privacy , due to Dwork , McSherry , Nissim, and Smith [10]) are vie wed as provid ing m eaning ful g uarantees e ven in the presen ce of arbitrary side information . In this paper . we gi v e a precise formul ation of this stat ement. First, we provide a Bayesian formulat ion of “pure” dif ferent ial pri vac y which ex plicitly models side information . Second , we prov e that the relaxed deﬁnitions of Blum et al. [ 2], Dwork et al. [9] and Machana v ajjhala et al. [16] imply the B ayesian formulat ion. The proof is non-tri vial , and relies on the “contin uity” of Bayes’ rule with resp ect to certain * Preliminary statements of the main results from this paper appeared in Ganta et al. [11]. This paper contains strengthened results as well as full proofs. † S.K. ’ s work at P enn State was partly supported by NSF award TF-0729171. S.K. is no w at Amazon Research. ‡ A.S. was supported by NSF awards TF-0729171, CDI-0941553, PE CASE CCF -074729 4 and a Google Faculty A ward. A.S. is no w at Boston Univ ersity . 1 distan ce measure s on probability distrib utions . Our resu lt means that technique s satisfying the relax ed deﬁ- nition s can be used with the same sort of assu rances as in the case of pure dif ferentia lly-pri v ate algorit hms, as long parameters are set approp riately . Speciﬁcally , ( ǫ, δ ) -dif ferentia l pri v acy provides meaningf ul guar - antees whene v er δ , the additi ve error parameter , is smaller than about ǫ 2 /n , where n is the size of the data set. Organiz ation. After introduc ing the basic deﬁnitio ns, we stat e and discuss our main result s in Section 2. In Section 2.1, we relate our approach to othe r ef forts—subs equent to the initial versio n of this wo rk—that sough t to pin do wn mathematical prec ise formulatio ns of the “meaning ” of differ ential pri va cy . Section 3 pro ves ou r main theorems. Along the wa y , we dev elop lemmas about ( ǫ, δ ) -indisting uishability —the notion of similarity that under lies ( ǫ, δ ) -diffe rential pri vac y—-that we believ e are of independe nt interest. The most useful of these , w hich we dub the Conditioning Lemma, is gi ven in Secti on 3.3. Finally , we provide further discus sion of our approach in Section 4. 1.1 Differ ential Privacy Database s are assumed to be vec tors in D n for some domain D . The Hamming dist ance d H (x , y) on D n is the number of po sitions in which the vecto rs x , y diffe r . W e let Pr[ · ] and E [ · ] denote probabil ity and exp ectation, res pecti vely . Giv en a rand omized alg orithm A , we let A (x) be th e rand om v ariabl e (or , prob a- bility dis trib ution on outputs) correspo nding to input x . If X and Y are probab ility dis trib utions (or random v ariables) on a discrete space D , the statistic al dif fer ence (a.k.a. total variatio n distan ce ) betwee n X and Y is deﬁne d as: SD ( X , Y ) = max S ⊂ D | Pr[ X ∈ S ] − Pr[ Y ∈ S ] | . Deﬁnition 1.1 ( ǫ -diffe rential priv ac y [10]) . A rand omized algor ithm A is said to be ǫ -dif fer entially private if for all da tabases x , y ∈ D n at Hamming dist ance at most 1, and for all su bsets S of outp uts, Pr[ A (x) ∈ S ] ≤ e ǫ · Pr[ A (y) ∈ S ] . This deﬁnition sta tes that changin g a single indi vidual ’ s data in the database leads to a small change in the distrib ution on outpu ts. Unlike more standard measur es of dista nce such as statistic al diffe rence or Kullb ack-Leibler di ver gen ce, the metric here is multip licati ve and so eve n ver y unlik ely e vent s must hav e approx imately the same prob ability und er the dis trib utions A (x) and A (y) . This con dition was relaxed some what in other papers [4 , 8, 2, 9, 3, 17, 16]. The schemes in all those papers, ho wev er , satisfy the follo wing relaxati on (ﬁrst formula ted by Dwork, Ke nthapadi, McSherry , M ironov , and Naor [9 ]): Deﬁnition 1.2 ( ( ǫ, δ ) -diff erential pri v acy [9]) . A rando mized algorithm A is ( ǫ, δ ) -dif fer entially private if for all data bases x , y ∈ D n that dif fer in one entry , and for all subsets S of outputs, Pr[ A (x) ∈ S ] ≤ e ǫ · Pr[ A (y) ∈ S ] + δ . 2 Semantics of Differential Privacy There is a crisp, semant ically-ﬂa v ored 1 interp retation of dif ferential priv acy , due to Dwork and McSherry , exp lained in [5]: Re gar dless of e xternal knowledge , an adver sary with acc ess to the sanitize d databas e 1 The use of the term “semantic” for deﬁnitions that deal directly with adve rsarial kno wledge dates back to semantic security of encryption [12]. 2 dra ws the same conclus ions whether or not my data is include d in the ori ginal data base. O ne might hop e for a stronger statement, namely that the adve rsary draws the same conclusions w hether or not the data is used at all. Howe ve r , such a strong statement is impossible to provide in the presence of arbitrary exter nal informat ion (Dwork and Naor [6], Dwork [5]; see also Kifer and Machana v ajjhala [14]), as illust rated by the follo wing exa mple. Example 1. Cons ider a clinica l stud y that explor es the relations hip betwee n smoking and lung disease. A health insurance compan y who had no a prio ri understa nding of that relationship might dramatically alter its “beliefs ” (as encoded by insuran ce premiums) to account for the results of the stud y . The study woul d cause the compan y to raise premiums for smok ers and lower them for nonsmo kers, re gardle ss of whethe r the y particip ated in the stud y . In this case, the conclusions drawn by the company about the risk iness of an y one indi vidua l (say Alice ) ar e st rongly affecte d by the results of the study . This oc curs regardles s of whether Alice’ s data are inclu ded in the study . ♦ In this section, we de velop a formalization Dwork and McSher ry’ s interpret ation and explore its relation to stan dard deﬁnitions. T o proce ed, we requir e a mathematical formulati on of “exte rnal kno wledge”, and of “dra wing conclusion s”. The ﬁrst is captured via a prior proba bility distr ibu tion b on D n ( b is a mnemonic for “beliefs ”). C onclus ions are modeled by the correspon ding posterior distrib ution: giv en a transcript t , the adv ersary updates his belief b ab out the database x using Bayes’ rul e to obtain a poster ior ¯ b : ¯ b [x | t ] = Pr[ A (x) = t ] b [x] P z Pr[ A (z) = t ] b [z] . (1) When the mechan ism A is inter acti ve, the deﬁnition of A depends on the adv ersary’ s choice s; for leg ibility we omit the depend ence on the adv ersary in the notation . Also, for simpli city , we discuss only discre te pro bability distrib utions. Our results exten d dir ectly to the interacti v e, con tinuous cas e. For a database x , deﬁne x − i to be the same v ector ex cept position i has been replaced by some ﬁ xed , def ault v alue D . Any valid val ue in D will do for the default value . W e deﬁne n + 1 related games, numbered 0 through n . In Game 0, the adv ersary intera cts with A (x) . This is the interact ion that actua lly takes place between th e adversa ry and the rand omized algorithm A . The d istrib ution ¯ b 0 is jus t the distri but ion ¯ b as deﬁned in (1). In Game i (for 1 ≤ i ≤ n ), the adv ersary interacts w ith A (x − i ) . Game i describes the hypothetica l scenar io where person i ’ s data is not use d. 2 In Game i > 0 , giv en a transcr ipt t , the adve rsary updat es his belief b about database x again using Bayes’ rule to obtain a poste rior ¯ b i as follo ws: ¯ b i [x | t ] = Pr[ A (x − i ) = t ] b [x] P z Pr[ A (z − i ) = t ] b [z] . (2) Through th ese n + 1 games, we get n + 1 a po steriori distrib utions ¯ b 0 , . . . , ¯ b n , where ¯ b 0 is same as ¯ b (deﬁned in (1)), and ¯ b i ( i > 0 ) is the post erior distri but ion obtained when the adver sary interac ts with A (x − i ) and use s this interacti on to upda te his belief distrib ution (deﬁne d in (2)). Giv en a partic ular tran script t , we say pri va cy has been bre ached if the advers ary would dra w dif feren t conclu sions about the wor ld and, in particular , ab out a person i , depending on whethe r or not i ’ s data was used. One could formally deﬁne “dif ferent ” in many ways. In this paper , we cho ose a weak (b ut popular) 2 It could happen by coinciden ce that person i ’ s data equ als the def ault v alue and hence that x = x − i . This doe sn’t af fect the meaning of the result si nce t he default v alue is chosen indepen dently of the data. Readers bothered by the possible coincidence may choose to think of the default value as a special value ⊥ ( e. g., “no data”) that does not correspond t o any real record. 3 measure of distance between probab ility distrib ution s, namely statistical diff erence. W e say the adv ersary has learned something, if for any transcr ipt t th e dis trib utions ¯ b 0 [ ·| t ] and ¯ b i [ ·| t ] are fa r apart in statistical dif ference . W e would like to av oid this from ha ppening for any potential par ticipant. This is captu red by the follo wing deﬁnition . Deﬁnition 2.1 ( ǫ -semantic priv acy) . A ra ndomized algorithm A is said to be ǫ -seman tically private if for all belief distrib utions b on D n , for al l possible transcr ipts t , and for all i = 1 , . . . , n : SD  ¯ b 0 [ ·| t ] , ¯ b i [ ·| t ]  ≤ ǫ . Our formulatio n of semantic pri va cy is inspired by Dwork and McSherry’ s interpretatio n of differe ntial pri v acy [5]. W e no w fo rmally sho w that the notions of ǫ -dif fere ntial pri v acy (D eﬁ nition 1.1) and ǫ -sema ntic pri v acy (Deﬁnition 2.1) are ess entially equi va lent. Theor em 2.2. F or all ǫ > 0 , ǫ -diffe r ential privacy implies ¯ ǫ -semantic priv acy , wher e ¯ ǫ = e ǫ − 1 . F or 0 < ǫ ≤ 0 . 45 , ǫ/ 2 -semantic privacy implies 3 ǫ -dif fer ential privacy . The proo f of this and all other results in th is section may be foun d in Sectio n 3. W e can e xtend the prev ious Bayesian formula tion to capture situat ions where bad e ven ts can oc cur with some negligib le pr obability . Speciﬁcally , w e formulate ( ǫ, δ ) -semantic pri v acy and show that it is close ly related to ( ǫ, δ ) -diffe rential pri va cy . Deﬁnition 2.3 (( ǫ, δ )-semantic pri va cy) . A rand omized algor ithm is ( ǫ, δ ) -semantica lly private if for all belief distrib ution s b on D n , w ith pr oba bility at leas t 1 − δ o ver t ∼ A (x) ( t dra wn fr om A (x) ), w her e the datab ase x is drawn accor ding to b , and for all i = 1 , . . . , n : SD  ¯ b 0 [ ·| t ] , ¯ b i [ ·| t ]  ≤ ǫ . The ( ǫ, δ ) -pri va cy deﬁnition is m ost intere sting when ǫ ≫ δ , sin ce eve ry ( ǫ, δ ) -priv ate algorithm is also (0 , δ + ( e ǫ − 1)) -dif ferent ially priv ate. Belo w , we assume ǫ > δ . In fact, m any of our results are meaningful only when δ is less than 1 /n , while ǫ must gener ally be much lar ger than 1 /n to allow for useful algorithms. Theor em 2.4 (Main Theorem) . (1) If ǫ, δ > 0 and δ < (1 − e − ǫ ) 2 /n , then ( ǫ, δ )-diff er ential privacy implies ( ǫ ′ , δ ′ ) -semant ic privacy on datab ases of size n with ǫ ′ = e 3 ǫ − 1 + 2 √ nδ and δ ′ = 4 √ nδ . (2) If ǫ, δ > 0 and ǫ ≤ 0 . 45 , the n ( ǫ, δ ) -semantic priva cy implies (3 ǫ, 2 δ ) -dif fer ential privacy . In Appendix B, we dis cuss a stronge r notion of ( ǫ, δ ) -semantic pri vac y and sho w that ( ǫ, δ ) -dif ferential pri v acy need not imply thi s stronger semantic pri v acy guaran tee. Remark 1. The implicati ons in Theorems 2.2 and 2.4 would not hold if dif ferentia l p riv ac y were deﬁned in terms of statistica l dif feren ce (total v ariation distance) or mutual info rmation instead of the m ultipli cati ve metric used in Deﬁniti ons 1.1 and 1.2. For ex ample, one could chang e the las t line of the Deﬁnition 1.2 to Pr[ A (x) ∈ S ] ≤ Pr[ A (y) ∈ S ] + ǫ S D . (3) For this modiﬁed deﬁnition to allo w publishi ng useful inf ormation, one would need ǫ S D = Ω (1 /n ) (other - wise, data sets that dif fer in all n elements would still be hard to distingu ish). H owe ver , in that parameter 4 range there is a mechanism that satisﬁes the new deﬁnition b ut does not satisfy “semantic” pri va cy for any reason able paramete rs. N amely , consid er the m echani sm which on inp ut x = ( x 1 , . . . , x n ) samples a uni- formly ran dom inde x i ∈ { 1 , ..., n } and out puts ( i, x i ) . This mechanism is intuiti vel y unsat isfacto ry , since it alw ays out puts some indi vidu al’ s d ata in the clear . It also does not satisfy semantic priv ac y for an y pair ( ǫ, δ ) where ǫ < 1 and δ < 1 . N e verthe less, it does satisfy the require ment of (3) with ǫ S D = 1 /n . T he same mechanism also satisﬁes the natural varia nt of dif feren tial pr iv ac y based on mutual information (for exa mple, w here the mutual informat ion be tween A (x) and x i is required to be small for all indices i and produ ct dis trib utions on x ). ♦ 2.1 Related A ppr oaches Prior to P osterior Compariso ns. In the orig inal paper on dif ferential pri v acy , Dwork et al. [10] deﬁne d a notion of “semantic” pri v acy that in volv ed comparing the prior and post erior distri but ions of the adv ersary . In the language of the preced ing s ection, the y requir e that SD  b [ · ] , ¯ b i [ ·| t ]  ≤ ǫ for a subclass of belie f distrib utions, called “informed beliefs” , in which all b ut one of the data set entries are ﬁxed (con stant). They sho w that this deﬁnition is equi valent to dif ferential pri v acy . Kifer and Machana v ajjhala [15] use this prior -to-p osterior approach to gene ralize dif feren tial pri v acy to oth er settings. Ho wev er , the impo ssibility res ults of Dwork and Naor [5, 7] an d Kifer and Mach ana vaj jhala [14], ex em- pliﬁed by the smoking example in Example 1, imply that no mech anism th at pro vides nontri vial info rmation about the data se t satisﬁes such a prior -to-p osterior deﬁnition for all distrib utions. This impossibil ity moti vate d the pos terior -to-pos terior compariso n espous ed in this paper , and su bse- quentl y generalized by Bassily et al. [1]. In contrast to the prior -to-poste rior approach , the framewo rk discus sed in this paper does gen eralize to arbitrary distrib utio ns on the data (and, he nce, to arbitrary side informat ion). B assily et al. [1] suggest the term “infe rence-bas ed” for deﬁnitions which explicit ly discuss the poste rior dist ribu tions cons tructed by Bayes ian adversa ries. Hypothesis T esting. W asserman and Z hou [18] relate dif ferenti al priv ac y to the type I and II errors of a hypot hesis test. Speciﬁcally , ﬁx an ǫ -dif ferentia lly pri va te mechanism A , an i.i.d. distri buti on on the dat a x , an ind ex i , and dis joint set s S and T of possib le values for the i -th entry x i of x . W asserman an d Zhou [18] sho w that any hypothes is test (gi ven A (x) , and full knowled ge of the input product distri but ion on x an d the dif ferentia lly pri va te mechanism A ) fo r the hypoth esis H 1 : x i ∈ S ve rsus the altern ativ e H 1 : x i ∈ T must satisfy 1 − β ≤ e ǫ α , (4) where α is the sign iﬁcance le vel (maximum type -I error) and 1 − β is the po wer (maximu m type-II error) of the test. In othe r wor ds, the test rejects the hypothes is with approx imately the same proba bility re gardless of whether the hypothesi s is true. This pers pecti ve was exte nded to ( ǫ, δ ) -dif ferent ial priv ac y by Hall et al. [13]. This is a reasonabl e requ irement. N ote, howe ver , that it holds only for produ ct distrib utions, w hich limits its applicabilit y . More importantly , a very similar statement can be prov en for the statist ical dif ferenc e-based deﬁnitio n disc ussed in Remark 1. Speciﬁcall y , one can sh ow th at 1 − β ≤ α + ǫ S D . (5) when the mechanism satisﬁes the deﬁnition of Remark 1. Equation (4) has the same natural lang uage interp retation as equa tion (5), namely , “the test rejects the hypot hesis with appr oximately the same proba - bility regardle ss of whether the hypot hesis is true” . Howe ve r , as mentioned in the Remark 1, the stat istical 5 dif ference -based deﬁnition allo w s mechanis ms that publish detaile d pers onal data in the clear . This mak es the meaning of a hypoth esis-test ing-based deﬁni tion hard to e v aluate intuiti ve ly . W e hope the deﬁnitions pro vided here are easier to inter pret. 3 Pr oofs of Main Results W e be gin this sect ion by deﬁning ( ǫ, δ ) -indistingu ishability and stating a few of its basic prope rties (S ec- tion 3.1, with proofs in Appendix A ). S ection 3.2 giv es the proof of our main result for ǫ -di ffer ential pri vac y . In Section 3.3 we state and prove the C onditio ning Lemma, the main too l which allows us to pro ve our re- sults abo ut ( ǫ, δ ) -dif ferent ial pri v acy (Sectio n 3.4). 3.1 ( ǫ, δ ) -Indistinguishability and its Basic Properties The relax ed notions of ( ǫ, δ ) -dif ferentia l pri v acy implicitly uses a two-pa rameter distance measure on proba- bility distri but ions (or random varia bles) which we call ( ǫ, δ ) -indistingu ishability . In this section, we dev elop a few basic properties of this measure . These properties list ed in Lemma 3.3 will play an import ant role in establ ishing the pro ofs of Theorems 2.2 and 2.4 Deﬁnition 3.1 ( ( ǫ, δ ) -indisting uishability) . T wo ran dom variables X , Y tak ing values in a set D ar e ( ǫ, δ ) - indist inguishab le if fo r all sets S ⊆ D , Pr[ X ∈ S ] ≤ e ǫ Pr[ Y ∈ S ] + δ and Pr[ Y ∈ S ] ≤ e ǫ Pr[ X ∈ S ] + δ . W e will also be using a varia nt of ( ǫ, δ ) -indis tinguishability , which w e call point- wise ( ǫ, δ ) -indistin guishability . Lemma 3.3 (Par ts 1 and 2 ) sho ws that ( ǫ, δ ) -indis tinguishability and point- wise ( ǫ, δ ) -indisting uishability are almost equ iv alent . Deﬁnition 3.2 (Point-wise ( ǫ, δ ) -indistin guishability) . T wo random variab les X and Y ar e point-wise ( ǫ, δ ) - indist inguishab le if with pr obabil ity at least 1 − δ over a dra wn fr om either X or Y , we have: e − ǫ Pr[ Y = a ] ≤ Pr[ X = a ] ≤ e ǫ Pr[ Y = a ] . Lemma 3.3. Indistingui shability satisﬁe s the followin g pr operties : 1. If X , Y are point-wise ( ǫ, δ ) -indisting uishable then the y are ( ǫ, δ ) -indis tinguishable . 2. If X , Y are ( ǫ, δ ) -indistingu ishable then the y ar e point-wise  2 ǫ , δ · 2 1 − e − ǫ  -indis tinguisha ble. 3. L et X be a random variable on D . Suppo se that for eve ry a ∈ D , A ( a ) and A ′ ( a ) are ( ǫ, δ ) - indist inguishab le (for some rand omized algorithms A and A ′ ). Then the pair s ( X, A ( X )) and ( X, A ′ ( X )) ar e ( ǫ, δ ) -indisting uishable. 4. L et X be a rand om variab le. Suppose with pr obability at least 1 − δ 1 ove r a ∼ X , A ( a ) and A ′ ( a ) ar e ( ǫ, δ ) -indis tinguishable (for some r andomized algorith ms A and A ′ ). Then the pairs ( X, A ( X )) and ( X, A ′ ( X )) ar e ( ǫ, δ + δ 1 ) -indis tinguish able. 5. If X , Y ar e ( ǫ, δ ) -indist inguishable (or X , Y ar e point -wise ( ǫ, δ ) -indistingu ishable), then SD ( X, Y ) ≤ ¯ ǫ + δ , wher e ¯ ǫ = e ǫ − 1 . The lemma is pro ved in A ppendix A. 6 3.2 Case o f ǫ -Differ ential Privacy: Proof of Theor em 2.2 Theor em 2.2 (re stated) (Dwork-McSherry) . ǫ/ 2 -dif fer enti al privacy implies ¯ ǫ -semantic priva cy , wher e ¯ ǫ = e ǫ − 1 . ǫ/ 2 -seman tic priv acy implies 3 ǫ -dif fer entia l priv acy as long as ǫ ≤ 0 . 45 . Pr oof. Consider any databas e x ∈ D n . Let A be an ǫ/ 2 -dif ferent ially priv ate algori thm. Consider any belief distrib ution b . Let the posterior distr ibut ions ¯ b 0 [x | t ] and ¯ b i [x | t ] for some ﬁxed i and t be as deﬁned in (1) and (2). ǫ/ 2 -dif ferentia l priv acy implies that for e very da tabase z ∈ D n e − ǫ/ 2 Pr[ A (z − i ) = t ] ≤ Pr[ A (z) = t ] ≤ e ǫ/ 2 Pr[ A (z − i ) = t ] . These inequaliti es imply th at the ratio of ¯ b 0 [x | t ] and ¯ b i [x | t ] (deﬁned in (1) and (2)) is within e ± ǫ . Since these inequa lities hold s for ev ery x , we get: ∀ x ∈ D n , e − ǫ ¯ b i [x | t ] ≤ ¯ b 0 [x | t ] ≤ e ǫ ¯ b i [x | t ] . This implies that the random v ariables (distrib utio ns) ¯ b 0 [ ·| t ] and ¯ b i [ ·| t ] are point- wise ( ǫ, 0) -indis tinguishable. Applying L emma 3.3 (Part 5) with δ = 0 , giv es SD  ¯ b 0 [ ·| t ] , ¯ b i [ ·| t ]  ≤ ¯ ǫ . Repeati ng the ab ove argu ments for e very beli ef distrib ution, for eve ry i , and for e very t , shows that A is ¯ ǫ -semantically pri va te. T o see that ǫ -semantic priv ac y implies 3 ǫ -di ffer ential priv ac y , co nsider a belief distrib utio n b which is unifor m ove r two databases x , y which are at Hamming distance of one . Let i be the position in which x and y dif fer . Fix a transcript t . The distrib utio n ¯ b i [ ·| t ] will be uniform ov er x and y since the y ind uce the same distrib ution on transcr ipts in Game i . This m eans tha t ¯ b 0 [ ·| t ] will assign proba bilities 1 / 2 ± ǫ to each of the two databases (by Deﬁnition 2.1). W orking through Bayes’ rule sho w s that (note tha t b [x] = b [y] ) Pr[ A (x) = t ] Pr[ A (y) = t ] = ¯ b 0 [x | t ] ¯ b 0 [y | t ] ≤ 1 2 + ǫ 1 2 − ǫ ≤ e 3 ǫ (since ǫ ≤ 0 . 45 ). (6) Since the bo und in (6) h olds for e very t , A (x) and A (y) are point-wise (3 ǫ, 0) -indist inguishable. Using Lemma 3.3 (Part 1), implies that A (x) and A (y) are (3 ǫ, 0) -indisti nguishable. S ince this relati onship holds for e very pai r of neighb oring datab ases x and y , means that A is 3 ǫ -d iffe rentially priv ate. 3.3 A Us eful T ool: The Conditioning Lemma W e w ill use the follo w ing lemma to establish connection s between ( ǫ, δ ) -diffe rential priv acy and ( ǫ, δ ) - semantic priv ac y . Let B | A = a denote the conditio nal distr ibu tion of B gi v en that A = a for jointly distrib uted random v ariabl es A and B . Lemma 3.4 (Condition ing Lemma) . Supp ose the pair of random variable s ( A, B ) is ( ǫ, δ ) -indisting uishable fr om the pair ( A ′ , B ′ ) . Then, for ˆ ǫ = 3 ǫ and for ev ery ˆ δ > 0 , the following holds: with pr obab ility at least 1 − δ ′′ ove r t ∼ B (or , alt ernativel y , over t ∼ B ′ ), th e rando m variab les A | B = t and A ′ | B ′ = t ar e (ˆ ǫ, ˆ δ ) - indist inguishab le, w her e δ ′′ = 2 δ ˆ δ + 2 δ 1 − e − ǫ . W e can satisfy the conditions of the preceding lemma by setting ˆ δ = δ ′′ = O ( √ δ ) for any consta nt ǫ . Ho wev er , the proo f of our main theore m will use a slightly diffe rent setting (with δ ′′ smaller than ˆ δ ). Pr oof. Let ( A, B ) and ( A ′ , B ′ ) tak e values in the set D × E . In th e remainder of the proo f, we will use the notati on A | t for A | B = t and A ′ | t for A ′ | B ′ = t . Deﬁ ne, B ad 1 = n t ∈ E : ∃ S t ⊂ D such that Pr[ A | t ∈ S t ] > e ˆ ǫ Pr[ A ′ | t ∈ S t ] + ˆ δ o B ad 2 = n t ∈ E : ∃ S t ⊂ D such that Pr[ A ′ | t ∈ S t ] > e ˆ ǫ Pr[ A | t ∈ S t ] + ˆ δ o . 7 T o pro ve the lemma, it suf ﬁces to sho w that the probabi lities Pr [ B ∈ B ad 1 ∪ B ad 2 ] and P r [ B ′ ∈ B ad 1 ∪ B ad 2 ] are each at most δ ′′ . T o do so, we ﬁ rst consid er the set B ad 0 =  t ∈ E : Pr[ B = t ] < e − 2 ǫ Pr[ B ′ = t ] or Pr[ B = t ] > e 2 ǫ Pr[ B ′ = t ]  . W e will separat ely boun d the probabiliti es of B ad 0 , B ad ′ 1 = B ad 1 \ B ad 0 and B ad ′ 2 = B ad 2 \ B ad 0 . T o bound the mass of B ad 0 , note that B and B ′ are ( ǫ, δ ) -indis tinguishable (since they are funct ions of ( A, B ) and ( A ′ , B ′ ) ) 3 . Since ( ǫ, δ ) -indisting uishability implies poin t-wise (2 ǫ, 2 δ 1 − e − ǫ ) -indis tinguish ability (Lemma 3.3, Par t 2), we ha ve Pr[ B ∈ B ad 0 ] ≤ 2 δ 1 − e − ǫ . W e now turn to B ad ′ 1 = B ad 1 \ B ad 0 . For each t ∈ B ad ′ 1 , let S t be an y set that witne sses t ’ s membersh ip in B ad 1 (that is, for w hich Pr[ A | t ∈ S t ] ex ceeds e ˆ ǫ Pr[ A ′ | t ∈ S t ] + ˆ δ ). Conside r the cri tical set T 1 = [ t ∈ B ad ′ 1 ( S t × { t } ) . Intuiti vely , thi s set will ha ve lar ge mass if B ad ′ 1 does. Speciﬁcally , by the deﬁnitio n of S t , we get a lower bound on the probability of T 1 : Pr[( A, B ) ∈ T 1 ] = X t ∈ B ad ′ 1 Pr[ A | t ∈ S t ] Pr[ B = t ] > X t ∈ B ad ′ 1 ( e ˆ ǫ Pr[ A ′ | t ∈ S t ] + ˆ δ ) Pr[ B = t ] =   X t ∈ B ad ′ 1 e ˆ ǫ Pr[ A ′ | t ∈ S t ] Pr[ B = t ]   + ˆ δ Pr[ B ∈ B ad ′ 1 ] . Because B ad ′ 1 does not cont ain points in B ad 0 , w e know that Pr[ B = t ] ≥ e − 2 e Pr[ B ′ = t ] . Substit uting this into th e bound abov e and using the fa ct that ˆ ǫ = 3 ǫ and Pr[ A ′ | t ∈ S t ] = Pr[ A ′ ∈ S t | B ′ = t ] , we get Pr[( A, B ) ∈ T 1 ] ≥ X t ∈ B ad ′ 1 e ˆ ǫ Pr[ A ′ ∈ S t | B ′ = t ] e − 2 ǫ Pr[ B ′ = t ] + ˆ δ Pr[ B ∈ B ad ′ 1 ] = e ǫ Pr[( A ′ , B ′ ) ∈ T 1 ] + ˆ δ Pr[ B ∈ B ad ′ 1 ] . By ( ǫ, δ ) -indisting uishability , Pr[( A, B ) ∈ T 1 ] ≤ e ǫ Pr[( A ′ , B ′ ) ∈ T 1 ] + δ . Combining the up per and lower bound s on the pr obability that ( A, B ) ∈ T 1 , we ha v e ˆ δ Pr[ B ∈ B ad ′ 1 ] ≤ δ , which implies that Pr[ B ∈ B ad ′ 1 ] ≤ δ / ˆ δ . By a simila r argu ment, one gets that P r[ B ∈ B ad ′ 2 ] ≤ δ / ˆ δ . Finally , Pr[ B ∈ B ad 1 ∪ B ad 2 ] ≤ Pr[ B ∈ B ad 0 ] + Pr[ B ∈ B ad ′ 1 ] + Pr[ B ∈ B ad ′ 2 ] = 2 δ 1 − e − ǫ + δ ˆ δ + δ ˆ δ = 2 δ 1 − e − ǫ + 2 δ ˆ δ = δ ′′ . By symmetry , we als o hav e Pr[ B ′ ∈ B ad 1 ∪ B ad 2 ] ≤ 2 δ 1 − e − ǫ + 2 δ ˆ δ . Therefo re, with prob ability at least 1 − δ ′′ , A | t and A ′ | t are (ˆ ǫ, ˆ δ ) -indisting uishable, as claimed. 3 Note: Even if we started wi t h the stronger assumption that the pairs ( A, B ) and ( A ′ , B ′ ) are point-wise indistinguishable, we would still ha ve to make a nontri vial argument to bound B ad 0 , since point-wise indistinguishability is not , in general, closed under postprocessing . 8 3.4 The G eneral Case: Proof of Theor em 2.4 Theor em 2.4 (re stated). (1) If ǫ, δ > 0 and δ < (1 − e − ǫ ) 2 /n , then ( ǫ, δ )-diff er ential privacy implies ( ǫ ′ , δ ′ ) -semant ic privacy on datab ases of size n with ǫ ′ = e 3 ǫ − 1 + 2 √ nδ and δ ′ = 4 √ nδ . (2) If ǫ, δ > 0 an d ǫ ≤ 0 . 45 , then ( ǫ, δ ) -semantic priva cy implies (3 ǫ, 2 δ ) -dif fer ential privacy . Pr oof. (1) Let A be an ( ǫ, δ ) -dif ferent ially pri v ate algorithm. Let b be an y belief distrib ution and let x ∼ b . Let A i (x) = A (x − i ) , i.e., A i on input x constructs x − i and then ap plies A on it. From Lemma 3.3 (Part 3), we kno w that (x , A (x)) and (x , A i (x)) are ( ǫ, δ ) -indisting uishable for e ver y index i = 1 , . . . , n . Apply L emma 3.4 with A ( X ) = A (x) , A ′ ( X ) = A i (x) , ˆ ǫ = 3 ǫ , and ˆ δ = √ nδ . W e get that with proba - bility at least 1 − δ ′′ ov er t ∼ A (x) , the random variab les x | A (x)= t and x | A i (x)= t are (ˆ ǫ, ˆ δ ) -indisting uishable, where δ ′′ ≤ 2 δ / ˆ δ + 2 δ / (1 − e − ǫ ) ≤ 4 p δ /n . Note that 1 − e − ǫ > ˆ δ = √ nδ (a condition assumed in the theore m). Let δ ′ = n δ ′′ ; note th at δ ′ ≤ 4 √ nδ . T aking a unio n bound ov er all n cho ices for the inde x i , we get tha t with probability at least 1 − δ ′ ov er the choice of t ∼ A (x) , all n v ariabl es x | A i (x)= t (for dif feren t i ’ s) are (ˆ ǫ, ˆ δ ) -indis tinguisha ble from x | A (x)= t . T o compl ete the proof o f (1), reca ll that (ˆ ǫ, ˆ δ ) -indis tinguisha bility implies statistical distan ce at most e 3ˆ ǫ − 1 + ˆ δ = ǫ ′ . (2) T o see that ( ǫ, δ ) -semantic pri vac y implie s (3 ǫ, 2 δ ) -dif ferent ial priv ac y , conside r a belief dis trib ution b which is uniform over two databases x , y which are at Hamming distanc e of one. The proof idea is same as in Theorem 2.2. Let i be the position in w hich x and y diffe r . Let A be an algorithm that satisﬁes ( ǫ, δ ) -semantic priv ac y . In Game i , x and y induce the same distrib utio n on transc ripts, so the distrib ution ¯ b i [ ·| t ] will be uniform ov er x and y (for all transcript s t ). W e now turn to G ame 0 (th e real world). Let E denote the set of transc ripts t such that ¯ b 0 [ ·| t ] assigns probabiliti es in 1 / 2 ± ǫ to each of the two databases x and y . Let ¯ A denote the (random) out put of A when run on a da tabase sample d from distrib utio n b . The seman tic pri vac y of A impli es E occurs with probabili ty at least 1 − δ ov er t ∼ ¯ A . W orking through Bayes’ rule as in Theorem 2.2 sh ows that e − 3 ǫ Pr[ A (y) = t ] ≤ Pr[ A (x) = t ] ≤ e 3 ǫ Pr[ A (y) = t ] for all t ∈ E . (This last step uses the assumption that ǫ ≤ 0 . 45 ). Moreove r , since ¯ A is an equal mixture of A (x) and A (y) , the ev ent E must occur with probabilit y at least 1 − 2 δ under bot h t ∼ A (x) and t ∼ A (y) Hence, A (x) and A (y) are (3 ǫ, 2 δ ) -indistin guishable. Since this relatio nship holds for e very pair of nei ghboring data bases x and y , means that A is (3 ǫ, 2 δ ) -dif ferential ly pri vat e. 4 Further Discussion Theorem 2.4 states that the relaxations of differe ntial pri v acy in some pre vious work still pro vide meaningful guaran tees in the fac e of arbit rary side informat ion. T his is not the case for all possibl e relax ations, e v en ver y natural ones, as noted in Remark 1. 9 Calibrating Noise to a High-Pr obability Boun d Local Sensitivity . In a diff erent v ein, the techniq ues used to pro ve T heorem 2.4 can also be used to ana lyze schemes that do not pro vide pri v acy for all pairs of neighb oring datab ases x and y , b ut rather only for most such pairs (remember that neighboring databases are the ones that dif fer in one entry). S peciﬁcally , it is sufﬁcien t that those datab ases where the indisti nguisha- bility condi tion f ails occur only with small probability . W e ﬁrst deﬁne a weak ening of Deﬁnition 2.3 so that it only holds for speciﬁc bel ief distrib utions. Deﬁnition 4.1 (( ǫ, δ )-local semantic priv acy ) . A randomiz ed algorith m is ( ǫ, δ ) -local semantically private for a belie f distrib utio n b on D n if w ith pr obability at least 1 − δ over t ∼ A (x) ( t dr awn fr om A (x) ), wher e the datab ase x is drawn accor ding to b , and for all i = 1 , . . . , n : SD  ¯ b 0 [ ·| t ] , ¯ b i [ ·| t ]  ≤ ǫ . Theor em 4.2. Let A be a rand omized alg orithm. Let E = { x : ∀ neig hbors y of x , A (x) and A (y) ar e ( ǫ, δ ) -indisti nguishable } . Then A satisﬁes ( ǫ ′ , δ ′ ) -loca l semanti c priv acy for any belief distrib ution b such that b [ E ] = Pr x ∼ b [x ∈ E ] ≥ 1 − δ 1 with ǫ ′ = e 3 ǫ − 1 + √ nδ 2 and δ ′ ≤ 4 √ nδ 2 as lon g as ǫ > √ nδ 2 , wher e δ 2 = δ + δ 1 . Pr oof. The proof is similar to Theorem 2.4 (1). L et b be a belief distrib ution with b [ E ] ≥ 1 − δ 1 and let x ∼ b . From L emma 3.3 (Par t 4), we kno w that (x , A (x)) and (x , A i (x)) are ( ǫ, δ + δ 1 ) -indis tinguish able, where A i (x) = A (x − i ) . The remai ning proof follo ws exactly as in T heorem 2.4 (1). W e now discuss a simple consequen ce of the abov e theorem to the techn ique of adding noi se according to loc al sen sitivity of a function . Deﬁnition 4.3 (Loca l Sensitiv ity , [17]) . F or a function f : D n → R , and x ∈ D n , the lo cal sensitivi ty of f at x is: LS f (x) = max y : d H (x , y)=1 | f (x) − f (y) | . Let Lap ( λ ) denote the Laplacian distrib utio n. This distrib ution has density function h ( y ) ∝ exp( −| y | /λ ) , mean 0 , and standard dev iation λ . Using the L aplacia n nois e addi tion procedure of [10, 17], along with The- orem 4.2 we ge t 4 , Cor ollary 4.4. Let E = { x : LS f (x) ≤ s } . Let A (x) = f (x) + Lap  s ǫ  . Let b be a belief distrib ution such that b [ E ] = Pr x ∼ b [x ∈ E ] ≥ 1 − δ 1 . T hen A satisﬁes ( ǫ ′ , δ ′ ) -loca l semant ic priva cy for belief distrib ution b with ǫ ′ = e 3 ǫ − 1 + √ nδ 1 and δ ′ ≤ 4 √ nδ 1 as long as ǫ > √ nδ 1 . Pr oof. Let x ∼ b . If x ∈ E , th en it follo ws from [10, 17], that A (x) and A (x − i ) are ( ǫ, 0) -indistin guishable for e very ind ex i = 1 , . . . , n . W e can appl y Theor em 4.2 to complete the pro of. The approa ch discuss ed here was generalized sign iﬁcantly by Bassi ly et al. [1]; we refer to their work for a deta iled discussio n. 4 Similar corollaries could be deri ved for other differential pri vac y mechan isms like those that add Gaussian noise instead of Laplacian noise. 10 Ackno wledgements W e are gratefu l for help ful discussion s with Cynthia Dwork, Daniel K ifer , Ashwin Machan vajjha la, F rank McSherry , Moni Naor , K obbi Nissim, and Sofy a Raskhodnik ov a. Refer ences [1] Raef Bass ily , Adam Groc e, Jonath an Katz, and Adam Smith. Coupled -worlds pri vac y: E xploiti ng adv ersarial uncertain ty in pri v ate data analysis . In F oundation s of Computer Scienc e (FOCS) , 2013. [2] A vrim Blum, Cynthia Dwork, Frank McSherry , and K obbi Nissim. Practical pri v acy: The SuLQ frame work. In P ODS , page s 128 –138. ACM Press, 2005. [3] Kamalika Chaudhuri and Nina Mishra. When random sampling prese rves pri v acy . In CRYPTO , pages 198–2 13. Springer , 2006. [4] Irit Dinur and K obbi Nissim. Re veali ng information while pres erving priv acy . In POD S , pages 202– 210. A CM Press , 2003. [5] Cynthia Dwork. D ifferen tial priv ac y . In ICAL P , pa ges 1–12. Springer , 2006 . [6] Cynthia Dwork and Moni Naor . O n the difﬁculti es of disc losure pre venti on in stati stical databas es or the case for di ffere ntial priv acy . J. P rivacy an d Conﬁdential ity , 2(1), 2010. [7] Cynthia Dwork and Moni Naor . On th e dif ﬁculties of disclos ure prev ention , or the case fo r dif ferent ial pri v acy . Journa l of Priva cy and Conﬁdenti ality , 2(1), 2010 . [8] Cynthia Dwo rk and K obbi Nissim. Pri v acy- preservin g datamining on ve rtically partiti oned datab ases. In CR YPTO , pag es 528–544. Springer , 2004 . [9] Cynthia Dwork, Krishna ram Kenth apadi, Frank McSherry , Ily a Mirono v , and Moni Naor . Our data, oursel ves: Pri vac y via distrib uted noise gener ation. In E UR OCRYPT , pages 486–5 03. Springer , 2006. [10] Cynthi a Dwork, Frank McSherry , Ko bbi Nissim, and A dam Smith. Calibra ting noise to sensiti vity in pri v ate data analysis. In TCC , pag es 265–284. Springer , 2006 . [11] Sri va tsa va R anjit Ganta, Shiv a P rasad Kasivisw anathan , and Adam Smith. Composition attacks and auxili ary infor mation in data pri vac y . In KDD , pages 265–273. ACM, 2008. [12] Shaﬁ Goldwas ser and Silvio Micali. Probabili stic encryptio n. Jour nal of Computer and System Sci- ences , 28( 2):270–2 99, 1984. [13] Rob Hall, Aless andro Rinal do, and Larry W asserman. D if ferential pri va cy for functions and functional data. The J ourn al of M ach ine Learning Resear ch , 14(1):70 3–727, 2013. [14] Daniel Kifer and Ashwin Machana vajjha la. No F ree Lunch in Data Priv ac y. In SIGMOD , pages 193–2 04, 2011. [15] Daniel Kifer and Ashwin Machana v ajjhala. A rigorou s and cust omizable framew ork for pri va cy . In PODS , pages 77–88, 2012. 11 [16] Ashwin Machana vajjha la, Daniel Kifer , John Abowd, Johann es Gehrke, and L ars V ilhuber . Priv ac y: From theor y to practice on the map. In ICD E , pages 277– 286. IEEE Computer Society , 2008 . [17] K obbi Nissim, S ofya Raskhodnik ov a, and Adam Smith. Smooth sens itiv ity an d sampling in pri v ate data anal ysis. In STOC , pag es 75–84. A CM Press, 2007 . [18] Larry W asserman and Shuheng Z hou. A stati stical frame work for dif ferenti al pri vac y . J. American Statis tical Assoc iation , 105(489):3 75–389, 2010. A ppendi x A Proof of Lemma 3.3 Pr oof of P art 1. L et B ad be the set of bad v alues of a , that is B ad = { a : Pr[ X = a ] < e − ǫ Pr[ Y = a ] or Pr [ X = a ] > e ǫ Pr[ Y = a ] } . By deﬁnitio n, Pr[ X ∈ B ad ] ≤ δ . Now consider any set S of outcomes . Pr[ X ∈ S ] ≤ Pr[ X ∈ S \ B ad ] + Pr[ X ∈ B ad ] . The ﬁrst term is at most e ǫ Pr[ Y ∈ S \ B ad ] ≤ e ǫ Pr[ Y ∈ S ] . Hence, Pr[ X ∈ S ] ≤ e ǫ Pr[ Y ∈ S ] + δ , as requir ed. T he case of P r[ Y ∈ S ] is symmetric . Therefore, X and Y are ( ǫ, δ ) -indis tinguishable. Pr oof of P art 2. L et S = { a : Pr[ X = a ] > e 2 ǫ Pr[ Y = a ] } . Then Pr[ X ∈ S ] > e 2 ǫ Pr[ Y ∈ S ] . By ( ǫ, δ ) indisting uishability , we ha ve δ ≥ Pr[ X ∈ S ] − e ǫ Pr[ Y ∈ S ] > ( e 2 ǫ − e ǫ ) Pr[ Y ∈ S ] . Equi vale ntly , Pr[ Y ∈ S ] < δ e 2 ǫ − e ǫ = δ e 2 ǫ (1 − e − ǫ ) . (7) No w con sider the set S ′ = { a : Pr[ X = a ] < e − 2 ǫ Pr[ Y = a ] } . A symmetric ar gument to the one above sho ws that Pr[ X ∈ S ′ ] < δ / ( e 2 ǫ − e ǫ ) . Again using indis tinguisha bility , we get Pr[ Y ∈ S ′ ] ≤ e ǫ Pr[ X ∈ S ′ ] + δ < e ǫ δ e 2 ǫ − e ǫ + δ = δ 1 − e − ǫ . (8) The bound of (8) is alwa ys lar ger tha n that of (7), so we ha ve Pr[ Y ∈ S ∪ S ′ ] ≤ δ · 2 e ǫ e ǫ − 1 . W e can get the same bound on Pr[ X ∈ S ∪ S ′ ] by symmetry . Therefore , with probabi lity at least 1 − δ · 2 e ǫ e ǫ − 1 for a drawn from the di strib ution of eith er X or Y we hav e: e − 2 ǫ Pr[ Y = a ] ≤ P r [ X = a ] ≤ e 2 ǫ Pr[ Y = a ] . Pr oof of P art 3. Let ( X , A ( X )) and ( X, A ′ ( X )) be random vari ables on D × E . Let S be an arbitrary subset of D × E and, for e very a ∈ D , deﬁne S a = { b ∈ E : ( a, b ) ∈ S } . Pr[( X, A ( X )) ∈ S ] ≤ X a ∈ D Pr[ A ( X ) ∈ S a | X = a ] Pr [ X = a ] ≤ X a ∈ D ( e ǫ Pr[ A ′ ( X ) ∈ S a | X = a ] + δ ) Pr [ X = a ] ≤ δ + e ǫ Pr[( X, A ′ ( X )) ∈ S ] . 12 By symmetry , we also ha ve Pr[( X, A ′ ( X )) ∈ S ] < δ + e ǫ Pr[( X, A ( X )) ∈ S ] . Since the above inequalitie s hold for e ve ry selection of S , implies that ( X, A ( X )) and ( X , A ′ ( X )) are ( ǫ, δ ) -indisting uishable. Pr oof of P art 4. Let ( X , A ( X )) and ( X, A ′ ( X )) be rando m va riables on D × E . Let T ⊂ D be the set of a ’ s for which A ( a ) ≤ e ǫ A ′ ( a ) . Now , let S be an arbitrary subs et of D × E and, for e ver y a ∈ D , deﬁne S a = { b ∈ E : ( a, b ) ∈ S } . Pr[( X, A ( X )) ∈ S ] = X a / ∈ T Pr[ A ( X ) ∈ S a | X = a ] Pr [ X = a ] + X a ∈ T Pr[ A ( X ) ∈ S a | X = a ] Pr [ X = a ] ≤ X a / ∈ T Pr[ X = a ] + X a ∈ T Pr[ A ( X ) ∈ S a | X = a ] Pr [ X = a ] = Pr[ X / ∈ T ] + X a ∈ T Pr[ A ( X ) ∈ S a | X = a ] Pr [ X = a ] ≤ δ 1 + X a ∈ T ( e ǫ Pr[ A ′ ( X ) ∈ S a | X = a ] + δ ) Pr[ X = a ] ≤ δ + δ 1 + e ǫ Pr[( X, A ′ ( X )) ∈ S ] . By symmetry , w e also ha ve Pr[( X , A ′ ( X )) ∈ S ] < δ + δ 1 + e ǫ Pr[( X, A ( X )) ∈ S ] . Since the abo ve inequal- ities hold for eve ry selection of S , implies that ( X, A ( X )) an d ( X, A ′ ( X )) are ( ǫ, δ + δ 1 ) -indis tinguish able. Pr oof of P art 5. Let X and Y be random v ariables on D . By deﬁnition SD ( X , Y ) = max S ⊂ D | Pr[ X ∈ S ] − Pr [ Y ∈ S ] | . For any set S ⊂ D , 2 | P r [ X ∈ S ] − Pr[ Y ∈ S ] | = | Pr[ X ∈ S ] − Pr[ Y ∈ S ] | + | Pr[ X / ∈ S ] − Pr [ Y / ∈ S ] | =      X c ∈ S (Pr[ X = c ] − P r[ Y = c ])      +      X c / ∈ S (Pr[ X = c ] − P r[ Y = c ])      ≤ X c ∈ S | Pr[ X = c ] − P r[ Y = c ] | + X c / ∈ S | Pr[ X = c ] − Pr[ Y = c ] | = X c ∈ D | Pr[ X = c ] − Pr[ Y = c ] | ≤ X c ∈ D ( e ǫ Pr[ Y = c ] + δ − Pr[ Y = c ]) + X c ∈ D ( e ǫ Pr[ X = c ] + δ − Pr[ X = c ]) = 2 δ + ( e ǫ − 1) X c ∈ D Pr[ Y = c ] + ( e ǫ − 1) X c ∈ D Pr[ X = c ] ≤ 2( e ǫ − 1) + 2 δ = 2¯ ǫ + 2 δ. This implies that | Pr[ X ∈ S ] − Pr[ Y ∈ S ] | ≤ ¯ ǫ + δ . Since the ab ove inequ ality hold s for e very S ⊂ D , it immediatel y foll ows that the statistical dif fere nce between X and Y is at most ¯ ǫ + δ . B Another V iew of Semantic Priv acy In this sec tion, we discuss anothe r possible deﬁnit ion of ( ǫ, δ ) -semantic priv ac y . Even thoug h this de ﬁnition seems to be the more desirabl e one, it also seems hard to achie ve. 13 Deﬁnition A.1 (reali ty-obli vious ( ǫ, δ )-semantic pri v acy) . A rand omized algorithm is r eality -obliviou s ( ǫ, δ ) - semantic ally priva te if for all belief distri but ions b on D n , for all databa ses x ∈ D n , with pr obability at least 1 − δ over tr anscrip ts t dr awn fr om A (x) , and for all i = 1 , . . . , n : SD  ¯ b 0 [ ·| t ] , ¯ b i [ ·| t ]  ≤ ǫ . W e prov e that if the advers ary has arbitrary beliefs, then ( ǫ, δ ) -dif ferential pri v acy does n’t prov ide any reason able real ity-obli viou s ( ǫ ′ , δ ′ ) -semant ic pri v acy guara ntee. Theor em A.2. ( ǫ, δ )-diffe r ential priv acy does not imply rea lity-obli vious ( ǫ ′ , δ ′ ) -semant ic privacy for any r eason able val ues of ǫ ′ and δ ′ . Pr oof. This countere xample is due to Dwork and McSherry: suppose that the belief distrib ution is uniform ov er { (0 n ) , (1 , 0 n − 1 ) } , but that real databas e is (1 n ) . Let the databa se x = ( x 1 , . . . , x n ) . Say we want to re veal f (x) = P i x i . Adding Gaussian noise w ith v arian ce σ 2 = log  1 δ  /ǫ 2 satisﬁes ( ǫ, δ ) -dif ferential pri v acy (refe r [10, 17] for details). H o wev er , with ov erwhelming prob ability the outp ut w ill be close to n , and this will in tu rn induce a very non-u niform distri buti on over { (0 n ) , (1 , 0 n − 1 ) } since (1 , 0 n − 1 ) is exp onentiall y (in n ) more likely to gen erate a v alue near n than (0 n ) . More precis ely , due to the Gaussian noise added , Pr[ A (x) = n | x = (0 n )] Pr[ A (x) = n | x = (1 , 0 n − 1 )] = exp  − n 2 2 σ  exp  − ( n − 1) 2 2 σ  = exp  − 2 n + 1 2 σ  . Therefore , gi ven that the output is close to n , the post erior distri buti on of the adv ersary would be exponen - tially more biased toward (1 , 0 n − 1 ) than (0 n ) . Hence, it is e xponen tially far awa y from the prior distrib ution which was u niform. On the othe r hand, on x − 1 , no update will occur and the posterior di strib ution will remain uniform ov er { (0 n ) , (1 , 0 n − 1 ) } (same as the prior). Since the po sterior distrib utio ns in these two sit- uation s are exp onential ly far apart (one ex ponential ly far from uniform, other uniform), it sho ws that ( ǫ, δ )- dif ferentia l pri v acy does not imply an y reasona ble guar antee on reality-obli vious semantic priv acy . The counterex ample of T heorem B implies that adv ersarie s whose belief distrib ution is ver y dif ferent from the real database may ob serve a large ch ange in their posterior distrib utions. W e do not conside r this a ‘viola tion of “pri vac y”, sin ce the issue lies in the inc orrect beliefs, not the mechan ism per se . 14

On the `Semantics of Differential Privacy: A Bayesian Formulation

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment