Differential Privacy versus Quantitative Information Flow

Diﬀeren tial Priv a cy v ersus Qu an titativ e Information Flo w ⋆ M´ ario S. Alvim 1 , Konstantinos Chatzikok olakis 2 , Pierpao lo Degano 3 , a nd Catuscia Palamidessi 1 1 INRIA and LIX, Ecole Polytec hnique, F rance. 2 T ec hn ical Universit y of Eindho ven, The Netherlands. 3 Dipartimento di Informatica, Un iversit` a di Pisa, Italy . Abstract. Diﬀeren tial priv acy is a notion of priv acy that h as b ecome very p opular in the database comm unity . R oughly , the idea is that a randomized qu ery mechanism pro vides suﬃcient priv acy p rotection if the ratio b etw een th e probab ilities of tw o diﬀerent entries to originate a certain answer is b ound by e ǫ . In the ﬁelds of anon ymity and informa - tion ﬂow there is a similar concern for con trolling information leak age, i.e. limiting th e p ossibility of inferring the secret information from t h e observ ables. In recent years, researchers have p rop osed to q uantify the leak age in terms of the informatio n-theoretic notion of mutual informa- tion. There are tw o main approac hes that fall in th is category: One based on Shannon en tropy , and one b ased on R´ enyi’s min entrop y . The latter has connection with the so-called Ba yes risk, which expresses the p rob- abilit y of guessing the secret. In this pap er, w e sho w how to model the query system in terms of an information-theoretic channel, and we comp are the notion of diﬀerential priv acy with that of mutual information. W e show th at the notion of diﬀerential priv acy is strictly stronger, in the sense that it implies a b ound on the mutual information, but not viceversa. 1 In tro duction The g rowth of information technology raises signiﬁcant conce r ns ab out the vul- nerability of sensitive informa tion. The p os sibilit y of co llecting and storing data in large amount and the av ailability of powerful data pro cess ing techniques op en the wa y to the threa t of inferring priv ate a nd s ecret information, to such a n extent that fully justiﬁes the user s’ worries. 1.1 Diﬀerent ial priv acy The area of s tatistical databases has b een, naturally , one o f the ﬁrst communities to consider the issues rela ted to the protection of information. Already some ⋆ This w ork has b een partially supp orted by the pro ject ANR-09-BLAN -0169-01 P ANDA and by the INRI A DRI Equip e A ssoci´ ee PRINTEMPS. decades a g o, Dalenius [10] pr op osed a famous “ad omnia” priv a cy desidera tum: nothing about an individual should be learna ble fro m the database that cannot be lea rned witho ut access to the database. Dalenius’ pro p e rty , how ev er, is to o strong to be useful in pr actice: it has bee n s hown b y Dwork [11] tha t no useful databa s e can pr ovide it. In replace - men t Dw o rk has pr op osed the notion of diﬀer ential privacy , which has had an extraor dinary impa c t in the comm unit y . Intuitiv ely , such notion is based on the idea that the pr esence or the absence of a n item in the databas e s hould not change in a signiﬁcant way the proba bility of obtaining a certain answer for a given query [11–14 ]. In o rder to explain the concept more precisely , let us cons ide r the typical scenario: we hav e data bases whose entries are v alues (p os sibly tuples) taken from a given univ erse. A database can b e querie d by users which hav e hone s t purp oses, but a lso by attack ers trying to infer secr et or priv ate data. In or der to co nt rol the leak age o f secr et informa tion, the curator us es some randomized mechanism, which c a uses a certain la ck of prec ision in the a nswers. Clearly , there is a tr ade oﬀ b etw een the need of obtaining ans w ers as precise a s p ossible for legitimate use, and the need to in tro duce some fuzziness to the purp o s e of confusing the a ttack er . Let K b e the ra ndomized function that pr ovides the answers to the quer ies. W e say that K pr ovides ǫ - diﬀer ent ial priv acy if for all databa ses D and D ′ , s uch that one is a subset of the other and the lar ger co ntains a single additional en try , and for all S ⊆ r ange ( K ), the ra tio b et ween the probability that the r esult of K ( D ) is in S , and the pr o bability that the re sult of K ( D ′ ) is in S , is at most e ǫ . Dw ork has a lso studied s uﬃcien t conditions for a r andomized function K to implemen t a mechanism s atisfying ǫ -diﬀerential priv acy . It s uﬃces to co ns ider a Laplacian distribution with v ar iance de p ending on ǫ , and mean equal to the correct answer [13 ]. This is a tec hnique quite diﬀused in practice. 1.2 Quan ti tativ e information ﬂo w and anon ymi t y The pr oblem of prevent ing the lea k age of secr et information has b een a pressing concern a ls o in the area o f softw are s ystems, a nd has motiv ated a very a ctive line of resea r ch called s e cur e information ﬂ ow . Similarly to the case of priv a cy , also in this ﬁeld, at the b eg inning, the goa l w as ambitious: to ensure non-interfer enc e , which means co mplete lack of leak ag e . But, as for Dalenius’ notion of priv a cy , no- int ereference is to o strong for b eing obtainable in practice, and the co mmunit y has star ted explo ring weaker notions. Some of the mos t p opular approaches a re the quantitativ e ones , based on information theory . Se e for instance [6–8, 15–17 , 21]. Independently the ﬁeld of anonymit y , which is concerne d with the protectio n of the identit y of ag ent s perfo r ming certain tas ks, has evolv ed to w ards similar approaches. In the case of anonymit y it is even more imp ortant to co ns ider a quantitativ e formulation, bec ause anonymity pr oto c ols typic al ly use r andomiza- tion to obfusca te the link b etw een the culprit (i.e. the agent which p erforms the task) and the observ able eﬀects of the task. T he ﬁrst notion of a nonymit y , due 2 to Chaum [5], req uir ed that the observ a tion would not change the probability of an individual to b e the culprit. In other words, the pro to col should guarran- tee that the obse r v ation do es not increa se the c hances of learning the identit y of the culprit. This is very similar to Dalenius’ notion of priv acy , and equa lly unattainable in pra ctice (at least, in the ma jorit y of real situations). Also in this case, resear chers in the area hav e star ted considering weak er notions based on information theory , see for instance [3, 18, 22]. If we abstract from the kind o f sec rets a nd o bserv ables, anonymit y and of information ﬂo w are simila r pr oblems: there is some information that w e w an t to keep se c r et, there is a sys tem that pro duces some kind of o bserv able infor ma- tion depending on the secret one, and we want to prev en t as mu ch as pos sible that an a ttack er ma y infer the secrets fro m the obs erv ables. It is therefore not surprising tha t the foundations o f the tw o ﬁelds have conv erged tow ar ds the same information theoretica l a ppr oaches. The ma jorit y of these a pproaches are based o n the idea o f r epresenting the system (or proto col) as an information- theoretic channel taking the s ecrets in input ( X ) a nd pr o ducing the observ ables in output ( Y ). The entr opy of X , H ( X ), repr esents the conv er se of the a priori vulner ability , i.e. the chance of the attack er to ﬁnd out the secret. Similar ly , the conditional entropy of X given Y , H ( X | Y ), repr esents the co n verse of the a p osteriori vulner ability , i.e. the chance o f the attack er to ﬁnd o ut the secret after having observed the output. The mutual information be tw een X and Y , I ( X ; Y ) = H ( X ) − H ( X | Y ), repr esents the g ain for the adversary provided by the observ atio n, and is ta ken as deﬁnition of the informa tion le akage of the system. Sometimes we may wan t to abstract from the distribution of X , in which case we can use the c ap acity of the channel, deﬁned a s the maximum of I ( X ; Y ) ov er all p o s sible dis tributions on X . This repr esents the worst case for leak age. The v arious appro aches in literature diﬀer, mainly , fo r the notion of entrop y . Such notion is rela ted to the kind of a ttack er s we wan t to mo del, a nd to how we measure their success (see [15] for an illuminating discussion of s uch relation). Shannon entropy [2 0], on which most of the a pproaches ar e based, represents an adversary whic h tries to ﬁnd out the secr et x by asking questions o f the fo r m “do es x b elong to set S ?”. Shannon entrop y is precisely the average num b er of questions necessa ry to ﬁnd o ut the exact v alue of x with an optimal str a tegy (i.e. a n o ptimal choice of the S ’s). The other most po pular no tion o f entropy (in this area) is R ´ en yi’s min entrop y [19 ]. The co rresp onding notion of attac k is a single try of the form “is x e q ual to v ?”. R´ enyi’s min e n tropy is precisely the log of the pro bability of guessing the tr ue v alue with the optimal stra tegy , which consists, of co urse, in selecting the v with the highe s t probability . Approaches based on this notion include [21] a nd [2]. It is worth noting that, while the R´ enyi’s min entropy o f X , H ∞ ( X ), rep- resents the a pr iori probability o f succes s (o f the single-try attack), the R´ e n yi’s min conditional entropy of X given Y , H ∞ ( X | Y ), repre sents the a p os teriori 3 probability o f succes s 1 . This a po steriori probability is the converse of the Bayes risk [9] , which has also been used as a measure of leak a ge [1, 4 ]. 1.3 Goal of the pap er F rom a mathematical point of view, priv acy presents many similarities with information ﬂow and a nonymit y . The priv a te da ta of the entry co nstitute the secret, the answer to the query giv es the obs erv ation, and the goal is to prevent as muc h as po ssible the inference of the secret from the observ a ble. Diﬀerential priv acy ca n b e seen as a quantitativ e deﬁnition of the degree of leak age. The main goal of this pap er is to ex plore the rela tion with the alterna tive deﬁnitions based on informatio n theo ry , with the purp ose of getting a b etter understa nding of the notion o f diﬀerential priv a cy , of the s p eciﬁc problems r elated to priv acy , and o f the mo dels of atta ck used to formalize the notion of priv acy , in relation to those us ed for anonymit y and information ﬂow. 1.4 Con tribution The contribution of this pap er is as follows: – W e show how the problem of pr iv acy can b e formu lated in an information- theoretic s e tting. Mo re precisely , we show how the ans wer function K can b e asso ciated to an informatio n-theoretic channel. – W e pr ov e that ǫ -diﬀerential priv acy implies a bo und o n the Shannon mutual information of the channel, and that this bound approach 0 as ǫ appr oaches 0. Same fo r R´ enyi min mutual informatio n. – W e s how that the viceversa o f the ab ov e p oint do es no t hold, i.e. that Shan- non a nd R´ enyi min mutual infor ma tion (and also the co rresp onding ca paci- ties) can approach 0 while the ǫ parameter of diﬀer e n tial priv acy appr oaches inﬁnit y . 1.5 Plan of the pap er Next sec tio n introduces some necessar y background notions. Section 3 pro po ses an information- theo retic view of the databas e quer y sys tems. Section 4 show the main r esults of the paper, namely that diﬀerential priv acy implies a b ound on Shannon and R´ en yi min mutual infor mation, but not vic e versa. Sectio n 5 concludes and presents some ideas for future work. The pro ofs of the r esults are in the app endix. Such app endix will not b e included in the pro c e e ding version (for rea sons of space ), but the pr o ofs will b e made av ailable on line. 1 W e should mention that R´ enyi did n ot deﬁne the conditional version of the min entrop y , and that there hav e b een v arious diﬀeren t prop osals in literature for this notion. W e use here th e one proposed by Smith in [21]. 4 2 Preliminaries 2.1 Diﬀerent ial priv acy W e assume a ﬁxed ﬁnite universe U in which the e ntries o f databases may r ange. The concept of diﬀerential priv acy is tightly co nnected to the concept of adjac ent (or neighb or ) databa ses. Deﬁnition 1 ([13]). A p air of datab ases ( D ′ , D ′′ ) is c onsider e d adjacent (or neighbors ) if one is a pr op er subset of t he other and the lar ger datab ase c ontains just one add itional entry. Dw ork’s de ﬁnitio n of diﬀerential priv a cy is the following: Deﬁnition 2 ([11]). A r andomize d function K satisﬁes ǫ -diﬀerential priv a cy if for al l p airs of adja c ent datab ases D ′ and D ′′ , and al l S ⊆ R ang e ( K ) , P r [ K ( D ′ ) ∈ S ] ≤ e ǫ × P r [ K ( D ′′ ) ∈ S ] (1) 2.2 Information theory and in terpretation in terms of attac ks In the following, X , Y denote tw o disc rete r andom v ar iables with carr ie rs X = { x 1 , . . . , x n } , Y = { y 1 , . . . , y m } , and pr obability distributio ns p X ( · ), p Y ( · ), re- sp ectively . An infor mation-theoretic channel is constituted by an input X , an output Y , a nd the matrix o f co nditional pro babilities p Y | X ( · | · ), where p Y | X ( y | x ) represent the probability that Y is y given that X is x . W e will use X ∧ Y to represent the r andom v ariable with car rier X × Y and join t pr obability dis- tribution p X ∧ Y ( x, y ) = p X ( x ) · p Y | X ( y | x ). W e shall omit the subscr ipts o n the probabilities when they ar e clea r from the context. 2.3 Shannon en trop y The Shannon entropy o f X is deﬁned as H ( X ) = − X x ∈ X p ( x ) lo g p ( x ) The minimum v alue H ( X ) = 0 is obtained when p ( · ) is concentrated on a single v alue (i.e. when p ( · ) is a delta of Dir ac). The ma ximum v alue H ( X ) = log |X | is obtained whe n p ( · ) is the unifor m distribution. Usua lly the base o f the loga rithm is set to b e 2 and, co rresp ondingly , the ent ropy is mea sured in bits . The c onditional ent r opy of X given Y is H ( X | Y ) = X y ∈ Y p ( y ) H ( X | Y = y ) where H ( X | Y = y ) = − X x ∈ X p ( x | y ) log p ( x | y ) 5 W e ca n prov e that 0 ≤ H ( X | Y ) ≤ H ( X ). The minim um v a lue, 0, is obtained when X is completely determined b y Y . The maximum v alue H ( X ) is o btained when Y reveals no information ab out X , i.e. when X a nd Y are indep endent. The mu tual information b etw een X and Y is deﬁned a s I ( X ; Y ) = H ( X ) − H ( X | Y ) (2) and it measures the amount o f informa tion a b o ut X that we gain b y observing Y . It can b e shown that I ( X ; Y ) = I ( Y ; X ) and 0 ≤ I ( X ; Y ) ≤ H ( X ). Shannon capacity is deﬁned as the maximum m utual infor mation ov er a ll po ssible input distributions: C = max p X ( · ) I ( X ; Y ) 2.4 R´ en yi min-en trop y In [19 ], R´ enyi introduced an one- parameter family o f entrop y mea sures, intended as a generalizatio n of Shannon entrop y . The R´ enyi entrop y of o rder α ( α > 0, α 6 = 1) of a random v aria ble X is deﬁned as H α ( X ) = 1 1 − α log X x ∈ X p ( x ) α W e ar e particularly interested in the limit o f H α as α a ppr oaches ∞ . This is called min-entr opy . It can b e prov en that H ∞ ( X ) def = lim α →∞ H α ( X ) = − lo g max x ∈X p ( x ) R ´ enyi deﬁned also the α -g eneralizatio n of other infor mation-theore tic no- tions, like the Kullback-Leibler divergence. How ev er, he did not deﬁne the α - generaliza tion of the conditional entrop y , and ther e is no a greement on what it should b e. F or the cas e α = ∞ , we ado pt her e the deﬁnition of co nditional ent ropy pr op osed b y Smith in [21]: H ∞ ( X | Y ) = − log P y ∈Y p ( y ) max x ∈X p ( x | y ) (3) Analogously to (2), w e c a n deﬁne the mutual information I ∞ as H ∞ ( X ) − H ∞ ( X | Y ), and the capacity C ∞ as max p X ( · ) I ∞ ( X ; Y ). It has b een proven in [2] that C ∞ is obta ined at the uniform distribution, and that it is eq ual to the sum o f the maxima of each column in the channel matrix: C ∞ = X y ∈ Y max x ∈ X p ( y | x ) . 6 3 An information theoret ic mo del of priv acy In this section we show how to r e present a database query system (of the kind considered in diﬀerential priv acy) in ter ms o f an informa tion-theoretic channel. According to [11] and [13 ], diﬀerential priv a cy can b e implemented by adding some appropr iately c hosen r andom noise to the answer x = f ( D ), where f is the query fun ction and D is the database. The function can op erate in the entire database at o nce, and even though the query may b e comp osed by a chain of sub- queries, we assume that subsequent sub-queries dep end only on the t rue answer to pre vious sub-queries. Under this constraint, no matter how complex the query is, it is still a function f of the database D . The scena r io where subsequent sub- queries can de p end on the r ep orte d answer to previous queries corresp o nds to adaptive adversaries [11], and is not conside r ed in this pap er. After the true answer x to the q ue r y is obtained fro m D , some noise is intro- duced in order to pro duce a rep or ted answer y . The rep or ted ans wer c an be s een as a random v ar iable Y dep endent o n the rando m v a riable X co rresp onding to the real answer, and the tw o ra ndom v ariables ar e rela ted b y a conditional pro b- ability distribution p Y | X ( ·|· ). The c o nditional probabilities p Y | X ( y | x ) co nstitute the matrix o f an informa tion theor etic channel from X to Y . Figure 1 shows the scheme of implementation of a diﬀerential priv acy scheme. Database Query Randomization Reported Answer D r V alue of f ( D ∪ r ) = ⇒ Channel p Y | X ( ·|· ) x 1 ... x n y 1 ... y m Randomized value K ( f ( D ∪ r )) = ⇒ Fig. 1. The channel correspondin g to a diﬀerential priva cy sc heme. In [11 ] it has b een prov ed that a w a y to deﬁne the v alues of p Y | X ( ·|· ) s o to ensure ǫ -diﬀerential priv acy , is by using the Laplace distr ibutio n: P (( Y = y ) | ( X = x ) , ∆f /ǫ ) = ∆f 2 ǫ e −| y − x | ǫ/ ∆f (4) where ∆f is the L1-sens itivit y of f , deﬁned as 2 ∆f = max D ′ ,D ′′ adjac e nt | f ( D ′ ) − f ( D ′′ ) | . 2 W e give here the deﬁn ition for th e case in whic h the range of f is R . In the more general case in which the range is R n w e should replace | f ( D ′ ) − f ( D ′′ ) | by the 1-norm of the vector f ( D ′ ) − f ( D ′′ ). 7 4 Relation b et w een diﬀeren tial priv acy and m utual information In this section we inv es tigate the r elation b etw een diﬀerential priv a c y and infor mation- theoretic no tions. W e star t by c onsidering an equiv a lent deﬁnition of diﬀerential priv acy , easier to handle for our purp oses. 4.1 T esting single elements Deﬁnition 2 consider s tests which chec k whether the result of K ( D ) belo ngs to a certain set or not. W e prefer to simplify this deﬁnition by consider ing only tests ov er single elemen ts: Deﬁnition 3. A ra ndomize d function K gives δ -diﬀerential pr iv acy if for al l p airs adj ac ent datasets D ′ and D ′′ , and al l k ∈ R ang e ( K ) , P r [ K ( D ′ ) = k ] ≤ e δ × P r [ K ( D ′′ ) = k ] (5) The following result shows that our deﬁnition of diﬀerential priv acy is equiv- alent to the classical one . Theorem 1. A function K gives ǫ - diﬀer ential privacy iﬀ it gives δ -diﬀer ent ial privacy, with ǫ = δ . 4.2 Databases with the same n um b er of en tries and diﬀ e ring in at most one en try Consider tw o databas es D ′ and D ′′ that hav e the same num b er o f ent ries and diﬀer in at most o ne entry as in Fig ure 2. Let D be the common part s ha red by b oth database s , and let r ′ and r ′′ be the rows in w hich they diﬀer, na mely D ′ = D ∪ { r ′ } and D ′′ = D ∪ { r ′′ } . D ′        D r ′ D ′′        D r ′′ Fig. 2. Two databases diﬀering in exactly one en try W e prove that δ -diﬀerential priv acy impo s es also a b ound on the compar ison betw een databa ses with the same n um ber of entries, and which diﬀer in the v alues of only one entry . Lemma 1. L et K b e a funct ion that gives δ -diﬀer ential privac y for al l p airs of adjac ent datab ases. Given two datab ases D ′ and D ′′ that have the same numb er of ent ries and diﬀer in the value of at most one entry, then: P r [ K ( D ′ ) = k ] ≤ e 2 δ × P r [ K ( D ′′ ) = k ] 8 4.3 Shannon m utual information W e prov e now that δ -diﬀerential priv a cy impose s a b ound on Shannon m utual information, and that this bo und approaches 0 as the parameter δ appr oaches 0. Theorem 2. If a r andomize d function K gives δ -diﬀer ential privacy ac c or ding to Deﬁnition 3, then for every r esult x ∗ of the function f the S hannon mutu al information b etwe en the true answers X (i.e. the re sults of f ) and t he re p orte d answers Y (i.e. the r esults of K ) is b ounde d by: I ( X ; Y ) ≤ ( e 2 δ + e − 2 δ ) δ log( e ) + ( e 2 δ − e − 2 δ ) X y p ( y | x ∗ ) log( p ( y | x ∗ )) It is ea sy to see that the expressio n which bo unds I from a bove, ( e 2 δ + e − 2 δ ) δ log ( e ) + ( e 2 δ − e − 2 δ ) P y p ( y | x ∗ ) log( p ( y | x ∗ )), conv erges to 0 when δ a p- proaches 0. The conv erse of Theorem 2 do es not hold. O ne reaso n is that mutual in- formation is se nsitive to the v alues of the input distribution, while diﬀerential priv acy is no t. Next ex ample illus trates this p o int. Example 1. Let n b e the n um ber of e le men ts of the univ erse, a nd m the ca rdinal- it y of the se t of p os sible answers of f . Assume that p ( x 1 ) = α and p ( x i ) = 1 − α n − 1 for 2 ≤ i ≤ n . Let p ( y 1 | x 1 ) = β , p ( y j | x 1 ) = 1 − β m − 1 for 2 ≤ j ≤ m , and p ( y j | x i ) = 1 m . o therwise. This channel is r epresented in Figure 0 (a ). It is easy to se e that the Shannon m utual info r mation appro a ches 0 as α a p- proaches 0, indep endent ly of the v a lue of β . Diﬀerential priv a cy , howev er , de- pends only on the v alue o f β , mor e pr ecisely , the pa rameter of diﬀerential priv acy is ma x { log e 1 mβ , log e mβ , lo g e m − 1 m (1 − β ) , log e m (1 − β ) m − 1 } , and it is ea sy to see that such parameter is un b o und and go es to inﬁnity a s β approaches 0. The r e asoning in the counterexample above is not v a lid anymore if we co n- sider capacity instead than mut ual information. How ever, there is another r eason why the converse o f Theorem 2 do es not hold, and this remains the case a ls o if we c onsider capacity . The situation is illustr a ted by the following exa mple. Example 2. Let n b e the n um ber of e le men ts of the univ erse, a nd m the ca rdinal- it y of the set of p o ssible answers of f . Assume that p ( y i | x i ) = β a nd p ( y i | x j ) = 1 − β m − 1 for i 6 = j . This channel is represented in Figure 0(b). It is easy to see that the Shannon capa city is C = lo g m − (1 − β ) log ( m − 1) + β log β + (1 − β ) log(1 − β ), and that C approa ches 0 a s β appr oaches 0 and m becomes lar ge. Diﬀeren tia l priv acy , ho wev er , go es in the other direction when β approaches 0, and it is not very sensitive to the v alue of m . More pr ecisely , the par ameter of diﬀer e n tial priv acy is max { log e 1 − β β (1 − m ) , β (1 − m ) 1 − β } , a nd it is easy to see that such parameter is un bo und a nd go es to inﬁnity as β appr oaches 0 , indep endently of the v a lue o f m . 9 (a) Example 1 p X ( · ) α 1 − α m − 1 . . . 1 − α m − 1 y 1 y 2 . . . y m x 1 β 1 − β m − 1 . . . 1 − β m − 1 x 2 1 m 1 m . . . 1 m . . . . . . . . . . . . . . . x n 1 m 1 m . . . 1 m (b) Example 2 y 1 y 2 . . . y m x 1 β 1 − β m − 1 . . . 1 − β m − 1 x 2 1 − β m − 1 β . . . 1 − β m − 1 . . . . . . . . . . . . . . . x n 1 − β m − 1 1 − β m − 1 . . . β T able 1. The channels of Examples 1 and 2 4.4 R´ en yi min m utual in fo rm ation W e show now that a r esult analog ous to that o f Sectio n 4 .3 holds also in the case of R´ en yi min entrop y . Theorem 3. If a r andomize d function K gives δ -diﬀer ential privacy ac c or ding to Deﬁnition 3, then the R´ enyi min m utual information b etwe en the t rue answer of t he function X and the re p orte d answer Y is b ounde d by I ∞ ( X ; Y ) ≤ 2 δ log e. The co n verse of Theorem 3 do es not hold, not even if w e consider capacity instead than mutual information. It is easy to prove, in fact, that E x amples 1 and 2 lead to counterexamples also in the case of R´ en yi min m utual infor mation and c a pacity . 5 Conclusion and future w ork In this pap er we hav e shown that the problem of priv acy in statistical data bases can b e formulated in information-theoretic terms, in a wa y ana lo gous to what has been done for information ﬂow and anon ymit y: the da tabase quer y system can b e s een as a no isy channel, in the information- theo retic sense. Then we hav e considered Dwork’s no tio n of diﬀerential priv ac y , and w e hav e shown that it is strictly strong er than requiring the channel to hav e lo w capacity , b o th for the cases of Shannon and R ´ enyi min entrop y . It is natura l to cons ide r , then, whether a weak er notion would give enough priv acy g uarra ntees. As future work, we int end to inv estigate this ques tion. W e ﬁr st need to understand, of course, wha t are the co nstraints that could be relaxed in the notion o f diﬀerential priv acy . T o this a im, Example 2 is q uite int eresting: whenever we get an answer y , ther e are n − 1 p ossible inputs (entries) which are equally lik ely to hav e genera ted that a nswer, and one input x that is muc h less likely than the others ( p ( x | y ) = α , where α is a v ery small v alue). The existence o f the latter seems quite harmless, yet it is exactly that ent ry that causes diﬀeren tial priv acy to fail (in the se nse that its para meter is unbound). The notion o f R ´ en yi min capacity seems a plausible candidate for the notion o f 10 priv acy: it’s r elation with the Bay es risk ensures tha t a b ound C ∞ can b e seen as a b ound on the pro bability o f guess ing the rig ht v a lue of x (given the obsev able). In some s c enario, this may b e exactly what w e wan t. Ac knowledgemen t W e wish to thank Daniel Le M´ etayer for having p ointed out to us the no tion of diﬀerential pr iv acy , a nd br ought to our attention the p ossible relation with quantitativ e information ﬂow. References 1. Christelle Braun, Konstantinos Chatzikok olakis, and Catuscia Pa lamidessi. Com- p ositional metho ds for information-hiding. In Pr o c. of FOSSA CS , volume 4962 of LNCS , pages 443–457. Sp ringer, 2008. 2. Christelle Braun, Konstantinos Chatziko kol akis, and Catuscia Palamidessi . Quan- titative notions of leak age for one-try attacks. In Pr o c. of M FPS , vol ume 249 of ENTCS , pages 75–91. Elsevier, 2009. 3. Konstantinos Chatzik oko lakis, Catuscia Palamidessi, and Prak ash Pananga den. Anonymit y proto cols as noisy channels. I nf. and Comp. , 206(2–4):378–40 1, 2008. 4. Konstantinos Chatzikoko lakis, Catuscia Palami dessi, and Prak ash Pa nangaden. On the Bay es risk in information-hiding proto cols. J. of Comp. Se curity , 16(5):531– 571, 2008. 5. David Chau m . The dining cryp tographers problem: Un conditional sender and recipien t u ntracea bilit y . Journal of Cryptolo gy , 1:65– 75, 1988. 6. David Clark, Sebastian Hunt, and Pasquale Mal acaria. Quan titativ e analysis of the leak age of conﬁdential data. In Pr o c. of QAPL , volume 59 (3) of Ele ctr. Notes The or. Comput. Sci , p ages 238–2 51. Elsevier, 2001. 7. David Clark, Sebastian Hunt, and Pas quale Malacaria. Quantitativ e information ﬂow , relations and p olymorphic types. J. of L o gi c and Computation , 18(2):181–199 , 2005. 8. Mic hael R. Clarkson, Andrew C. Myers, and F red B. Schneider. Belief in informa- tion ﬂow. J. of Com p. Se curity , 17(5):655–7 01, 2009. 9. Thomas M. Cov er and Joy A. Thomas. Elements of I nformation The ory . J. Wiley & Sons, Inc., second ed ition, 2006 . 10. T ore Dalenius. T ow ards a metho dology for statistical disclosure control. Statistik Tidskrift , 15:429 — 444, 1977. 11. Cynthia Dwork. Diﬀerential priv acy . In A utomata, L anguages and Pr o gr amming, 33r d Int. Col lo quium, ICALP 2006, V enic e, I taly, July 10-14, 2006, Pr o c., Part II , volume 4052 of LNCS , p ages 1–12. Springer, 2006. 12. Cynthia Dwork. Diﬀerential p riv acy in new settings. In Pr o c. of the Twenty- First Annual ACM-SIAM Symp osium on Di scr ete Algorithms, SODA 2010, Aust in, T exas, USA, January 17-19, 2010 , pages 174–1 83. SI AM, 2010. 13. Cynthia Dwork. A ﬁrm foundation for priv ate data analysis. Communic ations of the ACM , 2010. T o appear. 14. Cynthia Dw ork and Jing Lei. Diﬀerentia l p riv acy and robu st statistics. In Pr o c. of the 41st Annual ACM Symp osium on The ory of Com puting, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009 , p ages 371–380. ACM, 2009. 11 15. Boris K¨ opf and David A. Basin. An information-theoretic mo del for adaptive side-channel attacks. In Pr o c. of C CS , pages 286–296 . ACM , 2007. 16. Pa squale Mala caria. Assessing security threats of looping constructs. In Pr o c. of POPL , pages 225–235. ACM, 2007. 17. Pa squale Malacaria and Han Chen. Lagrange m ultipliers and maximum informa- tion leak age in diﬀerent observ ational models. I n Pr o c. of PLAS , pages 135–146. ACM , 2008. 18. Ira S. Mosko witz, Ric hard E. Newman, Daniel P . Crepeau, and Allen R. Miller. Co v ert channels and anonymizing n etw orks. In Pr o c. of PES , pages 79–88. ACM, 2003. 19. Alfr ´ ed R´ enyi. On Measures of Entrop y and Information. In Pr o c. of the 4th Berkeley Symp osium on Mathematics, Statistics, and Pr ob ability , pages 547–561, 1961. 20. Claude E. Shannon. A math ematical t heory of comm unication. Bel l System T e ch- nic al Journal , 27:379–42 3, 625–56, 1948. 21. Geoﬀrey Sm ith . On the foundations of quantitativ e information ﬂow. In Pr o c. of FOSS ACS , volume 5504 of LNCS , pages 288–302 . Springer, 2009. 22. Y e Zhu and Riccardo Bettati. A nonymit y vs. information leak age in anonymit y systems. In Pr o c. of ICDCS , pages 514–524. IEEE, 2005. 12 App endix Theorem 4 (Theorem 1 in the p ap er). A fun ction K gives ǫ - diﬀer ential privacy iﬀ it gives δ -diﬀer ential priva cy, with ǫ = δ . Pr o of. ⇒ Let k ∈ R ang e ( K ). Then for all pair of adjace n t databa s es D ′ , D ′′ we have P r [ K ( D ′ ) = k ] = P r [ K ( D ′ ) ∈ { k } ] (taking S to be a singleton set) ≤ e ǫ P r [ K ( D ′′ ) ∈ { k } ] (b y Deﬁnition 2 ) = e ǫ P r [ K ( D ′′ ) = k ] ⇐ Let S ⊆ Rang e ( K ) P r [ K ( D ′ ) ∈ S ] = X k ∈ S P r [ K ( D ′ ) = k ] (b y union of elements) ≤ X k ∈ S e δ P r [ K ( D ′′ ) = k ] (b y Deﬁnition 3 ) = e δ X k ∈ S P r [ K ( D ′′ ) = k ] (b y distributivit y) = e δ P r [ K ( D ′′ ) ∈ S ] (b y union of elements) Lemma 2 (Lemma 1 in the pap er). L et K b e a function that gives δ - diﬀer ential privacy for al l p airs of adjac ent datab ases. Given two datab ases D ′ and D ′′ that have t he same numb er of entries and diﬀer in the value of at most one entry, then: P r [ K ( D ′ ) = k ] ≤ e 2 δ × P r [ K ( D ′′ ) = k ] Pr o of. Let us ca ll D the common part that D ′ and D ′′ share, and let us call r ′ and r ′′ the en tries in which they diﬀer, in such a wa y that D ′ = D ∪ { r ′ } and P r [ K ( D ∪ { r ′ } ) = k ] ≤ e δ × P r [ K ( D ) = k ] (b y Deﬁnition 3) ≤ e δ × e δ × P r [ K ( D ∪ { r ′′ } ) = k ] (b y Deﬁnition 3) ≤ e 2 δ × P r [ K ( D ′′ ) = k ] Theorem 5 (Theorem 2 i n the pap er). If a r andomize d function K gives δ -diﬀer ential privacy ac c or ding to Deﬁnition 3, t hen for every r esult x ∗ of the function f the Shannon mutu al information b etwe en the tru e answers X (i.e. the re sults of f ) and the r ep orte d answers Y (i.e. the re sults of K ) is b ounde d by: I ( X ; Y ) ≤ ( e 2 δ + e − 2 δ ) δ log( e ) + ( e 2 δ − e − 2 δ ) X y p ( y | x ∗ ) log( p ( y | x ∗ )) 13 Pr o of. Let us ca lculate the Sha nno n mutual information using the formula I ( X ; Y ) = H ( Y ) − X ( Y | X ). H ( Y ) = − X y p ( y ) log p ( y ) (b y deﬁnition) = − X y X x p ( x, y ) ! log X x p ( x, y ) ! (b y probability laws) = − X y X x p ( x ) p ( y | x ) ! log X x p ( x ) p ( y | x ) ! (b y probability laws) ≤ − X y X x p ( x ) e − 2 δ p ( y | x ∗ ) ! log X x p ( x ) e − 2 δ p ( y | x ∗ ) ! (b y Deﬁnition 3 and Lemma 1) = − X y e − 2 δ p ( y | x ∗ ) X x p ( x ) ! log e − 2 δ p ( y | x ∗ ) X x p ( x ) ! = − X y e − 2 δ p ( y | x ∗ ) log( e − 2 δ p ( y | x ∗ )) (b y probability laws) = − X y  e − 2 δ p ( y | x ∗ ) log e − 2 δ  − X y  e − 2 δ p ( y | x ∗ ) log p ( y | x ∗ )  (b y distributivit y) = − e − 2 δ log e − 2 δ X y p ( y | x ∗ ) ! − X y  e − 2 δ p ( y | x ∗ ) log p ( y | x ∗ )  = δ e − 2 δ log e − e − 2 δ X y p ( y | x ∗ ) log p ( y | x ∗ ) (b y probability laws) (6) H ( Y | X ) = − X x p ( x ) X y p ( y | x ) log p ( y | x ) (b y deﬁnition) ≥ − X x p ( x ) X y e 2 δ p ( y | x ∗ ) log( e 2 δ p ( y | x ∗ )) (b y Deﬁnition 3 and Lemma 1) = − X y e 2 δ p ( y | x ∗ ) log( e 2 δ p ( y | x ∗ )) ! X x p ( x ) (b y distributivit y) = − X y e 2 δ p ( y | x ∗ ) log( e 2 δ p ( y | x ∗ )) (b y probability laws) = − X y  e 2 δ p ( y | x ∗ ) log( e 2 δ )  − X y  e 2 δ p ( y | x ∗ ) log p ( y | x ∗ )  = − e δ log e 2 δ X y p ( y | x ∗ ) ! − e 2 δ X y p ( y | x ∗ ) log p ( y | x ∗ ) = − δ e 2 δ log e − e 2 δ X y p ( y | x ∗ ) log p ( y | x ∗ ) (b y probability laws) (7) 14 I ( X ; Y ) = H ( Y ) − X ( Y | X ) (b y deﬁnition) ≤ δ e − 2 δ log e − e − 2 δ X x p ( y | x ∗ ) log p ( y | x ∗ )+ 2 δ e 2 δ log e + e 2 δ X y p ( y | x ∗ ) log p ( y | x ∗ ) (b y Equations 6 a nd 7) = ( e 2 δ + e − 2 δ ) δ log ( e ) + ( e 2 δ − e − 2 δ ) X y p ( y | x ∗ ) log( p ( y | x ∗ )) (by distributivit y) Theorem 6 (Theorem 3 i n the pap er). If a r andomize d function K gives δ -diﬀer ential privacy ac c or ding to Deﬁn ition 3, t hen the R´ enyi min mutual in- formation b etwe en t he true answer of the function X and the r ep orte d answer Y is b ou n de d by: I ∞ ( X ; Y ) ≤ 2 δ log e. Pr o of. Let us ca lculate the R´ en yi m utual infor mation using the formula I ∞ ( X ; Y ) = H ∞ ( X ) − X ∞ ( X | Y ). H ∞ ( X ) = − log max x p ( x ) (b y deﬁnition) (8) H ∞ ( X | Y ) = − log X y p ( y ) max x p ( x | y ) (by deﬁnition) = − log X y max x p ( y ) p ( x | y ) = − log X y max x p ( x ) p ( y | x ) (b y probability laws) ≥ − log X y max x p ( x ) e 2 δ p ( y | x ∗ ) (b y Deﬁnition 3 and Lemma 1) = − log X y e 2 δ p ( y | x ∗ ) max x p ( x ) = − log e 2 δ max x p ( x ) X y p ( y | x ∗ ) ! = − log  e 2 δ max x p ( x )  (b y probability laws) = − 2 δ log e − log max x p ( x ) (9) I ∞ ( X ; Y ) = H ∞ ( X ) − H ∞ ( X | Y ) (b y deﬁnition) ≤ − log max x p ( x ) + 2 δ lo g e + log max x p ( x ) (by Equatio ns 8 and 9) = 2 δ log e 15 D r f ( D ∪{ r } ) = ⇒ x K ( x ) = ⇒ y 1

Differential Privacy versus Quantitative Information Flow

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment