Hash Property and Coding Theorems for Sparse Matrices and Maximum-Likelihood Coding
The aim of this paper is to prove the achievability of several coding problems by using sparse matrices (the maximum column weight grows logarithmically in the block length) and maximal-likelihood (ML) coding. These problems are the Slepian-Wolf prob…
Authors: Jun Muramatsu, Shigeki Miyake
1 Hash Property a nd Coding Theorems for Sparse Matr ices and Maximum-Lik elihood Coding Jun Muramatsu and Shigeki Miyake Abstract The aim of t his paper is to prove the achiev ability of severa l coding problems by using sparse matrices (the maximum column weight grows logarithmically in the block length) and maximal-likelihood (ML) coding. T hese problems are the Slepian-W olf problem, the Gel’fand-Pinsker problem, the W yner-Zi v problem, and the One- helps-one problem (source coding with partial side information at the decoder). T o t his end, the notion of a hash property for an ensemble of functions is introduced and it is prov ed that an ensemble of q -ary sparse matrices satisfies the hash property . Based on this property , i t is proved t hat the rate of codes using sparse matrices and maximal-likelihoo d (ML) coding can achiev e the optimal rate. Index T erms Shannon theory , hash f unctions, linear codes, sparse matrix, maximum-likelihood eoncoding/decodin g, the Slepian-W olf problem, the Gel’fand-Pinsker problem, the W yner-Ziv problem, t he One-helps-one problem I . I N T RO D U C T I O N The aim of this paper is to prove the achie vability of se veral c oding p roblems by using sparse matrices (th e maximum column weight grows log arithmically in the block length) and maximal-likeliho od (ML) coding 1 , namely the Slepian-W olf prob lem [39] (Fig. 1), the Gel’fand-Pinsker p roblem [13] (Fig. 2), the W yner-Ziv problem [47] (Fig. 3), and the One-h elps-one problem (source coding with partial side inform ation at the decoder ) [ 44][46] ( Fig. 4). T o prove these theorems, we fir st intro duce the notion of a hash pro perty for an ensemble of functio ns, where func tions are not assumed to be line ar . This notio n is a sufficient condition fo r the a chiev ab ility of coding theorems. Next, we prove tha t an en semble of q -ary sparse matrice s, which is an extension o f [ 21], s atisfies the hash pr operty . Finally , b ased on the hash pro perty , we prove th at th e r ate o f J. Muramatsu is with NTT Communication Science Laboratori es, NTT Corporati on, 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0237, Japan (E-mail: pure@cslab .kecl.ntt.co.jp). S. Miyak e is with NTT Network Innov ation L aborator ies, NTT Corporat ion, 1-1, Hikarino oka, Y okosuka-shi, Kanaga wa 239-0847, Japan (E-mail: miyake . shigeki@l ab.ntt . co.jp). This paper is submitted to IEEE T ransactions on Information Theory and a part of this paper is submitted to IEE E Internationa l Symposium on Information Theory (ISIT2008, ISIT2009) . 1 This operati on is usually calle d an ML decoding . W e use the word ‘coding’ because this operati on is also used in the constructio n of an encod er . September 7, 2021 DRAFT 2 codes can achieve the o ptimal rate . Th is implies that the rate of codes using sparse matrices an d ML co ding can achieve the op timal rate. It should be n oted here that there is a pr actical app roximation meth od of ML coding by using sp arse matrices and th e linear p rogram ing techniq ue intr oduced by [1 1]. The contributions of this paper are summarized in the fo llowing. • Th e notion of a hash prop erty is introd uced. It is an extension of the notion of a universal class of hash function s intro duced in [7]. Th e single sou rce coding pr oblem is stu died in [22, Section 1 4.2][1 7 ] by using the h ash fun ction. W e prove that an en semble of q -ary sp arse m atrices h as a hash pro perty , while a weak version of hash pro perty is proved in [26][2][10][33][34] imp licitly . It should be noted that o ur definition of hash pro perty is also an extension of the definition of random bin cod ing introduced in [3], where the set of all sequen ces is p artitioned at random . On the other hand , the ran dom cod ebook (a set o f codewords/representations) gen eration is introduc ed fo r the proof of the orig inal channel coding theore m [37] and lo ssy sou rce cod ing theorem [38]. Here it is proved th at ra ndom bin codin g and partition ing determined by r andomly gener ated m atrix can be applied to th e original chan nel coding theo rem a nd lossy source coding theorem. • Th e p roof of the achiev a bility of the Slepian-W olf proble m is d emonstrated b ased o n th e h ash p roperty . It is th e extension of [ 22, Section 1 4.2][1 7 ] and provide s a n ew proo f of [3][5][33]. By a pplying the theorem to the coding the orem of c hannel w ith ad ditiv e (symmetric) no ise, it also provid es a new pro of of [26][2][10][33]. • Th e o ptimality o f a co de is proved by u sing sparse m atrices a nd ML cod ing fo r the Gel’fand-Pinsker problem . W e prove the q -ary an d asymm etric version of the theorem, while a binar y and symmetr ic version is studied in [2 4]. It shou ld be noted here that the column/row weight of ma trices u sed in [24] is constant with respect to the block length, while it grows lo garithmically in our con struction. The detailed difference from [2 4] is stated in Section V -B. As a co rollary , we h av e the o ptimality of codes using spar se matrices for the cod ing problem o f an arbitrary ( q -a ry and asymmetric) chann el, while a symmetric chann el is assumed in many of the c hannel coding theorems by using sparse matrices. The constructio n is ba sed on the co set code p resented in [3 3][29], which is different fr om that pre sented in [12][2]. Wh en ou r theorem is applied to the en semble of sp arse matrices, o ur proof is simpler th an that in [2 9]. • Th e optimality of a code is proved by using sparse matrices and ML coding for the W yner-Zi v problem . W e prove the q -ary and b iased source, and the non-ad ditiv e side in formation version of the theorem, while a binary and unbiased source and ad ditive side info rmation are assum ed in [24]. As a co rollary , we h av e X ✲ ϕ X ✲ R X > H ( X | Y ) Y ✲ ϕ Y ✲ R Y > H ( Y | X ) ϕ − 1 ✲ X Y R X + R Y > H ( X , Y ) Fig. 1. Slepian-W olf Problem September 7, 2021 DRAFT 3 M ✲ ϕ ✲ X ✲ µ Y | X Z ✲ Y ✲ ϕ − 1 ✲ M Z ✻ ✻ R < I ( W ; Y ) − I ( W ; Z ) Fig. 2. Gel’fand-Pin sker Problem X ✲ ϕ ✲ R > I ( X ; Y ) − I ( Y ; Z ) ϕ − 1 ✲ W D > E X Y Z [ ρ ( X , f ( Y , Z ))] Z ✻ Fig. 3. W yner-Zi v Problem the optim ality of codes u sing sparse matrice s for the lossy coding problem o f a n arbitrar y ( q -ary and biased) source an d a d istortion mea sure. In [25][3 6][23][27][14], a lossy code is prop osed by using sparse matrices called low density generato r matrices (LD GM) b y assuming an u nbiased source a nd Hammin g distortion. Th e co lumn/row we ight of matrices used in [24] is co nstant with respect to th e b lock length, while it gr ows loga rithmically in o ur constru ction. The lower b ound s on the rate-d istortion f unction is discussed in [8][19]. It should be noted th at the co nstruction of the co des is different from tho se presented in [25][36][23][24] [27][14]. The detailed difference is stated in Section V -C. Our constru ction is based on the code presented in [28][30] and similar to the code presen ted in [42][43][49]. When our theorem is applied to the en semble of sp arse matrices, our proof is simpler than that in [28]. • Th e achie vability of the On e-helps-on e pr oblem is proved b y using sp arse matrices and ML cod ing. I I . D E FI N I T I O N S A N D N OTA T I O N S Throu ghout this paper, we use the f ollowing definition s an d notations. Column vectors and sequence s are deno ted in boldface. L et A u den ote a value taken by a function A : U n → U at u ∈ U n where U n is a dom ain of the fu nction. It should be n oted that A m ay be no nlinear . When A is a line ar f unction expressed by an l × n matrix, we assume that U ≡ GF( q ) is a finite field and the range o f function s is defined by U ≡ U l . It shou ld be noted that this assum ption is not essential for g eneral (nonlinear) function s because discussion is no t ch anged if l log |U | is replaced by log | U | . For a set A of f unctions, let Im A be defined as Im A ≡ [ A ∈A { A u : u ∈ U n } . The card inality of a set U is den oted by |U | , U c denotes th e comp liment of U , and U \ V ≡ U ∩ V c denotes the set difference. W e define sets C A ( c ) and C AB ( c , b ) as C A ( c ) ≡ { u : A u = c } C AB ( c , b ) ≡ { u : A u = c , B u = b } . In the context of linear codes, C A ( c ) is called a co set d etermined by c . September 7, 2021 DRAFT 4 X ✲ ϕ X ✲ R X > H ( X | Z ) Y ✲ ϕ Y ✲ R Y > I ( Y ; Z ) ϕ − 1 ✲ X Fig. 4. One-helps-one Problem Let p and p ′ be p robability distributions an d let q and q ′ be co nditional pro bability distributions. Then entropy H ( p ) , conditio nal en tropy H ( q | p ) , di vergence D ( p k p ′ ) , and conditional divergence D ( q k q ′ | p ) are defined as H ( p ) ≡ X u p ( u ) log 1 p ( u ) H ( q | p ) ≡ X u,v q ( u | v ) p ( v ) log 1 q ( u | v ) D ( p k p ′ ) ≡ X u p ( u ) log p ( u ) p ′ ( u ) D ( q k q ′ | p ) ≡ X v p ( v ) X u q ( u | v ) log q ( u | v ) q ′ ( u | v ) , where we assume th e base 2 o f the logarithm when the subscrip t of log is omitted. Let µ U V be th e join t probab ility distribution o f ran dom variables U an d V . Let µ U and µ V be th e respective marginal distributions and µ U | V be the co nditional probability d istribution. Then the e ntropy H ( U ) , the condition al entro py H ( U | V ) , an d the mutual inf ormation I ( U ; V ) of random variables a re defined as H ( U ) ≡ H ( µ U ) H ( U | V ) ≡ H ( µ U | V | µ V ) I ( U ; V ) ≡ H ( U ) − H ( U | V ) . A set of typical sequences T U,γ and a set o f conditionally typical sequences T U | V ,γ ( v ) are defined as T U,γ ≡ { u : D ( ν u k µ U ) < γ } T U | V ,γ ( v ) ≡ u : D ( ν u | v k µ U | V | ν v ) < γ , respectively , where ν u and ν u | v are defined as ν u ( u ) ≡ |{ 1 ≤ i ≤ n : u i = u }| n ν u | v ( u | v ) ≡ ν uv ( u, v ) ν v ( v ) . W e define χ ( · ) as χ ( a = b ) ≡ 1 , if a = b 0 , if a 6 = b χ ( a 6 = b ) ≡ 1 , if a 6 = b 0 , if a = b. September 7, 2021 DRAFT 5 Finally , for γ , γ ′ > 0 , we d efine λ U ≡ |U | lo g[ n + 1 ] n (1) ζ U ( γ ) ≡ γ − p 2 γ log √ 2 γ |U | (2) ζ U |V ( γ ′ | γ ) ≡ γ ′ − p 2 γ ′ log √ 2 γ ′ |U ||V | + p 2 γ log |U | (3) η U ( γ ) ≡ − p 2 γ log √ 2 γ |U | + |U | lo g[ n + 1 ] n (4) η U |V ( γ ′ | γ ) ≡ − p 2 γ ′ log √ 2 γ ′ |U ||V | + p 2 γ log |U | + |U ||V | log [ n + 1] n . (5) It should be noted h ere th at the produ ct set U × V is denoted by U V when it appear s in the subscrip t of these function s. I I I . ( α , β ) - H A S H P RO P E RT Y In this section, we introd uce th e notion of the ( α , β ) -hash prop erty which is a sufficient condition for cod ing theorems, where the linearity of fun ctions is not assumed. The ( α , β ) -hash pro perty of an ensemble of linear (sparse) matrices will be discussed in Section IV. In Section V, we provide coding theorems for the Slepian-W olf problem , the Gel’fand-Pinsker pr oblem, the W y ner-Zi v problem , and the One-helps-o ne pro blem. Before stating th e f ormal definition, we explain the r andom coding arguments and two imp lications wh ich introdu ce the intuition of the h ash proper ty . A. T wo types of rando m cod ing W e r evie w the random codin g argument introdu ced by [37]. M ost of co ding theo rems are p roved by using the combina tion of the following two types o f random coding. Random c odebook gener ation: In the proo f of th e origin al chan nel coding theore m [37] and lossy source coding theorem [38], a codebook (a set of cod ew or ds/representatio ns) is rando mly gen erated and shared by the encoder and the deco der . It shou ld be noted th at the ra ndomly gen erated codeb ook repre sents the list of typical sequences. In the encoding step of channel coding, a message is mapped to a member of randomly generated codewords as a channel in put. In the decod ing step, we use the m aximum- likelihood decod er , which guess the most pro bable channel inpu t from the ch annel outpu t. In the encoding step of lossy sourc e coding, we fin d a member of rand omly gener ated representation s to satisfy the fidelity criterion co mpared to a message, an d th en we let the index of this member as a codeword. I n the deco ding step, the cod ew or d (index) is mapped to the reprod uction. It sh ould be noted tha t the en coder and the decod er have to share the large size ( exponentially in the block len gth) of table which in dicates the correspo ndence b etween a index and a mem ber of random ly generated codew ords/repr esentations. The time complexity of the encoding an d d ecoding step of channel coding and the enco ding step o f lossy source co ding is exponentially in th e blo ck length. They are o bstacles for the implementatio n. Random partitio ning (random bin coding): In the proof o f Slepian-W olf problem [ 3], th e set of all sequ ences is partitioned at ran dom and shared by the en coder and the deco der . In the encoding step, a pair o f messages September 7, 2021 DRAFT 6 (a) • • • • (b) • • • • • • (c) • • • • • • Fig. 5. Propertie s connec ting the number of bins and items (black dots, messages). (a) Collision-r esistant property: ev ery bin contain s at most one item. (b) Saturating property: ev ery bin contai ns at least one item. (c) Pigeonhole principle : there is at least one bin which contai ns two or more items. are mapped ind ependen tly to th e index o f bin wh ich contains the message. I n the de coding step, we u se the maximum -likelihood decod er , w hich g uess the most pro bable p air of messages. Random partitio ning by cosets determ ined by ran domly g enerated matrix can be co nsidered as a kind o f r andom b in coding , where the syndro me co rrespon ds the ind ex of bin. T his ap proach is intr oduced in [ 9] fo r the codin g of symmetric channel an d applied to the en semble o f sparse matric es in [26][2][1 0]. This argumen t is also ap plied to the coding th eorem for Slepian-W olf pr oblem in [45][5][33]. It should be no ted that th e time complexity of th e decodin g step is exponen tially in the blo ck leng th, but th ere are practical appr oximation metho ds by using sparse m atrices and the techniqu es in troduce d by [1][18][11]. By using the rando mly generate d matr ix, the size of tables shared by th e encoder an d the decoder has at most square order with respect to the blo ck len gth. One of the aim of intro ducing hash proper ty is to replace the ran dom cod ebook genera tion by th e ran dom partitioning . In oth er words, it is a un ification of these two random cod ing arguments. It is expecte d that th e space and time c omplexity can be reduc ed com pared to the ra ndom co deboo k generatio n. B. T wo implications of h ash pr o perty W e introduce the follo win g two implications of hash prope rty , which connect the number of bins and messages (items) and is essential fo r the co ding by using the rand om partitionin g. In Section III-D, the se two pro perty are derived f rom the hash pro perty by ad justing the number of bin s taking accoun t o f the nu mber of sequ ences. Collision-resistant property: The go od code assign s a message to a codeword which is different f rom the codewords of other messages, where th e loss (e rror pr obability) is as small as p ossible. The collision- resistant proper ty is th e n ature of the h ash prop erty . Figure 5 (a) rep resents th e id eal situation of this proper ty , wh ere the black do ts represent messages we want to d istinguish. Wh en the num ber of bins is greater than the number o f black dots, we can find a good fu nction that allocates the black dots to th e different b ins. It is because the hash pro perty tends to avoid the co llision. I t should be noted that it is enou gh fo r coding problem s to satisfy this pro perty f or ‘almost all (close to prob ability one )’ black d ots b y letting the ratio [ the number of b lack dots ] / [ the number of b ins ] close to zero. T his proper ty is u sed for the estimatio n of September 7, 2021 DRAFT 7 decodin g erro r p robability of lossless sou rce codin g b y u sing maximu m-likelihood decod er . I n this situation , the black dots correspond to the typical sequences. Saturating property: T o replace th e rando m codebo ok gener ation by th e rando m partitioning , we prepare the metho d findin g a ty pical sequen ce for each bin. T he saturatin g proper ty is the an other natu re of the hash proper ty . Figure 5 (b) represents the ideal situation of this prop erty . When the num ber of bins is smaller than the number o f black d ots, we can find a go od func tion so that every bins has a t least o ne black d ot. It is because the hash pro perty tends to av oid the co llision. It should be n oted that this p roperty is different from the pigeonh ole p rinciple: there is at least one bin which include s two or more black dots. Figure 5 (c) rep resents an unusu al situation, which does not contr adict by th e p igeonho le princip le while the hash pro perty te nds to av oid this situation. It sho uld be noted that it is eno ugh for codin g problems to satisfy this prop erty for ‘almost all (close to pr obability o ne)’ bin s by letting the ratio [ the number of bins ] / [ the n umber of black d ots ] close to zero. T o find a typ ical sequ ence from each bin , we use the maxim um-likelihoo d/minimum-diver gence codin g introdu ced in Section III-D. In th is situation, the black d ots correspond to th e typical sequ ences. C. F ormal definition of ( α , β ) -h ash pr o perty In this section, we intr oduce the fo rmal definition of the h ash property . In the proof of the fixed-rate sour ce coding th eorem giv en in [3][5][ 17], it is p roved implicitly that there is a probability distrib u tion p A on a set o f functions A : U n → U l such that p A ( { A : ∃ u ′ ∈ G \ { u } , A u ′ = A u } ) ≤ |G | |U | l (6) for any u ∈ U n , where G ≡ { u ′ : µ ( u ′ ) ≥ µ ( u ) , u 6 = u ′ } (7) and µ is the probab ility d istribution of a sour ce or the probab ility distribution o f the additive noise of a channe l. In the p roof of c oding theorems f or spare m atrices giv en in [2][10][26][33][34], it is pr oved implicitly th at there are a proba bility distrib u tion on a set of l × n spare matrices and sequen ces α ≡ { α ( n ) } ∞ n =1 and β ≡ { β ( n ) } ∞ n =1 satisfying lim n →∞ log α ( n ) n = 0 lim n →∞ β ( n ) = 0 such that p A ( { A : ∃ u ′ ∈ G \ { u } , A u ′ = A u } ) ≤ |G | α ( n ) |U | l + β ( n ) (8) for any u ∈ U n , wh ere α ( n ) measures how the ensemb le of l × n sparse matrices differs from the en semble of all l × n m atrices and β ( n ) measures the prob ability that the code deter mined by an l × n spa rse matrix has low-weight c odewords. It sho uld be n oted that the collision-re sistant p roperty can be derived fro m ( 6) an d (8). It is sho wn in Section III-D. The aim of this paper is n ot only u nifying the a bove results, but also p roviding sev e ral coding theorem s under th e gen eral setting s such as an asymmetric chann el for chan nel codin g an d an unbiased source for lossy source coding. T o this end, we define an ( α , β ) -hash property in the follo wing. September 7, 2021 DRAFT 8 Definition 1: L et A be a set of functio ns A : U n → U A and w e assume that lim n →∞ log | U A | | Im A| n = 0 . (H1) For a pro bability distribution 2 p A on A , we call a pa ir ( A , p A ) a n en semble 3 . Then, a n ensemble ( A , p A ) h as an ( α A , β A ) -hash pr o perty if there are two sequences 4 α A ≡ { α A ( n ) } ∞ n =1 and β A ≡ { β A ( n ) } ∞ n =1 such th at lim n →∞ α A ( n ) = 1 (H2) lim n →∞ β A ( n ) = 0 , (H3) and X u ∈T u ′ ∈T ′ p A ( { A : A u = A u ′ } ) ≤ |T ∩ T ′ | + |T ||T ′ | α A ( n ) | Im A| + min {|T | , |T ′ |} β A ( n ) (H4) for any T , T ′ ⊂ U n . Throug hout this paper, we o mit d epende nce on n of α A and β A when n is fixed. It sho uld be noted that we hav e (8) from ( H4) by letting T ≡ { u } and T ′ ≡ G , and (6 ) is the case when α A ( n ) ≡ 1 a nd β A ( n ) ≡ 0 . In th e r ight ha nd side of th e inequality (H4 ), th e first term corresp onds to the sum of p A ( { A : A u = A u } ) = 1 over a ll u ∈ T ∩ T ′ , the secon d term boun ds the su m of th e probab ility p A ( { A : A u = A u ′ } ) which are appro ximately 1 / | Im A| fo r u 6 = u ′ , and the thir d ter m b ounds the sum of the probab ility p A ( { A : A u = A u ′ } ) far gre ater than 1 / | Im A| for u 6 = u ′ . T his intu ition is explained in Section IV for the en semble of m atrices. In the following, we p resent two examp les of ensembles that hav e a hash property . Example 1 : Our ter minology ‘hash’ is deriv e d from a universal class o f hash f unctions introduced in [ 7]. W e call a set A of functions A : U n → U A a universal cla ss of ha sh fu nctions if | { A : A u = A u ′ } | ≤ |A| |U A | for any u 6 = u ′ . For example , the set of all functions o n U n and the set of all linear func tions A : U n → U l A are classes of u niversal hash fu nctions (see [7] ). Furthermore , fo r U n ≡ GF (2 n ) , the set A ≡ A : A u ≡ [ th e first l A bits of au ] a ∈ GF (2 n ) is a un i versal class of hash fun ctions, wh ere a u is a multiplication of two elements a , u ∈ GF(2 n ) . It sho uld be noted that every example above satisfies Im A = U A . When A is a class o f u niv ersal hash fun ctions a nd p A 2 It should be noted that p A does not depend on a particu lar function A . Strictl y speaking, the subscript A of p represent s the random v ariabl e of a function. W e use this ambiguous notation when A apera rs in the subscript of p because random vari ables are al ways denoted in Roman letter . 3 In the standard definition, an ensemble is defined as a set of functions and a uniform distributio n is assumed for this set. It should be noted that an ensemble is defined as the probabilit y distribu tion on a set of functions in this paper . 4 It should be noted tha t α A and β A do not depe nd on a particular functio n A b ut may depend on an ensemble ( A , p A ) . Strictly speaking , the subscript A represents the random varia ble. September 7, 2021 DRAFT 9 is the uniform d istribution on A , we have X u ∈T u ′ ∈T ′ p A ( { A : A u = A u ′ } ) ≤ |T ∩ T ′ | + |T ||T ′ | | Im A| . This implies that ( A , p ) has a ( 1 , 0 ) -hash property , wh ere 1 ( n ) ≡ 1 and 0 ( n ) ≡ 0 for every n . Example 2: In this example, we co nsider a set of lin ear function s A : U n → U l A . It was discu ssed in the above example that the unifo rm distribution on the set of all line ar f unctions has a ( 1 , 0 ) -h ash p roper ty . The hash prop erty o f an ensemble of q - ary sparse matrices will be discussed in Section IV. The bin ary version of this ensemble is in troduce d in [21]. D. B asic lemmas o f hash pr operty In the followi ng, basic lemmas of the ( α , β ) -ha sh proper ty are introduced . All lem mas are pr oved in Section VI-A. Let A (r esp. B ) be a set o f fu nctions A : U n → U A (resp. B : U n → U B ). W e assum e that ( A , p A ) ( resp. ( B , p B ) ) has an ( α A , β A ) -hash (resp. ( α B , β B ) -hash) pr operty . W e also assume that p C is the unifor m distribution on Im A , and rand om variables A , B , and C are mu tually in depend ent, that is, p C ( c ) = 1 | Im A| , if c ∈ Im A 0 , if c ∈ U A \ Im A p AB C ( A, B , c ) = p A ( A ) p B ( B ) p C ( c ) for any A , B , and c . First, w e dem onstrate that th e co llision-resistant pro perty and saturating p roperty ar e der i ved from the ( α , β ) - hash property . The first lemma in troduce the collision-resistant p roperty . Lemma 1: For any G ⊂ U n and u ∈ U n , p A ( { A : [ G \ { u } ] ∩ C A ( A u ) 6 = ∅} ) ≤ |G | α A | Im A| + β A . W e prove the co llision-resistant pro perty from L emma 1. Let µ U be the prob ability distribution on G ⊂ U n . W e hav e E A [ µ U ( { u : [ G \ { u } ] ∩ C A ( A u ) 6 = ∅} )] ≤ X u ∈G µ U ( u ) p A ( { A : [ G \ { u } ] ∩ C A ( A u ) 6 = ∅} ) ≤ X u ∈G µ U ( u ) |G | α A | Im A| + β A ≤ |G | α A | Im A| + β A . By assuming that |G | / | Im A| vanishes as n → ∞ , we hav e the fact th at there is a fu nction A such that µ U ( { u : [ G \ { u } ] ∩ C A ( A u ) 6 = ∅} ) < δ for any δ > 0 an d sufficiently large n . Since the r elation [ G \ { u } ] ∩ C A ( A u ) 6 = ∅ corr esponds to the event that th ere is u ′ ∈ G , u ′ 6 = u such th at u and u ′ are the mem bers of the same bin (have the same codeword determined by A ), we ha ve the fact that the m embers of G are loca ted in th e different bins (th e mem bers of G September 7, 2021 DRAFT 10 can be decod ed correctly) with high pro bability . In the proo f of fixed-rate sou rce cod ing, G is defined by (7) for g i ven probability distribution µ U of a source U , where µ U ( G c ) is close to zero. In th e linear coding of a channel with additiv e noise, additive n oise u can be specified by the syndro me A u obtained by operating the parity chec k matr ix A to the cha nnel outp ut. It sho uld be noted that, when Lemm a 1 is applied, it is sufficient to assume that lim n →∞ log α A ( n ) n = 0 instead of (H2) becau se it is usua lly assum ed that |G | / | Im A| vanishes expon entially by letting n → ∞ . It is implicitly proved in [26][2][1 0][33] that some ensem bles of sparse linear matr ices have this weak h ash p roperty . In fact, the con dition (H2) is requ ired f or the satu rating property . The second lemma introduc e the saturating pro perty . W e use the folloin g lemm a in the proof of the coding theorems of the Ge l’fand-Pinsker pro blem, the W yn er-Zi v problem, and the On e-helps-on e prob lem. Lemma 2: If T 6 = ∅ , then p AC ( { ( A, c ) : T ∩ C A ( c ) = ∅} ) ≤ α A − 1 + | Im A| [ β A + 1] |T | . W e prove th e saturating p roperty form Lemma 2. W e have E A [ p C ( { c : T ∩ C A ( c ) = ∅} )] = p AC ( { ( A, c ) : T ∩ C A ( c ) = ∅} ) ≤ α A − 1 + | Im A| [ β A + 1] |T | . By assuming that | Im A| / |T | vanishes as n → ∞ , we have the fact that ther e is a f unction A su ch that p C ( { c : T ∩ C A ( c ) = ∅} ) < δ for any δ > 0 and suf ficien tly large n . Since the relation T ∩ C A ( c ) = ∅ corre sponds to the e vent that the re is no u ∈ T in the bin C A ( c ) , we have the fact that we can fin d a m ember o f T in the randomly selected bin with high probability . It should be noted that, when Lemma 2 is applied, it is suf ficien t to a ssume that lim n →∞ log β A ( n ) n = 0 instead of (H3) becau se it is usually assumed that | Im A| / |T | vanishes exponentially by letting n → ∞ . In fact, th e condition (H3) is re quired for the collision -resistant p roperty . Next, we pre pare the lemmas used in the p roof o f c oding theorems. Th e f ollowing lemmas come f rom Lemma 1. Lemma 3: If G ⊂ U n and u / ∈ G , then p AC ( A, c ) : G ∩ C A ( c ) 6 = ∅ u ∈ C A ( c ) ≤ |G | α A | Im A| 2 + β A | Im A| . Lemma 4: Assume tha t u A, c ∈ U n depend s on A and c . Th en p AB C ( { ( A, B , c ) : [ G \ { u A, c } ] ∩ C AB ( c , B u A, c ) 6 = ∅} ) ≤ |G | α B | Im A|| Im B | + β B for any G ⊂ U n . Finally , we intro duce the meth od for find ing a typ ical sequenc e in a bin . The pr obability of a n event where the function finds a c ondition ally typical sequ ence is evaluated b y th e following lemmas. They are the key lemmas September 7, 2021 DRAFT 11 for the codin g theorems f or the Gel’fand-Pin sker proble m, th e W y ner-Zi v p roblem, an d the One-help s-one problem . The se lem mas a re proved by using Lemma 2 . For ε > 0 , let l A ≡ n [ H ( U | V ) − ε ] log |U | and assume that A is a set of functions A : U n → U l A and v ∈ T V ,γ . Lemma 5: W e de fine a ma ximum-likelihood (ML) codin g fun ction g A under constraint u ∈ C A ( c ) as g A ( c | v ) ≡ arg max u ∈C A ( c ) µ U | V ( u | v ) = ar g max u ∈C A ( c ) µ U V ( u , v ) and assume that a set T ( v ) ⊂ T U | V , 2 ε ( v ) satisfies • T ( v ) is no t empty , and • if u ∈ T ( v ) and u ′ satisfies µ U | V ( u | v ) ≤ µ U | V ( u ′ | v ) ≤ 2 − n [ H ( U | V ) − 2 ε ] then u ′ ∈ T ( v ) . In fact, we can construct such T ( v ) by ta king up |T ( v ) | elements from T U | V , 2 ε ( v ) in the order o f pr obability rank. If an e nsemble ( A , p A ) of a set o f functions A : U n → U l A has an ( α A , β A ) -hash property , th en p AC ( { ( A, c ) : g A ( c | v ) / ∈ T ( v ) } ) ≤ α A − 1 + | Im A| [ β A + 1] |T ( v ) | + 2 − nε |U | l A | Im A| for any v satisfyin g T U | V , 2 ε ( v ) 6 = ∅ . Lemma 6: W e de fine a min imum-diverg ence (MD) coding function b g A under constraint u ∈ C A ( c ) as b g A ( c | v ) ≡ arg min u ∈C A ( c ) D ( ν u | v k µ U | V | ν v ) = ar g min u ∈C A ( c ) D ( ν uv k µ U V ) and assume that, for γ > 0 , a set T ⊂ T U | V ,γ ( v ) satisfies that if u ∈ T a nd u ′ satisfies D ( ν u ′ | v k µ U | V | ν v ) ≤ D ( ν u | v k µ U | V | ν v ) then u ′ ∈ T . In fact, we ca n construct such T by picking up |T | elem ents from T U | V ,γ ( v ) in th e descending order of conditional d iv e rgence. T hen p AC ( { ( A, c ) : b g A ( c | v ) / ∈ T ( v ) } ) ≤ α A − 1 + | Im A| [ β A + 1] |T ( v ) | for any v satisfyin g T U | V ,γ ( v ) 6 = ∅ . In Sectio n V, we construc t co des by using the maximum-likelihoo d c oding fu nction. It should be n oted that we can rep lace the max imum-likelihoo d coding function by th e minimum-d i vergence coding fu nction an d p rove theorems simpler than that pr esented in th is paper . September 7, 2021 DRAFT 12 I V . H A S H P R O P E RT Y F O R E N S E M B L E O F M A T R I C E S In this section, we discu ss the hash p roperty of an en semble o f (sparse) m atrices. First, we intro duce the average spectrum of an en semble o f matrices given in [2]. Let U be a fin ite field and p A be a pro bability d istribution on a set of l A × n matrices. It sho uld b e no ted th at A rep resents a corr espondin g linear function A : U n → U l A , where we de fine U A ≡ U l A . Let t ( u ) be the type 5 of u ∈ U n , where a type is characterized b y the number nν u of occurr ences of each symbol in the sequen ce u . Let H be a set of all types of length n except t ( 0 ) , where 0 is the zer o vector . For the probab ility distribution p A on a set o f l A × n matrices, let S ( p A , t ) be d efined as S ( p A , t ) ≡ X A p A ( A ) |{ u ∈ U n : A u = 0 , t ( u ) = t } | . For b H ⊂ H , we define α A ( n ) and β A ( n ) as α A ( n ) ≡ | Im A| |U | l A · max t ∈ b H S ( p A , t ) S ( u A , t ) (9) β A ( n ) ≡ X t ∈H\ b H S ( p A , t ) , (10) where u A denotes the uniform distribution on th e set of all l A × n matrices. When U ≡ GF(2) and b H is a set of h igh-weig ht types, α A measures how the ensemb le ( A , p A ) d iffers from th e ensem ble of all l A × n matrices with respect to h igh-weigh t p art of average spe ctrum a nd β A provides th e upper b ound o f the probability that the code { u ∈ U n : A u = 0 } has low-weight cod ew or ds. It should be noted that e α A ( n ) ≡ max t ∈ b H S ( p A , t ) S ( u A , t ) (11) is intro duced in [2 6][2][10][33][34] instead of α A ( n ) . W e multiply the coefficient | Im A| / |U | l A so that α A satisfies (H2). W e hav e the following theor em. Theor e m 1 : Let ( A , p A ) be an ensemble of matric es an d assume that p A ( { A : A u = 0 } ) depend s on u o nly throug h th e ty pe t ( u ) . If | U A | / | Im A| satisfies (H1 ) and ( α A ( n ) , β A ( n )) , defined by (9) and (1 0), satisfies (H2) and (H3), th en ( A , p A ) has an ( α A , β A ) -hash prop erty . The proof is g i ven in Section VI-B. Next, we con sider the in depend ent co mbination of two ensembles ( A , p A ) and ( B , p B ) , of l A × n and l B × n matrices, respectively . W e assum e that ( A , p A ) has an ( α A , β A ) -hash prope rty , where ( α A ( n ) , β A ( n )) is defined as ( 9) and (1 0). Similarly we defin e ( α B ( n ) , β B ( n )) for an ensemble ( B , p B ) and assume that ( B , p B ) h as an ( α B , β B ) -hash property . Le t p AB be the joint d istribution defined as p AB ( A, B ) ≡ p A ( A ) p B ( B ) . W e hav e the following two lemmas. The proof is given in Section VI-B. 5 As in [6], the type is defined in terms of the empirical probabil ity distribu tion ν u . In our definiti on, the type is the number nν u of occuren ces which is dif ferent from the empilical probablity distribut ion. September 7, 2021 DRAFT 13 Lemma 7: Let ( α AB ( n ) , β AB ( n )) defined as α AB ( n ) ≡ α A ( n ) α B ( n ) β AB ( n ) ≡ min { β A ( n ) , β B ( n ) } . Then the ensemble ( A × B , p AB ) of functions A ⊕ B : U n → U l A + l B defined as A ⊕ B ( u ) ≡ ( A u , B u ) has an ( α AB , β AB ) -hash property . Lemma 8: Let ( α AB ( n ) , β ′ AB ( n )) be defined as α AB ( n ) ≡ α A ( n ) α B ( n ) β ′ AB ( n ) ≡ α A ( n ) β B ( n ) | Im A| + α B ( n ) β A ( n ) | Im B | + β A ( n ) β B ( n ) . Then the ensemble ( A × B , p AB ) of functions A ⊗ B : U n × V n → U l A × V l B defined as A ⊗ B ( u , v ) ≡ ( A u , B v ) has an ( α AB , β ′ AB ) -hash property . Finally , we in troduce a n ensemble of q -ary sp arse matrices, wh ere the binary version of th is ensemble is propo sed in [21]. In the fo llowing, let U ≡ GF( q ) and l A ≡ nR . W e gen erate an l A × n matrix A with the following pro cedure, where at most τ random n onzero elements are introduced in every row . 1) Start fro m an all-ze ro m atrix. 2) For each i ∈ { 1 , . . . , n } , repeat the following proce dure τ times: a) Choo se ( j, a ) ∈ { 1 , . . . , l A } × [GF( q ) \ { 0 } ] uniformly at ran dom. b) Add a to the ( j, i ) compon ent of A . Let ( A , p A ) be an ensemble corresponding to the above pro cedure. It is proved in Section VI-C that p A ( { A : A u = 0 } ) depend s on u only th rough the type t ( u ) . Le t ( α A ( n ) , β A ( n )) b e d efined by (9) and (10 ) for th is ensemb le. W e assume that column weigh t τ = O (log n ) is e ven . Let w ( t ) be the weight of type t = ( t (0) , . . . , t ( q − 1)) defined as w ( t ) ≡ q − 1 X i =1 t ( i ) and w ( u ) b e d efined as w ( u ) ≡ w ( t ( u )) . W e define b H ≡ { t : w ( t ) > ξ l A } . (12) Then it is also pr oved in Sectio n VI-C that Im A = { u ∈ U l A : w ( u ) is even } , if q = 2 U l , if q > 2 September 7, 2021 DRAFT 14 | Im A| |U | l A = 2 , if q = 2 1 , if q > 2 and ther e is ξ > 0 such that ( α A , β A ) satisfies ( H2) and (H3). From Th eorem 1, we have the following theorem. Theor e m 2 : For the ab ove ensemble ( A , p A ) of spar se m atrices, let ( α A ( n ) , β A ( n )) be define d by (9), (10) , (12), and suitable { τ ( n ) } ∞ n =1 and ξ > 0 . Then ( A , p A ) has an ( α A , β A ) -hash property . W e p rove the theo rem in Section VI-C. It should b e no ted here that, as we can see in th e pr oof of the theor em, the asymptotic behavior (convergence speed) of ( α A , β A ) depends on the weig ht τ . Remark 1: I t is proved in [26][33] that ( e α A ( n ) , β A ( n )) , defined b y (11) and (10 ), satisfies weaker prop erties lim n →∞ log e α A ( n ) n = 0 (13) and (H3) when q = 2 . It is p roved in [ 2, Section III, Eq . (23),(8 2)] that ( e α A ( n ) , β A ( n )) of ano ther ensemble of Modulo- q LDPC matrices satisfies weaker proper ties (13) and (H3). V . C O D I N G T H E O R E M S In th is section, we presen t se vera l coding theorem s. W e prove these theo rems in Sectio n VI ba sed on th e hash property . Throu ghout this section, the encoder and decod er are denoted b y ϕ and ϕ − 1 , respectively . W e assume that the dimension of vectors x , y , z , and w is n . A. Slepian-W olf Pr o blem In this section, we consider the Slepian-W olf p roblem illustrated in Fig. 1 . The ac hiev able rate region for this problem is g iv en by the set of encoding ra te p air ( R X , R Y ) satisfying R X ≥ H ( X | Y ) R Y ≥ H ( Y | X ) R X + R Y ≥ H ( X , Y ) . The achiev ab ility of the Slepian -W o lf problem is proved in [3] and [5] for the ensemble of bin-codin g and all q -ary linear matrices, respectively . The construction s of encoder s using sparse matrices is stud ied in [3 5][40][34] and the achiev a bility is proved in [33] by using ML decod ing. The aim of this section is to demon strate the proof of the coding th eorem based o n the h ash proper ty . The pr oof is given in Section VI-D. W e fix f unctions A : X n → X l A B : Y n → Y l B which are available to construct encoders a nd a dec oder . W e d efine the enc oders and the decoder (illustrated in Fig. 6) ϕ X : X n → X l A September 7, 2021 DRAFT 15 Encoder s x ✲ A ✲ A x y ✲ B ✲ B y Decoder A x ✲ B y ✲ g AB ✲ ( x , y ) Fig. 6. Construction of Slepian-W olf Source Code ϕ Y : Y n → Y l B ϕ − 1 : X l A × Y l B → X n × Y n as ϕ X ( x ) ≡ A x ϕ Y ( y ) ≡ B y ϕ − 1 ( b X , b Y ) ≡ g AB ( b X , b Y ) , where g AB ( b X , b Y ) ≡ ar g max ( x ′ , y ′ ) ∈C A ( b X ) ×C B ( b Y ) µ X Y ( x ′ , y ′ ) . The encoding rate p air ( R X , R Y ) is gi ven by R X ≡ l A log |X | n R Y ≡ l B log |Y | n and the error probability Er ror X Y ( A, B ) is given b y Erro r X Y ( A, B ) ≡ µ X Y ( x , y ) : ϕ − 1 ( ϕ X ( x ) , ϕ Y ( y )) 6 = ( x , y ) . W e hav e th e following theorem . It should be noted that and alphabets X an d Y may not be binary and the correlation of the two sources may not be symmetric. Theor e m 3 : Assume that ( A , p A ) , ( B , p B ) , a nd ( A × B , p A × p B ) h av e hash pr operty . Let ( X , Y ) be a pair of stationary memoryless sources. I f ( R X , R Y ) satisfies R X > H ( X | Y ) (14) R Y > H ( Y | X ) (15) R X + R Y > H ( X , Y ) , (16) then for any δ > 0 and all sufficiently large n there ar e function s (sparse matrices) A ∈ A and B ∈ B such that Erro r X Y ( A, B ) ≤ δ. September 7, 2021 DRAFT 16 M ✲ ϕ ✲ X ✲ µ Y | X ✲ Y ✲ ϕ − 1 ✲ M R < I ( X ; Y ) Fig. 7. Channel Coding Remark 2: I n [3][5], ran dom (linear) bin -coding is used to prove the achievability of th e above theo rem. In fact, random b in-codin g is equivalent to a unifor m ensemble on a set of all (linea r) function s and it has a ( 1 , 0 ) -hash property . Remark 3: T he above the orem includ es the fixed-rate coding of a single source X as a special case o f the Slepian-W olf p roblem with |Y | ≡ 1 . This im plies that the enco ding rate c an achieve the entro py of a source. It should be noted that sou rce coding u sing a class of hash function s is studied in [22, Section 14.2][17]. Remark 4: Assum ing |Y | ≡ 1 , we can prove the cod ing theore m for a c hannel with additive no ise X by letting A and { x : A x = 0 } be a p arity check matrix and a set of codewords (chann el inputs), respectively . This implies that the en coding rate of th e channel can achieve the channel capa city . The coding theorem f or a channel with additiv e noise is proved by using a low d ensity pa rity check (LDPC) matr ix in [26][2][10]. B. Gel’fand-Pin sker P r oblem In this section we con sider the Gel’ fand-Pinsker prob lem illustrated in Fig. 2. First, we construct a code for th e standard c hannel cod ing prob lem illustrated in Fig. 7, wh ich is a special case of Gel’fand-Pinsker problem . A ch annel is g i ven by the cond itional probability distribution µ Y | X , wh ere X an d Y ar e rando m variables correspo nding to the channel input and ch annel output, r espectiv ely . The capacity of a channel is given b y Capacity ≡ max µ X I ( X ; Y ) , where the maximu m is taken over all probability distributions µ X and the joint distribution of rando m variable ( X, Y ) is given b y µ X Y ( x, y ) ≡ µ Y | X ( y | x ) µ X ( x ) . The code for this p roblem is g iv en below (illustrated in Fig. 7). W e fix fu nctions A : X n → X l A B : X n → X l B and a vector c ∈ X l A av ailable to construct an encoder and a decod er , wher e l A ≡ n [ H ( X | Y ) + ε A ] log |X | l B ≡ n [ I ( X ; Y ) − ε B ] log |X | . W e define the en coder and the deco der ϕ : X l B → X n September 7, 2021 DRAFT 17 Encoder c ✲ m ✲ g AB ✲ x Decoder c ✲ y ✲ g A ✲ x ✲ B ✲ m Fig. 8. Construction of Channel Code ϕ − 1 : Y n → X l B as ϕ ( m ) ≡ g AB ( c , m ) ϕ − 1 ( m ) ≡ B g A ( c | y ) , where g AB ( c , m ) ≡ a rg max x ′ ∈C AB ( c , m ) µ X ( x ′ ) g A ( c | y ) ≡ arg max x ′ ∈C A ( c ) µ X Y ( x ′ | y ) . Let M be the random variable correspo nding to the message m , where the probability p M ( m ) is gi ven by p M ( m ) ≡ 1 | Im B| , if m ∈ Im B 0 , if m / ∈ Im B . The rate R ( B ) of this code is giv en b y R ( B ) ≡ log | Im B | n = l B log |W | n − log |W | l B | Im B| n and the decoding error pr obability Error Y | X ( A, B , c ) is giv en by Erro r Y | X ( A, B , c ) ≡ X m , y p M ( m ) µ Y | X ( y | ϕ ( m )) χ ( ϕ − 1 ( y ) 6 = m ) . In the following, we pr ovide an intuitive interpretation of the co nstruction of the cod e, which is illustra ted in Fig. 8. Assume th at c is shared by the encoder and the decoder . For c and a m essage m , the function g AB generates a typical sequ ence x ∈ T X,γ as a chan nel inp ut. The deco der r eprodu ces the channel input x by using g A from c an d a ch annel outpu t y . Since ( x , y ) is join tly typical and B x = m , the d ecoding succeed if September 7, 2021 DRAFT 18 the amoun t of info rmation o f c is greater than H ( X | Y ) to satisfy the collision-resistant prop erty . On th e other hand, the total rate of c and m shou ld be less than H ( X ) to satisfy the saturatin g property . Then we can set the encodin g rate of m c lose to H ( X ) − H ( X | Y ) = I ( X ; Y ) . W e h av e the following theor em It should b e no ted tha t alpha bets X an d Y ar e allowed to be non -binary , and the channel is allowed to be a symmetric. Theor e m 4 : For given ε A , ε B > 0 satisfying ε B − ε A ≤ p 6[ ε B − ε A ] log |X | < ε A , assume that ( A , p A ) and ( A × B , p A × p B ) ha ve hash p roperty . Let µ Y | X be the cond itional probab ility distribution of a stationar y me moryless cha nnel. The n, f or all δ > 0 an d sufficiently large n ther e are f unctions (sparse matrices) A ∈ A , B ∈ B , an d a vector c ∈ Im A such th at R ( B ) ≥ I ( X ; Y ) − ε B − δ Erro r Y | X ( A, B , c ) < δ. By assuming that µ X attains the chann el capacity and δ → 0 , ε B → 0 , the rate of th e prop osed co de is close to the capaticy . Next, we co nsider the Gel’fand- Pinsker problem illustrated in Fig. 2. A channe l with side info rmation is given by the condition al p robab ility d istribution µ Y | X Z , where X , Y and Z ar e rando m v a riables correspo nding to the chan nel input, channel outp ut, and chann el side inf ormation, respectively . The cap acity of a channe l with side informatio n is giv en by Capacity ≡ max µ X W | Z [ I ( W ; Y ) − I ( W ; Z )] , where the max imum is taken ov er all condition al probab ility distributions µ X W | Z and the joint distribution of random variable ( X , Y , Z , W ) is given by µ X Y Z W ( x, y , z , w ) ≡ µ X W | Z ( x, w | z ) µ Y | X Z ( y | x, z ) µ Z ( z ) . (17) In the following, we assume th at µ X W | Z is fixed. W e fix function s A : W n → W l A B : W n → W l B b A : X n → X l b A and vectors c ∈ W l A and b c ∈ X l b A av ailable to construct an en coder and a d ecoder, wher e l A ≡ n [ H ( W | Y ) + ε A ] log |W | l B ≡ n [ H ( W | Z ) − H ( W | Y ) − ε B ] log |W | = n [ I ( W ; Y ) − I ( W ; Z ) − ε B ] log |W | l b A ≡ n [ H ( X | Z , W ) − ε b A ] log |X | . September 7, 2021 DRAFT 19 Encoder c ✲ m ✲ g AB ✲ w ✲ b c ✲ g b A ✲ z ✻ ✻ x Decoder c ✲ y ✲ g A ✲ w ✲ B ✲ m Fig. 9. Construction of Gel’fa nd-Pinsk er Channel Code W e define the en coder and the deco der ϕ : W l B × Z n → X n ϕ − 1 : Y n → W l B as ϕ ( m | z ) ≡ g b A ( b c | z , g AB ( c , m | z )) ϕ − 1 ( m ) ≡ B g A ( c | y ) , where g AB ( c , m | z ) ≡ arg max w ′ ∈C AB ( c , m ) µ W | Z ( w ′ | z ) g b A ( b c | z , w ) ≡ arg max x ′ ∈C b A ( b c ) µ X | Z W ( x | z , w ) g A ( c | y ) ≡ arg max w ′ ∈C A ( c ) µ W | Y ( w ′ | y ) . Let M be the random variable correspon ding to the message m , where the prob ability p M ( m ) an d p M Z ( m , z ) are gi ven by p M ( m ) ≡ 1 | Im B| , if m ∈ Im B 0 , if m / ∈ Im B p M Z ( m , z ) ≡ p M ( m ) µ Z ( z ) . The rate R ( B ) of this code is giv en b y R ( B ) ≡ log | Im B | n September 7, 2021 DRAFT 20 = l B log |W | n − log |W | l B | Im B| n and the decoding error pr obability Error Y | X Z ( A, B , b A, c , b c ) is gi ven by Erro r Y | X Z ( A, B , b A, c , b c ) ≡ X m , y , z p M ( m ) µ Z ( z ) µ Y | X Z ( y | ϕ ( m , z ) , z ) χ ( ϕ − 1 ( y ) 6 = m ) . In th e fo llowing, we provid e an intuitive inter pretation of the construc tion of the code, w hich is illustrated in Fig. 9 . Assume th at c is shared b y the encode r and the decoder . For c , a message m , and a side in formatio n z , the function g AB generates a typical sequence w ∈ T Z,γ ( z ) and the fu nction g b A generates a typical sequence x ∈ T X | W Z,γ ( w , z ) as a channel in put. T he decoder rep roduces th e c hannel inp ut w by using g A from c an d a chann el ou tput y . Since ( w , y ) is jo intly typical and B w = m , the d ecoding succeed if the rate of c is greater than H ( W | Y ) to satisfy th e collision-resistant prop erty . On the o ther hand, the rate o f b c should be less than H ( X | Z , W ) an d the to tal rate o f c and m shou ld be less than H ( W | Z ) to satisfy the satur ating pro perty . Then we can set the en coding r ate of m close to H ( W | Z ) − H ( W | Y ) = I ( W ; Y ) − I ( W ; Z ) . W e hav e the f ollowing theor em. It shou ld b e noted that alphab ets X , Y , W , and Z ar e allowed to be non-b inary , and the channel is allowed to be asym metric. Theor e m 5 : For given ε A , ε B , ε b A > 0 satisfying ε B − ε A ≤ p 6[ ε B − ε A ] log |Z ||W | < ε A (18) 2 ζ Y W (6 ε b A ) < ε A , (19) assume that ( A , p A ) , ( A × B , p A × p B ) and ( b A , p b A ) have hash proper ty . Let µ Y | X Z be the cond itional probability distribution of a stationar y me moryless cha nnel. The n, f or all δ > 0 an d sufficiently large n ther e are f unctions (sparse matrices) A ∈ A , B ∈ B , b A ∈ b A , and vectors c ∈ Im A , b c ∈ Im b A such that R ( B ) ≥ I ( W ; Y ) − I ( W ; Z ) − ε B − δ Erro r Y | X Z ( A, B , b A, c , b c ) < δ. By assuming that µ X W | Z attains the Gel’fand -Pinsker bou nd, an d δ → 0 , ε B → 0 , th e rate of the prop osed code is close to this b ound. The proof is g iv en in Section VI-E. It should be noted that Theore m 4 is a special case of Gel’fand-Pinsker problem with |Z | ≡ 1 and W ≡ X . Remark 5: I n [ 24], th e code for th e Gel’fand-Pinsker pro blem is prop osed by using a com bination of two sparse matrices when all the alph abets are b inary and the chan nel side in formation and no ise are additive. In their constructed encoder, they obtain a vector called the ‘middle layer’ by using one of two matrices and obtain a c hannel inp ut by oper ating anoth er matrix on the middle layer an d ad ding the side info rmation. In our construction , we obtain w by using two m atrices A , B , and g AB , where the d imension of w d iffers f rom that of the middle lay er . W e obtain chann el inp ut x by u sing b A and g b A instead of ad ding the side in formatio n. It should be noted that our appr oach is based on th e constructio n of the channel code presented in [33][2 9], which is also d ifferent from the construction presented in [1 2][2]. September 7, 2021 DRAFT 21 X ✲ ϕ ✲ R > I ( X ; Y ) ϕ − 1 ✲ Y D > E X Y ρ ( X , Y ) Fig. 10. Lossy Source Coding Encoder c ✲ x ✲ g A ✲ y ✲ B ✲ B y Decoder c ✲ B y ✲ g AB ✲ y Fig. 11. Construction of Lossy Source Code C. W yner-Ziv P r oblem In this section we con sider the W y ner-Zi v pr oblem introduced in [4 7] ( illustrated in Fig. 3 ). First, we co nstruct a code for the stand ard lo ssy sou rce codin g proble m illustrated in Fig . 1 0, which is a special case of the W y ner-Zi v problem. Let ρ : X × Y → [0 , ∞ ) be the distortion measure satisfying ρ max ≡ max x,y ρ ( x, y ) < ∞ . W e define ρ n ( x , y ) as ρ n ( x , y ) ≡ n X i =1 ρ ( x i , y i ) for each x ≡ ( x 1 , . . . , x n ) and y ≡ ( y 1 , . . . , y n ) . For a p robability distribution µ X , the ra te-distortion fun ction R X ( D ) is gi ven by R X ( D ) = min µ Y | X : E X Y [ ρ ( X,Y )] ≤ D I ( X ; Y ) , where the minimum is taken over all c ondition al pro bability distributions µ Y | X and th e join t d istribution µ X Y of ( X, Y ) is given by µ X Y ( x , y ) ≡ µ X ( x ) µ Y | X ( y | x ) . The code for this p roblem is g iv en in the following (illustrated in Fig. 11). W e fix functions A : Y n → Y l A September 7, 2021 DRAFT 22 B : Y n → Y l B and a vector c ∈ Y l A av ailable to construct an encoder an d a d ecoder, where l A ≡ n [ H ( Y | X ) − ε A ] log |Y | l B ≡ n [ I ( Y ; X ) + ε B ] log |Y | . W e define the en coder and the deco der ϕ : X n → Y l B ϕ − 1 : Y l B → Y n as ϕ ( x ) ≡ B g A ( c | x ) ϕ − 1 ( b ) ≡ g AB ( c , b ) , where g A ( c | x ) ≡ a rg max y ′ ∈C A ( c ) µ Y | X ( y ′ | x ) g AB ( c , b ) ≡ ar g max y ′ ∈C AB ( c , b ) µ Y ( y ′ ) . In the following, we pr ovide an intuitive interpretation of the co nstruction of the cod e, which is illustra ted in Fig. 11. Assume that c is shared by the encoder and the d ecoder . For c and x , the functio n g A generates y such th at A y = c and ( x , y ) is a jointly ty pical sequ ence. The rate of c should b e less than H ( Y | X ) to satisfy the saturation pro perty . Then the encoder obtains the codeword B y . Th e decoder obtains the repro duction y by using g AB from c and the codeword B y if the r ate of c and B y is gr eater than H ( Y ) to satisfy the collision-resistant prop erty . Then we can set the en coding rate close to H ( Y ) − H ( Y | X ) = I ( X ; Y ) . Since ( x , y ) is jointly typical, ρ n ( x , y ) is close to the distortio n criterion. W e have th e following theorem. It sho uld be no ted that a sou rce is allowed to be non-bin ary an d unbiased and the distortion measure ρ is arbitra ry . Theor e m 6 : For given ε A >, ε B > 0 satisfying ε A + 2 ζ Y (3 ε A ) < ε B , assume th at ( A , p A ) and ( B , p B ) hav e hash p roperty . Let X be a stationa ry mem oryless source. Then fo r all sufficiently large n th ere are f unctions (sparse matrices) A ∈ A , B ∈ B , and a vector c ∈ Im A such that R ( B ) = I ( X ; Y ) + ε B E X ρ n ( X n , ϕ − 1 ( ϕ ( X n ))) n ≤ E X Y [ ρ ( X , Y )] + 3 |X ||Y | ρ max √ ε A . By assumin g that µ Y | X attain th e r ate-distortion bo und and by letting ε A , ε B → 0 , the rate -distortion pair of the propo sed code is close to this bou nd. September 7, 2021 DRAFT 23 Encoder c ✲ x ✲ g A ✲ y ✲ B ✲ B y Decoder c ✲ B y ✲ g AB ✲ y ✲ f n ✲ f n ( y , z ) z ✻ ✻ Fig. 12. Construction of W yner -Zi v Source Code Next, we consider the W y ner-Zi v prob lem introduced in [47] (illustrated in Fig. 3 ). L et ρ : X × W → [0 , ∞ ) be the distortion m easure satisfying ρ max ≡ max x,w ρ ( x, w ) < ∞ . W e define ρ n ( x , w ) as ρ n ( x , w ) ≡ n X i =1 ρ ( x i , w i ) for each x ≡ ( x 1 , . . . , x n ) and w ≡ ( w 1 , . . . , w n ) . For a probab ility d istribution µ X Z , the rate-d istortion function R X | Z ( D ) is gi ven by R X | Z ( D ) = min µ Y | X ,f : E X Y Z [ ρ ( X,f ( Y ,Z ))] ≤ D [ I ( X ; Y ) − I ( Y ; Z )] , where the min imum is taken over all co nditional p robab ility distributions µ Y | X and fun ctions f : Y × Z → W and the joint distrib u tion µ X Y Z of ( X, Y , Z ) is g iv e n by µ X Y Z ( x , y , z ) ≡ µ X Z ( x , z ) µ Y | X ( y | x ) . (20) In the following, we assume th at µ Y | X is fixed. W e fix function s A : Y n → Y l A B : Y n → Y l B and a vector c ∈ Y l A av ailable to construct an encoder an d a d ecoder, where l A ≡ n [ H ( Y | X ) − ε A ] log |Y | l B ≡ n [ H ( Y | Z ) − H ( Y | X ) + ε B ] log |Y | = n [ I ( X ; Y ) − I ( Y ; Z ) + ε B ] log |Y | . September 7, 2021 DRAFT 24 W e define the en coders and th e decoder (illustrated in Fig. 12 ) ϕ : X n → Y l B ϕ − 1 : Y l B × Z n → W n as ϕ ( x ) ≡ B g A ( c | x ) ϕ − 1 ( b | z ) ≡ f n ( g AB ( c , b | z ) , z ) where g A ( c | x ) ≡ a rg max y ′ ∈C A ( c ) µ Y | X ( y ′ | x ) g AB ( c , b | z ) ≡ arg ma x y ′ ∈C AB ( c , b ) µ Y | Z ( y ′ | z ) and we define f n ( y , z ) ≡ ( w 1 , . . . , w n ) by w i ≡ f ( y i , z i ) for each y ≡ ( y 1 , . . . , y n ) and z ≡ ( z 1 , . . . , z n ) . The rate R ( B ) of this code is giv en b y R ( B ) ≡ l B log |Y | n . In the following, we pr ovide an intuitive interpretation of the co nstruction of the cod e, which is illustra ted in Fig. 12. Assume that c is shared by the encoder and the d ecoder . For c and x , the functio n g A generates y such th at A y = c and ( x , y ) is a jointly ty pical sequ ence. The rate of c should b e less than H ( Y | X ) to satisfy the satur ation prope rty . Th en the en coder obtain s the co dew o rd B y . The d ecoder obtains the repr oduction y by using g AB from c , the codew ord B y , and the side infor mation z if the rate of c a nd B y is greater th an H ( Y | Z ) to satisfy the collision-r esistant prop erty . Then we can set the enco ding r ate close to H ( Y | Z ) − H ( Y | X ) = I ( X ; Y ) − I ( Y ; Z ) . Sin ce ( x , y , z ) is jointly typical, ρ n ( x , f ( y , z )) is close to the distortion criterion. W e hav e the following th eorem. It should be noted tha t a source is allo wed to be non -binary and u nbiased, side informatio n is allowd to be asymmetric, and th e distortion measure ρ is arbitra ry . Theor e m 7 : For given ε A , ε B > 0 satisfying ε A + 2 ζ Y Z (3 ε A ) < ε B , (21) assume that ( A , p A ) and ( B , p B ) have hash pr operty . Let ( X, Z ) be a pair of stationary memor yless sources. Then f or all sufficiently large n the re ar e fu nctions (sparse matrices) A ∈ A , B ∈ B , and a vecto r c ∈ Im A such that R ( B ) = I ( X ; Y ) − I ( Y ; Z ) + ε B E X Z ρ n ( X n , ϕ − 1 ( ϕ ( X n ) , Z n )) n ≤ E X Y Z [ ρ ( X , f ( Y , Z ))] + 3 |X ||Y ||Z | ρ max √ ε A . By assumin g that µ Y | X and f attain the W yner-Ziv bo und and letting ε A , ε B → 0 , the rate-distor tion pair of the propo sed code is close to this bou nd. September 7, 2021 DRAFT 25 Encoder s x ✲ b B ✲ b B x c ✲ y ✲ g A ✲ z ✲ B ✲ B z Decoder c ✲ B z ✲ g AB ✲ z ✲ b B x ✲ g b B ✲ x Fig. 13. Construction of One-helps-o ne Source Code The pr oof is given in Section VI-F. I t sho uld be noted that Theor em 6 is a special case of the W y ner-Zi v problem with |Z | ≡ 1 , W ≡ Y , and f ( y , z ) ≡ y . Remark 6: I n [ 25][36][23][27][14], th e lossy sou rce cod e is pro posed using sparse matrices for the binary alphabet and Ham ming d istance. In their constructed en coder pr oposed in [36][2 3][27][14], they obtain a codeword vector called the ‘middle la yer’ (see [23]) by u sing a matrix. In their constru cted decode r , they operate an other matr ix on the codew or d vector . I n our constructio n of the decoder, we obtain th e rep roductio n y by using a sparse matr ix A and g A and co mpress y with ano ther matrix B . It shou ld be note d that th e dimension of y is different from that of th e midd le layer and we need ML decoder g AB in the construction of the decod er , because y is c ompressed by using B . In [ 24], the code for the W yn er-Zi v pro blem is p roposed and the re are similar differences. Our approac h is based on the code presented in [28][30] and similar to the code presented in [42][43][49]. D. On e-helps-o ne Pr oblem In this sectio n, we conside r the One-h elps-one problem illustrated in Fig. 4. The achievable rate region for this problem is g iv en by a set of encoding rate pair ( R X , R Y ) satisfying R X ≥ H ( X | Z ) R Y ≥ I ( Y ; Z ) , where the joint distrib ution µ X Y Z is gi ven by µ X Y Z ( x, y , z ) = µ X Y ( x, y ) µ Z | Y ( z | y ) . (22) In the following, we construct a c ode by combining a Slep ian-W olf code and a lo ssy source co de. W e assume that µ Z | Y is fixed. W e fix fun ctions b B : X n → X l b B September 7, 2021 DRAFT 26 A : Z n → Z l A B : Z n → Z l B and a vector c ∈ Z l A av ailable to construct an en coder and a d ecoder, wher e l b B ≡ n [ H ( X | Z ) + ε b B ] log |X | l A ≡ n [ H ( Z | Y ) − ε A ] log |Z | l B ≡ n [ I ( Y ; Z ) + ε B ] log |Z | . W e define the en coders and th e decoder (illustrated in Fig. 13 ) ϕ X : X n → X l b B ϕ Y : Y n → Z l B ϕ − 1 : X l b B × Z l B → X n as ϕ X ( x ) ≡ b B x ϕ Y ( y ) ≡ B g A ( c | y ) ϕ − 1 ( b X , b Y ) ≡ g b B ( b X , g AB ( c , b Y )) , where g A ( c | y ) ≡ arg max z ′ ∈C A ( c ) µ Z | Y ( z ′ | y ) g AB ( c , b Y ) ≡ ar g max z ′ ∈C AB ( c , b Y ) µ Z ( z ′ ) g b B ( b X | z ) ≡ arg max x ′ ∈C b B ( b X ) µ X | Z ( x ′ | z ) . The pair of encoding ra tes ( R X , R Y ) is g iv en by R X ≡ l b B log |X | n R Y ≡ l B log |Z | n and the decoding error pr obability Error X Y ( A, B , b B , c ) is given by Erro r X Y ( A, B , b B , c ) ≡ µ X Y ( x , y ) : ϕ − 1 ( ϕ X ( x ) , ϕ Y ( y )) 6 = x . W e hav e the following theor em. Theor e m 8 : For given ε A , ε B , ε b B > 0 satisfying ε B > ε A + ζ Z (3 ε A ) (23) ε b B > 2 ζ X Z (3 ε A ) , (24) September 7, 2021 DRAFT 27 assume that ( A , p A ) , ( B , p B ) , and ( b B , p b B ) h av e hash proper ty . Let ( X, Y ) be a pair of stationar y me moryless sources. Then, for any δ > 0 an d an d all sufficiently la rge n , there are functio ns (spar se matrices) A ∈ A , B ∈ B , b B ∈ b B , and a vector c ∈ Im A such that R X = H ( X | Z ) + ε b B R Y = I ( X ; Z ) + ε B Erro r X Y ( A, B , b B , c ) ≤ δ. The proof is g i ven in Section VI-G. V I . P RO O F O F L E M M A S A N D T H E O R E M S In the proof, we use the method of types, wh ich is gi ven in Append ix. Thro ughou t this sectio n, we assume that the probability distributions of p C , p b C , p M are uniform and the ran dom v ar iables A , B , b A , b B , C , b C and M ar e mutually ind ependen t. A. Pr o of of Lemmas 1–6 W e p repare the fo llowing two lemma s, which com e from the fact that p C is the un iform distribution on Im A and random variables A and C ar e m utually indepen dent. Lemma 9: Let p A be the distribution on the set o f function s and p C be the u niform distribution on Im A . W e assume that a jo int distrib u tion p AC satisfies p AC ( A, c ) = p A ( A ) p C ( c ) for any A and c ∈ Im A . Th en X c p C ( c ) χ ( A u = c ) = 1 | Im A| (25) for any A and u ∈ U n , X A, c p AC ( A, c ) χ ( A u = c ) = 1 | Im A| (26) for any u ∈ U n , and X A, c p AC ( A, c ) | G ∩ C A ( c ) | = X u ∈G X A, c p AC ( A, c ) χ ( A u = c ) = |G | | Im A| (27) p AC ( { ( A, c ) : G ∩ C A ( c ) 6 = ∅} ) ≤ |G | | Im A| (28) for any G ⊂ U n . Pr oof: First, we prove (25). Since A u is d etermined u niquely , we have X c χ ( A u = c ) = 1 . Then we hav e X c p C ( c ) χ ( A u = c ) = X c χ ( A u = c ) | Im A| September 7, 2021 DRAFT 28 = 1 | Im A| . Next, we pr ove (2 6). From (25), we h av e X A, c p AC ( A, c ) χ ( A u = c ) = X A p A ( A ) X c p c ( c ) χ ( A u = c ) = X A p A ( A ) | Im A| = 1 | Im A| . Next, we pr ove (2 7). From (26), we h av e X A, c p AC ( A, c ) |G ∩ C A ( c ) | = X A, c p AC ( A, c ) X u ∈G χ ( A u = c ) = X u ∈G X A, c p AC ( A, c ) χ ( A u = c ) = X u ∈G 1 | Im A| = |G | | Im A| . Finally , we p rove ( 28). From ( 27), we ha ve p AC ( { ( A, c ) : G ∩ C A ( c ) 6 = ∅} ) = p AC ( { ( A, c ) : ∃ u ∈ G ∩ C A ( c ) } ) ≤ X u ∈G p AC ( { ( A, c ) : u ∈ C A ( c ) } ) = X u ∈G X A, c p AC ( A, c ) χ ( A u = c ) = |G | | Im A| . Pr oof of Lemma 1: Since ( A , p A ) has an ( α A , β A ) -hash property , we have p A ( { A : [ G \ { u } ] ∩ C A ( A u ) 6 = ∅} ) ≤ X u ′ ∈G \{ u } p A ( { A : A u = A u ′ } ) ≤ |{ u } ∩ [ G \ { u } ] | + |G \ { u }| α A | Im A| + min {|{ u }| , |G \ { u }|} β A ≤ |G | α A | Im A| + β A . Pr oof of Lemma 2: First, sin ce ( A , p A ) has an ( α A , β A ) -hash prop erty , we have X u , u ′ ∈T X A, c p AC ( A, c ) χ ( A u = c ) χ ( A u ′ = c ) = X u , u ′ ∈T X A, c p AC ( A, c ) χ ( A u = c ) χ ( A u = A u ′ ) = X u , u ′ ∈T X A p A ( A ) χ ( A u = A u ′ ) X c p C ( c ) χ ( A u = c ) = 1 | Im A| X u , u ′ ∈T X A p A ( A ) χ ( A u = A u ′ ) September 7, 2021 DRAFT 29 ≤ |T | | Im A| + |T | 2 α A | Im A| 2 + |T | β A | Im A| , (29) where the third eq uality comes from (2 5). Next, we have X A, c p AC ( A, c ) " X u ∈T χ ( A u = c ) − |T | | Im A| # 2 = X A, c p AC ( A, c ) " X u ∈T χ ( A u = c ) # 2 − 2 |T | | Im A| X A, c p AC ( A, c ) X u ∈T χ ( A u = c ) + |T | 2 | Im A| 2 = X u , u ′ ∈T X A, c p AC ( A, c ) χ ( A u = c ) χ ( A u ′ = c ) − 2 |T | | Im A| X u ∈T X A, c p AC ( A, c ) χ ( A u = c ) + |T | 2 | Im A| 2 ≤ |T | 2 [ α A − 1] | Im A| 2 + |T | [ β A + 1] | Im A| , (30) where the last inequality co mes from (27) an d (29). Finally , from the fact that T 6 = ∅ , we have p AC ( { ( A, c ) : T ∩ C A ( c ) = ∅} ) = p AC ( { ( A, c ) : ∀ u ∈ T , A u 6 = c } ) = p AC ( ( A, c ) : X u ∈T χ ( A u = c ) = 0 )! ≤ p AC ( ( A, c ) : X u ∈T χ ( A u = c ) − |T | | Im A| ≥ |T | | Im A| )! ≤ P A, c p AC ( A, c ) h P u ∈T χ ( A u = c ) − |T | | Im A| i 2 |T | 2 | Im A| 2 ≤ |T | 2 [ α A − 1] | Im A| 2 + |T | [ β A +1] | Im A| |T | 2 | Im A| 2 = α A − 1 + | Im A| [ β A + 1] |T | , where the second inequality co mes f rom the Markov inequality and the third inequality co mes from (3 0). Pr oof of Lemma 3: Since ( A , p A ) has an ( α A , β A ) -hash property , we have p AC ( A, c ) : G ∩ C A ( c ) 6 = ∅ u ∈ C A ( c ) = p AC ( A, c ) : G ∩ C A ( A u ) 6 = ∅ u ∈ C A ( c ) = X A p A ( A ) χ ( G ∩ C A ( A u ) 6 = ∅ ) X c p C ( c ) χ ( A u = c ) = p A ( { A : G ∩ C A ( A u ) 6 = ∅} ) | Im A| ≤ |G | α A | Im A| 2 + β A | Im A| , where the seco nd equ ality comes fro m the fact that random variables A and C are ind ependen t, the third equality comes from (25), and th e inequality co mes form Lemma 1. September 7, 2021 DRAFT 30 Pr oof of Lemma 4: By applying Lemm a 1 to the set [ G \ { u A, c } ] ∩ C A ( c ) , we hav e p AB C ( { ( A, B , c ) : [ G \ { u A, c } ] ∩ C AB ( c , B u A, c ) 6 = ∅} ) ≤ X A, c p AC ( A, c ) | [ G \ { u A, c } ] ∩ C A ( c ) | α B | Im B | + β B ≤ X A, c p AC ( A, c ) |G ∩ C A ( c ) | α B | Im B | + β B = |G | α B | Im A|| Im B | + β B , where the last equality com es f rom (27). Pr oof of Lemma 5: Let G ( v ) ≡ { u : µ U | V ( u | v ) > 2 − n [ H ( U | V ) − 2 ε ] } . If G ( v ) ∩ C A ( c ) = ∅ an d T ( v ) ∩ C A ( c ) 6 = ∅ , then the re is u ∈ T ( v ) ∩ C A ( c ) and g A ( c | v ) satisfies µ U | V ( g A ( c | v ) | v ) ≤ 2 − n [ H ( U | V ) − 2 ε ] . Since u ∈ C A ( c ) , we hav e µ U | V ( g A ( c | v ) | v ) ≥ µ U | V ( u | v ) . This implies that g A ( c | v ) ∈ T ( v ) from the assumption of T ( v ) . From Lemma 2 and (28) , we have p AC ( { ( A, c ) : g A ( c | v ) / ∈ T ( v ) } ) ≤ 1 − p AC ( { ( A, c ) : g A ( c | v ) ∈ T ( v ) } ) ≤ 1 − p AC ( A, c ) : T ( v ) ∩ C A ( c ) 6 = ∅ G ( v ) ∩ C A ( c ) = ∅ ≤ p AC ( { ( A, c ) : T ( v ) ∩ C A ( c ) = ∅} ) + p AC ( { ( A, c ) : G ( v ) ∩ C A ( c ) 6 = ∅} ) ≤ α A − 1 + | Im A| [ β A + 1] |T ( v ) | + |G ( v ) | | Im A| ≤ α A − 1 + | Im A| [ β A + 1] |T ( v ) | + 2 − nε |U | l A | Im A| , where the last inequality co mes from the fact that |G ( v ) | ≤ 2 n [ H ( U | V ) − 2 ε ] . Pr oof of Lemma 6 : When T ( v ) ∩ C A ( c ) 6 = ∅ , we can always find the member of T ( v ) by using b g A . From Lemma 2, we ha ve p AC ( { ( A, c ) : b g A ( c | v ) / ∈ T ( v ) } ) ≤ p AC ( { ( A, c ) : T ( v ) ∩ C A ( c ) = ∅ } ) ≤ α A − 1 + | Im A| [ β A + 1] |T ( v ) | . September 7, 2021 DRAFT 31 B. Pr o of of Theor em 1 and Lemmas 7 a nd 8 For a typ e t , let C t be defined as C t ≡ { u ∈ U n : t ( u ) = t } . W e assume that p A ( { A : A u = 0 } ) depends on u o nly th rough the type t ( u ) . For a gi ven u ∈ C t , we define u A, t ≡ u A ( { A : A u = 0 } ) p A, t ≡ p A ( { A : A u = 0 } ) , where u A denotes the uniform distribution on the set of all l A × n matrices an d we omit u from the lef t hand side beca use the prob abilities u A ( { A : A u = 0 } ) and p A ( { A : A u = 0 } ) d epend o n u ∈ C t only thr ough th e type t . W e use th e following lemma in the proof. Lemma 10: α A ( n ) = | Im A| max t ∈ b H p A, t (31) β A ( n ) = X t ∈H\ b H |C t | p A, t , (32) where H is a set of all typ es o f length n excep t the type of the zero vector . Pr oof: Since we can find |U | [ n − 1] l A matrices A to satisfy A u = 0 for u ∈ C t , we hav e u A, t = |U | [ n − 1] l A |U | nl A = |U | − l A . W e hav e S ( p A , t ) = X A p A ( A ) X u ∈C t A u = 0 1 = X u ∈C t X A : A u = 0 p A ( A ) = |C t | p A, t . Similarly , we have S ( u A , t ) = |C t | u A, t . The lemma can b e shown immediate ly fro m (9 ), (10), a nd the ab ove equalities. Pr oof of Theor em 1: W ithout loss of generality , we can assume that |T | ≤ |T ′ | . W e h av e X u ∈T u ′ ∈T ′ p A ( { A : A u = A u ′ } ) = X u ∈T u ′ ∈T ′ p A ( { A : A [ u − u ′ ] = 0 } ) ≤ X u ∈T ∩T ′ p A ( { A : A 0 = 0 } ) + X t ∈H X u ∈T u ′ ∈T ′ t ( u − u ′ )= t p A, t September 7, 2021 DRAFT 32 ≤ X u ∈T ∩T ′ 1 + X t ∈ b H X u ∈T u ′ ∈T ′ t ( u − u ′ )= t p A, t + X t ∈H\ b H X u ∈T |C t | p A, t ≤ |T ∩ T ′ | + X t ∈ b H X u ∈T u ′ ∈T ′ t ( u − u ′ ) ∈ b H α A ( n ) | Im A| + |T | X t ∈H\ b H S ( p A , t ) ≤ |T ∩ T ′ | + |T ||T ′ | α A ( n ) | Im A| + |T | β A ( n ) = |T ∩ T ′ | + |T ||T ′ | α A ( n ) | Im A| + min {|T | , |T ′ |} β A ( n ) , where the third inequality com es f rom (3 1) an d the last eq uality comes fr om th e assump tion |T | ≤ |T ′ | . Since ( α A , β A ) satisfies (H2) a nd (H3), we ha ve the fact th at ( A , p A ) has an ( α A , β A ) -hash property . Pr oof of Lemma 7: W itho ut loss o f gen erality , we can assume that |T | ≤ |T ′ | and β A ( n ) ≤ β B ( n ) . Similar to the proof o f Theorem 1, we have X u ∈T u ′ ∈T ′ p AB ( { ( A, B ) : ( A u , B u ) = ( A u ′ , B u ′ ) } ) = X u ∈T ∩T ′ p A ( { A : A 0 = 0 } ) p B ( { B : B 0 = 0 } ) + X t ∈H X u ∈T u ′ ∈T ′ t ( u − u ′ )= t p A, t p B , t ≤ |T ∩ T ′ | + X t ∈ b H X u ∈T u ′ ∈T ′ t ( u − u ′ ) ∈ b H α A ( n ) α B ( n ) | Im A|| Im B | + |T | X t ∈H\ b H S ( p A , t ) ≤ |T ∩ T ′ | + |T ||T ′ | α A ( n ) α B ( n ) | Im A|| Im B | + |T | β A ( n ) = |T ∩ T ′ | + |T ||T ′ | α AB ( n ) | Im A|| Im B | + min {|T | , |T ′ |} β AB ( n ) , where the first inequality comes fr om the fact th at p B , t ≤ 1 . Since ( α AB ( n ) , β AB ( n )) satisfies (H2 ) and (H3 ), ( A × B , p AB ) has an ( α AB , β AB ) -hash prop erty . Pr oof of Lemma 8 : W ithou t loss of gene rality , we ca n assume that |T | ≤ |T ′ | . Let H U and H V be defined similarly to th e definitio n o f H , and b H U and b H V be de fined similarly to the definition of b H . Sim ilar to the proof of Theorem 1, we have X ( u , v ) ∈T ( u ′ , v ′ ) ∈T ′ p AB ( { ( A, B ) : ( A u , B v ) = ( A u ′ , B v ′ ) } ) = X ( u , v ) ∈ T ∩T ′ p A ( { A : A 0 = 0 } ) p B ( { B : B 0 = 0 } ) + X t U ∈H U t V ∈H V X ( u , v ) ∈ T ( u ′ , v ′ ) ∈T ′ t ( u − u ′ )= t U t ( v − v ′ )= t V p A, t U p B , t V September 7, 2021 DRAFT 33 ≤ X ( u , v ) ∈ T ∩T ′ 1 + X t U ∈ b H U t V ∈ b H V X ( u , v ) ∈ T ( u ′ , v ′ ) ∈T ′ t ( u − u ′ )= t U t ( v − v ′ )= t V α A ( n ) α B ( n ) | Im A|| Im B | + X t U ∈ b H U X t V ∈H V \ b H V X ( u , v ) ∈T X ( u ′ , v ′ ) ∈T ′ t ( u − u ′ )= t U t ( v − v ′ )= t V α A ( n ) p B , t V | Im A| + X t U ∈H U \ b H U X t V ∈H V X ( u , v ) ∈T X ( u ′ , v ′ ) ∈T ′ t ( u − u ′ )= t U t ( v − v ′ )= t V α B ( n ) p A, t U | Im B | + X t U ∈H U \ b H U X t V ∈H V \ b H V X ( u , v ) ∈T X ( u ′ , v ′ ) ∈T ′ t ( u − u ′ )= t U t ( v − v ′ )= t V p A, t U p B , t V ≤ X ( u , v ) ∈ T ∩T ′ 1 + X t U ∈ b H U t V ∈ b H V X ( u , v ) ∈ T ( u ′ , v ′ ) ∈T ′ t ( u − u ′ )= t U t ( v − v ′ )= t V α A ( n ) α B ( n ) | Im A|| Im B | + X ( u , v ) ∈T α A ( n ) | Im A| X t V ∈H V \ b H V |C t V | p B , t V + X ( u , v ) ∈T α B ( n ) | Im B | X t U ∈H U \ b H U |C t U | p A, t U + X ( u , v ) ∈ T X t U ∈H U \ b H U |C t U | p A, t U X t V ∈H V \ b H V |C t V | p B , t V ≤ |T ∩ T ′ | + |T ||T ′ | α A ( n ) α B ( n ) | Im A|| Im B | + |T | α A ( n ) β B ( n ) | Im A| + α B ( n ) β A ( n ) | Im B | + β A ( n ) β B ( n ) = |T ∩ T ′ | + |T ||T ′ | α AB ( n ) | Im A|| Im B | + min {|T | , |T ′ |} β ′ AB ( n ) , where the first inequ ality comes from the fact that p A, t U ≤ 1 and p B , t V ≤ 1 . Since ( α AB , β ′ AB ) satisfies (H2) and (H3), ( A × B , p AB ) has an ( α AB , β ′ AB ) -hash property . C. Pr oof of Theor em 2 Throu ghout this section, let U ≡ GF( q ) , l ≡ nR , an d p A be an ensemble of l × n sparse matrices as specified in Section IV, wh ere we omit de pendenc e on the ensemble o f l . I t should be n oted th at l → ∞ b y letting n → ∞ . First, we prepare lemmas that pr ovide the analytic expression o f p A, t . Lemma 11: W e consider a random- walk on GF( q l ) defined as the fo llowing. Let c n ∈ GF( q l ) be the po sition after n steps. At eac h unit step, the position is renewed in the following ru le. 1) Cho ose ( i, u ) ∈ { 1 , . . . , l } × GF( q ) unifo rmly at rand om. 2) Add u to the i -th element of c n . Then, the probability P n ( c ) of the position c after n steps starting from the zer o vector is described by P n ( c ) = 1 q l l X k =0 1 − q k [ q − 1] l n w ( c ) X k ′ =0 w ( c ) k ′ l − w ( c ) k − k ′ ( − 1) k ′ ( q − 1) k − k ′ . (33) Pr oof: Let b C ⊂ GF( q l ) be defined as b C ≡ (0 , . . . , 0 | {z } [ j − 1] , c, 0 , . . . , 0 | {z } [ l − j ] ) : j ∈ { 1 , . . . , l } c ∈ GF( q ) . Then the transition rule o f this ra ndom walk is equivalent to the following. 1) Cho ose b c ∈ b C uniform ly at random . 2) Add c to c n , that is, c n +1 ≡ c n + b c . September 7, 2021 DRAFT 34 W e hav e the following recur sion fo rmula for P n ( c ) . P 1 ( c ) = 1 [ q − 1] l , if c ∈ b C , 0 , otherwise . P n +1 ( c ) = X c ′ ∈ GF( q l ) P n ( c ′ ) P 1 ( c − c ′ ) = [ P n ∗ P 1 ]( c ) , where P n ∗ P 1 denotes the conv olutio n. W e h av e (33) by using the following for mulas F P n = [ F P n − 1 ][ F P 1 ] = · · · = [ F P 1 ] n P n = F − 1 F P n = F − 1 [[ F P 1 ] n ] , where F is th e discrete F ourier transform and F − 1 is its in verse. Lemma 12: The prob ability p A ( { A : A u = 0 } ) d epends on u only th rough the type t ( u ) , that is, if w ( t ) = w ( t ′ ) then p A, t = p A, t ′ . Furthermo re, p A, t = 1 q l l X k =0 1 − q k [ q − 1] l w ( t ) τ l k ( q − 1) k . Pr oof: For u ≡ ( u 1 , . . . , u n ) , we define u ∗ ≡ ( u ∗ 1 , . . . , u ∗ n ) as u ∗ i ≡ 1 , if u i 6 = 0 0 , if u i = 0 . Similarly as in the proo f of [10, Lemma 1], w e can prove that two sets { A : A u = 0 } and { A : A u ∗ = 0 } are in one-to-on e correspo ndence. Then we hav e p A ( { A : A u = 0 } ) = p A ( { A : A u ∗ = 0 } ) , that is, p A ( { A : A u = 0 } ) depen ds on u only through w ( t ) . Since p A ( { A : A u ∗ = 0 } ) is equal to the probab ility that the position of the ran dom walk defin ed in Lemma 11 starts from the zero vector and returns to th e z ero vector af ter w ( u ∗ ) τ step s, we ha ve p A, t = P w ( t ) τ ( 0 ) = 1 q l l X k =0 1 − q k [ q − 1] l w ( t ) τ l k ( q − 1) k . Next, we pr ove the fo llowing lemma. Lemma 13: If th e column weight τ is e ven , then Im A = { u ∈ U l : w ( u ) is e ven } , if q = 2 U l , if q > 2 , which implies | Im A| |U | l = 2 , if q = 2 1 , if q > 2 . September 7, 2021 DRAFT 35 Pr oof: Let a i,j be the ( i, j ) element of A . First, we assume that q = 2 . T hen it is sufficient to p rove tha t w ( A u ) is even for any possible A an d u ∈ U l because X c : w ( c ) is even 1 − X c : w ( c ) is odd 1 = l X w =0 n w [ − 1] w = 0 which imp lies that | Im A| = |U | l / 2 . W itho ut loss of g enerality , we can assum e tha t w ( u ) = w and u = (1 , . . . , 1 , 0 , . . . , 0) . Let a i ≡ ( a i, 1 , . . . , a i,w ( u ) ) . Sin ce every colum n vecotor has an even weight, we have the fact that P w ( u ) i =1 w ( a i ) is e ven. In addition, we h av e X i : w ( a i ) is odd w ( a i ) = w ( u ) X i =1 w ( a i ) − X i : w ( a i ) is even w ( a i ) . This implies that th e numb er o f odd-weigh t vectors a i is ev en b ecause the right h and side of the ab ove equality is even. Since w ( A u ) is a number of o dd-weigh t vectors a i , we hav e the fact that w ( A u ) is ev en f or any A and u ∈ U l . Next, we assume that q > 2 . It is suf ficien t to prove that, for any c = ( c 1 , . . . , c l ) ∈ U l , ther e is A g enerated by the scheme and u ∈ U l such that A u = c . This fact implies that Im A = U l . Let u = (1 , . . . , 1) . It is possible to generate A satisfying a i,j = 2 a, if i = j 0 , if i 6 = j, where a ∈ GF( q ) is arb itrary . Since q > 3 , we h av e A u = c by letting a ≡ c i / 2 . Finally we prove th at ( α A , β A ) satisfies (H2) a nd (H3). W e d efine the fu nction h as h ( θ ) ≡ − θ log e ( θ ) − [1 − θ ] log e (1 − θ ) , where e is the base o f the natu ral logarithm. W e use the following lemmas to de riv e the asym ptotic behavior of ( α A , β A ) . Lemma 14: Let a be a real numbe r . Then max 0 ≤ θ ≤ 1 [ h ( θ ) + aθ ] ≤ lo g e (1 + e a ) . If a ≤ − log e ( l − 1) , then max 1 /l ≤ θ ≤ 1 [ h ( θ ) + aθ ] ≤ h 1 l + a l . Lemma 15: l h 1 l ≤ 1 + log e l . Lemma 16: l − 1 X k =1 1 − 2 k l wτ l k ≤ 2 ⌊ l 2 ⌋ X k =1 exp − 2 k w τ l l k Pr oof: Since 1 − 2 k l wτ l k = 1 − 2[ l − k ] l wτ l 1 − k , September 7, 2021 DRAFT 36 then we ha ve l − 1 X k =1 1 − 2 k l wτ l k = 2 ⌊ l 2 ⌋ X k =1 1 − 2 k l wτ l k ≤ 2 ⌊ l 2 ⌋ X k =1 exp − 2 k w τ l l k , where the inequality comes f rom the fact that 2 k /l ≤ 1 . Lemma 17: l X k =1 1 − q k [ q − 1] l wτ l k [ q − 1] k ≤ ⌊ [ q − 1] l q ⌋ X k =1 exp − q k w τ [ q − 1] l l k [ q − 1] k + l X k = ⌈ [ q − 1] l q ⌉ l k [ q − 1] k − w τ . (3 4) Pr oof: W e can sho w the lemma from the fact tha t q k [ q − 1] l ≤ 1 when k ≤ [ q − 1 ] l / q an d 1 − q k [ q − 1] l = q [ k − l ] + l [ q − 1] l ≤ l [ q − 1] l = [ q − 1] − 1 when [ q − 1] l /q < k ≤ l . Let τ be the param eter given in the proced ure u sed fo r gen erating a sparse matrix. W e assume that τ and ξ satisfy τ ≡ 2 log e l 2 R (35) h ( ξ R ) R + ξ log e ( q − 1) < 1 3 . (36) Then we hav e ξ τ ≥ 3 log e l (37) for all suf ficiently large l . Now we are in position to prove the fo llowing two lemm as which provides the proof of Theorem 2. Lemma 18: lim n →∞ α A ( n ) = 1 . Pr oof: In the following, we first show that lim l →∞ ⌊ [ q − 1] l q ⌋ X k =1 exp − q k w τ [ q − 1] l l k [ q − 1] k = 0 (38) September 7, 2021 DRAFT 37 for all q ≥ 2 and w > ξ l . By assuming w > ξ l , we hav e ⌊ [ q − 1] l q ⌋ X k =1 exp − q k w τ [ q − 1] l l k [ q − 1] k ≤ l max 1 /l ≤ θ ≤ 1 exp ( − w τ θ ) exp ( l h ( θ )) [ q − 1] lθ ≤ l max 1 /l ≤ θ ≤ 1 exp ( − ξ l τ θ + l h ( θ ) + l θ log e ( q − 1)) ≤ l max 1 /l ≤ θ ≤ 1 exp ( l [ h ( θ ) + [lo g e ( q − 1) − ξ τ ] θ ]) ≤ l exp l h 1 l + log e ( q − 1) − ξ τ l ≤ exp (1 + log e l + log e ( q − 1) − ξ τ + lo g e l ) ≤ exp ( − ξ τ + 2 log e l + log e [ q − 1] e ) , where the fifth inequ ality come s f rom (37) and Lemma 14, and the sixth in equality co mes from Lemma 15. Hence we ha ve (3 8) f or all q ≥ 2 and w > ξ l . Next, w e show th e lemma b y assumimg th at q = 2 . From Lemma 16 , (38) , and the fact that wτ is ev e n, we have lim l →∞ max w> ξl l − 1 X k =1 1 − 2 k l wτ l k = lim l →∞ max w> ξl l − 1 X k =1 1 − 2 k l wτ l k ≤ 2 lim l →∞ max w> ξl ⌊ l 2 ⌋ X k =1 exp − 2 k w τ l l k = 0 . From (31) and L emma 13, we ha ve lim n →∞ α A ( n ) = lim n →∞ 1 2 max w> lξ l X k =0 1 − 2 k l wτ l k = 1 + 1 2 lim n →∞ max w> lξ l − 1 X k =1 1 − 2 k l wτ l k = 1 . Finally , we show the lemma by assuming that q > 2 . From (38 ), the first term o n the rig ht han d side of (34) vanishes by letting l → ∞ . Since [ q − 1] l / q ≥ 1 / 2 , the n the seco nd term o n the r ight hand side of (34 ) is e valuated by l X k = ⌈ [ q − 1] l q ⌉ l k [ q − 1] k − w τ ≤ l l ⌈ [ q − 1] l /q ⌉ [ q − 1] l − w τ ≤ l exp l h q − 1 q [ q − 1] l − w τ ≤ l exp ( l log e eq − w τ log e ( q − 1)) < exp ( − l [ ξ τ log e ( q − 1) − log e eq − lo g e l ]) , September 7, 2021 DRAFT 38 where the third ineq uality comes f rom h ( θ ) ≤ 1 . From q > 2 a nd ( 37), the seco nd term on the right hand sid e of (34) vanishes b y letting l → ∞ . Fro m the above two observations and the fact that wτ is ev en , we have lim l →∞ max w> ξl l X k =1 1 − q k [ q − 1] l wτ l k [ q − 1] k = lim l →∞ max w> ξl l X k =1 1 − q k [ q − 1] l wτ l k [ q − 1] k = 0 . From (31) and L emma 13, we ha ve lim n →∞ α A ( n ) = lim n →∞ max w> lξ l X k =0 1 − q k [ q − 1] l wτ l k [ q − 1] k = 1 + lim n →∞ max w> lξ l X k =1 1 − q k [ q − 1] l wτ l k [ q − 1] k = 1 . Lemma 19: lim n →∞ β A ( n ) = 0 . Pr oof: Let C w ≡ { x : w ( x ) = w } . Th en we ha ve |C w | = n w [ q − 1] w (39) ≤ exp nh w n + w log e ( q − 1) . (40) In the following, we first show that lim l →∞ ξl X w =1 |C w | q l l X k =0 exp − q k w τ [ q − 1] l l k [ q − 1] k = 0 . (41) W e hav e ξl X w =1 |C w | q l l X k =0 exp − q k w τ [ q − 1] l l k [ q − 1] k = ξl X w =1 |C w | q l 1 + [ q − 1] exp − q w τ [ q − 1] l l = ⌊ l 2 τ ⌋ X w =1 |C w | 1 + [ q − 1] exp − qw τ [ q − 1] l q l + ξl X w = ⌈ l 2 τ ⌉ |C w | 1 + [ q − 1] exp − qw τ [ q − 1] l q l . (42) The first term on th e right h and side o f (42) is evaluated by ⌊ l 2 τ ⌋ X w =1 |C w | 1 + [ q − 1 ] exp − qw τ [ q − 1] l q l ≤ ⌊ l 2 τ ⌋ X w =1 |C w | 1 + [ q − 1 ] h 1 − qw τ 2[ q − 1] l i q l = ⌊ l 2 τ ⌋ X w =1 n w [ q − 1] w h 1 − wτ 2 l i l September 7, 2021 DRAFT 39 ≤ ⌊ l 2 τ ⌋ X w =1 n w q w exp − wτ 2 ≤ ⌊ l 2 τ ⌋ X w =1 nq ex p − τ 2 ≤ nq l 2 τ exp − τ 2 = q 2 τ exp log e l 2 R − τ 2 ≤ q 4 log e l 2 R . (43) The first ineq uality com es from the fact that exp( − x ) ≤ 1 − x/ 2 for 0 ≤ x ≤ 1 / 2 . The first equa lity comes from (39 ). The seco nd inequ ality comes fr om the fact that [1 + x ] l ≤ exp( l x ) . The th ird ineq uality comes from the fact that n w q w exp − wτ 2 is a non -increasing fun ction of w . T he fif th inequ ality comes from (35). From ( 43), the first term on the right hand side of ( 42) vanishes by letting l → ∞ . The secon d term of (42) is ev aluated by ξl X w = ⌊ l 2 τ ⌋ |C w | 1 + [ q − 1] exp − qw τ [ q − 1] l q l ≤ ξl X w = ⌊ l 2 τ ⌋ |C w | 1 + [ q − 1 ] exp − q 2[ q − 1] q l ≤ ξl X w = ⌊ l 2 τ ⌋ |C w | exp − l 3 ≤ ξ l e xp l h ( ξ R ) R + ξ log e ( q − 1) − 1 3 , (44) where the second inequality co mes f rom the fact that 1 + [ q − 1 ] exp − q 2[ q − 1] q ≤ e − 1 3 and the third inequality comes from (40 ). Fro m (36) and (44), the second term on the righ t h and side of (42) vanishes by letting l → ∞ . From the above two observations, we have (41 ). Next, we show the lem ma by assuming that q = 2 . From (32 ), the fact that wτ is e ven, and Lem ma 1 6, we have β A ( n ) = ξl X w =1 |C w | 2 l l X k =0 1 − 2 k l wτ l k ≤ 2 ξl X w =1 |C w | 2 l ⌊ l 2 ⌋ X k =0 exp − 2 k w τ l l k ≤ 2 ξl X w =1 |C w | 2 l l X k =0 exp − 2 k w τ l l k From (41), we ha ve the lemma f or q = 2 Finally , we sh ow the lemma by assum ing that q > 2 . From (32), and Lemmas 12 an d 17, we h av e β A ( n ) = ξl X w =1 |C w | q l l X k =0 1 − q k [ q − 1] l wτ l k [ q − 1] k September 7, 2021 DRAFT 40 ≤ ξl X w =1 |C w | q l l X k =0 exp − q k w τ [ q − 1] l l k [ q − 1] k + ξl X w =1 |C w | q l l X k =0 l k [ q − 1] k − w τ = ξl X w =1 |C w | q l l X k =0 exp − q k w τ [ q − 1] l l k [ q − 1] k + ξl X w =1 |C w | [ q − 1] − w τ . (45) From (41 ), the first term o n the righ t hand side of (45) vanishes by lettin g l → ∞ . From (4 0), the second term on the right h and side of ( 45) is e valuated by ξl X w =1 |C w | [ q − 1] − w τ ≤ ξl X w =1 exp nh w n + w [1 − τ ] log e ( q − 1) ≤ ξ l e xp n max 1 /n ≤ θ ≤ 1 [ h ( θ ) + n [1 − τ ] log e ( q − 1) θ ] ≤ ξ l e xp nh 1 n + [1 − τ ] log e ( q − 1) ≤ exp (1 + log e n + [1 − τ ] log e ( q − 1) + log e ξ l , ) where the third in equality comes from L emma 14 and th e fact that [1 − τ ] log e ( q − 1) < − lo g e ( n − 1) for all suf ficiently large n and q > 2 . The fourth inequality comes f rom Lemma 15. Fro m (35), we h av e 1 + log e n + [1 − τ ] log e ( q − 1) + log e ξ l → − ∞ by letting n → ∞ . Th en the third ter m on the right hand side of (45) vanishes by letting n → ∞ . Hence we have the lemma for q > 2 . D. P r oof of Theor em 3 W e define the set T as T ≡ ( x , y ) : − 1 n log µ X | Y ( x | y ) ≤ H ( X | Y ) + γ − 1 n log µ Y | X ( y | x ) ≤ H ( Y | X ) + γ − 1 n log µ X Y ( x , y ) ≤ H ( X Y ) + γ . It sho uld be n oted that the above definition can be re placed by that defin ed in [3 3]. This im plies that the th eorem is v a lid for g eneral correlated sources. Let ( x , y ) be the output of corr elated sou rces. W e defin e • ( x , y ) / ∈ T (SW1) • ∃ x ′ 6 = x s.t. x ′ ∈ C A ( A x ) , µ X,Y ( x ′ , y ) ≥ µ X,Y ( x , y ) (SW2) • ∃ y ′ 6 = y s.t. y ′ ∈ C B ( B y ) , µ X,Y ( x , y ′ ) ≥ µ X,Y ( x , y ) ( SW3) • ∃ ( x ′ , y ′ ) 6 = ( x , y ) s.t. x ′ ∈ C A ( A x ) , y ′ ∈ C B ( B y ) , µ X,Y ( x ′ , y ′ ) ≥ µ X,Y ( x , y ) . ( SW4) Since a d ecoding error oc curs when at least one of th e conditio ns (SW1)–(SW4) is satisfied, th e error pro bability is upper bounde d by Erro r X Y ( A, B ) ≤ µ X Y ( E c 1 ) + µ X Y ( E c 1 ∩ E 2 ) + µ X Y ( E c 1 ∩ E 3 ) + µ X Y ( E c 1 ∩ E 4 ) , (46) September 7, 2021 DRAFT 41 where we define E i ≡ { ( x , y ) : (SW i ) } . First, we e valuate E AB [ µ X Y ( E 1 )] . From Lemma 26 , we h av e E AB [ µ X Y ( E 1 )] ≤ δ 4 (47) for all suf ficiently large n . Next, we evaluate E AB [ µ X Y ( E c 1 ∩ E 2 )] and E AB [ µ X Y ( E c 1 ∩ E 3 )] . Since µ X | Y ( x ′ | y ) ≥ µ X | Y ( x | y ) ≥ 2 − n [ H ( X | Y )+ γ ] When (SW1) and (SW2), we have [ G ( y ) \ { x } ] ∩ C A ( A x ) 6 = ∅ , where G ( y ) ≡ n x : µ X | Y ( x | y ) ≥ 2 − n [ H ( X | Y )+ γ ] o . From Lemma 1, we h av e E AB [ µ X Y ( E c 1 ∩ E 2 )] = X ( x , y ) ∈T µ X Y ( x , y ) p AB ( { ( A, B ) : (SW2) } ) ≤ X ( x , y ) ∈T µ X Y ( x , y ) p A ( { A : [ G ( y ) \ { x } ] ∩ C A ( A x ) 6 = ∅} ) ≤ X ( x , y ) ∈T µ X Y ( x , y ) |G ( y ) | α A | Im A| + β A ≤ X ( x , y ) ∈T µ X Y ( x , y ) 2 n [ H ( X | Y )+ γ ] α A | Im A| + β A ≤ 2 n [ H ( X | Y )+ γ ] |X | − l A |X | l A α A | Im A| + β A ≤ δ 4 (48) for all sufficiently large n b y taking an app ropriate γ > 0 , wher e the last inequality co mes fr om (14) an d an ( α A , β A ) -hash property of ( A , p A ) . Similarly , we have E AB [ µ X Y ( E c 1 ∩ E 3 )] ≤ 2 n [ H ( Y | X )+ γ ] |Y | − l B |Y | l B α B | Im B | + β B ≤ δ 4 (49) for all sufficiently large n b y taking an app ropriate γ > 0 , wher e the last inequality co mes fr om (15) an d an ( α B , β B ) -hash property of ( B , p B ) . Next, we evaluate E AB [ µ X Y ( E c 1 ∩ E 4 )] . When (SW1) and ( SW4), we ha ve [ G \ { ( x , y ) } ] ∩ [ C A ( A x ) × C B ( B y )] 6 = ∅ , where G ≡ n ( x , y ) : µ X Y ( x , y ) ≥ 2 − n [ H ( X Y )+ γ ] o . September 7, 2021 DRAFT 42 Applying Lemma 1 to th e joint en semble ( A × B , p AB ) of a set o f function s AB : X n × Y n → X l A × Y l B , we hav e E AB [ µ X Y ( E c 1 ∩ E 4 )] = X ( x , y ) ∈T µ X Y ( x , y ) p AB ( { ( A, B ) : ( SW4) } ) ≤ X ( x , y ) ∈T µ X Y ( x , y ) p AB ( { ( A, B ) : [ G \ { ( x , y ) } ] ∩ [ C A ( A x ) × C B ( B y )] 6 = ∅} ) ≤ X ( x , y ) ∈T µ X Y ( x , y ) |G | α AB | Im A|| Im B | + β ′ AB ≤ 2 n [ H ( X ,Y )+ γ ] |X | − l A |Y | − l B |X | l A |Y | l B α AB | Im A|| Im B | + β ′ AB ≤ δ 4 (50) for all sufficiently large n b y taking an app ropriate γ > 0 , wher e the last inequality co mes fr om (16) an d an ( α AB , β AB ) -hash property of ( A × B , p A × p B ) . Finally , from (4 6)–(5 0), for all δ > 0 and for all suf ficien tly large n th ere are A an d B such that Erro r X Y ( A, B ) < δ. E. Pr o of of Theor em 5 For β A satisfying lim n →∞ β A ( n ) = 0 , let κ ≡ { κ ( n ) } ∞ n =1 be a sequence satisfying lim n →∞ κ ( n ) = ∞ (51) lim n →∞ κ ( n ) β A ( n ) = 0 ( 52) lim n →∞ log κ ( n ) n = 0 . (53) For examp le, ther e is such κ by letting κ ( n ) ≡ n ξ if β A ( n ) = o n − ξ 1 √ β A ( n ) , otherwise for e very n . If β A ( n ) is not o n − ξ , there is κ ′ > 0 such that β A ( n ) n ξ > κ ′ and log κ ( n ) n = log 1 β A ( n ) 2 n ≤ log n ξ κ ′ 2 n = ξ log n − log κ ′ 2 n for all suf ficiently large n . This implies that κ satisfies (53 ). I n the following, κ den otes κ ( n ) . Let ε ≡ ε B − ε A . Then, from (53), th ere are γ and γ ′ such that 0 < γ < p 2 γ log |Z | < ε (54) September 7, 2021 DRAFT 43 0 < γ ′ ≤ 2 ε (55) η W |Z ( γ ′ | γ ) ≤ ε − log κ 2 n (56) η X |Z W ( γ ′ | γ + 2 ε ) ≤ ε b A − log κ 2 n . (57) for all suf ficiently large n . It should be n oted here that ther e is such γ and γ ′ by assuming ( 18). Let z be the outpu t of ch annel side in formatio n, a nd x an d y b e an inp ut and an output of the channel, respectively , and m be a m essage. Fro m (55) , we have T W | Z,γ ′ ( z ) 6 = ∅ for all z and sufficiently large n . Then we hav e |T W | Z, 2 ε ( z ) | ≥ |T W | Z,γ ′ ( z ) | ≥ 2 n [ H ( W | Z ) − η W |Z ( γ ′ | γ )] ≥ √ κ 2 n [ H ( W | Z ) − ε A + ε B ] = √ κ |W | l A + l B ≥ √ κ | Im A|| Im B | for all z ∈ T Z,γ and sufficiently large n , where the first inequ ality com es from (5 5), th e second in equality comes from Lemma 27 , th e third inequality comes from (56 ), and th e fourth inequality comes from the fact that Im A ⊂ W l A and Im B ⊂ W l B . This im plies that f or all z ∈ T Z,γ there is T W | Z ( z ) ⊂ T W | Z, 2 ε ( z ) such that √ κ ≤ |T W | Z ( z ) | | Im A|| Im B | ≤ 2 √ κ (58) for all z ∈ T Z,γ and sufficiently large n . W e assume that T W | Z ( z ) satisfies the assum ption describ ed in Lemma 5. Similarly , from (57), we o btain T X | Z W ( z , w ) ⊂ T X | Z W , 2 ε b A ( z , w ) such that √ κ ≤ |T X | Z W ( z , w ) | | Im b A| ≤ 2 √ κ (59) for all ( z , w ) ∈ T Z W, 2 ε and sufficiently large n . W e define • z ∈ T Z,γ (GP1) • g AB ( c , m | z ) ∈ T W | Z ( z ) (GP2) • g b A ( b c | z , g AB ( c , m | z )) ∈ T X | Z W ( z , g AB ( c , m | z )) (GP3) • y ∈ T Y | X Z,γ ( g b A ( b c | z , g AB ( c , m | z )) , z ) (GP4) • g A ( c | y ) = g AB ( c , m | z ) . (GP5) Under condition (GP5), we h av e ϕ − 1 ( y ) = B g A ( c | y ) = B g AB ( c , m | z ) = m , September 7, 2021 DRAFT 44 which implies that th e decoding succeeds. T hen the error pr obability is u pper bounded b y Erro r Y | X Z ( A, B , b A, c , b c ) = X m , y p M ( m ) µ Z ( z ) µ Y | X Z ( y | g b A ( b c | z , g AB ( c , m | z )) , z ) χ ( g A ( c | y ) 6 = g AB ( c , m | z )) ≤ p M Y Z ( S c 1 ) + p M Y Z ( S 1 ∩ S c 2 ) + p M Y Z ( S 1 ∩ S 2 ∩ S c 3 ) + p M Y Z ( S c 4 ) + p M Y Z ( S 1 ∩ S 2 ∩ S 3 ∩ S 4 ∩ S c 5 ) , (60) where S i ≡ { ( m , y , z ) : (GP i ) } . Let δ be an arbitrary positi ve number . First we e valuate E AB b AC b C [ p M Y Z ( S c 1 )] and E AB b AC b C [ p M Y Z ( S c 4 )] . From Lemma 26 , we h av e E AB b AC b C [ p M Y Z ( S c 1 )] ≤ δ 5 (61) E AB b AC b C [ p M Y Z ( S c 4 )] ≤ δ 5 (62) for sufficiently large n . Next we ev aluate E AB C [ p M Y Z ( S 1 ∩ S c 2 )] and E b A b C [ p M Y Z ( S 1 ∩ S 2 ∩ S c 3 )] . From Lemma 5, we have E AB C [ p M Y Z ( S 1 ∩ S c 2 )] = X z ∈T Z,γ µ Z ( z ) p AB C M ( A, B , c , m ) : g AB ( c , m | z ) / ∈ T W | Z ( z ) ≤ X z ∈T Z,γ p Z ( z ) α AB − 1 + | Im A|| Im B | [ β AB + 1] |T W | Z ( z ) | + 2 − nε |W | l A + l B | Im A|| Im B | ≤ α AB − 1 + β AB + 1 √ κ + 2 − nε |W | l A + l B | Im A|| Im B | ≤ δ 5 (63) for all sufficiently large n , where th e seco nd ineq uality comes from (58 ), and the last inequ ality come s from (51) and the properties o f ( α AB , β AB ) and |W | l A / | Im A| . Similarly , by using (59) , we have E b A b C [ p M Y Z ( S 1 ∩ S 2 ∩ S c 3 )] ≤ α b A − 1 + β b A + 1 √ κ + 2 − nε b A |X | l b A | Im b A| ≤ δ 5 (64) for all suf ficiently large n . Next, we evaluate E AB b AC b C [ p M Y Z ( S 1 ∩ S 2 ∩ S 3 ∩ S 4 ∩ S c 5 )] . In the following, we assume that • z ∈ T Z,γ • w ∈ T W | Z ( z ) ⊂ T W | Z, 2 ε ( z ) • x ∈ T X | Z W ( z , w ) ⊂ T X | Z W , 2 ε b A ( z , w ) • y ∈ T Y | X Z,γ ( x , z ) = T Y | X Z W,γ ( x , z , w ) • g A ( c | y ) 6 = w , September 7, 2021 DRAFT 45 where the relation T Y | X Z,γ ( x , z ) = T Y | X Z W,γ ( x , z , w ) comes fr om (17 ). From Lemma 23, (1 8), an d (54) , we have ( x , y , z , w ) ∈ T X Y X Z W, 6 ε b A . Then there is w ′ ∈ C A ( c ) such that w ′ 6 = w and µ W | Y ( w ′ | y ) ≥ µ W | Y ( w | y ) = µ W Y ( w , y ) µ Y ( y ) ≥ 2 − n [ H ( W,Y )+ ζ Y W (6 ε b A )] 2 − n [ H ( Y ) − ζ Y (6 ε b A )] ≥ 2 − n [ H ( W | Y )+2 ζ Y W (6 ε b A )] , where the second inequality co mes f rom Lemma 26. T his implies that [ G ( y ) \ { w } ] ∩ C A ( c ) 6 = ∅ , where G ( y ) ≡ n w ′ : µ W | Y ( w ′ | y ) ≥ 2 − n [ H ( W | Y )+2 ζ Y W (6 ε b A )] o . Then, we ha ve E AB b AC b C [ p M Y Z ( S 1 ∩ S 2 ∩ S 3 ∩ S 4 ∩ S c 5 )] ≤ E AB b AC M b C X z ∈T Z,γ µ Z ( z ) X w ∈ T W | Z ( z ) χ ( g AB ( c , m | z ) = w ) X x ∈T X | Z W ( z , w ) χ ( g b A ( b c | z , w ) = x ) X y ∈T Y | XZ,γ ( x , z ) µ Y | X Z ( y | x , z ) χ ( g A ( c | y ) 6 = w ) ≤ E AB b AC M b C X z ∈T Z,γ µ Z ( z ) X w ∈ T W | Z ( z ) χ ( A w = c ) χ ( B w = m ) X x ∈T X | Z W ( z , w ) χ ( b A x = b c ) X y ∈T Y | XZ,γ ( x , z ) µ Y | X Z ( y | x , z ) χ ( g A ( c | y ) 6 = w ) = X z ∈T Z,γ µ Z ( z ) X w ∈ T W | Z ( z ) X x ∈T X | Z W ( z , w ) X y ∈T Y | XZ,γ ( x , z ) µ Y | X Z ( y | x , z ) · E AC h χ ( g A ( c | y ) 6 = w ) χ ( A w = c ) E B b AM b C h χ ( B w = m ) χ ( b A x = b c ) ii ≤ 1 | Im B || Im b A| X z ∈T Z,γ µ Z ( z ) X w ∈T W | Z ( z ) X x ∈T X | Z W ( z , w ) X y ∈T Y | XZ,γ ( x , z ) µ Y | X Z ( y | x , z ) · p AC ( A, c ) : [ G ( y ) \ { w } ] ∩ C A ( c ) 6 = ∅ w ∈ C A ( c ) ≤ 1 | Im B || Im b A| X z ∈T Z,γ µ Z ( z ) X w ∈T W | Z ( z ) X x ∈T X | Z W ( z , w ) X y ∈T Y | XZ,γ ( x , z ) µ Y | X Z ( y | x , z ) · 2 n [ H ( W | Y )+2 ζ Y W (6 ε b A )] α A | Im A| 2 + β A | Im A| ≤ 2 n [ H ( W | Y )+2 ζ Y W (6 ε b A )] α A | Im A| + β A X z ∈T Z,γ µ Z ( z ) X w ∈ T W | Z ( z ) 1 | Im A|| Im B | X x ∈T X | Z W ( z , w ) 1 | Im b A| September 7, 2021 DRAFT 46 ≤ 4 κ |W | l A 2 − n [ ε A − 2 ζ Y W (6 ε b A )] α A | Im A| + 4 κβ A ≤ δ 5 . (65) where the third in equality comes from ( 9), the f ourth inequality comes fr om Lemma 3 an d the fact that |G ( y ) | ≤ 2 n [ H ( W | Y )+2 ζ Y W (6 ε A )] , the sixth ineq uality come s from (58) and (5 9), and the last ine quality com es fro m (19) , (52), an d th e pro perties of ( α A , β A ) and |W | l A / | Im A| . Finally , from (60 )–(65 ), we h av e the fact that fo r all δ > 0 an d sufficiently large n ther e are A ∈ A , B ∈ B , b A ∈ b A , c ∈ Im A , and b c such that Erro r Y | X Z ( A, B , b A, c , b c ) ≤ δ. F . Pr oof of Theor em 7 W e define • ( x , z ) ∈ T X Z,γ (WZ1) • g A ( c | x ) ∈ T Y | X Z, 2 ε A ( x , z ) (WZ2) • g AB ( c , B g A ( c | x ) | z ) = g A ( c | x ) (WZ3) and assume that γ > 0 satisfies γ + p 2 γ log |X ||Z | < ε A . (66) W e prove th e following lemma. Lemma 20: For any ( x , z ) satisfyin g (WZ1) p AB C ( { ( A, B , c ) : (WZ2) , not (WZ3 ) } ) ≤ 2 − n [ ε B − ε A − 2 ζ Y Z (3 ε A )] |Y | l A + l B α B | Im A|| Im B | + β B . Pr oof: If ( x , z , A, B , c ) satisfies (WZ1) and (WZ2) but n ot (WZ3 ), there is y ′ ∈ C AB ( c , B g A ( c | x )) such that y ′ 6 = g A ( c | x ) . Then, from (66) an d Lemmas 23 an d 25, we h av e ( x , g A ( c | x ) , z ) ∈ T Y X Z, 3 ε A and µ Y | Z ( y ′ | z ) ≥ µ Y | Z ( g A ( c | x ) | z ) = µ Y Z ( g A ( c | x ) , z ) µ Z ( z ) ≥ 2 − n [ H ( Y ,Z )+ ζ Y Z (3 ε A )] 2 − n [ H ( Z ) − ζ Z (3 ε A )] = 2 − n [ H ( Y | Z )+2 ζ Y Z (3 ε A )] . This implies that [ G \ { g A ( c | x ) } ] ∩ C A ( c , B g A ( c | x )) 6 = ∅ , where G ≡ n y ′ : µ Y | Z ( y ′ | z ) ≥ 2 − n [ H ( Y | Z )+2 ζ Y Z (3 ε A )] o . September 7, 2021 DRAFT 47 Let y A, c ≡ g A ( c | x ) . From Lemma 4 , we hav e p AB C ( { ( A, B , c ) : (WZ2), not (WZ3 ) } ) ≤ p AB C ( A, B , c ) : G \ { y A, c } ∩ C AB ( c , B y A, c ) 6 = ∅ ≤ |G | α B | Im A|| Im B | + β B ≤ 2 n [ H ( Y | Z )+2 ζ Y Z (3 ε A )] α B | Im A|| Im B | + β B = 2 − n [ ε B − ε A − 2 ζ Y Z (3 ε A )] |Y | l A + l B α B | Im A|| Im B | + β B , where the second inequality co mes f rom Lemma 4 and th e th ird inequality comes from th e fact that |G | ≤ 2 n [ H ( Y | Z )+2 ζ Y Z (3 ε A )] . Pr oof of Theor em 7: Let Error X Z ( A, B , c ) be defined as Erro r X Z ( A, B , c ) ≡ µ X Z ( x , z ) : ( x , ϕ − 1 ( ϕ ( x ) , z ) , z ) / ∈ T X Y Z, 3 ε A . Since ( x , ϕ − 1 ( ϕ ( x ) , z ) , z ) = ( x , g AB ( c , B g A ( c | x )) , z ) = ( x , g A ( c | x ) , z ) ∈ T X Y Z, 3 ε A . under conditions (WZ1)–(WZ3), th en we ha ve Erro r X Z ( A, B , c ) ≤ µ X Z ( S c 1 ) + µ X Z ( S 1 ∩ S c 2 ) + µ X Z ( S 1 ∩ S 2 ∩ S c 3 ) , (67) where S i ≡ { ( x , z ) : (WZ i ) } . Let δ > 0 be an arbitrar y po siti ve number . First, we e valuate E AB C [ µ X Z ( S c 1 )] . From Lemma 26, we h av e E AB C [ µ X Z ( S c 1 )] ≤ δ 3 (68) for all suf ficiently large n . Next, we evaluate E AB C [ µ X Z ( S 1 ∩ S c 2 )] . From (20), we h av e µ X Y Z ( x , y , z ) = µ X Z ( x , z ) µ Y | X ( y | x ) = µ X Z ( x , z ) µ X Y ( x , y ) µ X ( x ) = µ X Y ( x , y ) µ Z | X ( z | x ) and arg max y ′ ∈C B ( c ) µ X Y ( x , y ′ ) = ar g max y ′ ∈C B ( c ) µ X Y ( x , y ′ ) µ Z | X ( z | x ) September 7, 2021 DRAFT 48 = arg max y ′ ∈C B ( c ) µ X Y Z ( x , y ′ , z ) . This implies th at ML co ding b y using µ X Y is eq uiv alen t that using µ X Y Z . Since γ > 0 satisfies (6 6), we have the fact tha t there is γ ′ > 0 such that η Y |X Z ( γ ′ | γ ) ≤ ε A − γ for all sufficiently large n . W e h av e T Y | X Z,γ ′ ( x , z ) 6 = ∅ for all ( x , z ) ∈ T X Z,γ and sufficiently large n . Then, from Lemma 27, we h av e |T Y | X Z, 2 ε A ( x , z ) | ≥ |T Y | X Z,γ ′ ( x , z ) | ≥ 2 n [ H ( Y | X Z ) − η Y |X Z ( γ ′ | γ )] ≥ 2 n [ H ( Y | X Z ) − ε A + γ ] ≥ 2 nγ |Z | l A ≥ 2 nγ | Im A| for all ( x , z ) ∈ T X Z,γ and sufficiently large n . T his implies th at there is T ( x , z ) ⊂ T Y | X Z, 2 ε A such that |T ( x , z ) | ≥ 2 nγ | Im A| . (69) W e assume that T ( x , z ) satisfies th e a ssumption described in Le mma 5. Then f rom Lemma 5 , we ha ve E AB C [ µ X Z ( S 1 ∩ S c 2 )] = X ( x , z ) ∈ T X Z,γ µ X Z ( x , z ) p AC ( A, c ) : g A ( c | x ) / ∈ T Y | X Z, 2 ε A ( x , z ) ≤ X ( x , z ) ∈ T X Z,γ µ X Z ( x , z ) α A − 1 + | Im A| [ β A + 1] |T ( x , z ) | + 2 − nε A |Y | l A | Im A| ≤ X ( x , z ) ∈ T X Z,γ µ X Z ( x , z ) α A − 1 + 2 − nγ [ β A + 1] + 2 − nε A |Y | l A | Im A| ≤ α A − 1 + 2 − nγ [ β A + 1] + 2 − nε A |Y | l A | Im A| ≤ δ 3 (70) for all sufficiently large n , where the third inequality comes form (6 9), and the last inequality comes from the proper ties of ( α A , β A ) and |Y | l A / | Im A| . Finally , we evaluate E AB C [ µ X Z ( S 1 ∩ S 2 ∩ S c 3 )] . From Lemma 20 , we h av e E AB C [ µ X Z ( S 1 ∩ S 2 ∩ S c 3 )] ≤ X ( x , z ) ∈T X Z,γ µ X Z ( x , z ) p AB C ( { ( A, B , c ) : (WZ2) , not (WZ3 ) } ) ≤ 2 − n [ ε B − ε A − 2 ζ Y Z (3 ε A )] |Y | l A + l B α B | Im A|| Im B | + β B ≤ δ 3 (71) for suffi ciently la rge n , wh ere th e last in equality comes f rom (21) and th e p roperties of ( α B , β B ) , |Y | l A / | Im A| and |Y | l B / | Im B | . September 7, 2021 DRAFT 49 From (6 7)–(7 1), w e have the fact that for any δ > 0 and for all sufficiently large n there are A , B , and c such that Erro r X Y ( A, B , c ) ≤ δ. From Lemma 24, we have ρ n ( x , f n ( y , z )) n = X ( x,y ,z ) ∈X ×Y ×Z ν xy z ( x, y , z ) ρ ( x, f ( y, z )) ≤ X ( x,y ,z ) ∈X ×Y ×Z µ X Y Z ( x, y , z ) + √ 6 ε A ρ ( x, f ( y , z )) ≤ X ( x,y ,z ) ∈X ×Y ×Z µ X Y Z ( x, y , z ) ρ ( x, f ( y, z )) + |X ||Y ||Z | ρ max √ 6 ε A = E X Y Z [ ρ ( X , f ( Y , Z ))] + |X ||Y ||Z | ρ max √ 6 ε A for ( x , y , z ) ∈ T X Y Z, 3 ε A . Then we h av e E X Z ρ n ( X n , f n ( ϕ − 1 ( ϕ ( X n ) , Z n ) , Z n )) n ≤ E X Y Z [ ρ ( X , f ( Y , Z ))] + |X ||Y ||Z | ρ max √ 6 ε A + δ ρ max ≤ E X Y Z [ ρ ( X , f ( Y , Z ))] + 3 |X ||Y ||Z | ρ max √ ε A for all suf ficiently large n by letting δ ≤ [3 − √ 6] |X ||Y ||Z | √ ε A . G. P r oof of Theor em 8 W e define • ( x , y ) ∈ T X Y ,γ (OHO1) • g A ( c | y ) ∈ T Z | X Y , 2 ε A ( x , y ) (OHO2) • g AB ( c , B g A ( c | y )) = g A ( c | y ) (OHO3) • g b B ( b B x | g AB ( c , B g A ( c | y ))) = x (OHO4) and assume that γ > 0 satisfies γ + p 2 γ log |X ||Y | < ε A . ( 72) W e prove th e following lemma. Lemma 21: For any ( x , y ) satisfy ing (OHO1), p AB b B C n ( A, B , b B , c ) : (OHO2),(OHO3 ), not (OHO4) o ≤ 2 − n [ ε b B − 2 ζ X Z (3 ε A )] |X | l b B α b B | Im b B | + β b B . Pr oof: W e define x A,B , b B , c ≡ g b B ( b B x | g AB ( c , B g A ( c , x ))) z A,B , c ≡ g AB ( c , B g A ( c , x )) . September 7, 2021 DRAFT 50 Assume that con ditions (OHO1)-(OHO3) ar e satisfied but (OHO4) is not. From L emma 2 3 and (72), we have ( x , y , g A ( c , y )) ∈ T X Y Z, 3 ε A and there is x ′ ∈ C b B ( b B x ) suc h that x ′ 6 = g b B ( b B x , g AB ( c , B g A ( c | x ))) . Fro m Lemma 26, we h av e µ X | Z ( x ′ | z A,B , c ) ≥ µ X | Z ( x A,B , b B, c | z A,B , c ) = µ X Z ( x A,B , b B, c , z A,B , c ) µ Z ( z ) ≥ 2 − n [ H ( X Z )+ ζ X Z (3 ε A )] 2 − n [ H ( Z ) − ζ Z (3 ε A )] = 2 − n [ H ( X | Z )+2 ζ X Z (3 ε A )] . This implies that h G ( z A,B , c ) \ { x A,B , b B , c } i ∩ C b B ( b B x ) 6 = ∅ , where G ( z ) ≡ n x ′ : µ X | Z ( x ′ | z ) ≥ 2 − n [ H ( X | Z )+2 ζ X Z (3 ε A )] o . From Lemma 1, we h av e p AB b B C n ( A, B , b B , c ) : (OHO2),(OH O3), not (OHO4) o ≤ p AB b B C n ( A, B , b B , c ) : h G ( z A,B , c ) \ { x A,B , b B, c } i ∩ C b B ( b B x A,B , b B, c ) 6 = ∅ o = X A,B , c p A,B , c ( A, B , c ) p b B n b B : h G ( z A,B , c ) \ { x A,B , b B, c } i ∩ C b B ( b B x A,B , b B, c ) 6 = ∅ o ≤ X A,B , c p A,B , c ( A, B , c ) G ( z A,B , c ) \ { x A,B , b B , c } α b B | Im b B | + β b B ≤ 2 n [ H ( X | Z )+2 ζ X Z (3 ε A )] α b B | Im b B | + β b B = 2 − n [ ε b B − 2 ζ X Z (3 ε A )] |X | l b B α b B | Im b B | + β b B , where the last inequality co mes from the fact that |G ( z A,B , c ) | ≤ 2 n [ H ( X | Z )+2 ζ X Z (3 ε A )] , for all A , B and c . Pr oof of Theor em 8: Under the conditions (OHO1)–(OHO4 ), we have ϕ − 1 ( ϕ X ( x ) , ϕ Y ( y )) = g b B ( b B x , g AB ( c , B g A ( c | y ))) = x . Then the decoding error p robability is upper bounded b y Erro r X Y ( A, B , b B , c ) ≤ µ X Y ( S c 1 ) + µ X Y ( S c 2 ) + µ X Y ( S 1 ∩ S 2 ∩ S c 3 ) + µ X Y ( S 1 ∩ S 2 ∩ S 3 ∩ S c 4 ) (73) where we define S i ≡ { ( v , x ) : (O HO i ) } . September 7, 2021 DRAFT 51 From (22), we ha ve µ X Y Z ( x , y , z ) = µ X Y ( x , y ) µ Z | Y ( z | y ) = µ X Y ( x , y ) µ Y Z ( y , z ) µ Y ( y ) = µ X | Y ( x | y ) µ Y Z ( y , z ) and arg max z ′ ∈C A ( c ) µ Z | Y ( z ′ | y ) = arg max z ′ ∈C A ( c ) µ Y Z ( y , z ′ ) = ar g max z ′ ∈C A ( c ) µ Y Z ( y , z ′ ) µ X | Y ( x | y ) = ar g max z ′ ∈C A ( c ) µ X Y Z ( x , y , z ′ ) = ar g max z ′ ∈C A ( c ) µ Z | X Y ( z ′ | x , y ) This imp lies th at ML codin g by using µ Z | Y is eq uiv alen t to tha t u sing µ Z | X Y . By ap plying a similar argum ent to that in the pr oof of Th eorem 7, we have E AB b B C [ µ X Y ( S c 1 )] ≤ δ 4 (74) E AB b B C [ µ X Y ( S c 2 )] ≤ α A − 1 + 2 − nγ [ β A + 1] + 2 − nε A |Z | l A | Im A| ≤ δ 4 (75) E AB b B C [ µ X Y ( S 1 ∩ S 2 ∩ S c 3 )] ≤ 2 − n [ ε B − ε A − ζ Z (3 ε A )] |Z | l A + l B α AB | Im A|| Im B | + β AB ≤ δ 4 (76) for all suf ficiently large n by assuming (23) an d (72). Furthermore, f rom Lemma 2 1, we ha ve E AB b B C [ µ X Y ( S 1 ∩ S 2 ∩ S 3 ∩ S c 4 )] ≤ X ( x , y ) ∈T X Y ,γ µ X Y ( x , y ) p AB b BC ( { ( A, B , c ) : (OHO2),(OHO3 ), n ot (OHO4) } ) ≤ X ( x , y ) ∈T X Y ,γ µ X Y ( x , y ) " 2 − n [ ε b B − 2 ζ X Z (3 ε A )] |X | l b B α b B | Im b B | + β b B # ≤ 2 − n [ ε b B − 2 ζ X Z (3 ε A )] |X | l b B α b B | Im b B | + β b B ≤ δ 4 (77) for all sufficiently large n , wher e th e last inequ ality co mes f rom ( 24) and the properties of ( α b B , β b B ) and |X | l b B / | Im b B | . From (73) –(77) , we have the fact that for all δ > 0 and sufficiently large n there are A , B , b B , and c such that Erro r X Y ( A, B , b B , C ) ≤ δ. September 7, 2021 DRAFT 52 V I I . C O N C L U S I O N In this pap er we intro duced th e notion of the hash pr operty o f an en semble o f functions an d pr oved th at a n ensemble of q -ary sparse matrices satisfies the hash p roperty . Based on th is pro perty , we proved the achiev ability of the coding theorem s for the Slepian- W olf p roblem, the Gel’fand -Pinsker problem, the W yn er-Zi v prob lem, and the On e-helps-on e pr oblem. This implies that th e rate of c odes using sparse m atrices comb ined with ML coding ca n achieve the optimal rate. W e believe tha t the h ash pr operty is essential for codin g p roblems and our theory can also be ap plied to other ensemb les of fun ctions suitable for efficient c oding a lgorithms. I n o ther words, it is enoug h to prove the hash pro perty of a n ew ensem ble to obtain several cod ing theor ems. It is a future challeng e to derive the perfo rmance of codes when ML codin g is rep laced by on e o f these efficient algorithm s gi ven in [1][18][11]. It is also a future cha llenge to ap ply the h ash prop erty to othe r coding p roblems. For examp le, there are studies of the fixed-r ate universal so urce c oding and the fixed-rate u niversal chann el coding [31], and th e wiretap channel c oding and the secret key agreement [ 32]. A P P E N D I X Method of T yp es W e use the following lemmas for a set of typical sequen ces. It shou ld be no ted that our d efinition of a set of typical sequences is intro duced in [ 15][41] and dif fers fr om that d efined in [6][4][16][48]. Lemma 22 ([6, Lemma 2. 6]): 1 n log 1 µ U ( u ) = H ( ν u ) + D ( ν u k µ U ) 1 n log 1 µ U | V ( u | v ) = H ( ν u | v | ν v ) + D ( ν u | v k µ U | V | ν v ) . Lemma 23 ([41, Th eor em 2.5]): If v ∈ T V ,γ and u ∈ T U | V ,γ ′ ( v ) , then ( u , v ) ∈ T U V ,γ + γ ′ . If ( u , v ) ∈ T U V ,γ , then u ∈ T U,γ . Pr oof: The first statement can b e proved from the fact that D ( ν uv k µ U V ) = D ( ν v k µ V ) + D ( ν u | v k µ U | V | ν v ) . (78) The second statement c an be proved f rom th e fact that D ( ν v k µ V ) ≤ D ( ν u , v k µ U V ) , which is deri ved from (78) and th e non-negativity of the div ergence. Lemma 24 ([41, Th eor em 2.6]): If u ∈ T U,γ , then | ν u ( u ) − µ U ( u ) | ≤ p 2 γ , for all u ∈ U , ν u ( u ) = 0 , if µ U ( u ) = 0 . Pr oof: The lemma can be p roved directly from the fact that X u ∈U | ν ( u ) − µ U ( u ) | ≤ s 2 D ( ν k µ U ) log 2 e , where e is the base of the natural logarithm (see [4, L emma 1 2.6.1 ]). September 7, 2021 DRAFT 53 Lemma 25 ([41, Th eor em 2.7]): Let 0 < γ ≤ 1 / 8 . Then, 1 n log 2 1 µ U ( u ) − H ( U ) ≤ ζ U ( γ ) (79) for all u ∈ T U,γ , and 1 n log 2 1 µ U | V ( u | v ) − H ( U | V ) ≤ ζ U |V ( γ ′ | γ ) (80) for v ∈ T V ,γ and u ∈ T U | V ,γ ′ ( v ) , where ζ U ( γ ) an d ζ U |V ( γ ′ | γ ) ar e d efined in (2) an d (3), respectiv ely . Pr oof: From Lemma 22, we ha ve 1 n log 2 1 µ V ( v ) − H ( V ) ≤ D ( ν v k µ V ) + | H ( ν v ) − H ( V ) | . W e hav e (79) from [6, L emma 2.7]. From Lemmas 22 and 2 4, we ha ve 1 n log 2 1 µ U | V ( u | v ) − H ( U | V ) ≤ D ( ν u | v k µ U | V | ν v ) + | H ( ν u | v | ν v ) − H ( µ U | V | ν v ) | + | H ( µ U | V | ν v ) − H ( U | V ) | , and | H ( µ U | V | ν v ) − H ( U | V ) | ≤ p 2 γ log 2 |U | , respectively . W e have (80 ) fro m the ab ove inequalities and [6, Lemma 2.7 ]. Lemma 26 ([41, Th eor em 2.8]): For any γ > 0 ,and v ∈ V n , µ U ([ T U,γ ] c ) ≤ 2 − n [ γ − λ U ] µ U | V ([ T U | V ,γ ( v )] c | v ) ≤ 2 − n [ γ − λ U V ] , where λ U and λ U V are defined in (1 ). Pr oof: The lemma can be p roved from [6, Lemma 2 .2] and [6, Lemma 2.6 ]. Lemma 27 ([41, Th eor em 2.9]): For any γ > 0 , γ ′ > 0 , an d v ∈ T V ,γ , 1 n log 2 |T U,γ | − H ( U ) ≤ η U ( γ ) 1 n log 2 |T U | V ,γ ′ ( v ) | − H ( U | V ) ≤ η U |V ( γ ′ | γ ) , where η U ( γ ) an d η U |V ( γ ′ | γ ) ar e d efined in (4) an d (5), respectiv ely . Pr oof: The lemma can be p roved in the same w ay as the p roof of [6, Le mma 2.13]. A C K N OW L E D G E M E N T S This paper was written while one of authors J. M. was a visiting researcher at ETH, Z ¨ urich. He wishes to thank Prof. Maure r for arran ging fo r his stay . Constru ctiv e comm ents, sugg estions, an d refferences by anonym ous revie wers of IEE E Transactions on Info rmation Theo ry h av e significantly improved the presen tation of our results. September 7, 2021 DRAFT 54 R E F E R E N C E S [1] S. M. Aji and R. J. McEliece, “The generalized distrib utiv e law , ” IEEE T rans. Inform. Theory , vol . 46, no. 2, pp. 325–343, Mar . 2000. [2] A. Bennatan and D. Burshtein , “On the applicat ion of L DPC codes to arbitrary discret e-memoryless channel s, ” IEEE T rans. Inform. Theory , vol . IT -50, no. 3, pp. 417–438, Mar . 2004. [3] T . M. Cove r, “ A proof of the data compression theorem of Slepian and W olf for ergodic sources, ” IE EE T rans. Inform Theory , vol. IT -21, no. 2, pp. 226–228, Mar . 1975. [4] T . M. Cove r and J. A. Thomas, Elements of Information Theory 2nd. Ed. , John Wile y & Sons, Inc., 2006. [5] I. Csisz ´ ar , “Linear codes for sources and s ource networks: Error exponent s, uni versal coding, ” IE EE T rans. Inform. Theory , vol. IT -28, no. 4, pp. 585–592, Jul. 1982. [6] I. Csisz ´ ar and J. K ¨ orne r , Information Theory: Coding Theor ems for Discr ete Memoryle s s Systems , Academic Press, 1981. [7] J. L. Carter and M. N. W egman, “Uni versal classes of hash functions, ” J. Comput. Syst. Sci. , vol . 18, pp. 143–154, 1979. [8] A.G. Dimakis, M. J. W ainwright, and K. Ramchandran, “Lo wer bounds on the rate-distort ion function of LDGM codes, ” Pr oc. IEEE Informatio n T heory W orkshop 2007 , Lake T ahoe, USA, 2–6 Sept, 2007, pp. 541–546. pp. 650–655. [9] P . Elias, “Coding for noisy channels, ” IRE Con vention Record , Part 4, pp. 37–46, 1955. [10] U . Erez and G. Mille r , “The ML decoding performance of LDPC ensembles ove r Z q , ” IEE E T rans. Inform. Theory , vol . IT -51, no. 5, pp. 1871–1879, May 2005. [11] J . Feld man, M.J. W ainwright, and D.R. Kar ger , “Using linear programming to decode binary linea r codes, ” IEEE T rans. Info rm. Theory , vol . IT -51, no. 3, pp. 954–972, Mar . 2005. [12] R. G. Gallager , Info rmation Theory and Reliabl e Communicati on , John Wi ley & Sons, Inc., 1968. [13] S . I. Gel’fan d and M. S. Pinsker , “Coding for channel with random parame ters, ” Probl . P ered. Inform. (Prob . Inf. T ransmission) , vol . 9(1), pp. 19–31, 1983. [14] A . Gupta and S. V erdu, “Nonline ar sparse-graph codes for lossy compression of discrete nonredundant sources, ” Proc. IEE E Informatio n T heory W orkshop 2007 , Lake T ahoe, USA, 2–6 Sept, 2007, pp. 541–546. [15] T . S. Han, and K. Kobay ashi; “Exponentia l-type error probabil ities for multitermina l hypothesis testing, ” IEEE T rans. Info. Theory , vol. IT -35, no. 1, pp. 2 - 14, Jan. 1989. [16] T . S. Han and K. Ko bayashi, Mathema tics of Information and Coding , Am erica n Mathematic al Society , 2002. [17] H . Koga, “Source coding using famili es of univ ersal hash function s, ” IEEE Tr ans. Inform. Theory , vol. IT -53, no. 9, pp. 3226–3233, Sept. 2007. [18] F . R. Kschischang , B. J. Frey , and H. A. Loeliger , “Fa ctor graphs and the sum-product algorit hm, ” IEEE T ransact ions on Information Theory , vol . 47, no. 2, pp. 498–519, Feb . 2001. [19] S . Kudekar and R. Urbank e, “Lowe r bounds on the rate-di stortion func tion of indivi dual LDGM codes, ” av ailabl e at arXiv:0804.169 7[cs.IT] , 2007. [20] A . D. L i veris, Z. Xiong, and C. N. G eorg hiades, “Compression of binary sources with side information at the decoder using LDPC codes, ” IEE E Comm. Letter s , vol. 6, no. 10, pp. 440–442, Oct. 2002. [21] D . J. C. MacKay , “Good error-c orrecti ng codes based on very sparse m atrice s, ” IEEE T rans. Inform. Theory , vol. IT -45, no. 2, pp. 399–431, Mar . 1999. [22] D . J. C. MacKay , Information Theory , Infer ence, and Learning Algorithms , Cambridge Unive rsity Press 2003. [23] E . Martinian and M. W ainwright, “Lo w density codes achie ve the rate-dist ortion bound, ” Pr oc. of the IEEE Data Compr ession Cofer ence , Mar . 28–30, 2006 pp. 153–162. [24] E . Martinian and M. W ainwright, “Low-densi ty construction s ca n achie ve the W yner-Zi v and Gelfand-Pi nsker bounds, ” P r oc. 2006 IEEE Int. Symp. Inform. Theory , Seattl e, USA, Jul. 9–14, 2006, pp. 484–488. [25] Y . Matsunaga and H. Y amamoto, “ A codin g theorem for lossy data compression by LDPC codes, ” IE EE T rans. Inform. Theory , vol . IT -49, no. 9, pp. 2225-2229, 2003. [26] G . Miller and D. Burshtein , “Bounds on the m aximum-lik elihood deco ding error probabi lity of low-densi ty parity-c heck codes, ” IEEE T rans. Inform. Theory , vol. IT -47, no. 7, pp. 2696–2710, Nov . 2001. [27] S . Miyake, “Lossy data compression ove r Z q by LD PC code, ” Pr oc. 2006 IEEE Int. Symp. Inform. Theory , Seattl e, US A, Jul. 9–14, 2006, pp. 813–816. September 7, 2021 DRAFT 55 [28] S . Miyake and J. Muramatsu, “Construc tions of a lossy source code using LDPC matrices, ” IE ICE T rans. Fundamentals , vol. E91-A, No. 6, pp. 1488–1501, Jun. 2008. [29] S . Miyake and J. Muramatsu, “ A construction of channel code, joint source-chann el code, and uni versal code for arbitrary stationa ry memoryless channels using sparse m atrice s , ” Pr oc. 2008 IEEE Int. Symp. Inform. Theory , Tront o, Canada, Jul. 6–11, 2008, pp. 1193–1197. [30] J . Muramatsu, and S. Miyake, “Lossy s ource coding algorithm using lossless multi-termina l source codes, ” T echni cal Report of IEICE , vol. IT2006-50, pp. 1–6, Jan. 2007. [31] J . Muramatsu and S. Miyake “Hash property and fixed-rate uni versal coding theorems, ” submitted to IEEE T rans. Inform. Theory , av ailable at arXiv:0804.118 3[cs.IT] , 2008. [32] J . Muramatsu, and S. Miyake , “Co nstructi on of codes for wiretap channel and secret key agreement from correlate d source outputs by using sparse matrices, ” in preparat ion for submission, 2007. [33] J . Muramatsu, T . Uyematsu, and T . W adayama, “Low density parity check matrices for coding of correla ted sources, ” IEEE T rans. Inform. Theory , vo l. IT -51, no. 10, pp. 3645–3653, Oct. 2005. [34] J . Muramatsu, “Secret key agreement from correlated source outputs using low density parity check matrices, ” IEICE T rans. Fundamental s , vol. E89-A, no. 7, pp. 2036–2046, Jul. 2006. [35] T . Murayama, “Statist ical mechanic s of data compression theore m, ” J. P hys. A : Math. Gen. , vol. 35: L95L100, 2002. [36] T . Murayama, “Thouless-Ander son-Pal mer approach for lossy compression, ” Phys. Rev . E , vol. 69, no. 035105(R), 2004. [37] C. E. Shannon, “ A mathematica l theory of communication , ” Bell System T ech nical J ournal , vol. 27, pp. 379–423, 623–656, 1948. [38] C. E. Shannon, “Codi ng theorems for a discrete source wit h a fideli ty crite rion, ” IRE Natio nal Con ventional R ecor d, P art 4 , pp. 142–163, 1959. [39] D . Slepi an and J. K. W olf, “Noisele s s coding of correl ated information sources, ” IEEE T rans. Inform. Theory , vol . IT -19, no. 4, pp. 471–480, J ul. 1973. [40] D . Schonber g, S. S. Pradhan, and K. Ramchandran, “LDPC codes can approach the Slepia n W olf bound for general binary sources, ” 40th A nnual Allerton Confe re nce on Communic ation, Contr ol, and Computin g, Allerton House, Monticello, Illinois, Oct. 2002. [41] T . Uyematsu, Gendai Shannon R ir on , Baifukan , 1998 (in Japane se). [42] T . W adayama , “ A lossy compression algorithm for discrete memoryless sources based on LDPC codes, ” P r oc. 3rd Asian-Euro pean W orkshop on Inform. Theory . , Kamoga wa, Japan, pp. 98–105, June 25–28, 2003. [43] T . W adayama , “ A lossy compression algorith m for binary memoryless sources based on LDPC codes, ” T echnic al Report of IE ICE , vol. IT2003-50, pp. 53-56, 2003. [44] A . D. W yner , “ A theorem on the entropy of certain binary sequence s and applic ations II, ” IEE E T rans. Inform. Theory , vol. IT -19, no. 6, pp. 772–777, Nov . 1973. [45] A . D. W yner , “Recen t results in the Shannon theory , ” IEEE T rans. Inform. Theory , vol. IT -20, pp. 2–10, Jan. 1974. [46] A . D. W yner and J. Ziv , “ A theorem on the entrop y of certain binary sequences and applica tions I, ” IEEE T rans. Inform. Theory , vol. IT -19, no. 6, pp. 769–771, Nov . 1973. [47] A . D. W yner and J. Ziv , “The rate -distorti on function for source coding with side informatio n at the decoder , ” IEE E T rans. Inform. Theory , vol . IT -22, no. 1, pp. 1–10, Jan. 1976. [48] R. W . Y eung, A Fi rst Course in Information Theory , Springer Published, 2006. [49] R. Zamir, S. Shamai( Shitz), and U. Erez, “Nested line ar/lat tice codes for structured multite rminal binning, ” IE EE T rans. Inform. Theory , vol . IT -48, no. 6, pp. 1250–1276, Jun. 2002. September 7, 2021 DRAFT
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment