Coding for Network Coding

Co ding for Net w ork Co ding Andrea Mon tanar i ∗ and R ¨ udiger Urbank e † No vem b er 19, 2007 Abstract W e consider communication o ver a noisy net work under randomized linear netw ork co ding. Possible error mechanism include no de- or link- failur es, Byzantine b ehavior o f no de s , or an ov er - estimate of the net work min-cut. Building on the w o rk o f K¨ otter and Kschischang, we int ro duce a probabilistic model for errors. W e compute the capacity of this channel and we deﬁne an error-cor rection scheme ba sed on random sparse graphs and a low-complexity deco ding algor ithm. By optimizing ov er the co de deg r ee proﬁle, w e show that this co nstruction achieves the channel capacity in complexity which is join tly quadratic in the n umber of co ded infor mation bits and subloga rithmic in the err or probability . 1 In tro duction Consider a wir e-line comm unica tion netw o rk mo deled as a directed acyc lic (mu lti-)graph with edges of unit capacit y . A so urces wan ts to communicate informatio n to a set of r eceivers. If w e allow pr o c essing of informa tion at no des in the netw ork then the ach iev able throughput is in gener al higher than what can be achiev ed b y s chemes that only allo w r outing [1 , 9]. Schemes that employ pro cessing are r eferred to as network c o ding s chemes. The standard assumption in the net work co ding litera ture is that no errors a re int ro duced within the net work or, equiv a le n tly , that suﬃciently p ow er ful erro r-cor r ecting co des are employ ed on the links at the physical lay er. How ever a num b er of error sour ces (e.g., malicious or malfunctioning no des) cannot b e neglected. W e consider a pr obabilistic model for transmission errors that builds up on the work of Kschisc ha ng and K¨ otter [7, 14 ]. W e compute the infor mation theoretic limit on point-to-p oint communication for this mo del (the channel capa city) and de ﬁne a coding sc heme based on a spars e - graph constructio n and a low-complexity iterative deco ding a lg orithm. W e show that the parameters of the cons truction ca n b e optimized analytically and, remark a bly , the optimized scheme achiev es the channel capacity . This is the second channel mo del for whic h iterative schemes can b e shown to a chiev e capacity (the ﬁrst one being the binary er asure channel; this was shown in the semina l work of Luby , Mitzenmacher, Shokrollahi, Spielman, and Steman [10]). 2 Net w ork Co ding: Bac kgroun d and Related W ork Assume that an info r mation sour ce generates h symbols p er unit time. T he in teg er h is r eferred to as the source rate. Info r mation is enco de d at the sender in pack ets of length N with entries from a ﬁnite ﬁeld F q . The netw or k is as sumed to b e synchronous and without delay . As a consequence, pac kets are aligned a t the destination at regular time in terv als . ∗ Departments of Electrical En gineering and S tatistics, Stanford Universit y , Stan ford CA-9305, US A † School of Computer and Communication Sciences, EPFL, 1015 Lausanne, CH Keywords: S parse graph codes, probabilistic channel mo dels, Sh annon c hannel capacity , netw ork cod ing 1 The most c o mmon scenario studied in this co nt ext is a multi-cast one in whic h the so urce aims at commu nicating the same information to a se t of receivers (distinct no des in the same net work.) The fundamental theorem of netw or k co ding states that this is p ossible using netw or k co ding (i.e., pro cessing at the no des) if the v a lues of the min-cuts fro m the source to any of the rece ivers is at least h [1]. Moreov er , li ne ar net work co ding suﬃces [9]. This means that pro cessing at the no de s can be limited to forwarding pack ets whic h a r e linea r (ov er F q ) combinations of inc o ming pa ck ets. Finally , it is not necessar y to c ho o se the loca l enco ding functions at the no des carefully . R andom linear co mb inations are suﬃcient with probability close to o ne, provided the cardinality of the ﬁeld is large enough [6, 8]. F or a general in tro duction in to net work co ding w e refer the reader to [4, 16, 17]. The prefer red metho d to implement r andom linear net work co des is to include “headers” in the pack ets of length N [2 ]. The role of the headers is to “ recor d” the co eﬃcients used in the lo cal enc o ding functions so that the r e ceiver can be oblivious to the netw or k top olo gy and to the spe c iﬁc lo cal enco ding functions used. In more detail, assume that w e send ℓ pack ets. The header of each packet is then an element of ( F q ) ℓ , where the header o f the i -th so urce packet, i ∈ [ ℓ ], is the all-zero tuple, exce pt for an iden tit y element at p osition i . Recall that nodes for ward pac kets whic h are linea r co mbinations of the incoming packets. Therefor e , if the header of a pa ck et somewhere in the netw or k r eads ( β 1 , . . . , β ℓ ), β i ∈ F q , then we know that this pack et is the linea r combination o f the ℓ original source packets, whe r e the i -th original source pack et ha s “weigh t” β i . The signiﬁcant adv a nt age o f suc h a scheme is that the receivers can b e o blivious to the top olo gy and the lo cal enco ding functions. Of course we pay some price; if we use headers then only m = N − ℓ of the N symbo ls of each pac ket are av ailable for information transmission. Our subsequent discussio n ass umes this “ oblivious” mo de l. So far we assumed that error s neither oc cur during transmis s ion nor during pro cess ing. If the channel or the pro ces sing ar e noisy , o ne ca n use co ding to co m bat the noise. Note that if we stack the ℓ so urce pack ets of leng th N on top of each other then w e get an ℓ × N matrix over F q whose, lets s ay , left ℓ × ℓ submatrix (the collection of headers) is the identit y matrix. F ormally , a co de C is a collection of ℓ × N matrices with elemen ts in F q , such that eac h M ∈ C takes the form M = [1 | x ]. Here, 1 is the ℓ × ℓ iden tity matrix a nd x is a n ℓ × m matrix ( m = N − ℓ ). W e say that M is in normal for m. The co de C is th us equiv a le nt ly describ ed b y a collection of ℓ × m matrices { x } . The r ate of the code is deﬁned as the ratio of the n umber of information q -bits that ca n be conv eyed b y the c ho ice of co deword (log q | C | ) to the num b er of transmitted sym bo ls ( N ℓ ): R ( C ) = log q | C | N ℓ . (1) Before the source packet s a r e tra nsmitted w e mult iply M fro m the left by an ℓ × ℓ random inv e r tible matrix with components in F q . This “mixe s ” the rows of M a nd ensur es that regardless o f the netw ork top ology a nd the lo ca tion where the errors are in tro duced, the eﬀect of the er r ors on the normalized form is uniform. W e then transmit ea ch r esulting r ow as o ne pac ket. Upo n transmissio n of M , a “corr upted” v ers ion Q of the co deword is received. Without loss of generality , we assume that Q is brought back in to normal form Q = [1 | y ] b y Gaussian elimination. 1 F ollowing K¨ otter and Kschisch ang [7], w e mo del the net eﬀect of the tra ns mission- and the pro cessing- “noise” a s a low-rank p erturbatio n of x . More precisely , we assume tha t y = x + z , (2) where z is an ℓ × m matrix o ver F q of rank( z ) = ℓω , ω ∈ [0 , 1]. W e call ℓω the weigh t of the er ror, and ω the normalize d weigh t. Deﬁne the di stanc e o f tw o codewords x and x ′ as d ( x, x ′ ) = ra nk( x − x ′ ) and the minimum distance d ( C ) of the co de C as the minimum of the dis tances b e t ween a ll distinct pairs of c o dewords. The normalize d minimum distance is δ ( C ) = d ( C ) /ℓ . It is shown in [7 ] that d ( · , · ) is a true distance metric; 1 In principle it migh t b e that the received matrix cannot b e brought in this form because its ﬁrst ℓ columns hav e rank smaller than ℓ . H o w ever, within the probabilistic mo d el which w e discuss in the follo wing, the rank deﬁciency is small with high probabilit y and can b e eliminated by a small p erturbation. 2 in par ticula r it fulﬁlls the tr iangle inequa lit y . Therefore, giv en a co de C o f minim um distance d ( C ) a simple b ounde d d istanc e deco der can correct a ll err ors o f weight s = ( d ( C ) − 1 ) / 2 o r les s . A b ounded distance deco der is a n algorithm that, giv en a receiv ed w ord y , decodes y to the unique word within distance s if such a word exists and dec lares a n erro r otherwise. Bounded dista nce deco ders ar e po pular since a suitable alg ebraic structure on the co de often ensures that b ounded distance decoding can b e accomplished with low c omplexity . The b ounde d-distanc e error-co rrecting capability of a co de is deﬁned as ω ( C ) = d ( C ) / (2 ℓ ) = δ ( C ) / 2 . K¨ otter and Kschisc hang showed that the optimal trade- o ﬀ b etw een R ( C ) a nd ω ( C ) is given by an appropria te g eneralizatio n o f the “Singleton b ound.” In the limit N → ∞ , with ℓ = λN , the maximal achiev able rate for the parameter s ω, λ ∈ [0 , 1 / 2], ca ll it C Singleton ( λ, ω ), is given by C Singleton ( λ, ω ) = (1 − λ )( 1 − 2 ω ) . (3) Note that C Singleton ( λ, ω ) is the ma ximum achiev able ra te fo r a guaranteed erro r corr ection in an ad- versarial channel mo del. It is also the maxima l achiev a ble ra te in a probabilis tic setting if w e a re limited to bounded distance deco ding. Remark ably , K¨ otter and Kschisc hang found a genera lization of Reed-Solomon codes that achiev es this bound. 3 Main Results W e are interested in a pr obabilistic (a s opp os ed to adversarial) c ha nnel mo del. More precisely , we assume that in (2) the p erturbation z is c hosen uniformly at rando m fro m all matrices in ( F q ) ℓ × m of rank ℓω . W e assume that the para meters λ and ω ar e ﬁxed and consider the b ehavior of the channel as we increa s e N . W e refer to our c ha nnel model as the symmetric network c o ding channel with pa rameters λ and ω , denoted b y SNC( λ, ω ). Prop ositi on 3.1 (Channel Ca pacity) . The c ap acity of SNC( λ, ω ) is C ( λ, ω ) = 1 − λ − ω + λω 2 . (4) Discussion: In the deﬁnition of capacity we implicitly assume that the er ror pr obability ω is not a function of N . Dep ending on the under lying physical erro r mechanism this may o r may not b e the case. Note that for small ω , C ( λ, ω ) ≈ 1 − λ − ω , whereas C Singleton ( λ, ω ) ≈ 1 − λ − 2 (1 − λ ) ω . Fig. 1 compa res C ( λ, ω ) with C Singleton ( λ, ω ) and shows the points that ar e achiev able accor ding to Theorem 3 .2 . Theorem 3.2 (Capacity-Achieving Itera tive Co de Construction) . F or any λ, ω ∈ (0 , 1) such that (1 − λ ) /λ is an inte ger multiple of ω , any R < C ( λ, ω ) , and any π > 0 ther e exists an err or c orr e cting c o de and a de c o ding alg orithm that achiev es symb ol err or pr ob ability smal ler t han π , with O ( N 4 log log(1 / π )) de c o ding c omplexity. ω 0 . 2 0 . 4 0 . 6 0 . 8 0 . 2 0 . 4 0 . 6 0 . 8 0 . 0 R Figure 1: Comparison of C ( λ, ω ) (solid line) with C Singleton ( λ, ω ) (dotted line) for λ = 1 / 6. The p oin ts on the curve C ( λ, ω ) that are ac h iev able by the lo w-complexit y iterativ e s cheme are shown as dots. 3 Discussion: The complexity o f the sch eme is giv en as O ( N 4 ). B ut note that the n um be r of tra nsmitted information sym b ols is N 2 λR . Therefor e, if we measure the co mplex it y per transmitted informa tion symbol then it is only quadr atic. Note als o that the complexity scales muc h be tter with the target e r ror pro ba bility than for usual sparse gr aph co des (where it is at least linear in log(1 /π )). 3.1 Co de Construction Fig. 2 shows our co ding scheme. Each r ow corr esp onds to a pack et o f length N . The ℓ × ℓ identit y matrix 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ℓ = λN         ω ℓ  (1 − ω ) ℓ 2 P ′ (1) ℓ = λN z }| { m =(1 − λ ) N z }| { x =    x 1 . . . x ℓ    Figure 2: Co ding scheme. Th e last ω ℓ ro ws of x are zero. The ﬁr st (1 − ω ) ℓ ob ey a set of linear constrain ts represent ed by the bipartite graph shown on the right -hand side. 1 is sho wn on the left-hand side, wherea s the ℓ × m matrix o n the rig ht-hand side represents x . Each x corres p o nds to a co deword of C . Not all x are allowed. Here are the constraints that x must fulﬁll to be a co de word. The b ottom ω ℓ rows are identical to zero . The top (1 − ω ) ℓ rows ar e constrained by a linear system of equations. These ar e indicated by the bipartite gr aph on the r ight-hand side, a ccording to the standard gr aphical repr e sentation used for low-densit y pa rity-c heck co des [5 , 13]. More precise ly , we hav e ˆ H    x T 1 . . . x T (1 − ω ) ℓ    = 0 . (5) The matrix ˆ H has the follo wing structure. Sta r t with a “s parse” ((1 − ω ) ℓr ) × ((1 − ω ) ℓ ) { 0 , 1 } -v alued matrix H . The matrix H has exactly 2 no n-zero entries along each column. F urther, the fr action of rows that co ntain exa c tly i non-zero entries is equal to P i , where P ( x ) = P i P i x i is a given de gr e e distribution (in particular, it fulﬁlls P i ≥ 0 and P (1) = 1.) In the fo llowing, we s hall s ay that P has b ounde d su pp ort if P i = 0 for i lar ger than some n max < ∞ or, equiv ale ntly , if P ( x ) is a p o lynomial. The matrix H is represented by the graph. Cir cles (on the right-hand side in Fig. 2) corr esp ond to the columns of H and squar es (on the left-hand side) corresp ond to the ro ws of H . There is an edge betw een a circle and a n edge iﬀ there is a non-zero entry at the corresp o nding row and column of H . F ollowing the iterative co ding literature, w e refer to the circles a s the variable no des, to the squa res a s the che ck nodes , and w e call this graph a T anner graph. T o get the matrix ˆ H we “lift” H by repla c ing each of its non- zero elements by a n m × m invertible matrix with elements in F q . W e can visualize this by attaching these in vertible matrices a s lab els to the c o rresp onding edges. W e claim that for any choice o f the matrix ˆ H compatible with the degr ee distribution P ( x ) the rate of the co de is at le ast R ( ω , λ, P ) = (1 − λ )(1 − ω )  1 − 2 P ′ (1)  . (6) T o see this, note that the matrix x is of dimension ℓ × m and has entries in F q . Since the last ω ℓ rows have to b e zero this reduces the degrees o f freedom by mω ℓ . F urther, there are m (1 − ω ) ℓ 2 P ′ (1) linear constraints, taking aw ay at most that ma ny further deg rees of free dom (and po ssibly less beca use of linear dep endencies). W e get the cla im by dividing the r e ma ining degree s of freedom by N ℓ , in accorda nce with (1). 4 So far w e hav e explained ho w to construct a code. W e deﬁne an ensemble of co des by ( i ) picking a matrix H uniformly from all matrices that hav e degr ee proﬁle P ( x ) a c cording to the conﬁguration mo del and, ( ii ) picking the la b els (the m × m invertible matrices) for a ll edges uniformly and independently for ea ch edge . W e denote the re s ulting ensemble by C ( N , λ, ω , P ( x )). Discussion: In Fig . 2 all linear constra ints a re on the rows of x . An entirely equiv alent formulation is to apply the linear constraints to the columns of x instea d; i.e., set the last ω m co lumns of x to z e r o and apply a set of linear constraints on the ﬁrst (1 − ω ) m columns of x . All subsequent statements apply also to this case and yield iden tical results if w e let N tend to inﬁnity . F or the s ake of simplicit y , we limit our disc us sion to the scheme of Fig. 2. In a practical implementation, how ever, there can b e reasons to prefer one scheme ov er the o ther . F or instance, the iterative deco der discussed in the next sectio n might be mor e eﬀective on a larg er T anner graphs. This suggests to use the c o nstruction in Fig. 2 if ℓ > m and the ‘transp osed’ o ne otherwise. 3.2 Enco ding and Deco ding Algorithm Assume that the para meters of the mo de l ( N , λ , ω , and P ( x )) ar e ﬁxed and that we hav e chosen one particular co de from the ensem ble C ( N , λ, ω , P ( x )). A t the source w e are given R N ℓ sym bo ls ov er F q (the infor mation we wan t to transmit). W e need to map each of these q RN ℓ po ssible informatio n vectors to a distinct co deword x . This is the enc o ding task. In principle this can be done by solving a linear system o f equations, starting with (5). A brute force a pproach, how ever, has complexity O ( N 6 ). F ortunately , one can explo it the sparseness of the matrix H to reduce the e nc o ding complexity to O ( N 3 ). The bas ic idea is to bring H int o upp er -triangular fo rm by using only row a nd co lumn p ermutations but no algebr a ic opera tions. As prov ed in [12], this can b e do ne with high proba bility if P ′′ (1) /P ′ (1) > 1. W e will see in Section 4, cf. Lemma 4.5, that this condition is a lwa ys fulﬁlled. F ur ther deta ils on the eﬃcient implementation of the enco der will b e discussed in a forthcoming publica tio n. W e a re curr ently mainly concerned with the deco ding problem. The receiv er sees the p er turb ed matrix y . An equiv a lent description of our channel mo del is the following. Each row of y is the res ult of adding to the c orresp o nding row of x a uniformly random element of a subspace W of ( F q ) m . The subspace W is itself uniformly ra ndom under the co nditio n dim ( W ) = ω ℓ . 2 Recall that by assumption the la s t ℓω rows of x are zer o. In fact, in order to achieve reliable transmission we ne e d to mo dify the scheme describ e d so far a nd set the last ℓω ′ rows of x to 0, where ω ′ > ω is arbitrarily close to ω . This mo diﬁcation reduce s the rate by a quantit y tha t can be made arbitrar ily small. Since the p erturbatio n has dimension ℓ ω , the la s t ℓω ′ rows of y will span W with high probability as N → ∞ . A bas is of W is then obtained by reducing thes e rows via gaussian elimination. W e therefor e ass ume her eafter that W is kno wn and, to av oid cumbers o me notation, we set ω ′ = ω . The deco ding task consis ts in ﬁnding the per turbations for the ﬁrst (1 − ω ) ℓ rows of y . If we subtr a ct these perturbations form y , w e hav e found x . Thr oughout the description, given tw o sets o f vectors U 1 and U 2 , we let U 1 + U 2 ≡ { u 1 + u 2 : u 1 ∈ U 1 , u 2 ∈ U 2 } a nd, f or a given vector x , x + U ≡ { x } + U . Finally , giv en a matrix h ∈ ( F q ) m × m , U h ≡ { u h : u ∈ U } (v ector s ar e alwa y s thought a s row vectors). W e pro ceed in an iter ative fashion. The basic principle is easily understoo d. W e kno w that x i ∈ y i + W . In words, we k now that x i lies in a given aﬃne subspace. Consider a chec k no de a and, without loss of generality , let its neigh b or s be 1 , . . . , d . Let h ia , i = 1 , . . . , d , denote the corresp onding edge lab els. As we discussed earlier, each such edge lab el is an m × m in vertible matrix with en tries in F q . By the deﬁnition o f the co de, P d i =1 x i h i,a = 0. In particular, this means that x 1 ∈ ( P d i =2 x i h i,a ) h − 1 1 ,a . Since we know that x i ∈ y i + W , this implies that x 1 ∈ [( y 2 + W ) h 2 ,a + · · · + ( y d + W ) h d,a ] h − 1 1 ,a . 2 As d iscussed in the introduction, th e u nderlying physical process is th e follo wing: w e add the headers to the rows of x ; we scram b le the rows of M multiplying it by a random inv ertible matrix in ( F q ) m × m ; w e send the resulting pac kets; th e channel p erturbs these pack ets; th e receiver collects the p erturb ed packe ts, stac ks them up to a matrix Q , brings the matrix back into normal form, and “strips oﬀ ” the h eaders. 5 Since we also know that x 1 ∈ y 1 + W , this implies that x 1 ∈ ( y 1 + W ) ∩ { [( y 2 + W ) h 2 ,a + · · · + ( y d + W ) h d,a ] h − 1 1 ,a } . (7) The actual deco der is most con venien tly describ ed (and analyzed) as a ‘mess age passing ’ algorithm, with messages being sent along the edges of the T anner graph. Messages are aﬃne s ubspace of ( F q ) m . They are sent in rounds. First we send messa ges fro m the v a riable no des to the c heck no des . W e pro ces s the incoming messa ges at the chec k no des and then send messag es on all edges from the c heck no des to the v ar iable no de s . This co ncludes one iter ation of message p assing . In more detail, the message sent fro m v ar iable no de i to check no de a in the t -th iteration is an aﬃne subspace W ( t ) i → a of ( F q ) m . If v aria ble no de i is co nnec ted to chec k no de a , let ¯ a deno te the sec o nd chec k no de that is connected to i (re c all that each v ar iable no de has exactly t wo neighbors). V aria ble nodes do not p erform any non-tr ivial pro cessing of the messag es, and chec k- to-v aria ble no de mess ages coincide with v ar iable-to-check ones W ( t ) i → a = W ( t ) ¯ a → i . F or t = 0 we hav e W (0) i → a = y i + W for all v a riable nodes i and all chec k no des a . F urther, let ∂ a denote all neighbors of a chec k node a . A ccording to the a b ove discussio n, w e apply for t ≥ 0 the recursion W ( t +1) i → a = ( y i + W ) ∩ nh X j ∈ ∂ ¯ a \ i W ( t ) j → ¯ a h j, ¯ a i h − 1 i, ¯ a o . (8) If, after some iter ations, dim ( W ( t ) i → a ∩ W ( t ) i → ¯ a ) = 0, then hav e determined the i -th r ow of x , namely W ( t ) i → a ∩ W ( t ) i → ¯ a = { x i } . Our (main) Theorem 4.5 aﬃrms that. for g iven parameters λ a nd ω , the degr e e distribution P ( x ) can b e chosen in such a wa y that the rate of the ov erall co de appro aches the capa city arbitrar ily closely and that the de c o der succee ds with high probability when the pac ket siz e tends to inﬁnity . 4 Pro ofs In the next section w e sta te a few auxiliary lemmas on the b ehavior of the messa ge-pass ing dec o der and prov e Theorem 3.2. The lemmas a re then proved in Sectio n 4.2. Finally , the capacity of the net work co ding channel is computed in Section 4.3. 4.1 Auxiliary Results and Pro of of the Main Theorem T o start we ca n simplify our pro of in tw o manners. First, b y s y mmetry o f the channel and the message- passing rules, w e can assume that the all-zer o matrix x was transmitted and we need only analyze the behavior of the dec o der for this case. Notice that, under this as sumption, the messa ges W ( t ) i → a are line ar subspaces (as they must contain the tr ansmitted vectors x i = 0.) Second, a s w e discussed in Section 3.2, the ﬁrst step of the decoding pr o cedure consis ts of learning the p er turbing subspace W . Beca use of the sp ecial structure of the ma trix x (the last ω ℓ rows a re zero ) this is accomplished by a simple inspec tio n. W e ther efore a ssume in all that follo ws that W is kno wn and that the all-zero matrix w a s transmitted. Throughout this section we let P b e a distribution ov er the integers a nd let G b e a r andom m ulti- graph ov er ℓ (1 − r ) no des with degree distribution P . The g raph G is drawn according to the conﬁguration mo del and the c o de is constr ucted from G as describ ed in th e pre vious section. Since v ariable no des hav e degree 2, w e can think of G either as a m ulti-gra ph over the chec k nodes, or as a bipartite gra ph ov er check and v ariable no des. It is also useful to deﬁne the ‘edge p er sp ective’ degree distr ibution ρ n = nP n P n ′ ≥ 0 n ′ P n ′ . (9) F or a uniformly random edge in G , let W ( t ) be the asso ciated message (that, we reca ll, is an a ﬃne subspace in ( F q ) m ). The key step in the analysis is to notice that the dimensio n of W ( t ) satisﬁes a simple recursion. 6 First consider n − 1 indep endent and uniformly ra ndom linear subspaces V 1 , . . . , V n − 1 ⊆ ( F q ) m of dimensions d 1 , . . . , d n − 1 , r esp ectively . Let V be a ﬁxed subspace o f dim ( V ) = D , and deﬁne K ( n ) m,D ( d | d 1 , . . . , d n − 1 ) ≡ P { dim ( V ∩ ( V 1 + · · · + V n − 1 )) = d } , . (10) The probability kernel K ( n ) m,D admits an explicit albeit cumber some expression in terms of Ga uss po ly- nomials. F ortunately , we do not need its exact description in the following. W e deﬁne a sequence of integer-v alued random v a riables { D ( t ) } t ≥ 0 recursively as follows. F or t = 0 we let D (0) = ℓω identically . F or t ≥ 0 , cho ose n with distributio n ρ n , a nd dr aw D ( t ) 1 , . . . , D ( t ) n − 1 iid copies of D ( t ) . Then, the probabilit y of D ( t +1) = d , co nditioned on the v alues D ( t ) 1 = d 1 , . . . , D ( t ) n − 1 = d n − 1 coincides with Eq. (10 ) wher e D = ℓω . In form ulae, P { D ( t +1) = d } = X n ≥ 1 ρ n X d 1 ...d n − 1 K ( n ) m,D ( d | d 1 , . . . , d n − 1 ) P { D ( t ) = d 1 } · · · P { D ( t ) = d n − 1 } . (11) The s equence { D ( t ) } acc ur ately tracks the dimensio n o f W ( t ) as sta ted b elow. Lemma 4.1 (Density Evolution on a Graph v ersus Density E volution on a T ree) . F or any de gr e e distribution P with b ounde d supp ort (i.e. such that P n = 0 for al l n lar ge enough) and any t ∈ N ther e exists a se quenc e ǫ ( ℓ, t ) with ǫ ( ℓ, t ) ↓ 0 as ℓ → ∞ , such that, for any m , and ℓ , || P { dim ( W ( t ) ) ∈ · } − P { D ( t ) ∈ · }|| TV ≤ ǫ ( ℓ, t ) , (12) wher e we r e c al l that || P X − P Y || TV = sup A | P ( X ∈ A ) − P ( Y ∈ A ) | . Controlling the seq uence of random v ar iables { D ( t ) } t ≥ 0 is quite diﬃcult. Luckily , its b ehavior sim- pliﬁes considerably if we let m → ∞ and consider the scaled dimensio ns D ( t ) / ( ℓω ). More precise ly , we deﬁne the sequence of random v aria bles { ξ ( t ) } t ≥ 0 with v alues in [0 , 1] r ecursively as follows. W e let ξ (0) = 1 ident ically . F o r any t ≥ 0, let n b e dra wn with distribution ρ n , and ξ ( t ) 1 , . . . , ξ ( t ) n − 1 be iid copies of ξ ( t ) . F urther, for a, b, x ∈ R with a ≤ b , deﬁne [ x ] b a = min(max( x, a ) , b ). Then, the distribution of ξ ( t +1) is giv en b y ξ ( t +1) d = " n − 1 X i =1 ξ ( t ) i + 1 −  1 − λ λω  # 1 0 . (13) W e will prov e tha t the rescaled dimensions D ( t ) / ( ℓω ) are accurately tra ck ed b y ξ ( t ) . Lemma 4.2 (Density Evolution v e r sus Rescaled Densit y Evolution) . F or any n max , ω , and λ ther e exists ε > 0 su ch that, for any de gr e e distribution P with supp ort in [0 , n max ] : lim m →∞ P { D ( t +1) > 0 } ≤ n max P { ξ ( t ) ≥ ε } . (14) The previous lemma sho ws that it suﬃces to conside r the b ehavior of ξ ( t ) for which w e ha ve the explicit simple r ecursion (13 ). Even so, ﬁnding a degree distribution ρ which results in co des of larg e rates and so that ξ ( t ) conv er g es to 0 for larg e v alues of δ , seems c ha llenging. The k ey to our analysis is the observ ation that the recursion (13) simpliﬁes signiﬁcantly if (1 − λ ) /λ is an in teger multiple of ω . In this case the distribution of ξ ( t ) trivializes: ξ ( t ) only tak e s on the v alue s 0 or 1 regardles s of the degree distribution ρ . Density evolution ther e fo re collapses to a scala r recursio n, mak ing it po ssible to ﬁnd the optimum degree dis tr ibution ρ . Lemma 4.3 (Capacity Ac hie v ing Degree Distributions for Resca led Density Evolution) . L et λ, ω ∈ (0 , 1) b e such t hat (1 − λ ) /λ is an int e ger multiple of ω and let r < C ( λ, ω ) / ((1 − λ )(1 − ω )) . Then ther e exists ρ with b ounde d supp ort and 1 − 2 R 1 0 ρ ( x )d x ≥ r , and t wo c ons t ants A > 0 , γ > 1 such that, for any t , ε > 0 , P { ξ ( t ) ≥ ε } ≤ exp {− Aγ t } . (15) 7 Pr o of of the Main The or em 3.2. Let λ, ω , R be a s in the statement o f the theorem and r ∈ ( R / ((1 − λ )(1 − ω )), C ( λ, ω ) / ((1 − λ )(1 − ω )). W e claim that there exists a degree distribution P with suppor t in [0 , n max ], with 1 − 2 /P ′ (1) ≥ r (equiv alently , fro m the edge p er sp ective, 1 − 2 R 1 0 ρ ( x ) d x ≥ r ) such that the iterative deco der achieves er ror proba bility smaller than π in O (lo g log (1 /π )) iterations. Let us chec k that this indeed prov es the theo r em. As men tio ned ab ov e, the p erturba tion subs pace W (i.e., the linear subspace of ( F q ) m spanned by the rows o f z ) can be infer red with hig h proba bilit y by the last ω ′ ℓ of the output y . This requires Gaussian elimination of an m × ( ℓω ′ ) matr ix with elements in F q , which can b e accomplished at a cos t of O ( N 3 ) o p er ations. The rest of the codeword x is deco de d by message pa ssing. Each iteration requires upda ting O ( N ) messages (b ecause P has b ounded suppo rt). Eac h up date, cf. Eq. (8), requires ﬁnding a basis for a spac e spanned by , at most ( ℓω ) n max vectors in ( F q ) m . This can be done, again via Gaussian e limination, in O ( N 3 ) op eratio ns. W e thus get O ( N 4 ) op eratio ns p e r itera tion. Since running O (log lo g(1 /π )) iterations achiev es er ror pr obability smalle r than π , this implies the thesis. Let us no w pr ov e this claim. First, w e ﬁx the degr ee distribution in such a w ay tha t Lemma 4.3 holds for so me A > 0, γ > 1 . W e let t ∗ ( π ) = O (log log(1 / π )) be such that P { ξ ( t ) ≥ ε } ≤ exp {− Aγ t } ≤ π / (3 n max ) for any t ≥ t ∗ ( π ). Then, for any ﬁxed t ≥ t ∗ ( π ) the deco ded e rror probability is upper bo unded by π if N is large enough. Indeed, ε can b e c hosen in such a w ay that Lemma 4.2 holds and therefore , for m large enough, P { D ( t +1) > 0 } ≤ n max π / (3 n max ) + π / 3 ≤ 2 π / 3. The i - th row of co deword x is deco ded corre ctly if any of the t wo mess ages W ( t +1) i → a or W ( t +1) i → ¯ a has dimension 0. Therefore, the s ymbol error pr obability is upper bounded by P { dim ( W ( t +1) ) > 0 } . By Lemma 4 .1, for ℓ large enough, this is at mos t P { D ( t +1) > 0 } + π/ 3 ≤ π , which pr oves the theorem. 4.2 Pro ofs of Lemmas Pr o of of L emma 4.1. The pro of is bas ed on the ‘densit y evolution’ technique [1 3], a nd on some remark s that allow to simplify the res ulting distributional recurs ion. A simila r result app ear e d a lready in the context of erasure deco ding for non-binar y co des [11]: in order to be self-contained w e nevertheless sketc h the pro of her e. Let ~ e b e a unifor mly rando m directed edge in G and let W ( t ) the a s so ciated mes sage after t iterations of the message-passing algor ithm. Denote b y B ( ~ e, t ) the ‘dir ected neighbor ho o d’ of ~ e with radius t , i.e., the induced sub-gra ph containing all non-reversing w alk s in G of length at most t that terminate in ~ e . W e rega rd this as a lab eled g raph with v a riable no de lab els given by the rec eived vectors a nd edg e la b els by the m → m matrices tha t deﬁne the co de. It is well known that suc h a neighbo rho o d conv erges to a (lab eled) Galton-W atson tree T ( t ). More pr ecisely , T ( t ) is a t -gener ations tree ro oted in a directed edge ~ e T and with o ﬀspring distr ibution ρ n . W e ha ve || P { B ( ~ e, t ) ∈ · } − P { T ( t ) ∈ · }|| TV ≤ ǫ ( ℓ, t ) , (16) for so me ǫ ( ℓ, t ) a s in the statement of Lemma 4.1. Note that the messa ge W ( t ) is a function o nly of the neighbor ho o d B ( ~ e, t ). Supp ose tha t we a pply the message-pa ssing algo rithm to T ( t ) and let W ( t ) T be the message passed through the root edge after t iter ations. It fo llows from the deﬁnition of total v ariation distance that || P { dim ( W ( t ) ) ∈ · } − P { dim ( W ( t ) T ) ∈ · }|| TV ≤ ǫ ( ℓ, t ) . (17) The pro o f is completed by showing that dim ( W ( t ) T ) is distr ibuted as the ra ndom v ariable D ( t ) deﬁned recursively b y Eq . (11). First, note that W ( t ) T is a uniformly rando m subspace, conditiona l on its dimension dim ( W ( t ) T ). This follo ws fr o m the message-passing up date rule (8) to g ether with the remark that, given any ﬁxe d subspace W ∗ and a uniformly random full-ra nk m × m matrix L , L W ∗ is a uniformly random s ubspace with the sa me dimension as W ∗ . W e prove that dim ( W ( t ) T ) is distributed as D ( t ) by rec ur sion. The sta tement is true for t = 0 b y deﬁnition of o ur channel mo del. Co nsider the tree T ( t + 1) and co ndition on the oﬀspring num b er at 8 the ro ot n − 1. Denote by W ( t ) T , 1 , . . . , W ( t ) T ,n − 1 the cor resp onding messa ges tow ar ds the ro o t and condition on dim ( W ( t ) T , 1 ) = d 1 ,. . . , dim ( W ( t ) T ,n − 1 ) = d n − 1 . Then the distributio n of dim ( W ( t +1) T ) is given b y the kernel (10) with D = ℓω by uniformity of the s ubspace. The claim follows fro m the fact that W ( t ) T , 1 ,. . . , W ( t ) T ,n − 1 are iid be c ause o f the tree structure. In the pro of of Lemma 4.2 we require an estimate of the proba bility that true densit y ev olution deviates s ig niﬁcantly from the the resca led density evolution. Prop ositi on 4.4 (Deviations from Asymptotic Density Evolution) . L et V 1 b e a subsp ac e of dimension d 1 in F m q , and V 2 a uniformly ra ndom subsp ac e of dimension d 2 . Deﬁne d 1 ⊙ d 2 ≡ max(0 , d 1 + d 2 − m ) , and d 1 ⊞ d 2 ≡ min( m, d 1 + d 2 ) . Then P { d 1 ⊙ d 2 ≤ dim ( V 1 ∩ V 2 ) < d 1 ⊙ d 2 + k } ≥ 1 − q − k − max(0 ,m − d 1 − d 2 ) , (18) P { d 1 ⊞ d 2 − k ≤ dim ( V 1 + V 2 ) < d 1 ⊞ d 2 } ≥ 1 − q − k − max(0 ,m − d 1 − d 2 ) . (19) F urt her, let V b e a su bsp ac e of dimension d and let V 1 , . . . , V n − 1 b e uniformly r andom subsp ac es of dimensions (r esp e ctively) d ! , . . . , d n − 1 and d ≡ [ d 1 + · · · + d n − 1 + d − m ] d 0 . Then P {| dim (( V 1 + · · · + V n − 1 ) ∩ V ) − d | ≥ k } ≤ n q − k/n . (20) Pr o of. Notice that E q. (19) follows from Eq. (18) tog e ther with the identit y dim ( V 1 + V 2 ) = d 1 + d 2 − dim ( V 1 ∩ V 2 ). F urther dim ( V 1 ∩ V 2 ) ≥ d 1 ⊙ d 2 for any tw o subspaces V 1 , V 2 of the given dimensions. W e are left with the task of b ounding the probability of dim ( V 1 ∩ V 2 ) ≥ d 1 ⊙ d 2 + k . Notice that this ev ent is identical to | V 1 ∩ V 2 | ≥ q d 1 ⊙ d 2 + k (w e denote by | S | the car dinality of the set S ). B y the Marko v inequality w e hav e P { dim ( V 1 ∩ V 2 ) ≥ d 1 ⊙ d 2 + k } ≤ q − k − d 1 ˙ d 2 E | V 1 ∩ V 2 | = q − k − d 1 ⊙ d 2 q d 1 + d 2 − m , (21) where the equality on the right-hand side follows by m ultiplying the num b er of vectors in V 1 (that is q d 1 ) with the pro bability that one of them b elongs to V 2 (b y uniformity this is q − m + d 2 ). Eq. (20) follows by applying the previous b ound rec ursively . b ounds . Explicitly , we deﬁne W 1 = V 1 , W 2 = W 1 + V 2 , . . . , W n − 1 = W n − 2 + V n − 1 , a nd W n = W n − 1 ∩ V . The cor r esp onding (t ypical) dimensions are c 1 = d 1 , c 2 = c 1 ⊞ d 2 , . . . , c n − 1 = c n − 2 ⊞ d n − 1 , c n = c n − 1 ⊙ d n = d . B y the union bound, with probability at lea st 1 − n q − k/n we hav e | dim ( W n ) − ( d n ⊙ dim ( W n − 1 )) | ≤ k /n and | dim ( W i ) − ( d i ⊞ dim ( W i − 1 ) | ≤ k /n for i ∈ { 2 , . . . , n − 1 } . The thesis follows by the triangle inequality . Pr o of of L emma 4.2. W e will ﬁrs t prove that there exis ts a coupling betw een D ( t ) and ξ ( t ) such that | D ( t ) − ( ℓω ) ξ ( t ) | ≤ ℓ ε with high probability a s ℓ , m → ∞ (with λ , ω ﬁxed). Subsequen tly , we shall prov e that this claim implies the thesis. The coupling is co nstructed r ecursively . F or t = 0 we hav e D (0) = ( ℓω ) ξ ( t ) = ℓω deter ministically . This deﬁnes the coupling of D ( t ) and ξ ( t ) for t = 0. Assume we hav e shown how to construct a coupling of D ( t ) and ξ ( t ) for some t ∈ N . T o deﬁne the coupling for t + 1 we draw an integer n with distribution ρ n . W e then g enerate n − 1 co upled pairs ( D ( t − 1) i , ξ ( t − 1) i ). F rom those we ge nerate a coupled pair ( D ( t ) i , ξ ( t ) i ) via the recursions (11) and (13), resp ectively . In order to prov e the claim it is suﬃcient to show the following. If V 1 , . . . , V n − 1 are uniformly random subspaces of dimensions ( ℓω ) ξ 1 , . . . , ( ℓω ) ξ n − 1 in F m q , and if V has dimension ( ℓω ), then, with high pro bability , | dim (( V 1 + · · · + V n − 1 ) ∩ V ) − ( ℓω ) ξ | ≤ ℓε for any ε > 0 . This in turns follo ws from Prop ositio n 4.4 (E q . (20)) together with the obser v ation that the degree n is b ounded. Let us now consider the thesis of the lemma, E q. (14). W e ca n a ssume without loss of genera lity that n max ≥ 1 and m > ℓω , whence 1 − λ > λω follows. Let n max ≥ 2 b e the lar gest integer in the suppor t of ρ n and take ε > 0 small enough so that 2( n max − 1) ε ≤ (1 − λ ) / ( λω ) − 1 − γ for so me γ > 0. Dra w n max iid c opies o f D ( t ) , denoted D ( t ) 1 , . . . , D ( t ) n max . Since under the coupling | D ( t ) − ( ℓω ) ξ ( t ) | ≤ ℓε with high probabilit y , P n max { D ( t ) 1 , . . . , D ( t ) n max } ≥ 2 ε ( ℓω ) o ≤ n max P { ξ ( t ) ≥ ε } + o m (1) . (22) 9 Now draw n with distribution ρ n and D ( t +1) conditional o n D ( t ) 1 , . . . , D ( t ) n − 1 according to the kernel (10). Namely , D ( t +1) is the dimension of V ∩ ( V 1 + · · · + V n − 1 ) when dim ( V ) = ℓ ω a nd V 1 , . . . , V n − 1 are uniformly random subspaces of F m q with dimensio ns D ( t ) 1 , . . . , D ( t ) n − 1 . Let W ≡ V 1 + · · · + V n − 1 . Then W is uniformly random co nditioned on its dimension dim ( W ) ≤ D ( t ) 1 + · · · + D ( t ) n − 1 ≤ ℓ (1 − λ − λω ) /λ − ℓ ω γ with pr obability low er b ounded as in Eq . (22). Assume this to be the case. B y P r op osition 4 .4, Eq. (18), and r ecalling that m = ℓ (1 − λ ) /λ , the probability that D ( t +1) = dim ( V ∩ W ) > 0 is at most q − ℓγ ω . This proves the thesis. In order to pro ve our last auxiliary result, Lemma 4.3, we need some algebraic pr o p erties of the edge-p ersp ective capacity-achieving degree distribution and of the corresp onding generating function: ρ ∗ k ( x ) = ∞ X i = k +1 k − 1 ( i − 1)( i − 2) x i − 1 ≡ ∞ X i =0 ρ ∗ k,i x i − 1 . (23) Lemma 4.5 (Ba sic P rop erties of Capa city-Ac hieving Degre e Distribution) . Le t k ∈ N and deﬁne f k,i ( α ) = P i − 1 j = k  i − 1 j  α j (1 − α ) i − 1 − j . Then ρ ∗ k (1) = 1 , d ρ ∗ k ( x ) / d x | x =1 ≥ k , R 1 0 ρ ∗ k ( x )d x = 1 / (2 k ) , and P i ρ ∗ k,i f k,i ( α ) = α . Pr o of. By a reorder ing of the terms in the sum, ρ ∗ k (1) = lim j →∞ k + j X i = k +1 k − 1 ( i − 1)( i − 2) = lim j →∞ k + j X i = k +1  k − 1 i − 2 − k − 1 i − 1  = lim j →∞  1 − k − 1 k + j − 1  = 1 . In a similar manner , we hav e Z 1 0 ρ ∗ k ( x )d x = lim j →∞ k + j X i ≥ k +1 k − 1 i ( i − 1)( i − 2 ) = lim j →∞ k + j X i ≥ k +1  k − 1 2( i − 1)( i − 2 ) − k − 1 2 i ( i − 1)  = 1 2 k . The c la im R 1 0 ρ ∗ k ( x )d x = 1 / (2 k ) follows since ρ ∗ k ( x ) = 1 , ρ ∗ k,i ≥ 0, and since ρ ( x ) only contains p ow ers o f x of at least k . In or der to prov e the last asser tio n we re call the iden tit y [15] ∞ X n = i  n i  x n = x i (1 − x ) i +1 . (24) W e then obtain (here ¯ α ≡ (1 − α )): X i ρ ∗ k,i f k,i ( α ) = X i ≥ k +1 k − 1 ( i − 1)( i − 2) i − 1 X j = k  i − 1 j  α j ¯ α i − 1 − j = ( k − 1) ∞ X j = k  α ¯ α  j ∞ X i = j + 1  i − 1 j  ¯ α i − 1 ( i − 1)( i − 2) =( k − 1) ∞ X j ≥ k  α ¯ α  j α 1 − j ¯ α j j ( j − 1) = ( k − 1) ∞ X j = k α j ( j − 1) = α , where we a pplied the iden tity obtained b y in tegr ating E q. (24) t wice with res pe c t to x . Pr o of of L emma 4.3. Let k = (1 − λ ) / ( λω ), k ∈ N . Then C ( λ, ω ) / ((1 − ω )(1 − λ )) = 1 − 1 /k . It is clear fr om the r ecursive deﬁnition (1 3) together with the initial condition ξ (0) = 1 that, for an y t ≥ 0 , ξ ( t ) only takes v alues 0 and 1. Let α t ≡ P { ξ ( t ) = 1 } . Then α 0 = 1, and Eq. (13) implies that α t +1 = ∞ X n = k +1 ρ n f k,n ( α t ) ≡ F k,ρ ( α t ) , (25) where f k,n ( α ) is deﬁned as in the statement of Lemma 4.5 (note that f k,k ( α ) ≡ 0 ). W e cla im that for any r < 1 − 1 / k there exists an edge-p ersp ective deg ree distributio n ρ of b ounded s uppo rt such tha t: (i) 10 1 − 2 R 1 0 ρ ( x ) d x ≥ r ; (ii) F k,ρ ( α ) < α for any α ∈ (0 , 1]; (iii) F k,ρ ( α ) = O ( α k ) as α ↓ 0. Then the lemma follows b y standa r d calculus, with γ ∈ (1 , k ) a nd A suﬃcien tly small. In o rder to ex hibit suc h a deg ree distr ibution, ﬁx b ∈ N , b ≥ k , and deﬁne ρ ( x ) = P i = k ρ i x i − 1 , where ρ i = 0 except for ρ i = ρ ∗ k,i , i = k + 1 , . . . , b , a nd ρ k = 1 − P b i = k +1 ρ ∗ k,i . Then Z 1 0 ρ ( x ) d x = b X i = k ρ i /i = b X i = k ρ ∗ k,i /i + ∞ X i = b +1 ρ ∗ k,i /k . (26) By Lemma 4.5 the r ight-hand s ide co nv erg es to 1 / (2 k ) as b → ∞ . Therefore we can chose b large eno ugh so that claim (i) ab ov e is fulﬁlled. Consider now claim (ii). W e write F k,ρ ( α ) = b X i = k +1 ρ i f k,i ( α ) = ∞ X i = k +1 ρ ∗ k,i f k,i ( α ) − ∞ X i = b +1 ρ ∗ k,i f k,i ( α ) = α − ∞ X i = b +1 ρ ∗ k,i f k,i ( α ) , where the last iden tit y follows from Lemma 4 .5. The claim is implied b y the remark that f k,i ( α ) > 0 for i ≥ k + 1 a nd α ∈ (0 , 1]. Finally , claim (iii) is a co ns equence of the fact that f k,i ( α ) =  i − 1 k  α k + O ( α k +1 ) together with i ≤ b . 4.3 Capacit y Pr o of of Pr op osition 3.1. By standa rd info r mation-theor e tic arguments [3], the channel informa tion ca - pacity is given by C ( ω , λ ) = lim N →∞ , ℓ = N λ 1 N ℓ sup P X I ( X ; Y ) . (27) Here I ( X ; Y ) = P x ,y P X ,Y ( x, y ) log { P X ,Y ( x, y ) / P X ( x ) P Y ( y ) } is the mutu al information b etw een X and Y and the suprem um is tak en ov er all p o ssible input distributions. W riting the mutual information in terms of entropy a nd co nditional entropy , and us ing our channel mo del (2), w e ha ve I ( X ; Y ) = H ( Y ) − H ( Y | X ) = H ( Y ) − H ( Z ). Since H ( Z ) do es not dep end on the input distribution, the mutual information is maximized when the latter is uniform. This implies that the o utput is uniform as w ell, and we get H ( Y ) = log ( q mℓ ). Finally , H ( Z ) is the logarithm o f the num b er A ( s, ℓ, m ) o f ℓ × m matrices o f rank ra nk( Z ) = ℓω ≡ s . W e have A ( s, ℓ, m ) = q mℓ P 0 { rank( Z ) = s } where P 0 denotes proba bilit y with resp ect to a uniformly random ma trix Z . Assume without los s of generality that ℓ, m ≥ s . If z 1 , . . . , z ℓ be the lines of Z , then the ﬁrst s lines are independent with probability (1 − q − ℓ )(1 − q − ℓ +1 ) · · · (1 − q − ℓ + s ) ≥ 1 − sq − ℓ + s . then the spa ce . A ( s, ℓ, m ) ≥ q mℓ P { z s +1 . . . z ℓ ∈ ( z 1 . . . z s ) , r ank( z 1 . . . z s ) = s } ≥ q mℓ q − ( ℓ − s )( m − s ) (1 − sq − ℓ + s ) . (28) On the other ha nd P 0 { rank( Z ) = s } is upper b ounded b y summing over all subsets of s lines (there are  ℓ s  ≤ 2 ℓ such subsets), the pr obability that such lines ar e indep endent and that the other lines are in the span genera ted by these. Suc h an upp er bo und is at most 2 ℓ larger than the ab ov e lower b ound. By taking N → ∞ with ℓ = N λ = N − m , λ ∈ (0 , 1) and ω ∈ (0 , min(1 , (1 − λ ) /λ w e get H ( Z ) = log A ( s, ℓ, m ) = N ℓ ( ω + ω 2 λ ) + O ( N ) . (29) Therefore I ( X ; Y ) = H ( Y ) − H ( Z ) = N ℓ (1 − λ − ω + ω 2 λ ) + O ( N ) whence the thesis follows. 11 References [1] R. A hlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung , Network information ﬂow , IEE E T r ans. Inform. Theor y , 4 6 (2000), pp. 1204–12 16. [2] P. A. Chou, Y. Wu, and K. Jain , P r actial network c o ding , in Pr o c. of the Allerton C o nf. on Commun., Co ntrol, a nd Computing, Monticello, IL, USA, 2003 . [3] T. M. Cover a nd J. A . Thomas , Elements of Information The ory , Wiley , New Y ork , NY, USA, 1991. [4] C. Fragouli and E. Soljanin , Network Co ding F undamentals , vol. 2 of F oundations and T rends in Netw orking, NOW, Delft, Holland, 200 7. [5] R. G. Gallager , L ow-Density Parity-Che ck Co des , MIT Pres s, Cambridge, MA, USA, 1963. [6] T. Ho, R. K ¨ otter, M. Medard, D. R. Karger, and M. E ffr os , The b eneﬁts of c o ding over r outing in a r andomize d setting , in P ro c. of the IE EE In t. Symposium on Inform. Theory , Y okohama, Japan, 200 3, p. 442. [7] R. K ¨ otter and F. R. Kschischang , Co ding for err ors and er asur es in r andom network c o ding . Submitted, No v. 2007. [8] R. K ¨ otter a nd M. Medard , An algebr aic appr o ach to network c o ding . Submitted, F eb. 200 4. [9] S.-Y. R . Li, R. W. Yeung, and N . Cai , Line ar network c o ding , IEEE T rans. Infor m. Theor y , 49 (2 0 03), pp. 371–381 . [10] M. L uby, M. Mitzenmache r, A. Shok rollahi, D. A. Spielman, and V. Stemann , Pr actic al loss-r esilient c o des , in Pro c. of the 29th ann ual ACM Symp os ium on Theory of Computing, 1997, pp. 1 50–15 9. [11] V. Ra thi and R. Urbanke , Density evolution, thr eshold and t he stability c ondition for non- binary LDPC c o des , IE E P r o c. Co mm un., 152 (2 005), pp. 1069 – 1074 . [12] T. Richardson and R. Urbanke , Eﬃcient enc o ding of low-density p arity-che ck c o des , IEE E T rans. Inform. Theory , 47 (2001), pp. 638–6 56. [13] , Mo dern Co ding The ory , Cambridge Universit y P ress, 2007. In pr eparatio n. [14] D. Sil v a, R. K ¨ otter, and F. R. Kschischang , A r ank-metric appr o ach to err or c ontr ol in r andom network c o ding . Submitted, No v. 20 07. [15] H. S. Wilf , Gener atingfunctionolo gy , Academic P ress, 2 ed., 199 4. [16] R. W. Yeung, S.-Y. R. L i, N . Cai, and Z. Zhang , Network Co ding The ory: Multiple Sour c es , vol. 2 of F oundatio ns and T r ends in Comm unications and Information Theory , NOW, Delft, Holland, 2005. [17] , N etwork Co ding The ory: S ingle Sour c es , vol. 2 of F oundatio ns and T rends in Communications and Informa tion Theor y , NOW, Delft, Holland, 200 5. 12

Coding for Network Coding

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment