Exchange of Limits: Why Iterative Decoding Works

1 Exchange of Limits: Why Iterati v e Decoding W orks Satish Babu K orada and R ¨ udiger Urban ke Abstract — W e consider communication ov er binary-input memoryless output-symmetric channels usin g low-density parity- check codes and message-passing decodin g. The asymptotic (in the length) perfo rmance of such a co mbination f or a ﬁxed number of iterations is given by density evolution. Letting the number of iterations tend to inﬁn ity we get th e density ev olution threshold , the largest channel parameter so th at th e bit error probability tends to zero as a function of the iterations. In practice we often work with short codes and perfo rm a large nu mber of iterations. It is therefo re interesting to consider what happens if in the standard analysis we exchange t he order in which th e b locklength and the number of iterations diverge to inﬁnity . In p articular , we can ask whether both limits give the same threshold. Although empirical observatio ns strongly sugges t th at the exchange of li mits is va lid f or all chann el parameters, we li mit our discu ssion to chann el parameters below th e densi ty ev olu- tion threshold. Speciﬁcally , we show that under some suitable technical conditions th e bit error probability vanishes below the density evolution threshold regardless of how the limit is taken. Index T erms — LDPC, sp arse graph code, density evo lution I . I N T RO D U C T I O N A. Motivation Consider transmission over a binary- input memo ryless output- symmetric (BMS) channel using a low-density parity- check (L DPC) cod e and d ecoding via a message-p assing (MP) algorithm . W e refer th e reader to [1] for an introduction to the standard notation and an overview of the k nown results. It is well known that, for g ood choices of the degree distribution and the MP decod er , one can achie ve rates close to the capacity of th e channel with low decoding c omplexity [2]. The standard analy sis of iterativ e decoding systems assumes that the blocklen gth is large (tending to inﬁnity) and that a ﬁxed number of iter ations is p erforme d. As a consequence, when decodin g a g iv en bit, the output of the de coder only depend s on a ﬁxed-sized local neighbo rhood of this bit and this local neighborh ood is tree-like. Th is local tree pro perty im plies that the messages arriving at n odes a re con ditionally ind e- penden t, sign iﬁcantly simplif ying the analysis. T o de termine the performan ce in this setting, we track the e volution of the message densities as a function of the iteration. This pro cess is called density evolution (DE). Denote the bit pr obability of error of a code G af ter ℓ iter ations by P b ( G , ǫ, ℓ ) , whe re ǫ is the channel par ameter . Then DE com putes lim n →∞ E [ P b ( G , ǫ, ℓ )] . (1) If we now per form more and mor e iteratio ns then we g et a limiting p erform ance correspo nding to lim ℓ →∞ lim n →∞ E [ P b ( G , ǫ, ℓ )] . (2) EPFL, School of Computer , & Communicati on Sciences, Lausanne, CH- 1015, Switzer land, { satish.korada , ruedige r .urbanke } @e pﬂ.ch. In ord er for the comp utation graph s of dep th ℓ to form a tree, the number of iteration s can no t exceed c log ( n ) , where c is a constant that only depends on the d egree distribution. (For a ( l , r ) -regular degree d istribution p air a valid c hoice of c is c ( l , r ) = 2 log( l − 1)( r − 1) , [3].) I n practice, th is c ondition is rarely fulﬁlled: standard blockleng ths measure only in the hundr eds or thousand s but the number o f iterations that hav e been observed to be useful in practice can easily exceed one hundr ed. Consider theref ore the situation wher e we ﬁx the bloc k- length b ut let the numb er of iterations ten d to in ﬁnity . This means, we con sider the limit lim ℓ →∞ E [ P b ( G , ǫ, ℓ )] . (3) Now take the blockleng th to in ﬁnity , i.e., consider lim n →∞ lim ℓ →∞ E [ P b ( G , ǫ, ℓ )] . (4) What can we say a bout (4) and its relation ship to (2)? Consider the belief propag ation (BP) algo rithm. It was shown by McEliece , Rodemich , and Chen g [4] that one can construct speciﬁc graphs an d n oise realizations so that the me s- sages on a speciﬁc edge either show a ch aotic b ehavior (as a function of itera tion) or converge to limit cycles. In particular, this means that the m essages do not con verge as a fu nction of the iteratio n. For a ﬁxed length and a d iscrete chann el, the number o f graph s and n oise rea lizations is ﬁnite. Ther efore, if for single graph and n oise r ealization the m essages d o not co n verge as a func tion of ℓ , then it is likely that also lim ℓ →∞ E [ P b ( G , ǫ, ℓ )] does no t conver ge as a f unction of n (unless b y some miracle the various no n-converging parts can- cel). Let us therefore conside r lim sup ℓ →∞ E [ P b ( G , ǫ, ℓ )] and lim inf ℓ →∞ E [ P b ( G , ǫ, ℓ )] . What happens if we increase the blockleng th and consid er lim n →∞ lim sup ℓ →∞ E [ P b ( G , ǫ, ℓ )] and lim n →∞ lim inf ℓ →∞ E [ P b ( G , ǫ, ℓ )] ? W e restrict our present study to the e xchang e of limits below the d ensity threshold . I.e ., suppo se that the gi ven comb ination (of the chan nel family an d th e MP deco der) h as a thresho ld in the following sense: for the given channel f amily characte rized by the real valued parameter ǫ ther e exists a threshold ǫ MP so that for all 0 ≤ ǫ < ǫ MP the DE limit (2) is 0 , whereas for all ǫ > ǫ MP it is strictly positi ve. W e will show that u nder suitab le technical co nditions the bit error prob ability also tend s to zero if we exchange the limits. This implies that the DE threshold is a meanin gful and robust design p arameter . B. Summary of Main Result Consider tran smission over a BMS chann el parame trized by ǫ , using an LDPC ( n, l , r ) ensemble an d decod ing via an MP algorithm . Assume that the algorithm is symm etric in the sense of [1 ][Deﬁnition 4.8 1, p . 209 ]. Moreover, assume that this 2 combinatio n h as a thresh old and let ǫ MP denote this thresho ld. If ǫ < ǫ MP then u nder the condition s stated in Sections II and III, lim n →∞ lim sup ℓ →∞ E [ P MP b ( G , ǫ, ℓ )] = 0 . Instead of considerin g just an exchange of limits one can consider join t limits whe re the itera tion is an arbitr ary but increasing fu nction of th e bloc klength, i.e., on e can co nsider lim n →∞ E [ P MP b ( G , ǫ, ℓ ( n ))] . O ur argumen ts extend to this c ase and one can show that lim sup n →∞ E [ P MP b ( G , ǫ, ℓ ( n ))] = 0 . But for the sake of simplicity we restrict ourselves to the standard e x change of limits discussed ab ove. In the same spirit, although some of the techn iques and statements we discuss extend direc tly to th e irregular case, in order to keep the exposition simple we restrict our discussion to the standard regular ensemb le LDPC ( n, l , r ) . C. Outline W e introduce two tec hniques that are u seful in our co ntext. First, we consider expander s. Mor e precisely , in Section II we show that fo r cod es with sufﬁcient expansion the exchan ge of limits is valid below the DE threshold. The a dvantage of using expansion is that the argumen t ap plies to a wide variety of d ecoders. On the negative side, the argu ment can on ly be applied to ensemb les with large variable-node degrees. Why do es expansio n h elp in p roving the d esired resu lt and why do we n eed large v ariable-n ode degrees? Assume that a sufﬁcient num ber o f iterations has been perf ormed so that the nu mber of still erroneo us messages is relatively small. Conside r further iteratio ns. There ar e two reasons why a message emitted by a variable node can be b ad. This ca n b e due to the r eceiv ed value, or it can be due to a large n umber of bad incoming messages. If the degree of the v ariable node is large then the receiv ed v alue becomes less and less importan t (think of a node of degre e 100 0 and a decoder with a ﬁnite number of messages; in this c ase th e received value has only a limited inﬂuence on the outgoin g message and this message is mostly d etermined by the 99 9 incoming messages). If we ignore ther efore the r eceiv ed m essage then we see that expansion helps since it can guarantee that on ly few nodes have many bad inc oming m essages; oth erwise the set o f no des that h as bad o utgoing messages h as too few neig hbors in or der for the graph to b e a n expan der . If the variable no des h ave sm all degree, th en the received values p lay a signiﬁcant r ole and can no lo nger b e ign ored. Therefo re, for small d egrees expansion arguments do not sufﬁce by themselves. In Section I II we con centrate on the case l = 3 . Th is is the smallest degree that is meaningful for all the decoder s that we co nsider and so o ne can think of it as the most dif ﬁcult general c ase. Except for the BEC, th is case is not covered by a simp le expansion argumen t and the technique s are more in volved. I I . S U FFI C I E N T C O N D I T I O N S B A S E D O N E X PA N S I O N A R G U M E N T S Burshtein and Miller were the ﬁrst to realize th at expansion arguments ca n b e app lied not only to th e ﬂipping algorithm but also to show th at certain MP a lgorithms have a ﬁxed error correcting radius [5]. Althoug h their results can be ap plied directly to our pro blem, we get stron ger statements by using the expan sion in a slightly different manner . A. Deﬁnitions a nd R eview Deﬁnition 1 (Exp ansion) : Let G b e an element f rom LDPC ( n, l , r ) . 1) Left Expander : The gra ph G is an ( l , r , α, γ ) left expander if for every subset V o f at mo st αn variable nodes, the set of check no des that are connected to V is at least γ |V | l . 2) Right Expand er: Let m = n l r . Th e graph G is an ( l , r , α, γ ) right e xpand er if for every subset C o f a t most αm ch eck nodes, the set of variable no des that are connected to C is at least γ |C | r . ♦ Why are we using expansion argum ents in th e context of standard LDPC ensemb les? It is well kn own that such co des are g ood expand ers with h igh p robab ility [5]. Theor em 2 (Expansio n of Ra ndom Graphs [5]): L et G be chosen unifor mly at ra ndom from LDPC ( n, l , r ) . Let α max be the positive solution of the equation l − 1 l h 2 ( α ) − l r h 2 ( αγ r ) − αγ r h 2 (1 /γ r ) = 0 . Let X ( l , r , α, γ ) denote th e set of graphs { G ∈ LDPC ( n, l , r ) : G ∈ ( l , r , α, γ ) left expa nder } . If γ < 1 − 1 l then α max is strictly p ositiv e and for α < α max P { G ∈ X ( l , r , α, γ ) } ≥ 1 − O ( n − ( l (1 − γ ) − 1) ) . (5) Let m = n l r . W e g et the eq uiv alen t resu lt for righ t expanders by exch anging the roles of l a nd r as well as n and m . As explained befo re, the ide a is to show th at the error probab ility goes to zero once the nu mber o f bad messages becomes smaller th an a cer tain threshold. T o make this m ore concrete we n eed a pro per deﬁn ition o f “go od” message subsets. Deﬁnition 3 (Goo d Message Su bsets): For a ﬁxed ( l , r ) - regular ensemble and a ﬁxed MP d ecoder with message alphabet M , let β , 0 < β ≤ 1 , be such that β ( l − 1 ) ∈ N . A “good ” pair of subsets o f M of “stren gth” β is a pair of subsets ( G v , G c ) so th at • if at least β ( l − 1) of the ( l − 1) inc oming m essages at a variable n ode belo ng to G v then the ou tgoing message on th e rem aining edge is in G c • if a ll the ( r − 1) incom ing messages at a check node belong to G c then the outg oing message on the remain ing edge is in G v • if a t le ast β ( l − 1) + 1 of all l in coming m essages be long to G v , then the variable is decoded correctly W e denote the prob ability of the bad me ssage set M\ G v after ℓ iteratio ns of DE by p ( ℓ ) bad . ♦ 3 As we will see shortly , for many MP decoder s o f in terest the sets G v and G c can be chosen to be equ al. T his is true for all those MP decoder s wh ere the o utgoin g reliab ility at a check node is eq ual to the lea st reliability of all th e incoming messages (we call them m in-sum-ty pe decoder s). Theref ore, if all inc oming m essages are go od (me aning they a re c orrect an d have sufﬁciently large r eliability) then the outg oing message is cor rect and also h as sufﬁciently large reliability . The BP decoder is an interesting c ase whe re G v 6 = G c . For th is decoder the reliability of the outgoing message at a check node is strictly smaller than the smallest reliability of all incoming messages. Therefo re, we need to deﬁne the set G c to consist of messages o f strictly high er reliability than the set of messages in G v . Deﬁnition 4 (Goo d Nodes): W e call a variable or check node “go od” if all o f its o utgoing messages are good. All other nodes are called “b ad. ” ♦ Example 5 (BEC and BP ): If at least 1 of the ( l − 1) messages enterin g a variable n ode is known then the outgo ing message is known and if at least 1 of the l messages enterin g a variable node is kn own then the variable itself is known. Further, if all of the ( r − 1) inco ming messages entering a check node a re kn own then the ou tgoing message is known. W e conclud e that good is equiv a lent to known an d that β = 1 l − 1 . ♦ As a second standard example we con sider transmission over th e BSC ( ǫ ) and decod ing via the so-called Gallager Algorithm B (GalB). Deﬁnition 6 (Gallager Alg orithm B): Messages are ele- ments o f {± 1 } . The initial messages fr om the v ariable nodes to the check no des are the v alues received via the chann el. The decodin g process proceed s in iterations with the following processing r ules: Check-Nod e Processing: At a check node the o utgoing message alo ng a particular e dge is the pr oduct of the incoming messages along all th e rema ining edg es. V ariable-Node Processing: At a variable nod e the ou t- going message alon g a particular edge is eq ual to the majority vote on the set of other inc oming m essages and the rec eiv ed value. Ties are r esolved rando mly . ♦ Example 7 (BSC and Ga lB): Assum e that the recei ved value ( via the ch annel) is incorrect. In this case at least ⌈ ( l − 1) / 2 ⌉ + 1 of the ( l − 1) incoming messages should be co rrect to e nsure that the o utgoing message is correct. I f at least ⌈ ( l − 1) / 2 ⌉ + 2 o f the l incoming messages are correct then th e variable is decode d correctly . (In fact, it is sufﬁcient to h av e ⌊ ( l − 1) / 2 ⌋ + 2 correct inc oming m essages to be able to decode correctly .) Therefore, good is equiv a lent to corr ect and β = ⌈ ( l − 1) / 2 ⌉ +1 l − 1 . ♦ B. Expansion and Bit Err or Pr oba bility Theor em 8 (Expa nsion and Bit Err or Pr o bability): Consider an LDPC ( n, l , r ) ensemble, tran smission over a BMS ( ǫ ) channel, an d a symmetr ic MP decoder . Le t β be the streng th of the go od message subset. I f β < 1 and if fo r some ǫ , p ( ∞ ) bad = 0 then lim n →∞ lim sup ℓ →∞ E LDPC ( n, l , r ) [ P MP b ( G , ǫ, ℓ )] = 0 . (6) Pr oof: Here is the idea of the pr oof: we ﬁrst ru n the MP algo rithm for a ﬁxed number of iter ations suc h that the bit erro r prob ability is sufﬁciently small, say p . If the length n is suf ﬁciently large then we can use DE to gage the n umber of required iterations. Then, usin g the expansion p roperties o f the graph, we show that the probability o f err or stays close to p for any nu mber of f urther iterations. In particular, we show that the error prob ability never exceeds cp , where c is a constant, which on ly depends on the degree distribution an d β . Since p can be ch osen arbitr arily small, th e claim f ollows. Here is the ﬁne p rint. Deﬁne γ =  1 − 1 l  1 + β 2 β < 1 <  1 − 1 l  . (7) Let 0 < α < α max ( γ ) , where α max ( γ ) is the function deﬁned in Theor em 2. Let p = α (1 − β )( l − 1) 4 and let ℓ ( p ) be the n umber of iteration s such that p ( ℓ ) bad ≤ p . Since p ( ∞ ) bad = 0 and p > 0 this is possible. Let P e ( G , E , ℓ ) denote th e fraction o f messages belongin g to the b ad set a fter ℓ itera tions. Let Ω d enote the space of cod e an d noise realizatio ns. Let A ⊆ Ω denote the subset A = { ( G , E ) ⊆ Ω | P e ( G , E , ℓ ( p )) ≤ 2 p } . (8) From ( the Con centration ) Theorem 39 we know th at P { ( G , E ) 6∈ A } ≤ 2 e − K np 2 (9) for some strictly positive constan t K = K ( l , r , p ) . In words, for most (sufﬁciently large) graphs and noise realizations the error prob ability after a ﬁxed nu mber of iterations behaves close to the asymptotic ensemble. W e now show that o nce the err or prob ability is sufﬁciently sma ll it never increases substantially ther eafter if the g raph is an expander, regar dless of h ow many iterations we still perfo rm. Let V 0 ⊆ [ n ] be the initial set of bad variable no des. More precisely , V 0 is th e set of all variable nodes that are bad in the ℓ ( p ) -th iteration. W e claim that | V 0 | ≤ 2 p l − β ( l − 1) n . (This is because for a variable to send a b ad m essage it must ha ve at least l − β ( l − 1) incomin g bad messages.) As we ju st discussed, for most graph s and noise realizations this is the case. As a worst case we assume that all its o utgoing edges are bad. Let th e set of check nod es connecte d to V 0 be C 0 . These ar e the only ch eck n odes that potentially can send bad messages in the next iter ation. Therefo re, we call C 0 the initial set o f ba d ch eck n odes. Clearly , | C 0 | ≤ l | V 0 | . (10) Consider a variable nod e and a ﬁxed edge e conn ected to it: the outgoin g message along e is determin ed by the received value as well a s b y the ( l − 1) inco ming messages a long the other ( l − 1) edg es. Recall that if β ( l − 1) of those m essages are go od then the outgoing message along edge e is g ood. Therefo re, if a v ariable node has β ( l − 1) + 1 g ood in coming messages, then a ll outgoing messages are go od. W e con clude that for a variable node to be bad at least l − β ( l − 1) inco ming messages must be bad. Theref ore, it should conn ect to at least 4 l − β ( l − 1 ) b ad check n odes. This lea ves at most β ( l − 1) edges th at are conn ected to new check nodes. W e want to count the numb er of bad variables that are created in any of the future iterations. F or conv enience, once a variable beco mes bad we will conside r it to be bad for all future iterations. Th is implies th at the set of bad variables is non-d ecreasing. Let us n ow bound the n umber of bad variable no des by the following process. The p rocess proceeds in discrete steps. At each step t , co nsider the set of variables th at a re n ot co ntained in V t but that ar e connected to at least l − β ( l − 1) check nodes in C t (the set of “b ad” check nodes). If at time t no such variable exists stop the pr ocess. Otherwise, choose one such variable at ran dom and ad d it to V t . Th is gives u s the set V t +1 . W e also ad d all neighbo rs of this variable to C t . This giv es us the set C t +1 . By th is we are add ing the variable nodes that can potentially become bad and the check nod es that can potentially send bad messages to V t and C t respectively . As discussed above, for a good variable to become bad it must be connected to at least l − β ( l − 1) check nodes that are connected to b ad variable nodes. There fore, at most β ( l − 1) new check no des ar e added in each step. Hence, if th e p rocess continues th en | V t +1 | = | V t | + 1 , (11) | C t +1 | ≤ | C t | + β ( l − 1) . (12) By assumptio n, the gr aph is an elemen t of X ( l , r , α, γ ) . Initially we have | V 0 | ≤ 2 p l − β ( l − 1) n = α ( l − 1)(1 − β ) 2( l − β ( l − 1)) n ≤ αn . Therefo re, as lon g as | V t | ≤ αn , γ l | V t | ≤ | C t | , (13) since C t contains all neighb ors of V t . Let T deno te the stopping time o f the p rocess, i.e., the smallest time at which no new variable can be ad ded to V t . W e will n ow show that the stopping time is ﬁnite. W e have γ l ( | V 0 | + t ) (11) = γ l | V t | (13) ≤ | C t | (12) ≤ | C 0 | + tβ ( l − 1) ( 10 ) ≤ l | V 0 | + tβ ( l − 1) . Solving fo r t this gi ves us T ≤ | V 0 | l (1 − γ ) γ l − β ( l − 1) . Therefo re, | V T | ≤ | V 0 | l (1 − γ ) γ l − β ( l − 1) + | V 0 | ≤ 2 p γ l − β ( l − 1) n = αn, (14) where in the on e befo re last step we used the fact that | V 0 | ≤ 2 p l − β ( l − 1) n . The wh ole derivation so far was based on the assumptio n that | V t | ≤ αn fo r 0 ≤ t ≤ T . But as we can see from th e above equation, th is condition is indeed veriﬁed ( | V t | is non- decreasing and | V T | ≤ αn ). Putting a ll th ese th ings to gether, we get E [ P MP b ( G , ǫ, ℓ )] = E [ P MP b ( G , E , ℓ )( 1 { ( G , E ) ∈ A } + 1 { ( G , E ) 6∈ A } )] ≤ E [ P MP b ( G , E , ℓ ) 1 { ( G , E ) ∈ A } ] + P { ( G , E ) 6∈ A } ≤ E [ P MP b ( G , E , ℓ ) 1 { ( G , E ) ∈ A } 1 { G ∈X ( l , r ,α,γ ) } ]+ P { G 6∈ X ( l , r , α, γ ) } + P { ( G , E ) 6∈ A } . Apply lim sup ℓ →∞ on bo th sides o f the inequality . Acco rding to (14) the ﬁrst term is boun ded by α . For the second term, since γ < 1 − 1 l , we know fr om Theo rem 2 that it is upper bound ed by O ( n − ( l (1 − γ ) − 1) ) . For the thir d te rm we know fro m (9) tha t it is bounded by 2 e − K np 2 for some strictly positive constant K = K ( l , r , p ) . Ther efore, if we subsequen tly apply th e limit lim n →∞ then we get lim n →∞ lim sup ℓ →∞ E [ P MP b ( G , ǫ, ℓ )] ≤ α. Since th is co nclusion is valid for any 0 < α ≤ α max it fo llows that lim n →∞ lim sup ℓ →∞ E [ P MP b ( G , ǫ, ℓ )] = 0 . Example 9 (BEC and BP ): W e know fr om Exam ple 5 that β ( l − 1) = 1 . If we app ly the con ditions of Theorem 8, we see that we require 1 / ( l − 1) < 1 . He nce, the exch ange o f the limits is valid for l ≥ 3 . Of course, for the BEC the exchange of limits in this regime f ollows directly by th e m onoton icity of th e algorithm. ♦ Example 10 (BSC and GalB) : W e know fr om Example 7 that β ( l − 1) = ⌈ ( l − 1 ) / 2 ⌉ + 1 . Fr om T heorem 8 if ǫ < ǫ GalB , the limits can b e exch anged if l − 1 > 1 + ⌈ ( l − 1 ) / 2 ⌉ , i. e., for l ≥ 5 . ♦ The ke y to applying e xpansion arguments to d ecoders with a continuo us alp habet is to ensur e that the received values are no longer dominant once DE has reached small err or pr obabilities. This can b e achie ved b y e nsuring that the inp ut alphabet is smaller tha n the message alphabet. Deﬁnition 1 1 (B ounde d MP Deco ders): Given a MP d e- coder whose message passing alphabet is unb ound ed, i.e., it is equal to R , we associate to it a bound ed version. The bo unded MP deco der with parameter M ∈ R + , de note it by MP ( M ) , is identical to the standar d MP decoder except that th e r eliability of the me ssages e mitted by the check n odes is b ounde d to M before th e messages are forwarded to the variable nod es. ♦ Note that th e outg oing messages fr om the check nodes lie in [ − M , M ] wh ile the ou tgoing message s f rom the variable nodes can lie outside th is ran ge. Example 12 (MS( M ), BP( M ) Decoders): The MS ( M ) de- coder and the BP ( M ) decode r are identical to the stan dard min-sum (MS) and belief propagatio n ( BP) de coder, except that the r eliability of the messages emitted by the check nodes is bounded to M before the messages are forwarded to the variable nodes. ♦ Example 13 (MS (5) Decoder): Consider an ( l ≥ 5 , r ) en - semble and ﬁx M = 5 . Let the channel log -likelihoods b elong to [ − 1 , 1] . I t is easy to check that in this case we can c hoose G v = G c = [4 , 5] and that it has streng th β ≤ 3 4 . Th erefore, if the probability of outgoing messages from c heck nodes being in [4 , 5] goes to 1 under DE, then acco rding to Th eorem 8 the limits can be exchanged. For exam ple, consid er BSC( ǫ ) and LDPC (5 , 6) ensemb le. It is kn own for this ch annel and MS decoder th e messages are 5 of the form k lo g 1 − ǫ ǫ , f or k ∈ Z . T herefor e we can restrict th e message space to Z with the channel values ma pped to {± 1 } . Now , if we consider MS( 5 ) decoder, the messages belong to {− 5 , . . . , 5 } . For this deco der , we can show that the limits can be exchan ged till the DE threshold of 0 . 06 7 . ♦ Example 14 (BP (10) Decoder): Let l = 5 and r = 6 and ﬁx M = 10 . Let the chann el log-likelihoods belo ng to [ − 3 , 3] . W e claim that in this case the message subset p air G v = [9 , 10 ] , G c = [14 , 43] is good with stren gth β = 3 4 . This can be seen as follows: I f all the incom ing messages to a ch eck node belong to G c , then the outgoing m essage is at least 12 . 3 9 , which is m apped do wn to 1 0 . Suppose that at a variable nod e at least 3(= β ( l − 1)) out of th e 4 incoming messages belong to G v . I n this case the reliability of the o utgoing message is at least 14 = 3 × 9 − 10 − 3 . The maximum r eliability is 43 . Mo reover , if all the incoming messages belo ng to G v then the v ariable is decoded correctly . Th erefor e if the probability of o utgoing messages from check nodes being in [9 , 1 0] goes to 1 in the DE limit then from Theore m 8, the limits can b e exchanged. For exam ple, consid er BSC( ǫ ) with channel log-likelihood s restricted b etween [ − 3 , 3 ] . For ǫ < 1 1+ e 3 , th e log- likelihoods lie ou tside [ − 3 , 3] and hence they are mapped to {± 3 } . In this case the limits can b e exchanged till th e DE thr eshold of 0 . 136 . No te that th is is what is done practice, sinc e o ne ha s to work with b ound ed likelihoods. ♦ C. Expan sion and B lock Err or Pr oba bility In the previous section we con sidered the bit error prob a- bility . W e will now derive sufﬁcient condition s for the block error probab ility . Again we use expansion arguments but we proceed in a slightly dif ferent way . Theor em 15 (Expan sion and Bloc k Err o r Pr obab ility): Consider an LDPC ( n, l , r ) ensemble, tran smission over a BMS ( ǫ ) channel, and a symmetric MP decod er . Let β be the strength of the good message subset. If β < l − 2 l − 1 and if f or some ǫ , p ( ∞ ) bad = 0 then lim n →∞ lim sup ℓ →∞ E LDPC ( n, l , r ) [ P MP B ( G , ǫ, ℓ )] = 0 . (15) Pr oof: As in Theor em 8 we ﬁrst perfor m a ﬁxed nu mber of iteration s to b ring down th e bit error p robab ility b elow a desired le vel. W e then use Theorem 36 to sho w that for a graph with su fﬁcient expansion the MP algorith m decod es the whole block correctly once the bit error proba bility is sufﬁciently small. This is very much in the spirit of Burshtein and Miller [5]. Deﬁne γ =  1 − 1 l   3 + β 4  . Let 0 < α < α max ( γ ) , where α max ( γ ) is the function deﬁned in Theor em 2. Let p = α ( l − β ( l − 1)) 2 lr and let ℓ ( p ) be the numb er of iterations such that p ( ℓ ) bad ≤ p . Let Ω denote the sp ace of code and n oise realization s. Let P e ( G , E , ℓ ) d enote the fr action of messages be longing to the b ad set after ℓ iterations. Let A ⊆ Ω denote the subset A = { ( G , E ) ⊆ Ω | P e ( G , E , ℓ ( p )) ≤ 2 p } . From ( the Con centration ) Theorem 39 we know th at P { ( G , E ) 6∈ A } ≤ 2 e − K np 2 (16) for some strictly positiv e constant K = K ( l , r , p ) . Since β l − 1 l ≤ 2 γ − 1 we can apply Th eorem 36: if G ∈ X ( l , r , α, γ ) and if the initial number o f ba d me ssages is less than α lr then all the messages will become good a fter a sufﬁcient number of iteration s. Putting a ll th ese th ings to gether, we get E [ P MP B ( G , ǫ, ℓ )] = E [ P MP B ( G , E , ℓ )( 1 { ( G , E ) ∈ A } + 1 { ( G , E ) 6∈ A } )] ≤ E [ P MP B ( G , E , ℓ ) 1 { ( G , E ) ∈ A } ] + P { ( G , E ) 6∈ A } ≤ E [ P MP B ( G , E , ℓ ) 1 { ( G , E ) ∈ A } 1 { G ∈X ( l , r ,α,γ ) } ]+ P { G 6∈ X ( l , r , α, γ ) } + P { ( G , E ) 6∈ A } . Apply lim sup ℓ →∞ on bo th sides o f the inequality . Acco rding to Theorem 36 the ﬁrst term is 0 . For the seco nd term, since γ < 1 − 1 l , we k now f rom Theor em 2 that it is upp er bound ed by O ( n − ( l (1 − γ ) − 1) ) . For the third term we know from ( 16) that it is bound ed by 2 e − K np 2 for some strictly positi ve constant K = K ( l , r , p ) . T herefor e, if we sub sequently apply the limit lim n →∞ then we get lim n →∞ lim sup ℓ →∞ E [ P MP B ( G , ǫ, ℓ )] = 0 . Example 16 (BEC and B P): Acc ording to Theore m 8 we require l ≥ 4 . Hen ce, if l ≥ 4 then the b lock err or p robability tends to zero below the BP thresho ld. ♦ Example 17 (BSC and GalB) : As e xplained in Examp le 7 for the Gallager B algorithm ov er BSC, β ( l − 1) = 1 + ⌈ ( l − 1) / 2 ⌉ . The a bove condition imp lies if l − 2 > 1 + ⌈ ( l − 1 ) / 2 ⌉ , i.e., for l ≥ 7 the block error probab ility goes to zero below ǫ GalB . ♦ Example 18 (MS (5) Decoder): Consider an ( l ≥ 7 , r ) en - semble and ﬁx M = 5 . Let the channel log -likelihoods b elong to [ − 1 , 1] . I t is easy to check that in this case we can c hoose G v = G c = [4 , 5] and that it has streng th β ≤ 2 3 . Th erefore, if the probability of outgoing messages from c heck nodes being in [4 , 5] goes to 1 under DE then according to Theore m 15 the blo ck er ror pr obability ten ds to 0 . ♦ Example 19 (BP (10) Decoder): Let l = 7 and r = 8 and ﬁx M = 10 . Let the chann el log-likelihoods belo ng to [ − 1 , 1] . W e claim that in this case the m essage subset p air G v = [9 , 1 0 ] , G c = [15 , 59] is good with strength β = 2 3 . Therefo re if the prob ability of outg oing messages from ch eck nodes being in [9 , 10 ] goes to 1 in the DE limit then from Theorem 15, the block erro r pr obability goes to z ero. ♦ Theorem 8 has a stronger im plication than Theorem 15 since it conc erns the block err or probab ility . Unfo rtunately , the require d co nditions are considerably more restrictive. W e conjecture th at in fact the condition s of Th eorem 15 can be weakened by considerin g several stages of th e algorithm jointly and that the requir ed con ditions ar e id entical to the ones in Theo rem 15. Conjectur e 2 0 (Ex pansion and Block Err or Pr ob ability): Consider an LDPC ( n, l , r ) ensemble, transmission over a BMS ( ǫ ) channel, an d a symmetr ic MP decoder . Le t β be 6 the streng th of the go od message subset. I f β < 1 and if fo r some ǫ , p ( ∞ ) bad = 0 then lim n →∞ lim sup ℓ →∞ E LDPC ( n, l , r ) [ P MP B ( G , ǫ, ℓ )] = 0 . (17) I I I . S U FFI C I E N T C O N D I T I O N B A S E D O N B I RT H - D E AT H P R O C E S S In the pr evious sectio n we relied solely on the expansion of the graph to prove the validity of th e limit exchange. As can b e seen from the examples, f or th e d ecoders o f inter est the theor ems are on ly valid f or high er degrees, lets say l ≥ 5 . Practical codes howev er ty pically have small degrees. In th ese cases expan sion itself is no t sufﬁcient. In m ore detail, the p roofs in the pr evious section have two phases. In the ﬁrst phase we run the MP alg orithm for some ﬁxed nu mber of iteration s to get the error probability d own to a small constant. I n the second phase w e prove that the error probability stays clo se to 0 regardless of how many further iterations we perfor m and assuming pessimistically that all variables nodes have b ad rec eiv ed v alues. This is too pe ssimistic an assumption for small degrees, wh ere the received value plays an important ro le. In this section , we develop a method which takes the actu al ch annel realization into ac count. Consider a MP deco der operating on a message alphab et M ⊆ R . Further, fo r µ ∈ M , deﬁne | µ | to be th e r eliability of the message. This means that we deﬁne the reliability of − µ to be the same as the r eliability o f µ . Most of the MP algorith ms used in pr actice like GalB, BP , and MS, fall in the following category o f mon otone d ecoders. Deﬁnition 2 1 (Mo notone MP Decod ers): W e say that a symmetric M P decoder is mo notone if the following con - ditions are fulﬁlled. At variable nodes the processing rules are mon otone with r espect to the natural or der o n M ; for a ﬁxed recei ved value, the outgoing message is a non-d ecreasing function of the incoming me ssages. At ch eck nodes the p rocessing rules are monoton e with respect to th e natural ord er on the re liabilities; the reliability of the outg oing message is a no n-decr easing functio n of the reliabilities of the incoming m essages. ♦ Monoton icity is a usefu l pro perty and it is also quite n atural. A remainin g difﬁculty in analyz ing the se decoder s is that at check nodes the mono tonicity is with respect to the reliability and n ot the message itself. W e will see shortly how to get around this pr oblem. In wha t follows we mainly discuss the case of th e GalB algorithm and l = 3 . The gen eralization to degree l ≥ 4 is straightfor ward and it is discussed in Sec tion III-H. In this section we fur ther g i ve some examples of oth er mo notone decoder s to which the method can be extended. A. Main Re sult a nd Outline Lemma 22 (Ex change of Limits): Consider transmission over th e BSC( ǫ ) using random e lements fr om the ( l , r ) - regular ensemble and deco ding by the GalB algorithm. If ǫ < ǫ LGalB then lim n →∞ lim sup ℓ →∞ E [ P GalB b ( G , ǫ, ℓ )] = 0 , where ǫ LGalB is th e smallest p arameter ǫ for which a so lution to the following ﬁxed p oint equa tion exists in (0 , ǫ ] . x = ǫ ⌊ l − 1 2 ⌋ X k =0  l − 1 k  y k (1 − y ) l − 1 − k + ¯ ǫ l − 1 X k = ⌊ l 2 ⌋ +1  l − 1 k  (1 − y ) k y l − 1 − k + 1 { l 2 ∈ N } 2  l − 1 l 2   ǫy l 2 (1 − y ) l 2 − 1 + ¯ ǫ (1 − y ) l 2 ( y ) l 2 − 1  , (18) where y = (1 − x ) r − 1 . For the case of ( l = 3 , r )-regular ensemble this equation simpliﬁes to x = ¯ ǫ (1 − (1 − x ) r − 1 ) 2 + ǫ (1 − (1 − x ) 2( r − 1) ) . Discussion: Note that the th reshold ǫ LGalB introdu ced in the precedin g lemma is in gener al slightly smaller than the DE threshold ǫ GalB . W e pose the extension of th e result to channel values up to the DE threshold as an in teresting op en pro blem. It is likely to be dif ﬁcult. r rate ǫ Sha ǫ GalB ǫ LGalB 3 0 . 0 ≈ 0 . 5 ≈ 0 . 222 ≈ 0 . 1705 4 0 . 25 ≈ 0 . 2145 ≈ 0 . 1068 ≈ 0 . 0847 5 0 . 4 ≈ 0 . 1461 ≈ 0 . 06119 ≈ 0 . 0506 6 0 . 5 ≈ 0 . 11002 ≈ 0 . 0394 ≈ 0 . 0336 7 0 . 5714 ≈ 0 . 08766 ≈ 0 . 02751 ≈ 0 . 02398 8 0 . 625 ≈ 0 . 07245 ≈ 0 . 02027 ≈ 0 . 01795 9 0 . 667 ≈ 0 . 06141 ≈ 0 . 01554 ≈ 0 . 01395 10 0 . 7 ≈ 0 . 05324 ≈ 0 . 01229 ≈ 0 . 01115 T ABLE I T H R E S H O L D V A L U E S F O R S O M E D E G R E E D I S T R I B U T I O N S W I T H l = 3 . r rate ǫ Sha ǫ GalB ǫ LGalB 4 0 . 0 ≈ 0 . 5 ≈ 0 . 0840 ≈ 0 . 0697 5 0 . 2 ≈ 0 . 1461 ≈ 0 . 0464 ≈ 0 . 0399 6 0 . 333 ≈ 0 . 11002 ≈ 0 . 0292 ≈ 0 . 0258 7 0 . 4286 ≈ 0 . 08766 ≈ 0 . 0200 ≈ 0 . 018 8 0 . 5 ≈ 0 . 07245 ≈ 0 . 0146 ≈ 0 . 0133 9 0 . 556 ≈ 0 . 06141 ≈ 0 . 0111 ≈ 0 . 0102 10 0 . 6 ≈ 0 . 05324 ≈ 0 . 0087 ≈ 0 . 0081 T ABLE II T H R E S H O L D V A L U E S F O R S O M E D E G R E E D I S T R I B U T I O N S W I T H l = 4 . Example 23 : T able I shows th resholds fo r l = 3 , r = 3 , · · · , 10 . For the ( l = 3 , r = 6) degree distribution we hav e ǫ LGalB ≈ 0 . 033 6 . T his is slightly smaller th an, but compar able to, ǫ GalB ≈ 0 . 0 3 94 . ♦ W e pr oceed b y a sequ ence of simp liﬁcations, ensuring in each step th at the m odiﬁed algorithm is an upper boun d on the o riginal process. In Section III- B we simplify the decoder b y “lineariz ing” the pr ocessing rules at the check nodes. In Section III -C we f urther upper boun d the pr ocess b y considerin g th e marking p rocess associa ted with the decod ing algorithm . In Section I II-D we construc t a witness for the 7 marking p rocess and derive bound s o n the size of such a witness. In Section III -E we then show that, c ondition ed on the witness, we can co nsider the chann el r ealizations outside the witness to be random and indep endent o f the witness. In Section III -F we use an expansion argument to boun d the stopping time of the birth an d death p rocess associated with the markin g pro cess. Finally , in Section III -G we co mbine all previous statements to derive at ou r con clusion. B. Linearized Ga llager Algorithm B W e pro ceed as in Section I I: Fix 0 ≤ ǫ < ǫ LGalB . W e prove that for ev ery α > 0 there exists an n ( α, ǫ ) so that lim sup ℓ →∞ E [ P GalB b ( G , ǫ, ℓ )] < α for n ≥ n ( α, ǫ ) . W itho ut loss of generality we can assume tha t the all-one codeword was sent. W e will make this assumption thr ougho ut the remaind er o f this section. Th erefore , the message 1 signi- ﬁes in the sequ el a corr ect message, wher eas − 1 implies that the message is incorr ect . For this setting, we deﬁne the following linearized version of th e decoder . Deﬁnition 2 4 (Lin earized Ga lB): Th e linearized GalB de- coder, denoted by LGalB, is deﬁned as f ollows: at the variable node the c omputatio n rule is same as th at of th e GalB deco der . At the check node the o utgoing message is the minim um o f the incoming messages. Discussion: The LGalB is no t a p ractical d ecoding algorithm but rather a convenient device fo r analysis; it is understood that we assume that the all-one c odeword was tr ansmitted and that quantities like the error proba bility re fer to the variables decoded as − 1 . By some abuse of notation , we nevertheless refer to it as a deco der . The LGalB d ecoder is monotone also with respect to the incoming messages at chec k nodes. Moreover , it satisﬁes the following proper ty . Lemma 25 (LGalB is Up per Bou nd on GalB): For any graph G , any noise realization E , any startin g set of “bad” edges, and any ℓ , we have P GalB e ( G , E , ℓ ) ≤ P LGalB e ( G , E , ℓ ) , where P e ( G , E , ℓ ) denotes the fr action of erron eous messages after ℓ iteration s of decoding . Pr oof: Consider one iteratio n, i.e., a check-n ode step followed by a variable-node step. Let B GalB / LGalB ℓ denote the set of bad ed ges (edges with message − 1 ) after th e ℓ -th iteration of GalB and LGalB, respectively . Let ψ GalB / LGalB E ( B ) deno te the set of bad edges after one iteration assum ing that th e initial such set is B . W e use the fo llowing two facts: (i) The o utgoing messages for the L GalB decoder at v ariable/check nodes a re monoton e; if we d ecrease (with respe ct to the natural order on M ) the inpu t at a variable/check node then the output is e ither decreased or stays the same. I.e ., if B ⊆ B ′ , mean ing that the messages in B ′ can be obtaine d by decreasing some o f the +1 messages in B to − 1 , then ψ LGalB E ( B ) ⊆ ψ LGalB E ( B ′ ) . (ii) For any set of in put messages, the o utgoing message of L GalB is less th an o r equal to the message of the GalB decoder, i.e., ψ GalB E ( B ) ⊆ ψ LGalB E ( B ) . For the proo f, we proce ed by indu ction. Let B 0 be the initial set of bad edg es. After the ﬁrst itera tion, from (ii) we g et B GalB 1 = ψ GalB E ( B 0 ) ⊆ ψ LGalB E ( B 0 ) = B LGalB 1 . T o comp lete the proof it is sufﬁcient to show that B GalB ℓ ⊆ B LGalB ℓ implies B GalB ℓ +1 ⊆ B LGalB ℓ +1 . Using (i) and (ii) we have B LGalB ℓ +1 = ψ LGalB E ( B LGalB ℓ ) ⊇ ψ LGalB E ( B GalB ℓ ) ⊇ ψ GalB E ( B GalB ℓ ) = B GalB ℓ +1 and hence the lemma. From the above lemm a it sufﬁces to prove the exchang e of limits for the linear ized algorith m. Note that ǫ LGalB as d eﬁned in Lemm a 22 is the thresho ld of the L GalB algo rithm. W e will prove that for e very 0 ≤ ǫ < ǫ LGalB and every α > 0 ther e exists an n ( α, ǫ ) so that lim sup ℓ →∞ E [ P LGalB b ( G , ǫ, ℓ )] < α for n ≥ n ( α, ǫ ) . As we will see later, the mono tonicity property of LGalB considerably simpliﬁes the analysis. But the price paid for the simpliﬁcatio n is that the technique work s o nly for ǫ < ǫ LGalB , which is slightly smaller than the DE threshold . C. Markin g Pr ocess Rather than analyzing the LGalB alg orithm directly , we ana- lyze th e associated ma rking pr ocess . Th is p rocess is monoto ne as a fun ction of the iteration s. More precisely , we split the process into two phases: we start with LGalB fo r ℓ ( p ) iterations to get th e er ror p robab ility below p ; we then contin ue the marking process ass ociated with an inﬁnite numb er o f further iterations of LGalB. This means that we mark any variable that is bad in at least on e iteration ℓ ≥ ℓ ( p ) . Clearly , the u nion o f all variables that a re bad at at least on e point in time ℓ ≥ ℓ ( p ) is an up per bound on the maximum numbe r of variables that are bad at any speciﬁc instance in time. The standard schedule of the LGalB is parallel, i.e., all incoming messages (at eith er variable or ch eck nodes) are processed at the same time. This is the natu ral schedule fo r an actual implem entation. For the pu rpose of analysis it is conv enient to co nsider an asynchr on ous schedule. Here is how the general async hrono us ma rking pr ocess proceed s. W e ar e given a gra ph G and a noise realization E . W e are also given a set of marked edges. T hese ma rked e dges are d irected, f rom variable no de to check nod e. At the start of the pr ocess m ark th e variable no des th at are conn ected to th e marked ed ges. Declar e all oth er v ariables an d edges as un marked . Un marked edges d o not hav e a d irection. The process proceeds in discrete steps. At each step we pick a marked edg e and we p erform the pro cessing d escribed below . W e co ntinue until n o more marked ed ges are lef t. Here are the pr ocessing rule s: If th e m arked edge e go es f rom variable to check: • Let c be the check node conn ected to e . Declare e to be unmarked but mark all o ther edges conn ected to c ; o rient these mar ked edg es from check to variable; If th e m arked edge e go es f rom check to variable: • Let v be th e connected v ariable node. I f v has a go od associated chann el realization and v is u nmarked then mark v and declare e to be unm arked. • Let v be the connecte d variable node. If v has an asso- ciated bad chann el realization o r if v has an associated good chan nel realization but is marked : (i) mar k v an d all its outgoing edges; (ii) orient the edges from variable to che ck; (iii) unm ark e . 8 Let M ( G , E , S ) d enote the set o f marked variables assuming that we start with th e set o f marked edg es S an d that we run th e asynch ronou s mar king process. Le t M ( G , E , S ) = |M ( G , E , S ) | . As a special case, let M ( G , E , ℓ ) denote th e set of marked variables at the end of the p rocess assuming that the initial set of marked edges is the set of bad ed ges after ℓ round s of LGalB. As before, M ( G , E , ℓ ) = |M ( G , E , ℓ ) | . It is not hard to see th at fo r any ℓ ′ ≥ ℓ , P LGalB b ( G , ǫ, ℓ ′ ) ≤ M ( G , E , ℓ ) /n : fo r ℓ ′ = ℓ both pr ocesses start with th e same set of bad edges an d both a re operating on the same gr aph and noise realiza tion. At the check-node side the p rocessing rules are identical. At the variable-node side both processes also b ehave in the same way if they enco unter a variable node with a b ad chann el r ealization. The difference lies in th e behavior when they en counter a variable node with a good channel rea lization. In such a case the o utgoing message for the LGalB is bad only if there are two bad messages entering at the same time instance . Th e asyn chrono us m arking pro cess algorithm d eclares the outgoing message to be b ad if th ere are two incoming bad messages, even if the two m essages might c orrespon d to d ifferent time instan ces as m easured by the parallel sched ule. W e conclude th at f or ℓ ′ ∈ N lim sup ℓ →∞ E [ P LGalB b ( G , ǫ, ℓ )] ≤ 1 n E [ M ( G , E , ℓ ′ )] . (19) D. W itne ss It remains to boun d E [ M ( G , E , ℓ )] . Assume at ﬁrst that we take a random graph G and a random n oise realization E and that we start th e mark ing process with a sufﬁciently small random set of marked edges (and not the set of bad edges after ℓ iteration s of LGalB). In this case one can show that the num ber of marked no des at the end of the proc ess is with high prob ability not mor e th an a constant multiple of the size of the starting set. T o prove this statemen t, we use the fact that the gr aph, th e noise, and the starting set of edges are all independ ent. Th erefore, the marking pr ocess beh av es essentially like a birth an d dea th process: we pick an edge and we explore its neighbo rhood ; with a certain probab ility the edge d ies (if it en ters a variable n ode with a co rrectly rece i ved value) and with a certain probability the edge spawns so me children. As lon g as the expected numbe r of new child ren is less than 1 the process eventually dies with prob ability 1 . Unfortu nately our situa tion is more in volved. After ℓ it- erations the starting set of m arked e dges is correlated, both with the g raph as well as with th e noise realization . Our aim therefor e is to r educe this corr elated case to the unco rrelated case b y a sequen ce of transform ations. As a ﬁrst step we show how to g et rid o f the correlatio n with respect to the n oise realization. Consider a ﬁxed graph G . Assume tha t we have perfor med ℓ iteratio ns of LGalB. For eac h edg e e that is bad in the ℓ - th iteration we co nstruct a “witne ss. ” A witness f or e is a subset of the computatio n tree of h eight ℓ (where heigh t is co unted as the numb er of variable n ode le vels) for e consisting of paths that carried bad messages in the past iteration s. W e construc t the witness r ecursively starting with e . Orient e f rom che ck node to variable no de. At any po int in time while constructing the witn ess associated with e we h av e a partial witness that is a tree with oriented edges. The initial such partial witness is e . On e step in the co nstruction consists of taking a lea f edg e of the par tial witness and to “grow it out” according to the following rules. If an edge enter s a variable no de th at has an incor rect received value then add the smallest ( accordin g to some ﬁxed but arbitrary order on the set of e dges) edge that c arries an incorrect incoming message to the witness and con tinue the process alon g this edge . The added edge is directed from variable node to check n ode. If an edge enters a variable node that ha s a correct received value the n add b oth incomin g edges to the witness and fo llow th e process alo ng bo th ed ges. (Note that in this case bo th of th ese e dges must have carried bad messages.) Ag ain, both of these edges are directed fro m variable to check nod e. If an edge ente rs a check n ode th en choose the smallest incoming edge th at carries an incorrect message and add it to the witness. Continue th e p rocess a long this edge. The added edge is directed from check to variable node. Con tinue the process until depth ℓ . Fig. 1 shows an example for l = 3 , r = 4 , and ℓ = 3 . e h = 1 h = 2 h = 3 Fig. 1. Construction of the witness for a bad edge e . The dark v ariable s represent channel errors. The part of the tree with dark edges represent the witness, the thick edges, includi ng both dark and gray , repre sent the bad messages in the past iterati ons. The number h in the left indicat es the height of the tree. Denote th e union of all witnesses fo r all edge s that are bad in the ℓ -th iteration b y W ( G , E , ℓ ) . W e simp ly call it the witness . The witness is a par t of the graph that on its own explains why the set of b ad ed ges af ter ℓ iterations is bad. How large is W ? Th e larger ℓ , the f ewer bad edg es we expect to see in iter ation ℓ . On the other ha nd, the size of the witness f or eac h bad edge grows as a func tion of ℓ . The next lemm a, wh ose p roof c an be f ound in App endix B, asserts that th e ﬁrst effect d ominates and that the expected size of W conv erges to zero as the numb er of iterations incr eases. Lemma 26 (Size of W itness): Con sider the (3 , r ) -regular ensemble. For 0 ≤ ǫ < ǫ LGalB , lim n →∞ 1 n E [ |W ( G , E , ℓ ) | ] = o ℓ (1) . Why d o we co nstruct a witness? It is in tuitiv e th at if we keep the witn ess ﬁxed but randomize the structure as well as the received v alues on the remainder of the g raph then the situation should on ly get worse: alread y the witness itself explains all the b ad messages and hen ce any fu rther bad channel values can only create more bad messages. In the next two sections 9 we show that under some suitable technical conditio ns this intuition is indee d correct. E. Rando mization A witness W co nsists of two parts, (i) th e grap h structure of W and (ii) the c hannel realizations o f the v ariables in W . W e will often need to refe r to either of these parts on their own. By some abuse of notatio n we write W also if we refer only to the g raph structur e or only to the chan nel realization s. The u sage should be clear f rom the con text. As an examp le, we write W ⊆ G to indicate that G contains W as a subgraph and we wr ite W ⊆ E to indicate that the r eceiv ed values of all variables in W agree with the values that these variables take o n in E . Fix a g raph G and a witness W , W ⊆ G . L et E G , W denote the set of all error realization s E that give rise to W , i.e ., W ( G , E , ℓ ) = W . Clearly , for all E ∈ E G , W we must have W ⊆ E . In words, on the set of variables ﬁxed by the witness the erro rs are ﬁxed by th e witness itself. Therefo re, the various E that create this witness d iffer only on G \W . As a conv ention, we deﬁne E G , W = ∅ if W 6⊆ G . Let E ′ G , W denote the set of pro jections of E G , W onto the variables in G \W . Let E ′ ∈ E ′ G , W . Thin k of E ′ as an e lement of { 0 , 1 } | G \W | , where 0 deno tes a correct re ceiv ed value and 1 den otes an incorr ect received value. In this way , E ′ G , W is a subset of { 0 , 1 } | G \W | . This is im portant: E ′ G , W has stru cture. W e cla im th at, if E ′ ∈ E ′ G , W then E ′ G , W also contains E ′ ≤ (as deﬁned in Appendix D). More pr ecisely , if the no ise realization E ′ ∈ E ′ G , W giv es rise to the witness W then co n verting any incorrect received value in E ′ to a correct one will also give rise to W . Th is is true since the LGalB algorithm is mon otone, so that taking aw ay some incorrectly received values can not increase the size o f bad edges observed in the ℓ -th iteration. Bu t on the o ther hand, W itself ensur es that the set of bad edges after ℓ iterations includes all the bad edges we saw orig inally . T he p roof of the following lemma relies heavily o n this prope rty . Lemma 27 (Cha nnel Ran domization) : Fix G and let W ⊆ G . L et E E ′ [ · ] deno te the expectatio n with resp ect to th e ch annel realizations E ′ in G \ W . Then E E ′ [ M ( G , ( W , E ′ ) , W ) 1 { E ′ ∈E ′ G , W } ] ≤ E E ′ [ M ( G , ( W , E ′ ) , W )] E E ′ [ 1 { E ′ ∈E ′ G , W } ] . (20) Discussion: Lemma 27 h as the following important operation al signiﬁcance. I f we divide both side s b y E E ′ [ 1 { E ′ ∈E ′ G , W } ] , the left-hand side is the expectation of marked variables, where the expec tation is computed over all th ose ch annel re alizations that give rise to th e given w itness W , wherea s the righ t-hand side gives the expectatio n over all channel realizations (outside the witness) regardless whether they giv e rise to W or not. Clearly , the right- hand side is much easier to com pute, since the channel is no w independ ent of W . The lemma states that, if we assume that the chann el o utside W is in depend ently chosen then we get an upper bo und on the size of the marked variables. Pr oof: Let n ′ = | G \W | . Let P {·} be the pr obability measure associated with E E ′ [ · ] , i.e., P { E ′ } = ǫ n 1 ¯ ǫ n ′ − n 1 , where n 1 denotes the n umber of o nes in E ′ . Let f ( E ′ ) d enote the function 1 { E ′ ∈E ′ G , W } , and let g ( E ′ ) denote the function M ( G , ( W , E ′ ) , W ) . Note th at f is a decreasing function on { 0 , 1 } n ′ because if f ( E ′ ) = 1 then for all E ′′ ≤ E ′ , f ( E ′′ ) = 1 . Further, g is an incr easing in { 0 , 1 } n ′ since LGalB is monoto ne in the nu mber of chan nel errors. Since g ( E ′ ) ≤ n , n − g is no n- negativ e and it is a decreasing fun ction. For s, t ∈ { 0 , 1 } n ′ , let | s | denote the number of 1 s in s and s ∨ t and s ∧ t be as deﬁned in Appen dix D. Then, P { s } P { t } = ǫ | s | + | t | (¯ ǫ ) n ′ −| s |−| t | , P { s ∨ t } = ǫ | s | + | t |−| s ∧ t | (¯ ǫ ) n ′ − ( | s | + | t |−| s ∧ t | ) , P { s ∧ t } = ǫ | s ∧ t | (¯ ǫ ) n ′ −| s ∧ t | . Therefo re, P { s } P { t } = P { s ∨ t } P { s ∧ t } . Applying the FKG inequality in the form o f Lem ma 37 to f and n − g , we get E [ f ( n − g )] ≥ E [ f ] E [ n − g ] . This imp lies E [ f g ] ≤ E [ f ] E [ g ] . W e can n ow uppe r bound the right- hand side of (19). Th e proof of the next lemma can be found in Appen dix C. Lemma 28 (Markov In equality): Consider th e ( l = 3 , r ) - regular ensemb le and transmission over the BSC ( ǫ ) . Let ( G , E ) be chosen u niform ly at random. Le t ℓ ∈ N and θ > 0 so that E [ |W ( G , E , ℓ ) | ] ≤ θ 2 n . Th en E [ M ( G , E , ℓ )] ≤ X W : |W | ≤ θ n X G P { G } P { E G , W } E E ′ [ M ( G , ( W , E ′ ) , W )] + θ n. F . Back to Expansion In the p revious section we have shown that for a ﬁxed graph G , an d a g iv en witness W , we can ig nore the cor relations between the w itness and the chan nel va lues in G \W and consider those c hannel values to be chosen indepen dently . But the graph structur e of G \W is still correlated with W . Let us now deal with this correlation and g et a bound o n th e m arking process fo r th ose G that hav e an expan sion c lose to the ty pical one of the ensemble. Consider the following ran dom process, which we call the R-process. The proc ess proceeds in discrete steps and has state ( C t , S t , B t , I t ) at time t , where eac h co mpone nt is an integer . W e initialize the process with ( C 0 , S 0 , B 0 , I 0 ) = (0 , S 0 , 0 , 0) , where S 0 ∈ N . At each step we hav e two ch oices. W e can eith er perfo rm a r egular step or a boundary s tep. The effect of each step type on the state ( C t , S t , B t , I t ) is shown in T ab le III. If we cho ose a regular step then, with prob ability ǫ , an extension step is executed and, with probability ¯ ǫ , a pruning step is perf ormed. The ch oices o f extension step versus p runing step are iid. In our choice of step type we are r estricted by the following: at a ny time during the process the state has to satisfy γ r C t ≤ S t + B t + I t , (2 1) where γ = 1 − 1+ δ r for some strictly po siti ve number δ . Let T be the smallest time t so tha t S t = 0 . It is con venien t to formally d eﬁne the pr ocess f or all t by setting U t = U T for t ≥ T . 10 C t S t B t I t regul ar exten d 2 2 r − 3 0 1 1 r − 3 0 1 0 − 3 0 1 regul ar prune 0 − 1 1 0 boundary 1 r − 2 − 1 1 0 − 2 − 1 1 T ABLE III P O S S I B L E S TATE T R A N S I T I O N S . N O T E T H AT T H E R E A R E S E V E R A L P O S S I B L E T R A N S I T I O N S C O R R E S P O N D I N G T O A “ R E G U L A R E X T E N D ” S T E P A S W E L L A S A “ B O U N D A RY ” S T E P . A S E X P L A I N E D B E L OW , T H E T R A N S I T I O N S I N D I C ATE D I N B O L D L E T T E R S D O M I NAT E T H E OT H E R T R A N S I T I O N S I N T H E S E N S E O F D E FI N I T I O N 2 9 . Discussion: Here is th e interpretatio n of the above process. W e are given a ﬁxed g raph G and a witness W . The channel realizations in G \W are gen erated independen tly with proba- bility o f error ǫ . W e are inter ested in co mputing the exp ected number of marked variables E E ′ [ M ( G , ( W , E ′ ) , W )] . The compon ents of the state vector have th e f ollowing interpretatio n. By so me furth er abuse of no tation, let W ref er now also to the variables contained in W . Let N ( W ) den ote all the check n odes tha t neighb or W . W e start o ur process with those ed ges connec ted to N ( W ) that do not connec t to W . Th e c ardinality of this set is den oted b y S 0 (where the “s” stands for survivin g ). In each step we take a sing le edge f rom this set of surviving edges an d “gr ow it ou t. ” Let us discuss this p rocess in more detail. Wh en we “grow out” an ed ge we ﬁ rst visit the connected variable node. Suppose that this is the ﬁrst tim e th at the process visits this variable node. W e call th is a r e g ular step. If th e r eceived value o f this variable node is goo d then we stop the process along this edge. W e add the variable to the bound ary set to m ake a mental note that we have seen this node exactly once . The bo undary set has cardin ality B t . W e further subtr act 1 from S t to take in to accou nt that we ﬁnished processing o ne o f the “surviving” edges. If th e received value is bad then we add this v ariable n ode to the internal variable nod es. The ca rdinality of this set is I t . This means that in this step we increase I t by 1 . Further , we expand the graph alon g the tw o outg oing edg es, add the (at m ost) two conn ected che ck no des to the set of internal check nodes (whose cardinality is denoted by C t ) and add all the rem aining edges that eman ate from these ch eck nod es to the set of surviving edge s. Th is ad ds (at most) 2( r − 1) new surviv ors, but we h av e to subtrac t the ed ge we started from . Therefo re, S t is increased by at m ost 2 r − 3 . So far we hav e assumed that we have n ot seen the variable node (that is connected to the edge which we grow out) before. Suppose now that, to the co ntrary , the variable is an element of the boundar y . W e know that in this case the received value is go od, but we also k now that the variable received an other bad in coming message. Th erefore , the variable will send a bad outgoin g message along its remaining edg e. Hence, we move this variable node from the bo undary to the in ternal set (this decreases B t by 1 and increases I t by 1 ). Fu rther, we g row out the gr aph alon g the only rem aining outgo ing edge. This adds at most one new ch eck node and at mo st r − 1 outgoing ed ges to the set of surviving edg es. Disco unting again the edge we started with, we add in total at mo st r − 2 to S t . Suppose that th e grap h G is a r ight expan der; i.e., G ∈ X ( l , r , α, γ ) , where γ ≥ 1 − 1+ δ r for some strictly positive δ . This means that every collection C of check nod es of size at most αm has at least γ |C | r connected v ar iable nodes. Co nsider the state of the system at some tim e t . At this p oint in time we have C t check nodes. All these chec k nodes are “internal, ” i.e., all their neig hborin g variable nod es are either cou nted in V t or I t , or th ey are yet to be encou ntered by the process which cannot be more than the surviv ors set S t . W e know th at G is an expander and supp ose fo r n ow that C t ≤ αm . Then we know that th e numb er of con nected variable neig hbors must be at least γ r C t , i.e., at a ny time during the pro cess the state should satisfy γ r C t ≤ S t + B t + I t . (2 2) W e claim that γ r C t ≤ S t + B t + I t − (1 − δ ) (23) is a necessary cond ition to be ab le to per form a boundary step at time t . T o see this, sup pose we take a boundar y step. I f you loo k at T able III yo u will see that there are two possible transitions. One can ch eck that the transition stated in bold letters gives the less restrictive condition. Let us th erefore only focus on this case. The state after applying the boundary state must still fulﬁll (22). T his mean s tha t we must ha ve γ r ( C t + 1 ) ≤ ( S t + r − 2) + ( B t − 1 ) + ( I t + 1 ) . The cla im is proved by r ewriting this ineq uality . From the ab ove discussion we claim that for a given W and G , whe re G ∈ X ( l , r , α, γ ) , as lon g as C t ≤ αm then the mar king p rocess can be modele d as the R-process. The ran dom variable I ∞ is equal to the rando m variable M ( G , ( W , E ′ ) , W ) − |W | o f the marking process (we subtract the size of witness be cause we d o n ot include it in th e in ternal variables). For the actual mark ing process the decision of whether a regular step or a bo undar y step is taken is for ced by the stru cture of the grap h an d o ur c hoice of which edg e to grow ou t. For the R-pro cess the role of graph is taken by a strate gy . A strategy is any (ra ndomized ) decision func tion F that, based on the initial state and past decisions a nd outcome s, decides wh ether a regular step or a bou ndary step is taken at any point in time. Here is the connection betwee n the actual physical process and th e R-p rocess in more detail. Assume we are given a graph G a nd a witness W . W e know the graph and therefore we a lso know which edg es of the gr aph are ele ments of the surviving set. Theref ore, when we pick a surviv o r , we know in ad vance wheth er the step is a r egular step or a boun dary step. The noise realization, which is not known to us a prior i, determines whethe r a r egular step is a regular extend or prune step. W e see th at each grap h giv es rise to a strategy . As long as the size o f all r ev ealed nodes is sufﬁciently small th is strategy will b e ad missible since the expan sion will be valid up to this point. Since we are o nly inter ested in an upp er boun d on the number o f marked variables, we allo w the R-process to use 11 an arbitrary strate gy , only limited by the cond ition ( 22). W e call a strategy which obeys ( 22) an admissible strategy . Since the actual physical process is also limited by (22) (un der th e condition tha t the graph is an expan der a nd the process h as not gro wn beyon d the size where the expansion is valid), it sufﬁces to derive upper bound s on E [ I ∞ ] that is valid for all choices o f the strategy . W e relax one fur ther restriction imposed by th e actual physical pr ocess in ord er to simplify ou r task . Again , th is o nly increases E [ I ∞ ] . In the markin g p rocess, we can o nly perfor m a bo undar y step if the b ounda ry set is strictly positive. In other words, we require B t > 0 for a boundary step to be perform ed. W e lift this restriction for the R-process. Deﬁnition 2 9 (Ordering of States): The state U ≡ ( C, S, B , I ) d ominates the state U ′ ≡ ( C ′ , S ′ , B ′ , I ′ ) , denoted by U ≥ U ′ if (i) S ≥ S ′ , (ii) I ≥ I ′ , (iii) S + B + I − γ r C ≥ S ′ + B ′ + I ′ − γ r C ′ . ♦ Lemma 30 (Mon otonicity of I ∞ with S tate): Consider the R-process with ad missible strategy F ′ and initial state U ′ ≡ ( C ′ , S ′ , B ′ , I ′ ) . Let U ≡ ( C, S, B , I ) be an initial s tate which d ominates U ′ , i.e., U ≥ U ′ . Then there exists an admissible strategy F so that E [ I ∞ ( U, F )] ≥ E [ I ∞ ( U ′ , F ′ )] , where I ∞ ( U, F ) den otes I ∞ assuming tha t the R-p rocess is initialized with U and that the process u ses th e strategy F . Pr oof: Given U ′ and the admissible strategy F ′ we construct the adm issible strategy F in the following way . The process with initial state U uses strategy F ′ but app lies it to the pseudo state U ′ . Further , it u pdates its p seudo state according to the realization of the process and bases its futur e dec isions on strategy F ′ applied to this evolving pseud o state. Call the phase of the pro cess until the pseud o state has reached S ′ = 0 the “initial” phase of the process. At that point the ( U , F ) process switches to any adm issible strategy based on its real state. T o be con crete, assume tha t it uses a gr eed y strate gy at this point. This means that the proce ss per forms a boundary step any time it is admissible. In order to show the desired in equality on th e expe cted values we c ouple the pro cesses ( U ′ , F ′ ) and ( U , F ) . W e imagine that w e run both proce sses in parallel and that they experience exactly the same rando mness (this refer s to the random ness contain ed in the ch oice of the transitions as well as any randomness which might be used by the strategy). Assume for the momen t that strategy F is admissible. In the initial phase of the algo rithm (until the ( U ′ , F ′ ) process stops because S ′ t = 0 ) the ( U, F ) process proceeds in lo ck-step w ith th e ( U ′ , S ′ ) process. Since S 0 ≥ S ′ 0 and since S t − S 0 = S ′ t − S ′ 0 it follows that S t ≥ S ′ t in this initial phase. This means that the process ( U, F ) nev er stops before the process ( U ′ , F ′ ) . Further, I 0 ≥ I ′ 0 , I t − I 0 = I ′ t − I ′ 0 , and I t is a no n-decre asing func tion. I t follows that for every realization I ∞ ( U, F ) ≥ I ∞ ( U ′ , F ′ ) . This implies, a fortiori , the claimed inequa lity on the expected v alues. Let us now show that the proto col F is admissible. W e claim that for all t ∈ N S t + B t + I t − γ r C t ≥ S ′ t + B ′ t + I ′ t − γ r C ′ t . (24) By deﬁnition this is true for t = 0 . But b y c onstruction of the coupling , S t − S 0 = S ′ t − S ′ 0 , I t − I 0 = I ′ t − I ′ 0 , B t − B 0 = B ′ t − B ′ 0 , and C t − C 0 = C ′ t − C ′ 0 . It f ollows th at the left-hand side in (2 4) is always at least as large as the right-han d side. Therefo re, if F ′ is ad missible then so is F . From T ab le III we see th at fo r regular extend and boundary steps ther e are se veral possible outcomes. F or each of these two steps, there is a single outco me ( highligh ted in the table) whose resulting state dominates those of the o ther outcomes. Since we are interested in an up per bound on I ∞ , thanks to the above l emma, we can r estrict our attention to th ese dominatin g steps. Consider the gr eedy strategy , call if F g . For this greedy strategy , whenever (2 3) is true we per form a bounda ry step. Lemma 31 (Domina tion of the Greedy Pr ocess): For a giv en initial state U = ( C 0 , S 0 , B 0 , I 0 ) and any adm issible strategy F , we hav e E [ I ∞ ( U, F g )] ≥ E [ I ∞ ( U, F )] . Pr oof: Again we construc t a c oupling between the processes ( U, F ) and ( U, F g ) . As re marked above, fo r b oth processes we ca n assume that the state transitions are the one s indicated in b old in T ab le III. Th e only ra ndomn ess theref ore resides in whether for a regular step the process e xtends or prunes a nd, po ssibly , in the random ness used f or the strategy F . There is no rando mness in volved in a ny boun dary steps. The cou pling con sists in coupling for ea ch r egular step i , i ∈ N , the o utcomes of these regular step s. In more detail, if for th e p rocess ( U, F ) the i -th regula r step results in a pru ning then th e same occurs fo r the i -th r egular step for the process ( U, F g ) . By constru ction, for all r egular steps the cha nge o f S , I , B , and C is th e same f or both pro cesses. Assume we measure “time” not in the absolute nu mber of steps taken but by the number of regular step s taken. Consider a process ( U, F ) and assume that this proce ss is still “aliv e” at ‘tim e t . Then its state U t only depends on the realization of the random variables du ring the regula r steps and on the total nu mber of bound ary steps taken, but it does not depend on the ord er of the steps taken. Since th e pr ocess ( U, F g ) h as by deﬁn ition don e at least as many bo undar y steps as th e process ( U, F ) it further follows that if we compare the tw o pro cesses at “time” i correspo nding to i regular steps then the number of sur viv o rs (and also the number of internal nodes) fo r ( U, F g ) is at lea st as large as the number of su rviv o rs for ( U, F ) . Therefore, if at this time the process ( U, F ) is still alive th en so is the process ( U, F g ) a nd the latter has at least as many accumulate d internal variable nodes as the former . This proves our claim. Since we are interested in upp er bo unding E [ I ∞ ] , it is sufﬁcient to boun d E [ I ∞ ( U, F g )] , which is done in the next lemma. W e use large deviation p roperties o f th e sub -critical Galton-W atson proc ess. For th e conv enience o f the read er we provide this estimate in Appen dix E. Lemma 32 (Birth Death P r ocess): Let the initial state be U = (0 , S 0 , 0 , 0) . Fix a strictly positi ve δ , 0 < δ < 1 2( r − 1) , so 12 that 1 − δ 2 δ ∈ N an d let γ = 1 − 1+ δ r . For all ǫ < 1 2( r − 1) there exist constants c = c ( l , r , ǫ, δ ) , c > 1 , an d c ′ = c ′ ( l , r , ǫ, δ ) > 0 so that P { I ∞ ( U, F g ) ≥ cS 0 } ≤ e − c ′ S 0 . Pr oof: Since cond ition (23) is satisﬁed in the beginning , the gr eedy R-pro cess starts with some boun dary steps. W e claim that after exactly ⌊ S 0 1 − δ ⌋ such bo undary steps the condi- tion (2 3) is for th e ﬁrst time no lon ger fulﬁlled. T o see this, ignore the in teger constraint for a moment. At the b eginning of the pro cess th e conditio n ( 23) reads 0 ≤ S 0 − (1 − δ ) . Af ter S 0 1 − δ bound ary steps this cond ition is transformed to γ r S 0 1 − δ ≤ S 0 + S 0 1 − δ ( r − 2) − (1 − δ ) , which is eq uiv alen t to 0 ≤ − (1 − δ ) . W e see th at the in equality is n o lo nger fulﬁlled and it is easy to check that th is is the ﬁrst time that it is no longer fulﬁlled . After the initial bou ndary steps, the greed y strategy per- forms regular steps un til exactly 1 − δ 2 δ regular extend steps are perfor med a nd then fo llows it by exactly one b ound ary step . This sequ ence is then repeated. (No te that by our assumption 1 − δ 2 δ ∈ N .) T o see th is, note that eac h regu lar extend step in creases the right-ha nd side of (22) by 2( r − 1) and the left-han d side by 2( r − 1 − δ ) . Further, each bou ndary step inc reases the left- hand side by r − 1 − δ a nd the right-han d side by r − 2 . Since 1 − δ 2 δ 2( r − 1 − δ ) + ( r − 1 − δ ) = 1 − δ 2 δ 2( r − 1) + ( r − 2) , we see that af ter one su ch sequen ce of ﬁrst 1 − δ 2 δ regular extends steps followed by a bo undar y step the inequality is unch anged (up to an added constant). (A regular prun e step do es not change the condition (22).) Since the rando mness is contain ed o nly in the regular steps, we can m odel the p rocess as consisting of only regular steps. T o includ e the effect of bou ndary steps, we alter the outcome of the regular extend step as follows. From T ab le III note that for e ach regular extend step we inc rease S b y 2 r − 3 and I by 1 . W e include the effect o f b oundar y step by changing this to an increment of 2 r − 3 + ( r − 2) 2 δ 1 − δ for S and 1 + 2 δ 1 − δ for I , respectively . Now this p rocess is a stan dard birth an d death proce ss. Recall that we have ǫ < 1 2( r − 1) and δ < 1 2( r − 1) . Hence, the expected increase in S at each step is ǫ (2 r − 3 + ( r − 2) 2 δ 1 − δ ) . This is strictly less than 1 . As discussed in more detail in Append ix E, this sh ows that, except for an exponen tially small probab ility , this process stops for t ≤ cS 0 for some appropriate constant c > 1 . This proves our lemm a since in each step we create at most 1 + 2 δ 1 − δ internal v ariables. Using L emma 32 we boun d the numb er of variables marked by th e marking process as f ollows. Lemma 33 (Up per Bound ): Let γ = 1 − 1+ δ r for some 0 < δ < 1 2( r − 1) . Fix G and W su ch that W ⊆ G an d G ∈ X ( l , r , α, γ ) . Let c = c ( l , r , ǫ, δ ) be the constant app earing in Lemma 32. If |W | ≤ l 6 c ( r − 1) r αn then lim n →∞ 1 n E E ′ [ M ( G , ( W , E ′ ) , W )] ≤ α l r . Pr oof: Let m = l r n . The m aximum numbe r o f surviving edges co ming ou t of th e witness W is 3 ( r − 1) |W | . L et this be S 0 . Consider the R-pro cess with initial state U = (0 , S 0 , 0 , 0) and the greed y strategy F g . From Lemm a 3 2 th ere exists a strictly po siti ve constan t c ′ such th at P { I ∞ ( U, F g ) ≥ cS 0 } ≤ e − c ′ S 0 . The boun d on |W | in the hypo thesis imp lies that cS 0 = c 3( r − 1) |W | ≤ α 2 m . From T able III we see that any time the number of internal v ar iable n odes is increased by 1 the number of ch eck nod es increases by at most 2 . The refore, I ∞ ( U, F g ) ≤ cS 0 implies that C ∞ ≤ 2 cS 0 ≤ αm . Th is sho ws that the expansion property is satisﬁed for the whole duration of the process. Hence, I ∞ ( U, F g ) is a valid upp er bo und for M ( G , ( W , E ′ ) , W ) . Let M ( E ′ ) denote M ( G , ( W , E ′ ) , W ) . Since M ( E ′ ) counts the initial |W | < α 2 m variables present in W along with the internal variables cr eated, P { M ( E ′ ) ≥ αm } ≤ P { I ∞ ( U, F g ) ≥ α 2 m } ≤ e − c ′ S 0 . Therefo re, E E ′ [ M ( G , ( W , E ′ ) , W )] ≤ P { M ( E ′ ) ≤ αm } αm + P { M ( E ′ ) ≥ αm } n ≤ α l r n + (1 − e − c ′ α l n 2 r ) n. The lem ma is proved by taking the limit n → ∞ . G. Putting It All T ogether In th is sectio n we prove Lem ma 22 using the results developed in the previous sections. Pr oof of Lemma 22. Recall that we consider an ( l = 3 , r ) - regular ensemb le and that 0 ≤ ǫ < ǫ LGalB . Fix 0 < δ < 1 2( r − 1) and deﬁne γ = 1 − 1+ δ r . Let α max ( γ ) be the con stant deﬁn ed in Theo rem 2. Note that α max ( γ ) is strictly po siti ve since δ is strictly positi ve. Choose 0 < α < α max ( γ ) . Let X ( l , r , α, γ ) de- note the set of grap hs { G ∈ LDPC ( n, l , r ) : G ∈ ( l , r , α, γ ) r ight expan der } . From T heorem 2 we know that P { G 6∈ X } = o n (1) . (25) Let c = c ( l , r , ǫ, δ ) b e the coefﬁcient app earing in Lemma 32 a nd deﬁne θ = l 6 c ( r − 1) r α . From Lemm a 26 we know that there exists an iteratio n ℓ such that lim n →∞ 1 n E [ |W ( G , E , ℓ ) | ] ≤ 1 2 θ 2 . (26) Let n ( θ ) be such that for n ≥ n ( θ ) , E [ |W ( G , E , ℓ ) | ] ≤ θ 2 n . Using Lemma 28, and splitting the expe ctation ov er X and its co mplement, w e get E [ M ( G , E , ℓ )] ≤ X W : |W | ≤ θ n X G : G ∈X P { G } P { E G , W } E E ′ [ M ( G , ( W , E ′ ) , W )]+ X W : |W | ≤ θ n X G : G 6∈X P { G } P { E G , W } E E ′ [ M ( G , ( W , E ′ ) , W )]+ θn . 13 Consider the ﬁrst term. From Lemm a 33 we kn ow that E E ′ [ M ( G , ( W , E ′ ) , W )] ≤ α l r n + o ( n ) . (27) Consider the second term. Bou nd the expectation by n and remove the re striction on the size of the witness. Th is giv es the bound X W X G : G 6∈X P { G } P { E G , W } n. Switch the two summations and use th e fact that, for a gi ven G , each E realization maps to only on e W . W e get X G : G 6∈X P { G } X W : W ⊆ G P {E G , W } = X G : G 6∈X P { G } = P { G 6∈ X } ( 25 ) = o n (1) . (28) From ( 27) an d (2 8) we conclu de that for n ≥ n ( θ ) , 1 n E [ M ( G , E , ℓ )] ≤ X W : |W | ≤ θ n X G : G ∈X P { G } P { E G , W }  α l r + o n (1)  + l 6 c ( r − 1) r α ≤  l r + l 6 c ( r − 1) r  α + o n (1) . If we now let n ten d to inﬁnity then we get lim n →∞ lim sup ℓ →∞ E [ P LGalB b ( G , ǫ, ℓ )] ≤ lim n →∞ 1 n E [ M ( G , E , ℓ )] ≤  l r + l 6 c ( r − 1) r  α. Since this co nclusion is valid for a ny 0 < α ≤ α max ( γ ) it follows that lim n →∞ lim sup ℓ →∞ E [ P LGalB b ( G , ǫ, ℓ )] = 0 .  H. Extensions 1) GalB and l ≥ 4 : Note th at for l ≥ 5 the resu lt is already implied by Theorem ?? . For l = 4 the pro of is easily adapted from the o ne for l = 3 . The only d ifference lies in the way the size of the witness is computed (Section III-D) and the analysis of the birth- death process (Section II I-F). 2) MS and BSC: The proof s can also be extend ed to other decoder s. For a given MP d ecoder, the idea is to deﬁne an approp riate linearized version o f th e de coder (LMP) and go throug h the wh ole m achinery as don e for GalB. For example, con sider the MS( M ) decoder and transmission over BSC( ǫ ). The chann el r ealizations are m apped to { ± 1 } . Let M ∈ N , the message alph abet is M = {− M , . . . , M } . F o r transmission of the all-o ne cod ew o rd, the lin earized version of the decoder (LMS( M )) is deﬁned as in De ﬁnition 2 4: i.e., at the check no de the outgoin g messag e is the minimu m of the incoming messages and the variable no de r ule is unchan ged. One can ch eck tha t the LMS algorithm deﬁned above is monoto nic with respect to the input log-likelihood s at b oth the variable and check no des and the number of errors in the MS d ecoder c an be upp er bou nded by the er rors o f the LMS decoder . Lemma 34 (MS( M ) De coder , BSC and l ≥ 3 ) : Consider ( l , r ) ensemble and transmission over BSC( ǫ ). Let ǫ LMS be th e channel p arameter b elow which p ( ∞ ) { M } = 1 . If ǫ < ǫ LMS , then lim n →∞ lim sup ℓ →∞ E [ P MS b ( G , ǫ, ℓ )] = 0 Example 35 (LMS( 2 ) and BSC): Con sider co mmunic ation using LDPC (3 , 6 ) code over BSC( ǫ ) and d ecoding using MS( 2 ) algorithm. For this setup, the DE threshold is 0 . 06 3 . The linea rized decod er of this algorith m has p ( ∞ ) { 2 } = 1 for ǫ < 0 . 031 . The refore fro m the Lem ma 34 the limits can be exchanged for this ǫ . The p roof follows by showing results similar to Lemma 26 a nd 33. Her e we give a brief explan ation for adapting th e proof to the case of M = 2 and l = 3 . F or a g i ven p > 0 , we ﬁrst p erform ℓ ( p ) iterations such that p ( ℓ ) {− M ,...,M − 1 } ≤ p . W e start the mark ing pro cess fro m all the edges with m essages in {− M , . . . , M − 1 } and their witness. In this case the witness consists o f ed ges which send m essages {− M , . . . , M − 1 } . T o show that th e size o f the witness is going to zero, consider the DE eq uations similar to th ose in Appendix B. Let p µ ℓ ( x ) denote a poly nomial with n on-n egati ve coefﬁcients where the coefﬁcient in fr ont of x i denotes th e prob ability that the message emitted by a variable node at iteration ℓ is µ and that the witness (o f d epth ℓ ) f or this edge has size i . L et q µ ℓ ( x ) denote the equiv a lent qu antity for messages emitted at check nodes. Then the DE equations f or th is au gmented system are giv en by : p − 1 1 ( x ) = ǫx, p +1 1 ( x ) = ¯ ǫx, p +1 ℓ ( x ) = ǫx (( q +1 ℓ − 1 ( x )) 2 + 2 q +2 ℓ − 1 (1) q 0 ℓ − 1 ( x ))+ ¯ ǫx (2 q +2 ℓ − 1 (1) q − 2 ℓ − 1 ( x ) + 2 q +1 ℓ − 1 ( x ) q − 1 ℓ − 1 ( x ) + ( q 0 ℓ − 1 ( x )) 2 ) , p 0 ℓ ( x ) = ǫx (2 q +2 ℓ − 1 (1) q − 1 ℓ − 1 ( x ) + 2 q +1 ℓ − 1 ( x ) q 0 ℓ − 1 )+ ¯ ǫx (2 q +1 ℓ − 1 ( x ) q − 2 ℓ − 1 ( x ) + 2 q 0 ℓ − 1 ( x ) q − 1 ℓ − 1 ( x )) , p − 1 ℓ ( x ) = ¯ ǫx (( q − 1 ℓ − 1 ( x )) 2 + 2 q − 2 ℓ − 1 ( x ) q 0 ℓ − 1 ( x ))+ ǫx (2 q +2 ℓ − 1 (1) q − 2 ℓ − 1 ( x ) + 2 q +1 ℓ − 1 ( x ) q − 1 ℓ − 1 ( x ) + ( q 0 ℓ − 1 ( x )) 2 ) , p − 2 ℓ ( x ) = ǫx 2( q − 2 ℓ − 1 ( x )( q +1 ℓ − 1 ( x ) + q 0 ℓ − 1 ( x ) + q − 1 ℓ − 1 ( x ))) + ǫ x (2 q 0 ℓ − 1 ( x ) q − 1 ℓ − 1 ( x ) + ( q − 1 ℓ − 1 ( x )) 2 ( q − 2 ℓ − 1 ( x )) 2 ) + ¯ ǫx (2 q − 1 ℓ − 1 ( x ) q − 2 ℓ − 1 ( x ) + ( q − 2 ℓ − 1 ( x )) 2 ) , q µ ℓ ( x ) = p µ ℓ − 1 ( x ) p µ ℓ − 1 (1) ((1 − µ − 1 X i = − M p i ℓ − 1 ) r − 1 − (1 − µ X i = − M p i ℓ − 1 ) r − 1 ) Using the h ypothesis p ( ∞ ) { M } = 1 and d oing a similar analysis as in Append ix B we can show tha t the size of the witn ess behaves as o ℓ (1) . In the cor respondin g b irth-death p rocess we have to keep track of the size of the set of edges with messages in {− M , . . . , M − 1 } . Similar results can b e obtain ed for BP( M ) d ecoder, and channels with con tinuous outputs. But the analy sis of these decoder s is mo re complicated bec ause we have to d eal w ith densities of messages. 3) MS ( M ) a nd continu ous chann el: Consider transmission throug h BMS c hannels with bound ed o utput log-likelihoo ds and deco ding u sing MS( M ) decod er . For this setup it is 14 tempting to conjectu re that the proof s can be extended using FKG in equalities f or co ntinuou s lattices [ 6]. I V . C O N C L U S I O N W e have shown two app roaches fo r solving the problem of limit exchange below the DE threshold. Th e ﬁrst o ne, based solely o n the expansion proper ty of the grap h, he lps in proving th e r esult for a large c lass of MP deco ders but only if the degree is relati vely large. T o prove th e result fo r smaller degrees one has to includ e the role of chan nel realizations. The second appro ach accomp lishes this in so me cases. In this p aper we on ly con sidered chan nel p arameters below th e DE threshold . But the regime above this thresho ld is equally interesting. One im portant application of pr oving the exchange of limits in this regime is the ﬁnite-length analysis via a scaling approa ch [7] since the compu tation of the scalin g pa rameters heavily depends on the fact that this exchange is perm issible. A C K N O W L E D G M E N T W e would like to thank A . Montan ari fo r sugg esting to directly ap ply the FKG inequalities in the pr oof of Le mma 27 instead of the original more elabo rate construction. The work presented in this paper is pa rtially suppor ted b y the National Competence Center in Research on Mobile Inf ormation and Communica tion Systems (NCCR-MICS), a center suppor ted by the Swiss Natio nal Science Foundation under gra nt n umber 5005- 67322 . A P P E N D I X A. Expansion Ar gume nt F or Block Err or Pr oba bility The following theorem is a mo diﬁed version of a theor em by Bur shtein an d Miller [5] . Theor em 36 (Expan sion): Consider an ( l , r , α, γ ) left e x- pander . Assum e that 0 ≤ β ≤ 1 such that β ( l − 1) ∈ N and that β l − 1 l ≤ 2 γ − 1 . If a t some iteratio n ℓ the numb er o f bad variable n odes is less than α lr n then the MP algorithm will decode suc cessfully . Pr oof: Let B ℓ denote the b ad set in iteration ℓ . W e claim that γ l |B ℓ ∪ B ℓ +1 | ( i ) ≤ |N ( B ℓ ∪ B ℓ +1 ) | ( ii ) ≤ |N ( B ℓ ) | + β ( l − 1) |B ℓ +1 \B ℓ | . (29) Step (ii) follows fr om the fact that each variable in B ℓ +1 \B ℓ must be conn ected to at least l − β ( l − 1) chec ks in the set N ( B ℓ ) since otherwise this variable will be good an d wont be in B ℓ +1 . Therefo re the numb er of e dges coming out of B ℓ +1 \B ℓ that are not conn ecting to N ( B ℓ ) is at most β ( l − 1) |B ℓ +1 \B ℓ | . Thu s th e number of neighbo rs of B ℓ +1 \B ℓ that are not already neig hbors o f B ℓ is at most β ( l − 1) |B ℓ +1 \B ℓ | . Consider now step (i). Th is step follows in a straigh tforward fashion from the expansion p roperty since by assumption |B ℓ | ≤ α lr n so that |B ℓ ∪ B ℓ +1 | < αn . Let T be the set of check nod es that are connected to B ℓ ∩ B ℓ +1 but not connected to B ℓ \B ℓ +1 . Suppose an ed ge from a check node in T is carrying a bad me ssage. Then this check must b e con nected to one mo re variable in B ℓ ∩ B ℓ +1 because it is not co nnected to B ℓ \B ℓ +1 and thus cannot ge t a bad message from B ℓ \B ℓ +1 . For each variable in B ℓ ∩ B ℓ +1 , at least l − β ( l − 1) edges m ust b e ba d me ssages a nd h ence it c an connect to at most ( l − β ( l − 1 )) / 2 + β ( l − 1 ) = l / 2 + β ( l − 1) / 2 check no des. Th erefore we have, |N ( B ℓ ) | ≤ l |B ℓ \B ℓ +1 | + | T | , |N ( B ℓ ) | ≤ l |B ℓ \B ℓ +1 | + 1 + β l − 1 l 2 l |B ℓ ∩ B ℓ +1 | . (30) Using eq uations ( 29 ) and ( 30 ) , we get γ l |B ℓ +1 ∪ B ℓ | ≤ l |B ℓ \B ℓ +1 | + 1 + β l − 1 l 2 l |B ℓ +1 ∩ B ℓ | + β ( l − 1) |B ℓ +1 \B ℓ | γ |B ℓ +1 ∩ B ℓ | + γ |B ℓ \B ℓ +1 | + γ |B ℓ +1 \B ℓ | ≤|B ℓ \B ℓ +1 | + 1 + β l − 1 l 2 |B ℓ +1 ∩ B ℓ | + β l − 1 l |B ℓ +1 \B ℓ | |B ℓ +1 \B ℓ | ≤ (1 − γ ) γ − β l − 1 l |B ℓ \B ℓ +1 | + 1 + β l − 1 l − 2 γ 2( γ − β l − 1 l ) |B ℓ ∩ B ℓ +1 | The co efﬁcient of the ﬁrst term in RHS is less than 1 and the coefﬁcient o f the second term is negativ e a nd h ence |B ℓ +1 \B ℓ | < |B ℓ \B ℓ +1 | B. Size of W itness Pr oof of Lemma 26. Let G b e a g raph and let E b e the noise realization. Assume that we perf orm ℓ iteration s. Le t W e ( G , E , ℓ ) denote th e witness of edge e . Th en E [ |W ( G , E , ℓ ) | ] ≤ l n X i =1 E [ |W e i ( G , E , ℓ ) | ] = n l E [ |W e 1 ( G , E , ℓ ) | ] . It r emains to co mpute the expected size o f th e witn ess for the limit of n tend ing to in ﬁnity and a ﬁxed ℓ . This ca n b e accomplished by DE. Let x ℓ denote the pro bability of an ed ge b eing in error accordin g to DE. Let p ℓ ( x ) deno te a polyn omial with non - negativ e coefﬁcients wh ere the coef ﬁcien t in fr ont of x i denotes the proba bility that the message emitted by a v ariable node at iteration ℓ is bad and that th e witness (of depth ℓ ) for this edge h as size i ( i variable nodes). Let q ℓ ( x ) den ote the equiv alent q uantity for m essages emitted at check nodes. The DE equations for this au gmented system are: p 1 ( x ) = ǫx, p ℓ ( x ) = ǫ (2 − q ℓ (1)) q ℓ ( x ) x + ¯ ǫq ℓ ( x ) 2 x, q ℓ ( x ) = p ℓ − 1 ( x ) p ℓ − 1 (1) (1 − (1 − p ℓ − 1 (1)) r − 1 ) . The in itialization p 1 ( x ) = ǫx reﬂects the fact that with probab ility ǫ a variable-to-check message is in erro r in iteration 1 and that its associated witness of dep th 1 c onsists only of the attach ed variable (he nce th e x ). 15 The r ecursion fo r q ℓ ( x ) is also straig htforward. W ith pr ob- ability 1 − (1 − p ℓ − 1 (1)) r − 1 at least o ne of th e r − 1 incoming messages at a ch eck no de is bad, an d in this case the distribution of the size of the attached witn ess is p ℓ − 1 ( x ) p ℓ − 1 (1) . Let us now look at the recursion for p ℓ ( x ) . There are thr ee contributions: (i) Suppose that th e v ariable h as a bad received value and that exactly o ne o f the incomin g edges is bad ; this happen s with p robability ǫ 2(1 − q ℓ (1)) q ℓ (1) an d in this c ase the distribution of the size of the witn ess attached to this edge is q ℓ ( x ) x q ℓ (1) , where the extra x acco unts for the attached variable node. ( ii) Sup pose that th e variable has a ba d received value and that both incoming edges are bad; th is happen s with probab ility ǫq ℓ (1) 2 , an d in this case the distribution of the size of the witness attach ed to this edge is q ℓ ( x ) x q ℓ (1) . (iii) Finally , suppose that the variable has a goo d received v alue an d that both the inco ming edges are bad; this hap pens with pro bability ¯ ǫq ℓ (1) 2 and in this case the d istribution of the size of the witness attac hed to this edge is q ℓ ( x ) 2 x q ℓ (1) 2 . Note that we g et stand ard DE by setting x = 1 , i. e., we have x ℓ = p ℓ (1) . W e want to show that p ′ ℓ (1) (this is the expected size o f the witness in th e limit of inﬁnite blockleng ths) con verges to zero as a function of ℓ . The augmented DE equation is dif ﬁcult to handle. So let us ﬁrst write down a scalar version th at track s the expe cted value. Deﬁne β ℓ = (1 − (1 − p ℓ (1)) r − 1 ) p ℓ (1) . Then we get p ℓ ( x ) = ǫ (2 − q ℓ (1)) β ℓ − 1 p ℓ − 1 ( x ) x + ¯ ǫβ 2 ℓ − 1 p ℓ − 1 ( x ) 2 x. Differentiate both sides with respect to x . This gives p ′ ℓ ( x ) = ǫβ ℓ − 1 (2 − q ℓ (1))( p ′ ℓ − 1 ( x ) x + p ℓ − 1 ( x )) + ¯ ǫβ 2 ℓ − 1 ( p ℓ − 1 ( x )) 2 + ¯ ǫβ 2 ℓ − 1 2 p ℓ − 1 ( x ) p ′ ℓ − 1 ( x ) x. Now substitute x = 1 . Recall th at x ℓ = p ℓ (1) and d eﬁne p ℓ = p ′ ℓ (1) . Furth er , b ound 2 − q ℓ (1) by 2 and β ℓ by ( r − 1) . This gi ves the in equality p ℓ ≤ 2 ǫ ( r − 1 ) p ℓ − 1 + 2 ǫ ( r − 1) x ℓ − 1 + ¯ ǫ ( r − 1) 2 x 2 ℓ − 1 + 2 ¯ ǫ ( r − 1) 2 x ℓ − 1 p ℓ − 1 . W e claim that ℓ x ℓ ≤ p ℓ . Th is is true since x ℓ is th e p robability of a b ad message, wh ereas p ℓ is the expected size of the witness an d th e witne ss size is always at least ℓ if the message is b ad. Th erefore , p ℓ p ℓ − 1 ≤ 2 ǫ ( r − 1 ) + 2 ǫ ( r − 1) x ℓ − 1 p ℓ − 1 + ¯ ǫ ( r − 1 ) 2 x 2 ℓ − 1 p ℓ − 1 + 2 ¯ ǫ ( r − 1) 2 x ℓ − 1 ≤ 2 ǫ ( r − 1 ) + 2 ǫ ( r − 1) ℓ + 3 ¯ ǫ ( r − 1) 2 x ℓ − 1 . Now note that x ℓ tends to zero since ǫ < ǫ LGalB . Ther efore, if 2 ǫ ( r − 1) < 1 then p ℓ /p ℓ − 1 < 1 f or ℓ sufﬁciently large. The stability cond ition im plies ǫ LGalB < 1 2( r − 1) . Ther efore, f or ǫ < ǫ LGalB , p ℓ tends to zero expon entially fast for incre asing ℓ .  C. Ra ndomizatio n Pr oof of Lemma 28. W e have E [ M ( G , E , ℓ )] = X W E [ M ( G , E , W ) 1 {W ( G , E ,ℓ )= W } ] = X W , G P { G } E E [ M ( G , E , W ) 1 {W ( G , E ,ℓ )= W } ] = X W , G P { G } E E [ M ( G , E , W ) 1 { E ∈E G , W } ] . For all E ∈ E G , W , the channel values on W are ﬁxed to those appearin g in th e witness which is also denoted by W . Recall that E ′ G , W is the pr ojection of E G , W on G \W and E ′ ∈ E ′ G , W . The ab ove expectation is equiv alent to E E [ M ( G , ( W , E ′ ) , W ) 1 { ( W , E ′ ) ∈E G , W } ] = P ( W ) E E ′ [ M ( G , ( W , E ′ ) , W ) 1 { E ′ ∈E ′ G , W } ] , where P ( W ) is the probab ility of the chan nel values on W . This imp lies P ( W ) P ( E ′ G , W ) = P ( E G , W ) . Using (20) we bou nd E E ′ [ M ( G , ( W , E ′ ) , W ) 1 { E ′ ∈E ′ G , W } ] ≤ P ( E ′ G , W ) E E ′ [ M ( G , ( W , E ′ ) , W )] . Therefo re, E [ M ( G , E , ℓ )] ≤ X W , G P { G } P { E G , W } E E ′ [ M ( G , ( W , E ′ ) , W )] ≤ X W : |W | ≤ θ n, G P { G } P { E G , W } E E ′ [ M ( G , ( W , E ′ ) , W )]+ X W : |W | ≥ θ n, G P { G } P { E G , W } E E ′ [ M ( G , ( W , E ′ ) , W )] . Consider the seco nd term in the last line. Bound the expecta- tion b y n . This yields X W : |W | ≥ θ n, G P { G } P { E G , W } n. If W 6⊆ G , then E G , W is empty . There fore the above boun d is equiv alent to n X W : |W | ≥ θ n E [ 1 {W ⊆ G } 1 { E ∈E G , W } ] = n X W : |W | ≥ θ n E [ 1 {W ( G , E ,ℓ )= W } ] = n P {|W ( G , E , ℓ ) | ≥ θn } . By assum ption, E [ |W ( G , E , ℓ ) | ] ≤ θ 2 n . T he M arkov ineq uality therefor e sho ws that P {|W ( G , E , ℓ ) | ≥ θ n } ≤ θ .  16 D. FKG In equality Consider the Hamming space { 0 , 1 } n . F or x, y ∈ { 0 , 1 } n deﬁne th e following partial order: x ≤ y iff x i ≤ y i for all i . Deﬁne x ≤ as x ≤ = { y : y ∈ { 0 , 1 } n , y ≤ x } , (31) and x ∨ y and x ∧ y as ( x ∨ y ) i =  0 if x i = y i = 0 , 1 else , (32) ( x ∧ y ) i =  1 if x i = y i = 1 , 0 else . (33) W e say tha t a func tion f : { 0 , 1 } n → R is monoto nically increasing ( decreasing) if f ( x ) ≥ f ( y ) whenever x ≥ y ( x ≤ y ). Lemma 37 (FKG Inequality – [8]): Let P {·} be a pr oba- bility me asure o n { 0 , 1 } n such that P { x } P { y } ≤ P { x ∨ y } P { x ∧ y } . Let f and g be real-valued n on-negative functions on { 0 , 1 } n . If f a nd g are either both monoto nically in creasing o r both decreasing th en E [ f ( x ) g ( y )] ≥ E [ f ( x )] E [ g ( y )] . E. Birth and Death Pr ocess Consider th e following birth and dea th p rocess. W e start with X 0 = a > 0 . At step t , t ∈ N , if X t − 1 < 1 the n we stop the process and deﬁne X t ′ = X t ′ − 1 for t ′ > t . Otherwise we decrease X t − 1 by 1 and add Y t , wher e the sequence { Y t } t ≥ 1 is iid. In this way , as long as X t − 1 ≥ 1 , X t = X t − 1 − 1 + Y t . This p rocess is equiv alen t to th e stand ard b irth a nd dea th process if Y t takes non -negativ e integer values. In this case, the step descr ibed above corr esponds to choosin g a member of th e population which then cr eates Y t off-springs and d ies. Let T d enote the stopping time, i. e., T = min { t : X t < 1 } . Lemma 38 (Birth- Death): Fix p ∈ (0 , 1] and 0 < µ < 1 . Consider a birth and death process with X 0 = a ∈ N and Y i = ( µ p , with pro bability p, 0 , with pro bability 1 − p, so that E [ Y i ] = µ . Then, f or β a ∈ N , P { T > β a } ≤ e − ac ( p,µ,β ) where c ( p, µ, β ) > 0 for β > 1 1 − µ . Pr oof: Let b = β a . Note th at P { T > b } ≤ P { X b ≥ 1 } ≤ P { X b ≥ 0 } . Let ˜ Y t = Y t − 1 . W e have X b = X b − 1 + ˜ Y b = X b − 2 + ˜ Y b − 1 + ˜ Y b = a + b X i =1 ˜ Y i . Therefo re, P { T > b } ≤ P n b X i =1 ˜ Y i ≥ − a o s> 0 = P  e s P b i =1 ˜ Y i ≥ e − as  Markov ≤ e as E [ e s ˜ Y ] b = e as  (1 − p ) e − s + p e ( µ p − 1) s  b . First consider th e case µ ≥ p . Set s = p µ ln ( β − 1)(1 − p ) p + β ( µ − p ) , which is strictly po siti ve since µ ≥ p and β > 1 1 − µ . Set β = 1 1 − µ − ξ , where ξ > 0 . W ith this choice we get P { T > b } ≤ h µ (1 − p ) µ (1 − p ) − ξ p  µ (1 − p ) − ξ p µ (1 − p ) + ξ (1 − p )  p ( µ + ξ ) µ i b . For ξ = 0 the terms in side the square brackets is 1 . If we take the deriv ative o f the expression inside the square b rackets wrt to ξ we get − p µ + ξ  ( µ + ξ )(1 − p ) µ (1 − p ) − pξ  1 − p ( µ + ξ ) µ log ( µ + ξ )(1 − p ) µ (1 − p ) − ξ p . For ξ > 0 and µ > p this is strictly negati ve which proves our claim. Now consid er the case µ < p . For 1 1 − µ < β < p p − µ the above still applies. F or β ≥ p p − µ , the probability is 0 . This is because in each step we c an add at most µ p − 1 . The refore, for t ≥ p p − µ a + 1 , X t ≤ a + ( p p − µ a + 1)( µ p − 1 ) < 0 . F . Concentration Theor em 39 (Concentration Theorem [1][p. 2 22]): Let G , chosen unifor mly at rand om fr om LDPC ( n, λ, ρ ) , b e used for tra nsmission over a BMS ( ǫ ) ch annel. Assume that the decoder perf orms ℓ rou nds of message-passing d ecoding and let P MP b ( G , ǫ, ℓ ) d enote the resu lting bit error pr obability . Then, for any giv en δ > 0 , there exists an α > 0 , α = α ( λ, ρ, δ ) , such th at P {| P MP b ( G , ǫ, ℓ ) − E LDPC ( n,λ,ρ ) [ P MP b ( G , ǫ, ℓ )] | > δ } ≤ e − αn . R E F E R E N C E S [1] T . Richardson and R. Urbanke, Modern Coding Theory . Cambridge Uni versi ty P ress, 2008. [2] S.-Y . Chung, G. D. Forne y , Jr . , T . Richa rdson, and R. Urbanke, “On the design of lo w-density parity -check codes within 0.0045 dB of the Shannon limit, ” IEEE Communications Letters , vol. 5, no. 2, pp. 58–60, Feb . 2001. [3] R. G. Gallag er , “Low-de nsity parity-c heck codes, ” IRE T ransactions on Inform. Theory , vol. 8, pp. 21–28, jan 1962. [4] R. J. McEli ece, E. Rodemich, and J.-F . Cheng, “The turbo deci sion algorit hm, ” in Proc . of the Alle rton Conf. on Commun., Contr ol, and Computing , Monticello, IL, USA, 1995. [5] D. Burshtein and G. Miller , “Expander graph arguments for message- passing algorithms, ” IEE E T rans. Inform. Theory , vol. 47, no. 2, pp. 782– 790, Feb . 2001. [6] C. J. Preston, “ A generaliza tion of the FKG inequal ities, ” Commun. math. Phys. , vol. 36, pp. 233–241, 1974. [7] A. Amraoui, A. Montanari , T . Richar dson, and R. Urbanke, “Finite-l ength scaling for iterat i vely decode d LDPC ensembles, ” in P r oc. of the Allerton Conf . on Commun., Contr ol, and Computing , Monticello, IL, USA, Oct. 2003. [8] C. M. Fortuin , P . W . Kastele yn, and J. Ginibre , “Co rrelati on inequal ities on some partiall y ordered sets, ” Commun. math. Phys. , vol. 22, pp. 89– 103, 1971.

Exchange of Limits: Why Iterative Decoding Works

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment