Bethe Free Energy Approach to LDPC Decoding on Memory Channels

SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 1 Bethe Free Ener gy Approach to LDPC Decoding on Memory Channels Jaime A. Anguita, Member , IEEE, Michael Chertko v , Member , IEEE, Mark A. Neifeld, Member , IEEE, and Bane V asic, Senior Member , IEEE, Abstract —W e address the problem of the joint sequence detection in partial-response (PR) channels and decoding of low- density parity-check (LDPC) codes. W e model the PR channel and the LDPC code as a combined inference problem. W e present for the ﬁrst time the derivation of the belief propagation (BP) equations that allow the simultaneous detection and decoding of a LDPC codeword in a PR channel. T o accomplish this we follow an approach from statistical mechanics, in which the Bethe free energy is minimized with respect to the beliefs on the nodes of the PR-LDPC graph. The equations obtained are explicit and are optimal for decoding LDPC codes on PR channels with polynomial h ( D ) = 1 − αD n ( α real, n positiv e integer) in the sense that they pr ovide the exact inference of the marginal probabilities on the nodes in a graph free of loops. A simple algorithmic solution to the set of BP equations is proposed and evaluated using numerical simulations, yielding bit-error rate performances that surpass those of turbo equalization. Index T erms —Belief propagation, Bethe free energy , inter- symbol interference, partial-response channel, low-density parity- check (LDPC) code, message passing, sum-pr oduct algorithm, joint decoding, turbo equalization. I . I N T R O D U C T I O N M OSTtransmission media used in current digital com- munication systems exhibit a non-uniform frequenc y response. This non-uniform response, which may manifest in amplitude and/or in phase, introduces distortion to the transmitted signal. This distortion induces a spreading of a symbol waveform beyond its allocated time slot. As a result, a sequence of consecuti ve symbols transmitted through the channel experiences overlaps of their wav eforms, such that the individual symbols are no longer identiﬁable at the receiver . This symbol overlap is commonly referred to as inter-symbol interference (ISI). Channel ISI is an undesirable feature as it usually increases the complexity of the receiv er and may increase the probability of symbol error in the detection process. ISI is observed, for instance, in ﬁber-optic channels, in the read-back process of magnetic and optical recording channels, and in radio-frequenc y wireless channels [1]–[3]. In many cases, channel ISI can be represented by a linear ﬁlter . In a discrete-time system, this ﬁlter gi ves rise to a state-dependent response, where a channel output is a linear J.A. Anguita w as with the Department of Electrical and Computer Engineer - ing, University of Arizona, AZ, USA. He is now with the School of Engineer- ing, Universidad de los Andes, Santiago, Chile. E-mail: janguita@miuandes.cl M. Chertkov is with the Complex Systems Group, Los Alamos National Laboratory , NM. M. Neifeld and B. V asic are with the Department of Electrical and Computer Engineering, Univ ersity of Arizona, AZ. combination of the past transmitted symbols. The number of past symbols affecting the output symbol is denoted as ISI length or memory length . The memory length is one of the dominant factors determining the ability of a receiver to cor - rectly and efﬁciently detect symbols in the presence of ISI and receiv er noise, and a common way to reduce it is equalizing the channel to a target partial-response (PR) channel [4]–[6]. PR channels are designed with short ISI length, to simplify symbol decoding. A discrete-time PR-equalized channel is usually characterized by its PR polynomial or targ et h ( D ) , where D denotes the symbol delay . A PR target must be chosen carefully in order not to boost noise to intolerable lev els at frequencies for which the channel response is weak. Symbols transmitted ov er a PR channel are typically decoded using a maximum likelihood (ML) sequence detector [7], like the V iterbi algorithm, or using a maximum a posteriori (MAP) symbol detector [8], like the Bahl, Cocke, Jelinek, and Ravi v (BCJR) algorithm. Errors made in the detector may be controlled by an error corr ection code , typically a linear block code. A class of linear block codes that has recei ved much recognition in the last decade is that of low-density parity- chec k (LDPC) codes. LDPC codes hav e been sho wn to gi ve outstanding bit-error rate (BER) performance while featuring a simple encoding and decoding algorithm based on belief propagation [9]–[13]. The belief-propagation algorithm, which operates on a graphical representation of the parity-check matrix of a code, in volv es passing messages (receiv ed bit probabilities or likelihoods) from variable nodes to checks nodes, and vice-versa. This message-passing scheme operates locally on bits and checks and can, consequently , lend itself to low-comple xity hardware implementations [10]. Even though knowledge of the channel PR polynomial could be exploited by the error correction decoder , in the state-of- the art receiv ers symbol detection and error correction are performed separately due to speed and complexity constraints. Roughly , the idea of such sub-optimal algorithms is to provide ISI-free channel symbols as well as their likelihoods to an error-correction decoder . The independence among input sym- bol likelihoods –assuming that no other a-priori information is available to the decoder– is a necessary condition for successful decoding of LDPC codes by means of the message passing algorithms mentioned above, an example of which is the sum-product algorithm. For a detailed discussion on the sum-product algorithm (or its variants) the reader is referred to [14]. Most of the proposed alternati ves for decoding LDPC codes over ISI channels in v olve the use of the BCJR algorithm SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 2 (or its variants) follo wed by the sum-product algorithm. A signiﬁcantly more ef ﬁcient approach is to use the output symbol lik elihoods of the sum-product algorithm as e xtrinsic information to improv e the performance of the sequence detector in an iterativ e feedback scheme. This is kno wn as turbo equalization , and is currently the most effecti ve known algorithm to decode LDPC codes on PR channels [15]–[18]. T urbo equalization entails, ho wev er , high complexity and, because of the sequential nature of the BCJR algorithm, may lead to a signiﬁcant delay . Simultaneous channel ISI remov al and error-correction de- coding is preferred if high bit rates and/or short decoding delays are sought. In this direction, notew orthy contributions hav e been made by Kurkoski, Sie gel, and W olf [2], and by Pakzad and Anantharam [19]. In the former, a PR channel detector performs parallel symbol message-passing between the PR channel state nodes and the v ariable nodes for a prescribed number of iterations (graphical models for LDPC codes and ISI channels will be introduced in Section 6.2). The resulting symbol likelihoods are later passed to the LDPC decoder . After a number of iterations of the sum-product algorithm, the LDPC decoder feeds its output likelihoods back to the channel detector, thus completing a turbo iteration. The approach in [2] is, therefore, very similar to turbo equalization, but with the adv antage of a signiﬁcantly shorter delay thanks to the parallel channel detector . Ne vertheless, the method can only attain the performance of the BCJR-based turbo equalization algorithm if the number of iterations in the PR channel detector is equal to the LDPC code word length. It is well known that the inference of marginal probabilities in a graph can only be uniquely accomplished if the graph is a tree. Since the decoding of LDPC codes corresponds to an inference problem for which the graph has loops, the con- ver gence of the message-passing algorithm is not guaranteed. In this regard, the decoding strategy proposed in [19] seeks to redeﬁne the underlying graphical model –comprised by LDPC nodes/checks and PR nodes– in terms of a set of super-nodes or regions. The belief propagation algorithm is applied to the new , region-based graphical model, thus implementing the Generalized Belief Propagation (GBP) strategy introduced in [20], [21]. It was shown in [19], that the GBP-based approach can outperform the decoding approach of [2]. Ho wev er , one should also be aware of two important ca veats of the GBP- based algorithm. First, the selection of regions is not an unambiguous process, but rather a heuristic strategy , lacking a rigorous justiﬁcation. Therefore, the quality of the algorithm is not uni versal, but case- (e.g. code-) speciﬁc, and it can only be judged upon simulations. Second, increasing the size of the regions improves GBP but it is also computationally expensi ve, as the overhead gro ws exponentially with the region size. This paper addresses the problem of joint detection and error correction in PR channels. W e present, for the ﬁrst time, a deriv ation of the belief propagation (BP) equations in a LDPC-coded PR channel. These are obtained by minimizing the Bethe free energy , which is equiv alent to performing the exact inference [21] if the graph is loop-free. The deri ved equations giv e an explicit solution to the decoding of LDPC codes on general PR channels with pair-wise ISI (i.e., those in which each observed symbol depends on two transmitted symbols), that are corrupted by additive white Gaussian noise. This solution is e xact if the LDPC part of the graph does not contain loops and allows a fully parallel implementation on the symbols. The equations reduce to the well-known BP equation for the memoryless channel [14] in the absence of ISI. W e also present a simple yet po werful algorithmic solution to the PR-BP equations. The algorithm features a fully parallel implementation, in the sense that channel detection and LDPC decoding are simultaneously performed on each symbol (after a complete codeword has been recei ved). W e ev aluate the performance of this algorithm on some LDPC codes ov er the Dicode channel, for which h ( D ) = 1 − D , and ﬁnd that it outperforms the turbo equalization algorithm. W e illustrate the smooth conv ergence of PR-BP algorithm to wards the ISI-free channel case by e v aluating its performance o ver a channel with h ( D ) = 1 + 0 . 5 D . The remainder of the paper is organized as follows. In Section II we brieﬂy introduce LDPC codes and their graphical representation. W e also introduce a graphical model for LDPC codes on a linear ISI channel. In Section III the deri vation of the BP equations from the Bethe free energy is giv en, and an iterati ve decoding algorithm to solve the equations is proposed. In Section IV we present a numerical ev aluation of the bit-error rate (BER) v ersus signal-to-noise ratio (SNR) of sev eral LDPC codes ov er the Dicode channel. A comparison against the BER performance of a turbo equalizer is also offered to display the excellent con ver gence of the PR-BP algorithm, particularly with medium- to low-rate codes. A brief comparison of complexity between our approach and that of turbo equalization is offered at the end of Section IV . Finally , Section V summarizes our results. I I . P R E L I M I NA R I E S O N G R A P H I C A L M O D E L S A. Graphical repr esentation of a LDPC code Let x = { x 1 , x 2 , . . . , x N } denote an ordered set of variables each of which can take v alues from a ﬁnite alphabet B . Let g indicate a function of these variables. A conﬁgur ation of x denotes a particular realization of x from the domain S = B N , referred to as the conﬁguration space . A mar ginal function g i ( x i ) is a function such that for each γ ∈ B , g i ( γ ) is found by summing g ( x ) over all those conﬁgurations for which x i = γ . Namely , the mar ginal function g i ( x i ) is expressed by [14] g i ( x i ) = X x \ x i g ( x 1 , x 2 , . . . , x n ) where x \ x i denotes that the summation is ov er all variables in x except x i . Let us assume that g ( x ) can be expressed as a product of functions f α whose arguments x α are subsets of x , and α is an element of the index set A . W e write g ( x ) as g ( x 1 , x 2 , . . . , x n ) = Y α ∈ A f α ( x α ) , (1) and the marginal g i ( x i ) can be written as g i ( x i ) = X x \ x i Y α ∈ A f α ( x α ) . (2) SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 3 Fig. 1. Factor graph or bipartite graph. Circles correspond to the variable (bit) nodes and the squares correspond to the factor (check) nodes. A factor graph is a bipartite graph whose conﬁguration is determined via (1) [22], [23]. In a factor graph, variables xi are symbolized by variable nodes; factor functions f α are symbolized by factor nodes; and the dependence of a function on a v ariable is symbolized by an edge joining the two. It is not difﬁcult to see that ev ery factor graph is a tree. Figure 1 depicts an example of a factor graph, in which the variable nodes are represented by circles and the factor nodes are represented by squares. This graph has ﬁ ve v ariable nodes and four factor nodes. Its structure corresponds to the functional e xpression g ( x 1 , x 2 , x 3 , x 4 , x 5 ) = f A ( x 1 , x 2 , x 3 ) f B ( x 3 , x 4 ) × f C ( x 3 , x 5 ) f D ( x 4 ) . (3) A LDPC code is a linear block code speciﬁed by its parity- check matrix H . This matrix has elements from the set { 0 , 1 } and is sparse, i.e., the number of elements 1 is much smaller than the number of elements 0. H is said to be regular if it features uniform column and ro w weight; otherwise, it is called irregular . A parity-check matrix H of M rows and N columns and rank rank ( H ) deﬁnes a code C with block length N and rate ( N − rank ( H )) / N . Each row of H deﬁnes a parity check equation. A 1 in row j and column i indicates that v ariable x i is an ar gument of the j th parity check equation. A code word of the code C is a conﬁguration of the ordered set of v ariables x for which all the parity check equations are satisﬁed. If the alphabet B is binary with elements from GF(2), then a codew ord of C (in v ector representation) satisﬁes H x T = 0 ov er GF(2), where 0 is an all-zero vector . A bipartite graph in which the parity check equations are represented by f actor nodes and the v ariables x i by v ariable nodes is referred to as a T anner graph . A value 1 at ro w j and column i of H is represented by an edge between variable node i and factor node j . W e deﬁne q i as the node degree (i.e., the number of connected edges) of variable node i and p j as the node degree of factor node j [10]. A regular LDPC codes has q i = q , p j = p, ∀ i, j . B. The discr ete ISI c hannel Consider the transmission of a sequence of symbols x i in discrete time intervals indexed by i . A linear discrete ISI channel relates the output signal y i with the transmitted signal Fig. 2. Graphical representation of a linear ISI channel. The output nodes y i are a linear combination of the uncorrelated nodes x i plus additive receiv er noise. x i as y i = L X j =0 h j x i − j + ξ i (4) where L is the ISI length, h 0 , h 1 , . . . , h r are real-valued chan- nel coefﬁcients, and ξ i is an additiv e discrete noise process, which we assume to be white and Gaussian with zero mean and variance σ 2 . The relation in (4) is commonly referred to as the PR channel. The PR channel is usually represented by the polynomial expression h ( D ) = L X j =0 h j D j (5) where D is the delay operator, such that D j x i = x i − j . W e assume that the polynomial h ( D ) is normalized so that h 0 = 1 . Figure 2 shows a graphical representation of (5). Each variable x i is represented by a hidden variable node, that is, a variable node that cannot be observed. An output v ariable y i is designated by a triangle in the graph, which we denote as an ISI node. ISI nodes include the contribution of additiv e noise, but to simplify the graph we choose to leav e this contribution implicit. W e denote the SNR by s 2 and, following [24], is deﬁned as s 2 = P L j =0 h 2 j σ 2 . (6) This deﬁnition of SNR accounts for the energy contribution induced by the ISI, so that the energy per symbol is main- tained, regardless of the channel coefﬁcients and the memory length L . C. Graphical r epr esentation of a LDPC code on a discr ete ISI channel An LDPC code operating on an ISI channel can ha ve a graphical representation that combines the graphs in Figs. 1 and 2. An example of this is the tripartite graph in Fig. 3, which correspond to a linear code on a discrete ISI channel of length L . Provided that the factor nodes in Fig. 3 correspond to the parity check equations of the LDPC code, we may inter- changeably refer to them as chec k nodes . W e hav e previously SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 4 Fig. 3. Factor graph of a LDPC code on a ISI channel with length L = 1 . Squares, circles, and triangles represent factor nodes, variable nodes, and ISI nodes, respectiv ely . denoted the parity check equations by f α . In addition, we denote the functional representation of the ISI node by f ℵ ( ℵ is the ﬁrst letter of the hebrew alphabet). The tripartite graph in Fig. 3 serves as the basis for the deriv ation of the BP equations over a discrete ISI channel. Note that the left-most ISI node in the graph is assumed to be the ﬁrst output signal observed. The next section presents the deriv ation of the belief propagation equations. I I I . B E T H E F R E E E N E R G Y A N D B E L I E F P RO P AG A T I O N E Q UA T I O N S W e begin this Section by brieﬂy introducing the concept of beliefs and the Bethe free energy . The reader may ﬁnd a thorough description of the BP algorithm and its relation to the Bethe free energy in [21]. A belief b i ( x i ) at the variable node i is an approximation to the exact marginal function g i ( x i ) [21]. W e can extend this deﬁnition to the other types of nodes in the graph sho wn in Fig. 3. The joint belief b α ( x α ) at the set of variables x a (which in turn corresponds to the belief of the factor node f α ) is an approximation to the exact marginal function g α ( x α ) . Similarly , the joint belief b ℵ ( x ℵ ) of the set of variables x ℵ , corresponding to the variable nodes connected to the ISI node ℵ , is an approximation to the exact mar ginal function g ℵ ( x ℵ ) . It is of interest to compute the marginal function mentioned abov e, because they represent the probabilities of the transmit- ted symbols. Ho wever , ev en with the knowledge of the global function g ( x ) , this may be a very difﬁcult computational task. W e may use the BP equations to approximate the marginal functions by means of the beliefs. A. Bethe fr ee ener gy approac h to the decoding of LDPC codes in a PR c hannel The BP equations correspond to the stationary points of a function of the beliefs called the Bethe free energy [21]. The Bethe free energy , e xpressed as a function of the beliefs b α ( x α ) and b ℵ ( x ℵ ) on the check, and the ISI nodes, respec- tiv ely , is F [ b α ( x α ) , b ℵ ( x ℵ )] = U [ b α ( x α ) , b ℵ ( x ℵ )] − H [ b α ( x α ) , b ℵ ( x ℵ )] , (7) where the Bethe self energy U is U [ b α ( x α ) , b ℵ ( x ℵ )] = − X α X x α b α ( x α ) ln f α ( x α ) − X ℵ X x ℵ b ℵ ( x ℵ ) ln f ℵ ( x ℵ ) (8) and the Bethe entrop y H is H [ b i ( x i ) , b α ( x α ) , b ℵ ( x ℵ )] = X α X x α b α ( x α ) ln b α ( x α ) − X ℵ X x ℵ b ℵ ( x ℵ ) ln b ℵ ( x ℵ ) + X i ( q i − 1) X x i b i ( x i ) ln b i ( x i ) . (9) In a tree-like graph, the Bethe free ener gy is a concav e function on the beliefs such that at its minimum points, b α ( x α ) = g α ( x α ) and b ℵ ( x ℵ ) = g ℵ ( x ℵ ) , the desired marginal functions, and F = F free , the free energy . Since it is of interest to interpret the beliefs as probability mass functions, the normalization constraints X x i b i ( x i ) = X x α b α ( x α ) = X x ℵ b ℵ ( x ℵ ) = 1 , (10) and the consistency constraints b i ( x i ) = X x α \ x i b α ( x α ) = X x ℵ \ x i b ℵ ( x ℵ ) , (11) must be satisﬁed. The probability of observing the channel output vector y , giv en the binary input vector x ∈ {− 1 , 1 } N and the SNR s 2 , is P ( y | x ) ∝ exp    − s 2 2 N X i =1   y i − L X j =0 h j x j   2    (12) provided that the discrete additiv e noise process is white and Gaussian. After expanding (12) and discarding the constant terms, we obtain P ( y | x ) ∝ exp N X i =1 u i x i ! exp   − 1 ≤| i − j |≤ L X ( i,j ) Q j − i x j x i   , (13) where the summation on the right-most exponent is over all pairs ( i, j ) of distinct bits separated by a distance ( i, j ) , 1 ≤ | i − j | ≤ L , and we have deﬁned u i = s 2 L X j =0 h j y j , Q p = s 2 | p − L | X k =0 h k h k + p . (14) W ith u i we symbolize the likelihood of the v ariable node i . For L = 0 , u i takes the form of the likelihood of a Gaussian memoryless channel. Q p accounts for the pair -wise memory . Follo wing the approach in [21], [25], we would like to write the joint probability distribution function of the random vector X representing the binary symbols of a codew ord with a product of functions such that the probability of the SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 5 conﬁguration x = { x 1 , x 2 , . . . , x N } , p ( x ) , is gi ven by p ( x ) = 1 Z Y α ∈ A f α ( x α ) Y ℵ∈ i f ℵ ( x ℵ ) (15) where { f α ( x α ) } is a set of M non-negati ve functions as deﬁned in Section 6.2. Similarly , { f ℵ ( x ℵ ) } is a set i of N non-negati ve functions indexed by ℵ whose arguments x ℵ are subsets of x . In (15), Z is a normalization constant gi ven by Z = X x Y α ∈ A f α ( x α ) Y ℵ∈ i f ℵ ( x ℵ ) (16) such that P x p ( x ) = 1 , i.e., p ( x ) is a probability mass function. The purpose of expressing the joint probability distribution in this fashion is to con veniently represent the factor graph in Fig. 3, in which { f α ( x α ) } describes the check nodes of the LDPC parity-check matrix, and { f ℵ ( x ℵ ) } describes the observed ISI nodes. By writing f α ( x α ) ≡ δ   Y x j ∈ x α x j , 1   exp q − 1 i X x i ∈ x α x i u i ! (17) f ℵ ( x ℵ ) ≡ exp  − Q | j − i | x i x j  , (18) where δ ( v, 1) = 1 for v = 1 (the parity-check equation is satisﬁed) and 0 otherwise, we obtain p ( x ) = 1 Z M Y α =1 δ   Y x j ∈ x α x j , 1   N Y i =1 exp( u i x i ) × 1 ≤| i − j |≤ L Y ( i,j ) exp ( − Q j − i x j x i ) . (19) The Bethe free energy is then minimized with respect to the beliefs b i , b α , and b ℵ subject to the normalization and consistency constraints in (10) and (11). Namely , we minimize the Lagrangian function L = U − H + X α γ α X x α b α ( x α ) − 1 ! + X ℵ γ ℵ X x ℵ b ℵ ( x ℵ ) − 1 ! + X i γ i X x i b i ( x i ) − 1 ! + X i X α 3 i X x i λ iα ( x i )   b i ( x i ) − X x α \ x i b α ( x α )   + X i X ℵ3 i X x i λ i ℵ ( x i )   b i ( x i ) − X x ℵ \ x i b ℵ ( x ℵ )   , (20) where α 3 i and ℵ 3 i indicate all indices of the checks and of the ISI nodes connected to bit i , respectively , and γ i , γ α , γ ℵ , λ iα ( x i ) , λ i ℵ ( x i ) are Lagrange coef ﬁcients that mul- tiply the normalization constraints (10) and the consistency constraints (11). The minimization of (20) with respect to the beliefs leads to b α ( x α ) = f α ( x α ) exp " − γ α − 1 + X i ∈ α λ iα ( σ i ) # , (21) b ℵ ( x ℵ ) = f ℵ ( x ℵ ) exp " − γ ℵ − 1 + X i ∈ℵ λ i ℵ ( x i ) # , (22) b i ( x i ) = exp " 1 q i + L − 1 γ i + X α 3 i λ iα ( x i ) + X ℵ3 i λ i ℵ ( x i ) ! − 1 # . (23) Equations (21)-(23) complemented by the normalization and the consistenc y constraints form a close system of BP equations for the λ iα ( x i ) and λ i ℵ ( x i ) coefﬁcients. Follo wing the traditional notation of BP equations in terms of the ﬁelds η deﬁned on the edges of the factor graph, we have the relations η iα ≡ λ iα (+1) − λ iα ( − 1) 2 + h i q i , η i ℵ ≡ λ i ℵ (+1) − λ i ℵ ( − 1) 2 , (24) where η iα indicates the ﬁeld going from v ariable node i to factor node α , and η i ℵ indicates the ﬁeld going from variable node i to ISI node ℵ . Substituting (24) in (21)-(23) yields the expressions X x α x i b α ( x α ) = tanh   η iα + tanh − 1   j 6 = i Y j ∈ α tanh η j α     , (25) X x ℵ x i b ℵ ( x ℵ ) = tanh  η i ℵ − tanh − 1 (tanh η i ℵ tanh Q ℵ )  , such that ( i, j ) ∈ ℵ , (26) X x i x i b i ( x i ) = tanh " 1 q i + L − 1 X α 3 i η iα + X ℵ3 i η i ℵ !# (27) By equating the right-hand sides of (25), (26), and (27), and using (24), after some algebraic manipulations we ﬁnd the BP equations for the ISI channel: η iα = u i + β 6 = α X β 3 i tanh − 1   j 6 = i Y j ∈ β tanh η j β   − X ℵ3 i tanh − 1 (tanh η i ℵ tanh Q ℵ ) , (28) η i ℵ = u i − i 6 = ℵ X i 3 i tanh − 1 (tanh η i i tanh Q i ) + X α 3 i tanh − 1   j 6 = i Y j ∈ α tanh η j α   . (29) where the hebrew letter i is used to indicate those ISI nodes that are connected to the v ariable node i b ut are not ℵ . It can be observed in (28)-(29) that in the absence of ISI, Q ℵ = 0 and the equations reduce to the well known BP equations for memoryless channels. SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 6 Equations (28)-(29) form the exact BP solution to the PR channel with pair-wise ISI, regardless of the distance between the two variable nodes joint by an ISI node. In a factor graph whose PR nodes do not form loops (in the absence of the check nodes), the solution is exact in the sense that the ﬁelds determined correspond to the stationary points of the Bethe free energy . Of course, the check nodes of the LDPC code add loops to the graph, and therefore, the con ver gence of the ﬁelds is not guaranteed. The BP equations above can be used to decode the message symbols of a LDPC code over a discrete PR channel with polynomial h ( D ) = 1 − αD n , (30) where − 1 ≤ α ≤ 1 and n is a non-negati ve integer . For ISI inv olving more than two v ariable nodes, the solution is suboptimal ev en in the absence of the check nodes because the ISI nodes and the v ariable nodes will form loops. The BP equations are nonlinear and the roots can be ev aluated with the method of preference. In the next subsection we describe a simple iterati ve procedure to solv e the BP equations, analogous to the message-passing algorithm applied to memoryless channels. B. Algorithmic solution to the BP equations The nonlinear PR-BP equations described above can be solved using any nonlinear minimization algorithm. Howe ver , in the interest of ﬁnding a simple and fully-parallel decoding algorithm, we follow an iterative approach similar to that of message-passing for LDPC codes on memoryless channels. Equations (28)-(29) can be iterati vely decoded using the following algorithm: η ( n +1) iα = u i + β 6 = α X β 3 i µ iβ − X ℵ3 i ζ i ℵ (31) η ( n +1) i ℵ = u i − i 6 = ℵ X i 3 i ζ i i + X α 3 i µ iβ (32) with µ iβ = tanh − 1   j 6 = i Y j ∈ β tanh η ( n ) j β   (33) ζ i ℵ = tanh − 1  tanh η ( n ) i ℵ tanh Q ℵ  (34) The superscript ( n + 1) refers to the value of the ﬁeld η at iteration step n + 1 . Decoding starts when all ISI symbols associated with a LDPC codeword hav e been receiv ed. The algorithm is initialized by setting all µ iβ , ζ i ℵ to zero. It is assumed that the symbols transmitted prior and posterior to the codew ord are on a kno wn state. Using terminating nodes with known states is not strictly necessary when the channel has memory length L = 1 , as conv ergence is enforced by the check equations. Howe ver it helps achieving a faster con vergence and it is also common practice in the ev aluation of decoders on PR channels. The ﬁelds in (31) and (32) are e valuated using the v alues µ iβ , ζ i ℵ , computed in the previous iteration to replicate a fully-parallel architecture at each iteration. After ev ery iteration, the likelihood Λ i on each variable node i is computed as Λ i = u i + X α 3 i µ iα (35) and the codew ord is checked. Note that the summation is in this case over all the check nodes a connected to v ariable node i . In the next section we present a BER e valuation of some LDPC codes using this iterativ e algorithm on a Dicode channel. I V . N U M E R I C A L S I M U L A T I O N S As we have already mentioned, the exact solution given abov e does not guarantee con ver gence to a v alid codeword because linear block codes generate graphs with loops. It is well-known, howe ver , that very good con ver gence is generally observed using the sum-product algorithm on graphs with loops such as those generated by LDPC codes [12]. W e expect this to be also true for the BP equations presented here. T o illustrate this, we ha ve numerically performed the transmission and decoding of random LDPC codewords on a Dicode channel, given by the polynomial h ( D ) = 1 − D by means of Monte Carlo simulations. W e have done this for various LDPC codes using the algorithm in (31)-(34) with the initialization giv en in (14). Clearly , for the Dicode channel h 0 = 1 , h 1 = − 1 , and Q 1 = − s 2 . Each channel observation y i is computed using (4), in which x i is a LDPC-coded binary symbol that takes values from {− 1 , +1 } , and ξ i is a A WGN sample from the normal distribution N (0 , σ 2 ) . In the following subsections we present the BER vs. SNR curves obtained from the abovementioned simulations. Be- cause the BER performance of a code on an ISI channel may depend on the transmitted code word, we determine the generator matrix of the code by means of Gaussian elimination and encode randomly generated sequences of equiprobable bits. For each LDPC code considered, we ha ve included the BER performance obtained from the turbo equalization scheme [15] for comparison purposes. Decoding using turbo equalization was performed in the follo wing way . Each turbo iteration consisted of one pass of the BCJR algorithm follo wed by S iterations of the sum-product algorithm. If T is the number of turbo iterations, we selected T such that T ( S + 1) would equal (or be near) the number of iterations J of the PR- BP algorithm. It is worth mentioning that T and S were chosen to achie ve the best performance of the turbo equalization algorithm, and were in fact dif ferent for dif ferent LDPC codes. A. Length-495, rate 0.875 (quasi) re gular LDPC code W e ﬁrst consider the MacKay (495,433) code, with rate 0.875, column weight 3 and row weight 24 [26]. This is a slightly irregular code, as it features three parity-check equations with weight 23. This code has been previously considered in the joint-decoding works published in [2] and [19]. Figure 4 depicts the BER versus SNR in decibels. W e use the deﬁnition of SNR giv en in (6). In this and in the next examples, a po wer penalty is applied to the SNR to account for the code redundancy , and is giv en by 10 log 10 R , where R SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 7 Fig. 4. BER versus SNR in dB of the MacKay(495,433) rate 0.875 (quasi) regular code performing on both a memoryless and a Dicode channel. is the code rate. In this example, the penalty equals 0.58 dB. In Fig. 4, the curve with black square markers shows the BER performance of the code on a memoryless channel obtained making 20 iterations of the sum-product algorithm. This curve may serve as a lower bound to the code performance in a ISI channel. The curv e with open circle markers was obtained with the PR-BP algorithm using 40 iterations, while the curve with ﬁlled circle markers corresponds to 80 iterations of the same algorithm. The curves with open and ﬁlled diamond markers were obtained using turbo equalization with 8 × (5 + 1) iterations and 16 × (5 + 1) iterations, respecti vely . Note that at high SNR these curves show an error ﬂoor, whereas those from the PR- BP algorithm do not. W e observe a decoding gain of about 0.5 dB at BER = 10 − 8 . B. Length-4095, rate 0.82, (quasi) re gular LDPC code Our second example considers a code with block length 4095 and rate 0.82 [26]. This code features column weight of 4 and row weights 22 and 23. W e simulate this code on both a memoryless channel and a Dicode channel using randomly- generated codew ords. Figure 5 shows the BER curves of this code. The memoryless channel is decoded using 20 iterations of the sum-product algorithm, and its BER curve is sho wn with square markers. The results using the PR-BP algorithm with 20 iterations is depicted by the curve with circle markers. The curve with diamond markers represent the performance of the turbo equalization algorithm using 4 × (4 + 1) iterations. Note the low BER values achieved with the PR-BP algorithm (2 × 10 − 10 ) at SNR = 5 dB. As with the code in the previous subsection, the BER curves on the Dicode channel exhibit a cross-ov er , with a steeper slope in the case of the PR-BP algorithm. W e observe again that with high-rate codes the improv ement over turbo equalization can only be seen at low BER. At BER = 10 − 8 , the decoding gain is 0.15 dB. This gain, howe ver , increases for lower rate codes, as we sho w next. Fig. 5. BER versus SNR in dB of the MacKay (4095,3358) rate 0.82 (quasi) regular code performing on both a memoryless and a Dicode channel. Fig. 6. BER v ersus SNR in dB of the Margulis (2640,1320) rate 0.50 regular code ov er both a memoryless and a Dicode channel. C. Length-2640, rate 0.50, r e gular LDPC code The next LDPC code we ev aluate is a regular Mar gulis (2640,1320) code with rate 0.5, column weight 3 and row weight 6 [26]. As with the previous codes we have determined the BER performance on the memoryless and Dicode channels. W e use 20 iterations for PR-BP . The best decoding perfor- mance using turbo equalization is attained using 3 × (6 + 1) iterations, different from the optimal combination found for the pre vious code. W e observ e that the performance of turbo equalization can be seriously degraded with a careless selec- tion of the iteration parameters T and S , as described at the beginning of this Section, at least when the target number of iterations is small. In the e valuation of this code there is a substantial decoding improv ement with the PR-BP algorithm over turbo equaliza- tion, as observed in the BER plot of Fig. 6. The BER curv es are marked as in the last example. The decoding gain reaches approximately 0.5 dB at BER = 10 − 7 and appears to increase at higher SNR. At BER as low as 10 − 9 the PR-BP algorithm SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 8 Fig. 7. BER versus SNR in dB of the MacKay (4000,2000) rate 0.50 regular code ov er both a memoryless and a Dicode channel. shows a steadily increasing slope, with no sign of an error ﬂoor . W e observe a difference of only about 0.8 dB with the memoryless case. D. Length-4000, rate 0.50 r e gular LDPC code The last code considered is a MacKay code of block length 4000 and rate 0.5 [26]. This code is regular with column and row weights 3 and 6, respectively . Of the codes considered in this numerical BER ev aluation, this is the strongest, as it can be seen in Fig. 7, by the curve with square markers. W e hav e used 20 iterations on the PR-BP algorithm and 3 × (6 + 1) iterations on the turbo equalization algorithm. The decoding gain achie ved by the PR-BP algorithm is about 0.5 dB at BER = 10 − 7 . It is interesting to see that the turbo equal- ization algorithm sho ws an error ﬂoor belo w BER = 10 − 7 . Howe ver , no indication of this is seen with PR-BP , even at BER = 3 × 10 − 10 . Because of the error ﬂoor , the decoding gain increases to approximately 0.75 dB at BER = 10 − 8 . E. Le gth-2640, r ate 0.5 LDPC code on PR c hannel h ( D ) = 1 + 0 . 5 D As mentioned at the end of Section 6.3, the PR-BP equations are optimal on PR channels with pair-wise ISI. Although not shown here, we have e valuated the Margulis code on a PR2 channel, with h ( D ) = 1 − D 2 , and as expected, the BER performance coincides with that of the Dicode channel. It is worthwhile to assess the PR-BP algorithm on a channel with smaller ISI to observe ho w it approaches to the performance of the sum-product algorithm on a memoryless channel. W e hav e chosen a channel with an arbitrary impulse response h ( D ) = 1 + 0 . 5 D . Figure 8 shows the BER versus SNR of the Margulis code. As one would e xpect, the BER curve corresponding to PR-BP approaches nicely the curve of the memoryless channel, with a distance of only 0.4 dB. Again, the turbo equalization algorithm required a change of iteration parameters ( T = 2 and S = 9 ) and its BER performance did not approach that of the memoryless case as fast as the PR-BP scheme. Fig. 8. BER v ersus SNR in dB of the Margulis (2640,1320) rate 0.50 regular code ov er both a memoryless and a channel with h ( D ) = 1 + 0 . 5 D . F . Comple xity and delay T o ﬁnish our analysis we brieﬂy present an account of the operations required to complete one iteration of the PR-BP algorithm on a pair-wise ISI channel, of which the Dicode channel is an example. Let q and p be the (constant) out- degrees of the variable nodes and the check nodes of the LDPC code, respectively . In order to simplify the analysis, we do not consider the complexity of the tanh and tanh − 1 functions (as if they were look-up-table operations) and only account for multiplications and additions. W e also disregard the computation of the initial likelihoods u i . T o compute µ iβ in (33), ( p − 2) multiplications are required, and only 1 multiplication is needed for ζ i ℵ in (34). The ﬁeld η iα in (31) requires ( q − 2) additions of µ iβ and 1 addition of ζ i ℵ , besides the 2 explicit additions of the equation. Since there are ( q − 1) different µ iβ and two different ζ i ℵ , the total number of multiplications for η I α is ( q − 1)( p − 2) + 2 . For a pair-wise ISI channel there are only two edges on each ISI node. A careful look at (32) re veals that all the terms in η i ℵ hav e been computed for the ﬁeld η iα , or for another ﬁeld on an edge connected to check node α . Only ( q − 1) additions hav e to be counted. W e also count the 2 additions of the terms in (32). In summary , considering all edges incident on each variable node i (both from the LDPC checks and from ISI nodes), the number of multiplications N m and additions N a per symbol per iteration for the PR-BP algorithm on a pair - wise ISI channel are N m = q ( q − 1)( p − 2) + 2 , N a = q ( q + 1) + 6 . (36) The number of operations per symbol in the BCJR algorithm for a two-state trellis is 18 multiplications and 9 additions. W e add to this the cost of the sum-product algorithm, which consists of q ( q − 1)( p − 2) multiplications and q ( q − 1) additions per symbol. Because the number of sum-product and BCJR iterations within a turbo iteration differs from case to case, the ov erall complexity varies. Using one of the simulations reported abov e we compare the complexity and the latency of PR-BP and turbo equalization. SUBMITTED TO IEEE TRANSA CTIONS ON COMMUNICA TIONS 9 Example : In subsection 6.4.4 we simulate the performance of a (4000,2000) LDPC code with weights q = 3 and p = 6 . The total number of operations per symbol performed by the PR-BP algorithm on 20 iterations is 520 multiplications and 360 additions. In the turbo equalization algorithm we hav e used 3 turbo iterations, each with 7 sum-product iterations and 1 BCJR iteration. This corresponds to 486 multiplications 135 additions per symbol. W ith respect to delay , the PR-BP algorithm exceeds the performance of the turbo equalization algorithm. T o decode a codeword it takes 20 × the steps of the PR-BP algorithm, assuming that a parallel architecture is used. In contrast, the same decoding operation takes approximately 3 × N = 12 , 000 time steps of the turbo equalization algorithm. Note that the complexity within each time step in the BCJR algorithm is smaller than (18 multiplications and 9 additions) but comparable to that of each symbol in the PR-BP algorithm (26 multiplications and 18 additions). V . C O N C L U S I O N S W e have considered the problem of joint channel detection and error correction of LDPC codes over PR channels. W e treat the joint PR-LDPC system as an inference problem deﬁned on a graph on which we attempt to determine the marginal probabilities. Finding the marginal probabilities on the combined factor graph is equi valent to decoding a code- word on a PR channel. W e hav e presented a deri vation of the belief propagation equations for such a combined system [equations (28),(29)]. The equations originate from the mini- mization of the Bethe free energy –a well-kno wn technique in statistical mechanics– and provide an optimal solution (limited by the ef fect of loops in the LDPC part of the graph) for PR channels with polynomial h ( D ) = 1 − αD n (where − 1 ≤ α ≤ 1 ) and n is a non-negati ve integer . The BP equations for PR channels are explicit and can be solved by any algorithm capable of solving nonlinear equations. W e propose a simple iterati ve algorithm that per- mits fully parallel implementation on the symbols. Numerical simulations show that the algorithm deliv ers excellent BER performance on all of the LDPC codes ev aluated, surpassing the performance of turbo equalization and showing no error ﬂoor abov e BER = 10 − 9 . The complexity of the PR-BP scheme in terms of number of operations is comparable to that of turbo equalization. Particularly good characteristics of the PR-BP scheme are its simplicity –as it requires no customization for dif ferent LDPC codes– and very low latency . In fact the algorithm exhibits a delay that only depends on the number of iterations, and not on the codew ord length. This feature makes it an excellent choice for sequence detection and decoding at high bit rates. Further work is being pursued (a) to determine the optimal BP equations for PR channels with longer memory length, and (b) to analyze, in the spirit of [25], [27], the effects of LDPC-related loops on performance of the PR-BP scheme in the error-ﬂoor domain. R E F E R E N C E S [1] O. E. Agazzi, M. R. Hueda, H. S. Carrer , and D. E. Cri velli, “Maximum- likelihood sequence estimation in dispersi ve optical channels, ” J. Lightw . T echnol. 23 , 749–763 (2005). [2] B. Kurkoski, P . Siegel, J. W olf, “Joint message-passing decoding of LDPC codes and partial-response channels, ” IEEE Trans. Inf. Theory 48 , 1410–1422 (2002). [3] S. L. Ariyavisitakul and Y e Li, “Joint coding and decision feedback equalization for broadband wireless channels, ” IEEE J. Sel. Areas Commun. 16 , 1670–1678 (1998). [4] H. Kobayashi, “Correlative level coding and maximum-likelihood de- coding, ” IEEE Trans. Inf. Theory IT -17 , 586–594 (1971). [5] P . Kabal and S. Pasupathy , “Partial-response signaling, ” IEEE T rans. Commun. COM-23 , 921–934 (1975). [6] H. Thapar and A. Patel, “ A class of partial-response systems for increasing storage density in magnetic recording, ” IEEE Trans. Magn. MA G-23 , 3666–3668 (1987). [7] A. Acampora, “Maximum-likelihood decoding of binary conv olutional codes on band-limited satellite channels, ” IEEE Trans. Commun. 26 , 766–776 (1978). [8] L. R. Bahl, J. Cock e, F . Jelinek, and J. Ra viv , “Optimal decoding of linear codes for minimizing symbol error rate, ” IEEE Trans. Inf. Theory 20 , 284–287 (1974). [9] R. G. Gallager, Low-Density P arity-Check Codes. (Cambridge, MA, MIT Press, 1963). [10] R. M. T anner, “ A recursive approach to low complexity codes, ” IEEE T rans. Inf. Theory IT -27 , 533–547 (1981). [11] D. J. C. MacKay and R. M. Neal, “Near Shannon limit performance of low density parity check codes, ” IEEE Electron. Lett. 32 , 1645-1646 (1996). [12] D. J. C. MacKay , “Good error-correcting codes based on very sparse matrices, ” IEEE Trans. Inf. Theory 45 , 399–431 (1999). [13] T . Mittelholzer, A. Dholakia, and E. Eleftheriou, “Reduced-complexity decoding of low density parity check codes for generalized partial response channels, ” IEEE Trans. Magn. 37 , 721-728 (2001). [14] F . R. Kschischang, B. J. Frey , and H.-A. Loeliger , “Factor graphs and the sum-product algorithm, ” IEEE Trans. Inf. Theory 47 , 498–519 (2001). [15] M. Tuchler , R. Koetter , A. C. Singer, “Turbo equalization: principles and new results, ” IEEE T rans. Commun. 50 , 754–767 (2002). [16] T . Souvignier , M. Oberg, P . H. Siegel, R. E. Swanson, and J. W olf, “T urbo decoding for partial response channels, ” IEEE Trans. Commun. 48 , 1297–1308 (2000). [17] W . Ryan, “Concatenated codes for class IV partial response channels, ” IEEE T rans. Commun. 49 , 445–454 (2001). [18] C. Douillard, M. Jezequel, C. Berrou, A. Picart, P . Didier, and A. Glavieux, “Iterative correction of intersymbol interference: T urbo equal- ization, ” Eur . Trans. T elecomm. 6 , 507–511 (1995). [19] P . Pakzad and V . Anantharam, “Kikuchi approximation method for joint decoding of LDPC codes and partial response channels, ” IEEE Trans. Commun., 54 , 1149–1153 (2006). [20] O. Shental, A. J. W eiss, N. Shental, Y . W eiss, “Generalized belief propagation receiv er for near- optimal detection of two-dimensional channels with memory , ” Inform. Theory W orkshop, 2004. IEEE, 50 , 225–229, Oct. 2004. [21] J. S. Y edidia, W . T . Freeman, and Y . W eiss, “Constructing free-energy approximations and generalized belief propagation algorithms, ” IEEE T . Inf. Theory 51 , 2282–2312 (2005). [22] H.-A. Loeliger, “ An Introduction to Factor Graphs, ” IEEE Signal Pro- cess. Mag., Jan 2001, 28–41. [23] G. D. Forney , Codes on graphs: normal realizations, IEEE Trans. Inf. Theory 47 , 520–548 (2001). [24] A. Ka vcic, X. Ma, and N. V arnica, “Matched information rate codes for partial response channels, ” IEEE T rans. Inf. Theory 51 , 973–989 (2005). [25] M. Chertkov and V . Chernyak, “Loop series for discrete statistical models on graphs, ” J. Stat. Mech. (2006) P06009, cond-mat/0603189. [26] D. J.C. MacKay , Encyclopedia of sparse graph codes . [Online]. A vail- able: http://www .inference.phy .cam.ac.uk/mackay/codes/data.html [27] M. Chertkov and V . Chernyak, “Loop calculus helps to improve belief propagation and linear programming decodings of low-density-parity- check codes, ” presented at 44th Allerton Conference (Sept. 2006, Allerton, IL), arXi v:cs.IT/0609154.

Bethe Free Energy Approach to LDPC Decoding on Memory Channels

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment