Asymptotics of Entropy Rate in Special Families of Hidden Markov Chains

Asymptotics of En trop y Rate in Sp ecial F amilies of Hidden Mark o v Chains Guangyue Han Brian Marcus Universit y of H ong Kong Universit y of Briti sh Colu mbia email: ghan@ma t hs.hku.hk email: marcus@ma th.ub c.ca No v em b er 4, 2018 Abstract W e generalize a result in [8 ] and d eriv e an asymptotic form ula for entrop y rate of a h idden Mark ov c hain aroun d a “w eak Blac k Hole”. W e also discuss app lications of the a symp totic form ula to the asymptotic b ehavio rs of certain c hannels. Index T e rm s –en trop y , en trop y rate, hidden Mark o v chain, hidden Marko v mo del, hidden Mark o v pro cess 1 In tro ductio n Consider a discrete ﬁnite-v alued stat io nary stochastic pro cess Y = Y ∞ −∞ := { Y n : n ∈ Z } . The en tropy rate of Y is deﬁned to b e H ( Y ) = lim n →∞ H ( Y 0 − n ) / ( n + 1); here, H ( Y 0 − n ) denotes the join t en tropy of Y 0 − n := { Y − n , Y − n +1 , · · · , Y 0 } , and log is tak en to mean the natura l logarit hm. If Y is a Marko v c hain with alphab et { 1 , 2 , · · · , B } and transition probability mat r ix ∆, it is w ell kno wn tha t H ( Y ) can b e explicitly expressed with the stationar y v ector of Y and ∆. A function Z = Z ∞ −∞ of the Marko v c hain Y with the form Z = Φ( Y ) is called a hidde n Markov chain ; here Φ is a function deﬁned on { 1 , 2 , · · · , B } , ta king v alues in A := { 1 , 2 , · · · , A } (alternatively a hidden Marko v c hain is deﬁned as a Mark o v c hain observ ed in noise). F or a hidden Marko v c hain, H ( Z ) turns out (see Equation (1)) to b e the inte gra l o f a certain function deﬁned on a simplex with resp ect to a measure due to Blac kw ell [4]. How ev er Blac kw ell’s measure is somewhat complicated and the inte gra l form ula app ears to b e diﬃcult to ev aluate in most cases. In general it is v ery diﬃcult to compute H ( Z ); so far there is no simple and explicit formula for H ( Z ). Recen tly , the problem of computing the en tropy rate of a hidden Marko v c hain Z has dra wn muc h in terest, and many approaches hav e b een adopted to tackle this problem. F or instance, Blac kw ell’s measure has b een used to b ound the e ntrop y rate [15] and a v ariation on the Birc h b ound [3] was in tro duced in [5]. An eﬃcien t Monte Carlo metho d f o r computing the en trop y rate of a hidden Mark o v c hain w as prop osed indep enden tly by Arnold and Lo eliger [1], Pﬁster et. al. [17], and Sharma a nd Sing h [19]. T he connection b et w een the en trop y ra te o f a hidden Mar ko v c hain and the top Ly apuno v exp onent of a rando m matrix pro duct has b een o bserv ed [1 0, 11, 12, 6]. In [7], it is sho wn that under mild p ositivit y assumptions the en tropy rate of a hidden Mark o v c hain v aries analytically as a function of the underlying Mark o v c hain parameters. Another recen t approac h is based on computing the co eﬃcien ts of an asymptotic expan- sion of the entrop y rate around certain v alue s o f the Mark ov a nd channel parameters. The ﬁrst result along these lines was presen ted in [12], where for a binary symmetric c hannel with crosso v er probabilit y ε (denoted b y BSC( ε )), the T a ylor expansion of H ( Z ) around ε = 0 is studied for a binary hidden Marko v c hain o f order one. In particular, t he ﬁrst deriv ativ e of H ( Z ) a t ε = 0 is expressed ve ry compactly as a Kullbac k-Liebler divergenc e b et w een t w o distributions on bina r y triplets, deriv ed fr o m t he marginal of the input pro cess X . F ur- ther improv e men ts and new metho ds for the asymptotic expansion approa c h w ere obtained in [16], [20], [21] and [8]. In [16] the authors express the en tropy rate f or a binary hidden Mark o v chain where one of the transition probabilities is equal to zero as an asymptotic expansion including a O ( ε log ε ) term. This pap er is or g anized as follows . In Section 2 w e giv e an asymptotic formula (The- orem 2.8) for the entrop y rate of a hidden Marko v c hain around a w eak Blac k Hole. The co eﬃcien ts in the form ula can b e computed in principle (although explicit computations ma y b e quite complicated in general). The formula can b e view ed as a generalization of the Blac k Hole condition considered in [8]. The we ak Blac k Hole case is imp o r t a n t fo r hidden Marko v c hains obtained as output pro cesses o f noisy channels , corresp onding to input pro cesses, for which c ertain se quenc es have pr ob ability z e r o . Examples are given in Section 3. Example 3.1 w as already tr eat ed in [9] f or only the ﬁrst few co eﬃcien ts; but in this case, these co eﬃcien ts w ere computed quite explicitly . 2 Asymptotic F orm ula for En tropy Rate Let W b e the simplex, comprising the v ectors { w = ( w 1 , w 2 , · · · , w B ) ∈ R B : w i ≥ 0 , X i w i = 1 } , and let W a b e all w ∈ W with w i = 0 fo r Φ( i ) 6 = a . F or a ∈ A , let ∆ a denote the B × B matrix suc h that ∆ a ( i, j ) = ∆( i, j ) fo r j with Φ( j ) = a , and ∆ a ( i, j ) = 0 otherwise. F or a ∈ A , deﬁne the scalar-v alued and v ector-v alued functions r a and f a on W b y r a ( w ) = w ∆ a 1 , and f a ( w ) = w ∆ a /r a ( w ) . Note that f a deﬁnes the action of the matrix ∆ a on the simplex W . 2 If Y is irreducible, it turns out that H ( Z ) = − Z X a r a ( w ) log r a ( w ) dQ ( w ) , (1) where Q is Blackwel l’s me asur e [4] on W . This measure, whic h satisﬁes an integral equation dep enden t on the parameters o f the pro cess, is ho w ev er very hard to extract fro m the equation in any explicit w ay . Deﬁnition 2.1. (see [8]) Supp ose that for ev ery a ∈ A , ∆ a is a r ank one matr ix, and ev ery column of ∆ a is either strictly p ositiv e or all zeros. W e call this the Black Hole case. It w as sho wn [8] that H ( Z ) is analytic around a Bla c k Hole and t he deriv a tiv es of H ( Z ) can b e exactly computed around a Black Hole. In this sequel, we consider w eak ened assump- tions and pro v e an a symptotic form ula for en tropy rate of a hidden Mark ov c hain around a “w eak Bla c k Hole”, generalizing the corresp onding r esult in [8]. Deﬁnition 2.2. Supp ose that for ev ery a ∈ A , ∆ a is either an all zero mat rix or a rank o ne matrix. W e call this the we ak B lack Hole case. W e use the standard notation: b y α = Θ( β ), w e mean there exist p ositiv e constants C 1 , C 2 suc h that C 1 | β | ≤ | α | ≤ C 2 | β | , while b y α = O ( β ) , we mean t here exists a p ositiv e constan t C suc h that | α | ≤ C | β | . F or a giv en analytic f unction f ( ε ) aro und ε = 0, let ord ( f ( ε )) denote its o r der, i.e., the degree of the ﬁrst non-zero term of its T a ylor series expansion around ε = 0. Note that for an a nalytic function f ( ε ) around ε = 0 , f ( ε ) = Θ( ε k ) ⇐ ⇒ ord ( f ( ε )) = k . W e sa y ∆( ε ) is n ormal ly p a r ameterize d b y ε ( ε ≥ 0) if 1. eac h entry of ∆( ε ) is an analytic f unction at ε = 0, 2. when ε > 0, ∆( ε ) is (non- negativ e and) irreducible, 3. ∆(0) is a w eak blac k hole. In the follo wing, expressions lik e p X ( x ) will b e used to mean P ( X = x ) and w e drop the subscripts if the con text is clear: p ( x ) , p ( z ) mean P ( X = x ) , P ( Z = z ), resp ectiv ely , and further p ( y | x ) , p ( z 0 | z − 1 − n ) mean P ( Y = y | X = x ) , P ( Z 0 = z 0 | Z − 1 − n = z − 1 − n ), resp ectiv ely . Prop osition 2.3. Supp ose that ∆( ε ) is analytic al ly p ar ame terize d b y ε ≥ 0 and when ε > 0 , ∆( ε ) is non-ne g ative and irr e ducible. Then for any ﬁ x e d hidden Marko v se quenc e z 0 − n ∈ A n +1 , 1. p ( z − 1 − n ) is analytic ar ound ε = 0 ; 2. p ( y i = · | z i − n ) := ( p ( y i = b | z i − n ) : b = 1 , 2 , · · · , B ) is an alytic ar ound ε = 0 , wher e · denotes B p o s s ible states of Markov chain Y , 3. p ( z 0 | z − 1 − n ) is analytic a r ound ε = 0 . 3 Pr o o f . 1. When ε > 0, ∆( ε ) is non-negativ e and irreducible. By P erron-F rob enius the- ory [18], ∆( ε ) has a unique p o sitive stationary ve ctor, sa y π ( ε ). Since adj ( I − ∆( ε ))( I − ∆( ε )) = det( I − ∆( ε )) I = 0 (here adj ( · ) denotes the adjugate o p erator on matrices), one can c ho ose π ( ε ) to b e an y normalized row v ector of adj ( I − ∆( ε )). So π ( ε ) can b e written as ( π 1 ( ε ) , π 2 ( ε ) , · · · , π B ( ε )) π 1 ( ε ) + π 2 ( ε ) + · · · + π B ( ε ) , where π i ( ε )’s are non- negativ e analytic functions of ε and the ﬁrst non-zero term of ev ery π i ( ε )’s T a ylor series expansion has a p ositiv e co eﬃcien t. Then w e conclude that for eac h i ord ( π i ( ε )) ≥ ord ( π 1 ( ε ) + · · · + π B ( ε )) , and th us π ( ε ), whic h is uniquely deﬁned on ε > 0 , can b e con tinuously extended to ε = 0 via setting π (0) = lim ε → 0 π ( ε ). No w p ( z − 1 − n ) = π ( ε )∆ z − n · · · ∆ z − 1 1 = ( π 1 ( ε ) , π 2 ( ε ) , · · · , π B ( ε ))∆ z − n · · · ∆ z − 1 1 π 1 ( ε ) + π 2 ( ε ) + · · · + π B ( ε ) =: f ( ε ) g ( ε ) , (2) here ord ( f ( ε )) ≥ ord ( g ( ε )). It then fo llows that p ( z − 1 − n ) is a na lytic around ε = 0. 2. Let x i, − n = x i, − n ( z i − n ) denote p ( y i = · | z i − n ). Then one c hec ks that x i, − n satisﬁes the follo wing iteration: x i, − n = x i − 1 , − n ∆ z i x i − 1 , − n ∆ z i 1 , − n ≤ i ≤ − 1 , (3) starting with x − n − 1 , − n = p ( y − n − 1 = · ). Because ∆ is analytically parameterized by ε ( ε ≥ 0) and ∆( ε ) is non-negativ e and irreducible when ε > 0, inductiv ely w e can pro v e (the pro of is similar to the pro of of 1. ) that f o r an y i , x i, − n can b e written as fo llo ws: x i, − n = ( f 1 ( ε ) , f 2 ( ε ) , · · · , f B ( ε )) f 1 ( ε ) + f 2 ( ε ) + · · · + f B ( ε ) , where f i ( ε )’s are analytic functions a round ε = 0. No te that for eac h i ord ( f i ( ε )) ≥ ord ( f 1 ( ε ) + f 2 ( ε ) + · · · + f B ( ε )) . The existence of the T aylor series expansion of x i, − n around ε = 0 (for an y i ) then follow s. 3. One c hec ks that p ( z 0 | z − 1 − n ) = x − 1 , − n ∆ z 0 1 . (4) Analyticit y of p ( z 0 | z − 1 − n ) immediately follo ws from (4) and analyticity of x − 1 , − n around ε = 0 , whic h ha s b een show n in 2. . Lemma 2.4. Consi d er two formal series exp ansion f ( x ) , g ( x ) ∈ R [[ x ]] such that f ( x ) = P ∞ i =0 f i x i and g ( x ) = P ∞ i =0 g i x i , wher e g 0 6 = 0 . L et h ( x ) ∈ R [[ x ]] b e the quotient of f ( x ) and g ( x ) with h ( x ) = P ∞ i =0 h i x i . Then h i is a function dep endent on l y on f 0 , · · · , f i and g 0 , · · · , g i . 4 Pr o o f . Comparing the co eﬃcien ts of all the terms in the follow ing iden tit y: ∞ X i =0 h i x i ! ∞ X i =0 g i x i ! = ∞ X i =0 f i x i , w e o bt a in that for any i , h 0 g i + h 1 g i − 1 + · · · + h i g 0 = f i . The lemma then follow s f rom an induction (on i ) argumen t. By Prop osition 2.3, f o r an y hidden Mark ov string z 0 − m , the T aylor series expansion of p ( z 0 | z − 1 − m ) around ε = 0 exists. W e use b j ( z 0 − m ) to represen t the co eﬃcien t o f ε j in the expansion, namely p ( z 0 | z − 1 − m ) = b 0 ( z 0 − m ) + b 1 ( z 0 − m ) ε + b 2 ( z 0 − m ) ε 2 + · · · . (5) The f ollo wing lemma sho ws that under certain conditions, some co eﬃcien ts b j ( z 0 − m ) “ sta- bilize”. More precisely , w e ha ve : Lemma 2.5. C onsider a hidd en Markov chain Z with normal ly p a r ameterize d ∆( ε ) . F or two ﬁxe d hidden Markov ch a in se quenc es z 0 − m , ˆ z 0 − ˆ m such that z 0 − n = ˆ z 0 − n , ord ( p ( z − 1 − n | z − n − 1 − m )) , o r d ( p ( ˆ z − 1 − n | ˆ z − n − 1 − ˆ m )) ≤ k for some n ≤ m, ˆ m and some k , w e have for j with 0 ≤ j ≤ n − 4 k − 1 , b j ( z 0 − m ) = b j ( ˆ z 0 − ˆ m ) . Pr o o f . Recall that x i, − m = x i, − m ( z i − m ) = p ( y i = · | z i − m ) and ˆ x i, − ˆ m = ˆ x i, − ˆ m ( ˆ z i − ˆ m ) = p ( y i = · | ˆ z i − ˆ m ), where · denotes the p o ssible states of Mark ov c hain Y . Consider the T a ylor series expansion of x i, − m , ˆ x i, − ˆ m around ε = 0, x i, − m = a 0 ( z i − m ) + a 1 ( z i − m ) ε + a 2 ( z i − m ) ε 2 + · · · (6) ˆ x i, − ˆ m = a 0 ( z i − ˆ m ) + a 1 ( z i − ˆ m ) ε + a 2 ( z i − ˆ m ) ε 2 + · · · (7) W e shall show that a j ( z i − m ) = a j ( ˆ z i − ˆ m ) for j with 0 ≤ j ≤ n + i − i X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } , where for any hidden Mark ov sequence z i − m , J ( z i − m ) =    1 + ord ( p ( z i | z i − 1 − m )) if ord ( p ( z i | z i − 1 − m )) > 0 0 if ord ( p ( z i | z i − 1 − m )) = 0 . 5 Recall that x i +1 , − m = x i, − m ∆ z i +1 ( ε ) x i, − m ∆ z i +1 ( ε ) 1 . (8) No w with (6) and (7 ), w e hav e x i, − m ∆ z i +1 ( ε ) = ∞ X j =0 a j ( z i − m ) ∞ X k =0 ∆ ( k ) z i +1 (0) k ! ε k = ∞ X l =0 c l ( z i +1 − m ) ε l , (9) where sup erscript ( k ) denotes the k -th order deriv ativ e with resp ect to ε . W e pro ceed b y induction on i (from − n to − 1). First consider the case when i = − n . When max { J ( z − n − m ) , J ( ˆ z − n − ˆ m ) } > 0, the statemen t is v acuously true; when J ( z − n − m ) = J ( ˆ z − n − ˆ m ) = 0, necessarily ∆ z − n (0) is a r a nk one matrix, a 0 ( z − n − 1 − m )∆ z − n (0) 1 > 0 and a 0 ( ˆ z − n − 1 − ˆ m )∆ z − n (0) 1 > 0. Then w e hav e a 0 ( z − n − m ) = a 0 ( z − n − 1 − m )∆ z − n (0) a 0 ( z − n − 1 − m )∆ z − n (0) 1 ( ∗ ) = a 0 ( − ˆ z − n − 1 − ˆ m )∆ z − n (0) a 0 ( ˆ z − n − 1 − ˆ m )∆ z − n (0) 1 = a 0 ( ˆ z − n − ˆ m ) , where ( ∗ ) fo llows from the fa ct that ∆ z − n (0) is a rank one matrix. No w supp ose i ≥ − n and that a j ( z i − m ) = a j ( ˆ z i − ˆ m ) for j with 0 ≤ j ≤ n + i − P i l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } . If ord ( p ( z i +1 | z i − m )) > 0, since the leading co eﬃcien t ve ctor of the T ay lor series expansion in (9) is non-negative , c j ( z i +1 − m ) ≡ 0 for all j with 0 ≤ j ≤ J ( z i +1 − m ) − 2 and c J ( z i +1 − m ) − 1 ( z i +1 − m ) 6≡ 0 . So applying Lemma 2.4 to the following express ion x i +1 , − m = c 0 ( z i +1 − m ) + c 1 ( z i +1 − m ) ε + · · · + c l ( z i +1 − m ) ε l + · · · c 0 ( z i +1 − m ) 1 + c 1 ( z i +1 − m ) 1 ε + · · · + c l ( z i +1 − m ) 1 ε l + · · · = P ∞ l =0 c l + J ( z i +1 − m ) − 1 ( z i +1 − m ) ε l P ∞ l =0 c l + J ( z i +1 − m ) − 1 ( z i +1 − m ) 1 ε l , (10) w e conclude that for all j , a j ( z i +1 − m ) dep ends only on c l ( z i +1 − m ) , J ( z i +1 − m ) − 1 ≤ l ≤ J ( z i +1 − m ) − 1 + j, implying that a j ( z i +1 − m ) dep ends only on (or some of ) a l ( z i − m ) , ∆ ( l ) z i +1 (0) , 0 ≤ l ≤ J ( z i +1 − m ) − 1 + j. A completely parallel argumen t also a pplies t o the case when o rd ( p ( ˆ z i +1 | ˆ z i − ˆ m )) > 0. More sp eciﬁcally , the statemen ts ab o v e f o r the case ord ( p ( z i +1 | z i − m )) > 0 are still true if w e replace z , x, m with ˆ z , ˆ x , ˆ m , whic h implies that a j ( ˆ z i +1 − ˆ m ) dep ends only on (or some of ) a l ( ˆ z i − ˆ m ) , ∆ ( l ) ˆ z i +1 (0) , 0 ≤ l ≤ J ( ˆ z i +1 − ˆ m ) − 1 + j. Th us when max { J ( z i +1 − m ) , J ( ˆ z i +1 − ˆ m ) } > 0, we ha v e a j ( z i +1 − m ) = a j ( ˆ z i +1 − ˆ m ) for j with 0 ≤ j ≤ n + i − i X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } − max { J ( z i +1 − m ) − 1 , J ( ˆ z i +1 − ˆ m ) − 1 } 6 = n + ( i + 1) − i +1 X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } . If ord ( p ( z i +1 | z i − m )) = 0, by (4) necessarily w e hav e a 0 ( z i − m )∆ z i +1 (0) 1 6 = 0 . Again b y Lemma 2.4 a pplied to expression (10), f or an y j , a j ( z i +1 − m ) dep ends only on a l ( z i − m ) , ∆ ( l ) z i +1 (0) , 0 ≤ l ≤ j, Similarly if ord ( p ( ˆ z i +1 | ˆ z i − m )) = 0, we deduce that for an y j , a j ( ˆ z i +1 − m ) dep ends only on a l ( ˆ z i − m ) , ∆ ( l ) ˆ z i +1 (0) , 0 ≤ l ≤ j. Th us if max { J ( z i +1 − m ) , J ( ˆ z i +1 − ˆ m ) } = 0, for any j with 0 ≤ j ≤ n + i − i X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } = n + i − i +1 X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } , w e hav e a j ( z i +1 − m ) = a j ( ˆ z i +1 − m ). No w, let t = n + ( i + 1) − P i +1 l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } . Then one can show that a t ( z i +1 − m ) = a t ( z i − m )∆ z i +1 (0) a 0 ( z i − m )∆ z i +1 (0) 1 − a 0 ( z i − m )∆ z i +1 (0) a t ( z i − m )∆ z i +1 (0) 1 ( a 0 ( z i − m )∆ z i +1 (0) 1 ) 2 + other terms , where the ﬁrst term in the expression ab o v e is equal to 0 (since ∆ z i +1 (0) is a rank one matrix), and the “other terms” are functions of a 0 ( z i − m ) , · · · , a t − 1 ( z i − m ) , ∆ (0) z i +1 (0) , · · · , ∆ ( t ) z i +1 (0) . (11) It follows that a j ( z i +1 − m ) is a function of the same quan tities in (11). By a completely parallel argumen t a s ab ov e, a j ( ˆ z i +1 − ˆ m ) is the same function of of the same quan tities in ( 11). So w e ha v e a j ( z i +1 − m ) = a j ( ˆ z i +1 − ˆ m ) for j with 0 ≤ j ≤ n + ( i + 1) − i +1 X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } . Notice that − 1 X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } ≤ − 1 X l = − n ( J ( z l − m ) + J ( ˆ z l − ˆ m )) ≤ 4 k . The lemma then immediately follow s fr om (4) and the pro v en fact tha t a j ( z − 1 − m ) = a j ( ˆ z − 1 − ˆ m ) for j with 0 ≤ j ≤ n − 1 − − 1 X l = − n max { J ( z l − m ) , J ( ˆ z l − ˆ m ) } . 7 F or a mapping v = v ( ε ) : [0 , ∞ ) → W analytic at ε = 0 a nd a hidden Mark o v sequence z 0 − n , deﬁne p v ( z − 1 − n ) = v ∆ z − n · · · ∆ z − 1 1 , and p v ( z 0 | z − 1 − n ) = p v ( z 0 − n ) p v ( z − 1 − n ) . Let b v,j ( z 0 − n ) denote the co eﬃcien t of ε j in the T a ylor series expansion of p v ( z 0 | z − 1 − n ) (note that b v,j ( z 0 − n ) do es not dep end on ε ), p v ( z 0 | z − 1 − n ) = ∞ X j =0 b v,j ( z 0 − n ) ε j . Using the same inductiv e approac h in Lemma 2 .5 , w e can prov e that Lemma 2.6. F or two mapp i n gs v = v ( ε ) , ˆ v = ˆ v ( ε ) : [0 , ∞ ) → W analytic at ε = 0 , if ord ( p v ( z − 1 − n )) , ord ( p ˆ v ( z − 1 − n )) ≤ k , we then ha ve b v,j ( z 0 − n ) = b ˆ v ,j ( z 0 − n ) , 0 ≤ j ≤ n − 4 k − 1 . Note that for n ≤ m, ˆ m , if v ( ε ) (or ˆ v ( ε )) is equal to p ( y n − 1 = ·| z − n − 1 − m ) (or p ( y n − 1 = ·| z − n − 1 − ˆ m )), then p v ( z 0 − n ) (or p ˆ v ( z 0 − n )) will b e equal t o p ( z 0 − n | z − n − 1 − m ) (or p ( z 0 − n | z − n − 1 − ˆ m )); and if fo r a Mark ov state y , v ( ε ) (or ˆ v ( ε ) ) is equal to p ( y n − 1 = ·| z − n − 1 − m y ) (or p ( y n − 1 = ·| z − n − 1 − ˆ m y )), then p v ( z 0 − n ) (or p ˆ v ( z 0 − n )) will be equal to p ( z 0 − n | z − n − 1 − m y ) (or p ( z 0 − n | z − n − 1 − ˆ m y )). It then immediately follo ws that Corollary 2.7. Given ﬁxe d se quenc es z 0 − m , z 0 − ˆ m , z 0 − m y − m − 1 , ˆ z 0 − ˆ m y − ˆ m − 1 with z 0 − n = ˆ z 0 − n such that ord ( p ( z − 1 − n | z − n − 1 − m )) , ord ( p ( ˆ z − 1 − n | ˆ z − n − 1 − ˆ m )) , ord ( p ( z − 1 − n | z − n − 1 − m y − m − 1 )) , ord ( p ( ˆ z − 1 − n | ˆ z − n − 1 − ˆ m y − ˆ m − 1 )) ≤ k , for n ≤ m, ˆ m a nd some k , we have for j with 0 ≤ j ≤ n − 4 k − 1 , b j ( z 0 − m y − m − 1 ) = b j ( ˆ z 0 − ˆ m y − ˆ m − 1 ) = b j ( z 0 − m ) = b j ( ˆ z 0 − ˆ m ) , (12) wher e slightly ab usin g the notation, we deﬁne b j ( z 0 − m y − m − 1 ) , b j ( ˆ z 0 − ˆ m y − ˆ m − 1 ) a s the c o eﬃcients of the T aylor series exp ansions of p ( z 0 | z 0 − m y − m − 1 ) , p ( ˆ z 0 | ˆ z 0 − ˆ m y − ˆ m − 1 ) , r esp e ctively. Consider expres sion (5). In the follow ing, w e use p ( z 0 | z − 1 − n ) to denote the tr uncated (up to the ( l + 1)- st term) T aylor series expansion o f p ( z 0 | z − 1 − n ), i.e., p ( z 0 | z − 1 − n ) = b 0 ( z 0 − n ) + b 1 ( z 0 − n ) ε + b 2 ( z 0 − n ) ε 2 + · · · + b l ( z 0 − n ) ε l . Theorem 2.8. F or a hidden Markov chain Z with normal ly p ar ameterize d ∆( ε ) , we have for any k ≥ 0 , H ( Z ) = H ( Z ) | ε =0 + k +1 X j =1 f j ε j log ε + k X j =1 g j ε j + O ( ε k +1 ) , (13) wher e f j ’s and g j ’s for j = 1 , 2 , · · · , k + 1 a r e functions (mor e s p e ciﬁc al ly, elementary func- tions built fr om lo g and p olynomials ) of ∆ ( i ) (0) for 0 ≤ i ≤ 6 k + 6 and c an b e c ompute d fr om H 6 k + 6 ( Z ( ε )) . 8 Pr o o f . First ﬁx n such that n ≥ n 0 = 6 k + 6. Consider the Bir ch upp er b ound on H ( Z ) H n ( Z ) := H ( Z 0 | Z − 1 − n ) = − X z 0 − n p ( z 0 − n ) log p ( z 0 | z − 1 − n ) . Note that fo r j ≥ k + 2,       X ord ( p ( z 0 − n ))= j p ( z 0 − n ) log p ( z 0 | z − 1 − n )       = O ( ε k +1 ) . (14) So, in the f ollo wing we only consider the sequences z 0 − n with ord ( p ( z 0 − n )) ≤ k + 1. F or suc h sequence s, since ord ( p ( z 0 | z − 1 − n )) ≤ ord ( p ( z 0 − n )) ≤ k + 1, w e hav e | log p ( z 0 | z − 1 − n ) − log p < 2 k + 1 > ( z 0 | z − 1 − n ) | = O ( ε k +1 ); (15) and by Lemma 2.5, w e hav e p < 2 k + 1 > ( z 0 | z − 1 − n ) = p < 2 k + 1 > ( z 0 | z − 1 − n 0 ) . (16) No w for any ﬁxed n ≥ n 0 , H n ( Z ) = X z 0 − n − p ( z 0 − n ) log p ( z 0 | z − 1 − n ) ( a ) = X ord ( p ( z 0 − n )) ≤ k +1 − p ( z 0 − n ) log p ( z 0 | z − 1 − n ) + O ( ε k +1 ) ( b ) = X ord ( p ( z 0 − n )) ≤ k +1 − p ( z 0 − n ) log p < 2 k + 1 > ( z 0 | z − 1 − n ) + O ( ε k +1 ) ( c ) = X ord ( p ( z 0 − n 0 )) ≤ k +1 − p ( z 0 − n ) log p < 2 k + 1 > ( z 0 | z − 1 − n 0 ) + O ( ε k +1 ) = X ord ( p ( z 0 − n 0 )) ≤ k +1 − p ( z 0 − n 0 ) log p < 2 k + 1 > ( z 0 | z − 1 − n 0 ) + O ( ε k +1 ) , (17) where (a ) f ollo ws f rom (14); (b) follows from (15); (c) follows from (16), (14) and the fact that { z 0 − n : o rd ( p ( z 0 − n 0 )) ≤ k + 1 } = { z 0 − n : o rd ( p ( z 0 − n )) ≤ k + 1 } ∪ { z 0 − n : o rd ( p ( z 0 − n 0 )) ≤ k + 1 , ord ( p ( z 0 − n )) ≥ k + 2 } . Expanding (17), w e obtain: H n ( Z ) = H ( Z ) | ε =0 + k +1 X j =1 f j ε j log ε + k X j =1 g j ε j + O ( ε k +1 ) , where f j ’s and g j ’s for j = 1 , 2 , · · · , k + 1 are functions dependen t o nly on ∆ ( i ) (0) for 0 ≤ i ≤ n 0 and can be computed fro m H n 0 ( Z ) (in fact for ﬁxed j , f j and g j are f unctions 9 dep enden t only o n ∆ ( i ) (0) fo r 0 ≤ i ≤ 6 j + 6 and can be computed f r o m H 6 j + 6 ( Z )). In particular, X ord ( p ( z 0 − n )) ≤ k +1 X ord ( p ( z 0 | z − 1 − n 0 ))=0 − p ( z 0 − n 0 ) log p < 2 k + 1 > ( z 0 | z − 1 − n 0 ) (18) will con tribute to H ( Z ) | ε =0 and the terms ε j , and X ord ( p ( z 0 − n )) ≤ k +1 X ord ( p ( z 0 | z − 1 − n 0 )) > 0 − p ( z 0 − n 0 ) log p < 2 k + 1 > ( z 0 | z − 1 − n 0 ) (19) will con tribute to the terms ε j log ε and the terms ε j . Using Corolla ry 2.7, one can apply similar argumen t as ab o v e to the Birch lo w er b ound ˜ H n ( Z ) := H ( Z 0 | Z − 1 − n Y − n − 1 ) = X z 0 − n ,y − n − 1 − p ( z 0 − n y − n − 1 ) log p ( z 0 | z − 1 − n y − n − 1 ) . F or the same n 0 , o ne can sho w tha t ˜ H n ( Z ) ta kes the same for m (1 7) as H n ( Z ), whic h implies that H n ( Z ) and ˜ H n ( Z ) hav e exactly the same co eﬃcien t s of ε j for j ≤ k and o f ε j log ε for j ≤ k + 1 when n ≥ n 0 . W e th us prov e the theorem. Remark 2.9. Theorem 2.8 still holds if w e assume eac h en try of ∆( ε ) is merely a C k +1 function of ε in a neigh b orho o d of ε = 0: the pro of still w o rks if “analytic” is replaced by “ C k +1 ”, and the T a ylor series expansions are replaced b y T a ylor p olynomials with remainder. W e assumed ana lyticity of the para metrization only f o r simplicit y . Remark 2.10. Note that at a Blac k Hole, w e ha v e ord ( p ( z 0 | z − 1 − n )) = 0 for any hidden Mark o v sym b ol seque nce z 0 − n . Thus , from the discussion surrounding expressions (18) and (19) ab ov e, w e see that f j = 0 f o r all j . By the pro of of Theorem 2.8, F orm ula (13) is a T a ylor p olynomial with remainder; this is consisten t with the T a ylor series fo r mula for a Blac k Hole in [8]. Remark 2.11. The pro of of Theorem 2.8 sho ws that for n ≥ n 0 , H n ( Z ) , ˜ H n ( Z ) tak e the same form a s in (13) with the same co eﬃcien ts. 3 Applications to Finite-State Memoryle ss Channel s at High Signal-to-No i se Ratio Consider a ﬁnite-state memoryless c hannel with stationa r y input pro cess. Here, C = { C n } is an i.i.d. channel state pro cess ov er ﬁnite alphab et C with p C ( c ) = q c for c ∈ C , X = { X n } is a stationary input pro cess, indep enden t of C , o v er ﬁnite alphab et X a nd Z = { Z n } is the resulting (stationary) o utput pro cess ov er ﬁnite a lphab et Z . Let p ( z n | x n , c n ) = P ( Z n = z n | X n = x n , C n = c n ) denote the probabilit y that a t time n , the channel o utput sym b ol is z n giv en that the c hannel state is c n and t he channe l input is x n . The mutual information for suc h a c hannel is: I ( X , Z ) := H ( Z ) − H ( Z | X ) ( ∗ ) = H ( Z ) − X x ∈X , z ∈Z p ( x, z ) log p ( z | x ) , 10 where ( ∗ ) fo llows from the memoryless prop erty of the c hannel, and fo r x ∈ X , z ∈ Z , p ( x, z ) = X c ∈C p ( z | x, c ) p ( x ) p ( c ) , p ( z | x ) = X c ∈C p ( z | x, c ) p ( c ) . No w w e in tro duce an alternative framew ork, using the concept of channel noise. As ab ov e, let C b e an i.i.d. c hannel state pro cess, and let X b e a stationary input pro cess, indep enden t of C , o v er ﬁnite alphab ets C , X . Let E (resp., Z ) b e ﬁnite alphab ets of abstract error ev ents (resp. o utput sym b ols) and let Φ : X × C × E → Z b e a function. F or eac h x ∈ X and c ∈ C , let p ( · | x, c ) b e a conditional probabilit y distribution on E . This deﬁnes a join tly distributed stationary pro cess ( X , C , E ) o v er X × C × E . If X is a ﬁrst order Mark o v c hain with transition proba bilit y matrix Π, then ( X , C , E ) is a Mark ov chain with transition probabilit y matrix ∆, deﬁned by ∆ ( x,c,e ) , ( y ,d,f ) = Π xy · q d · p ( f | y , d ) and Φ , ∆ deﬁne a hidden Marko v c hain, denoted Z (∆ , Φ). W e claim that the output pro cess Z , described in the ﬁrst paragr aph of this section, ﬁts in to this alternative framew or k (when X is a ﬁrst order Mark o v ch ain). T o see this, let E = X × C × Z , and deﬁne p ( e = ( x, c, z ) | x ′ , c ′ ) = p ( z | x, c ) if x = x ′ and c = c ′ , and 0 otherwise. Deﬁne Φ( x ′ , c ′ , ( x, c, z )) = z . The n, Z = Z (∆ , Φ) is a hidden Mark ov c hain. So, from hereon w e adopt the alternative framew or k. No w, w e assume tha t X is an irreducible ﬁrst order Mark ov c hain and that the channel is para meterized by ε suc h that for eac h x, c, and e , p ( e | x, c )( ε ) a r e analytic functions of ε ≥ 0 . F or eac h ε ≥ 0, let ∆( ε ) denote the corresp onding transition pro babilit y matrix o n state set X × C × E and { Z ( ε ) } denote the family of resulting output hidden Mark ov c hains. W e also a ssume that there is a one-to- one function fr o m X in to Z , z = z ( x ), suc h that for all c , p ( z ( x ) | x, c )(0) = 1. In other words, ε b ehav es lik e a “comp osite index” indicating ho w go o d the channel is, and small ε corresponds to the high signal- t o -noise ratio. Then one can v erify t ha t ∆(0) is a w eak black hole and ∆( ε ) is normally parameterized. Th us, b y Theorem 2.8, we obtain an asymptotic formula for H ( Z ( ε )) ar o und ε = 0. W e remark that the ab ov e naturally generalizes to the case where X is a higher order irreducible Mark o v c hain (throug h appropriately grouping mat r ices in to blo c ks). In the remainder of this section, we giv e three examples to illustrate the idea. Example 3.1. [Binary Mark ov Chains Corrupted b y BSC( ε )] Consider a binary symmetric c hannel with crosso v er probability ε . A t time n the c ha nnel can b e c haracterized by the following equation Z n = X n ⊕ E n , where { X n } denotes the input pro cess, ⊕ denotes binary addition, { E n } denotes the i.i.d. binary noise with p E (0) = 1 − ε a nd p E (1) = ε , a nd { Z n } denotes the corrupted output. Note that this c hannel o nly has one channel state, and at ε = 0, p Z | X (1 | 1) = 1 , p Z | X (0 | 0) = 1, so it ﬁts in the alternativ e framew ork describ ed in the b eginning of Section 3. Indeed, supp ose X is a ﬁrst o r der irreducible Mark ov c hain with the transition probability matrix Π =  π 00 π 01 π 10 π 11  . 11 Then Y = { Y n } = { ( X n , E n ) } is join tly Mark ov with transition probability matrix (the column and row indices of the f ollo wing mat rix are o rdered alphab etically): ∆ =     π 00 (1 − ε ) π 00 ε π 01 (1 − ε ) π 01 ε π 00 (1 − ε ) π 00 ε π 01 (1 − ε ) π 01 ε π 10 (1 − ε ) π 10 ε π 11 (1 − ε ) π 11 ε π 10 (1 − ε ) π 10 ε π 11 (1 − ε ) π 11 ε     , and Z = Φ( Y ) is a hidden Mark ov chain with Φ(0 , 0) = Φ(1 , 1) = 0, Φ(0 , 1) = Φ(1 , 0) = 1. When ε = 0, ∆ =     π 00 0 π 01 0 π 00 0 π 01 0 π 10 0 π 11 0 π 10 0 π 11 0     , ∆ 0 =     π 00 0 0 0 π 00 0 0 0 π 10 0 0 0 π 10 0 0 0     , ∆ 1 =     0 0 π 01 0 0 0 π 01 0 0 0 π 11 0 0 0 π 11 0     , th us b o t h ∆ 0 and ∆ 1 ha v e ra nk one. If π ij ’s are all po sitiv e, then we ha v e a Black Hole case, for which one can deriv e the T ay lor series expansion of H ( Z ) around ε = 0 [20, 8]; if π 00 or π 11 are zero, then this is a we ak Blac k hole case with normal parameterization (of ε ) , for whic h Theorem 2.8 can b e applied and an asymptotic fo rm ula for H ( Z ) around ε = 0 can b e deriv ed. F or a ﬁrst order Marko v c hain X with the following transition probability matrix  1 − p p 1 0  , where 0 ≤ p ≤ 1, it has b een sho wn [16] t ha t H ( Z ) = H ( X ) − p (2 − p ) 1 + p ε log ε + O ( ε ) as ε → 0. This r esult has b een further g eneralized [9, 13] to the follo wing form ula: H ( Z ) = H ( X ) + f ( X ) ε log(1 /ε ) + g ( X ) ε + O ( ε 2 log ε ) , (20) where X is the input Mark ov c hain of any order m with transition probabilities P ( X t = a 0 | X t − 1 t − m = a − 1 − m ), a 0 − m ∈ X m , where X = { 0 , 1 } , Z is t he output pro cess obtained by pa ssing X through a BSC( ε ), and f ( X ) and g ( X ) can b e explicitly computed. Theorem 2.8 can b e used to generalize (2 0) to a formula with higher asymptotic terms. In particular, when P ( X t = a 0 | X t − 1 t − m = a − 1 − m ) > 0 for a 0 − m ∈ X m +1 , w e hav e a Blac k Hole, in whic h case, the T a ylor series expansions of H ( Z ) around ε = 0 can b e explicitly computed (in principle); when P ( X t = a 0 | X t − 1 t − m = a − 1 − m ) = 0 for some a 0 − m ∈ X m +1 , w e ha v e a w eak Black Hole, in whic h case an asymptotic formula of H ( Z ) around ε = 0 can b e obtained. Example 3.2. [Binary Mark ov Chains Corrupted b y BEC( ε )] Consider a binary erasure c hannel with ﬁxed erasure rate ε (denoted b y BEC( ε )). A t time n the c hannel can b e c haracterized b y the f ollo wing equation Z n =  X n if E n = 0 e if E n = 1 , 12 where { X n } denotes the input pro cess, e denotes the erasure, { E n } denotes the i.i.d. binary noise with p E (0) = 1 − ε and p E (1) = ε , and { Z n } denotes the corrupted output. Aga in this c hannel o nly has one c hannel state, and at ε = 0, p Z | X (1 | 1) = 1 , p Z | X (0 | 0) = 1, so it ﬁts in the alternative framew ork describ ed in the b eginning of Section 3. If t he input X is a ﬁrst order irreducible Mark o v c hain with transition probabilit y matrix Π =  π 00 π 01 π 10 π 11  , and let Z denote t he output pro cess. Then Y = ( X, E ) is join tly Mark ov with (the column and row indices of the following matrix are ordered alphab etically) ∆ =     π 00 (1 − ε ) π 00 ε π 01 (1 − ε ) π 01 ε π 00 (1 − ε ) π 00 ε π 01 (1 − ε ) π 01 ε π 10 (1 − ε ) π 10 ε π 11 (1 − ε ) π 11 ε π 10 (1 − ε ) π 10 ε π 11 (1 − ε ) π 11 ε     , and Z = Φ( Y ) is hidden Marko v with Φ(0 , 1) = Φ(1 , 1 ) = e , Φ(0 , 0) = 0 and Φ(1 , 0) = 1. No w one c hec ks that ∆ 0 =     π 00 (1 − ε ) 0 0 0 π 00 (1 − ε ) 0 0 0 π 10 (1 − ε ) 0 0 0 π 10 (1 − ε ) 0 0 0     , ∆ 1 =     0 0 π 01 (1 − ε ) 0 0 0 π 01 (1 − ε ) 0 0 0 π 11 (1 − ε ) 0 0 0 π 11 (1 − ε ) 0     , ∆ e =     0 π 00 ε 0 π 01 ε 0 π 00 ε 0 π 01 ε 0 π 10 ε 0 π 11 ε 0 π 10 ε 0 π 11 ε     . One chec ks that ∆( ε ) is normally parameterized b y ε and th us Theorem 2.8 can b e applied. F urthermore, Theorem 2.8 can be applied to the case when the input is a n m -th order irreducible Mark ov c hain X to obtain asymptotic form ula for H ( Z ) around ε = 0. Example 3.3. [Binary Mark ov Chains Corrupted b y Sp ecial Gilb ert- Elliot Channel] Consider a binary Gilb ert- Elliot c hannel, whose c hannel state (denoted by C = { C n } ) v aries as an i.i.d. binary sto c hastic pro cess with p C (0) = q 0 , p C (1) = q 1 (here the c hannel state v aries as an i.i.d. pro cess, ra ther than a generic Mark o v pro cess). A t time n the channel can b e c haracterized by the following equation Z n = X n ⊕ E n , where { X n } denotes the input pro cess, ⊕ denotes binary addition, { E n } denotes the i.i.d. binary no ise with p E | C (0 | 0) = 1 − ε 0 , p E | C (0 | 1) = 1 − ε 1 , p E | C (1 | 0) = ε 0 , p E | C (1 | 1) = ε 1 and { Z n } denotes the corrupted output. F or suc h a c hannel, p Z | ( X,C ) (1 | 1 , c ) = 1 , p Z | ( X,C ) (0 | 0 , c ) = 1 at ε = 0 fo r an y channel stat e c . So it ﬁts in the alternative f r amew ork described in the b eginning of Section 3. T o see this in more detail, w e consider the sp ecial case when the input X is a ﬁrst order irreducible Mark ov c hain with transition probability mat r ix Π =  π 00 π 01 π 10 π 11  , 13 and let Z denote the output pro cess. Then Y = ( X , C , E ) is join tly Mark o v with (the column and row indices of the following matrix are ordered alphab etically) ∆ = 2 6 6 6 6 6 6 6 6 6 4 π 00 q 0 (1 − ε 0 ) π 00 q 0 ε 0 π 00 q 1 (1 − ε 1 ) π 00 q 1 ε 1 π 01 q 0 (1 − ε 0 ) π 01 q 0 ε 0 π 01 q 1 (1 − ε 1 ) π 01 q 1 ε 1 π 00 q 0 (1 − ε 0 ) π 00 q 0 ε 0 π 00 q 1 (1 − ε 1 ) π 00 q 1 ε 1 π 01 q 0 (1 − ε 0 ) π 01 q 0 ε 0 π 01 q 1 (1 − ε 1 ) π 01 q 1 ε 1 π 00 q 0 (1 − ε 0 ) π 00 q 0 ε 0 π 00 q 1 (1 − ε 1 ) π 00 q 1 ε 1 π 01 q 0 (1 − ε 0 ) π 01 q 0 ε 0 π 01 q 1 (1 − ε 1 ) π 01 q 1 ε 1 π 00 q 0 (1 − ε 0 ) π 00 q 0 ε 0 π 00 q 1 (1 − ε 1 ) π 00 q 1 ε 1 π 01 q 0 (1 − ε 0 ) π 01 q 0 ε 0 π 01 q 1 (1 − ε 1 ) π 01 q 1 ε 1 π 10 q 0 (1 − ε 0 ) π 10 q 0 ε 0 π 10 q 1 (1 − ε 1 ) π 10 q 1 ε 1 π 11 q 0 (1 − ε 0 ) π 11 q 0 ε 0 π 11 q 1 (1 − ε 1 ) π 11 q 1 ε 1 π 10 q 0 (1 − ε 0 ) π 10 q 0 ε 0 π 10 q 1 (1 − ε 1 ) π 10 q 1 ε 1 π 11 q 0 (1 − ε 0 ) π 11 q 0 ε 0 π 11 q 1 (1 − ε 1 ) π 11 q 1 ε 1 π 10 q 0 (1 − ε 0 ) π 10 q 0 ε 0 π 10 q 1 (1 − ε 1 ) π 10 q 1 ε 1 π 11 q 0 (1 − ε 0 ) π 11 q 0 ε 0 π 11 q 1 (1 − ε 1 ) π 11 q 1 ε 1 π 10 q 0 (1 − ε 0 ) π 10 q 0 ε 0 π 10 q 1 (1 − ε 1 ) π 10 q 1 ε 1 π 11 q 0 (1 − ε 0 ) π 11 q 0 ε 0 π 11 q 1 (1 − ε 1 ) π 11 q 1 ε 1 3 7 7 7 7 7 7 7 7 7 5 , and Z = Φ( X, C , E ) is hidden Marko v with Φ(0 , 0 , 0) = Φ(0 , 1 , 0) = Φ(1 , 0 , 1) = Φ(1 , 1 , 1) = 0 , Φ(0 , 0 , 1) = Φ(0 , 1 , 1) = Φ(1 , 0 , 0) = Φ(1 , 1 , 0) = 1 . F or some p ositiv e k , let ε 0 = ε, ε 1 = k ε . If ε = 0, one c hec ks that ∆ 0 =             π 00 q 0 0 π 00 q 1 0 0 0 0 0 π 00 q 0 0 π 00 q 1 0 0 0 0 0 π 00 q 0 0 π 00 q 1 0 0 0 0 0 π 00 q 0 0 π 00 q 1 0 0 0 0 0 π 10 q 0 0 π 10 q 1 0 0 0 0 0 π 10 q 0 0 π 10 q 1 0 0 0 0 0 π 10 q 0 0 π 10 q 1 0 0 0 0 0 π 10 q 0 0 π 10 q 1 0 0 0 0 0             , ∆ 1 =             0 0 0 0 π 01 q 0 0 π 01 q 1 0 0 0 0 0 π 01 q 0 0 π 01 q 1 0 0 0 0 0 π 01 q 0 0 π 01 q 1 0 0 0 0 0 π 01 q 0 0 π 01 q 1 0 0 0 0 0 π 11 q 0 0 π 11 q 1 0 0 0 0 0 π 11 q 0 0 π 11 q 1 0 0 0 0 0 π 11 q 0 0 π 11 q 1 0 0 0 0 0 π 11 q 0 0 π 11 q 1 0             . So, b oth ∆ 0 and ∆ 1 will b e rank o ne matrices and one can c heck t ha t ∆( ε ) is normally parameterized b y ε . Again, Theorem 2 .8 can b e a pplied to the case when the input is an m -th order irreducible Mark o v c hain X to obtain an asymptotic fo r mula for H ( Z ) around ε = 0. References [1] D. Arnold and H. Lo eliger. The information rate of binar y- input c hannels with memory . Pr o c. 2001 IEEE I n t. Conf. o n Communic ations , ( Helsinki, Finland), pp. 2692 –2695, June 11-14 2001 . [2] D. M. Arnold, H.-A. Lo eliger, P . O . V ontobel, A. Ka vc ic, W. Zeng, “Sim ulation- Based Computation of Information Rates for Channels With Memory ,” IEEE T r an s . I nforma- tion The ory , 52 , 3498–3 508, 2006. [3] J. Birch. Approx imations fo r the en tropy for functions of Mark ov chains. A nn. Math. Statist. , 33:930 – 938, 1 962. [4] D. Blac kw ell. The en tropy of functions of ﬁnite-state Marko v c hains. T r ans. First Pr ague Conf. Info rm ation Tho ery, Statistic al De cision F unctions, R a n dom Pr o c esses , pages 13–20 , 1957 . 14 [5] S. Egner, V. Balakirsky , L. T olhuize n, S. Baggen and H. Hollmann. On the entrop y rate of a hidden Mark o v mo del. Pr o c e e dings of the 20 04 IEEE International Symp osium on Information The ory , page 12, June 2 7-July 2, Chicago, U.S.A., 2004. [6] R. Gharav i and V. Anan tharam. An upp er b ound for the largest Ly apunov exp onent of a Marko vian pro duct of nonnegative matrices. The or etic al Computer Scienc e , V ol. 332, Nos. 1-3, pp. 5 4 3 -557, F ebruary 2005. [7] G. Han and B. Marcus. Analyticity o f en tropy ra te o f hidden Mark ov c hains. IEEE T r ansactions on Information The ory , V olume 52, Issue 12, Decem b er, 2006 , pages: 5251-52 66. [8] G. Han and B. Marcus. Deriv atives of Entrop y Rate in Sp ecial F amilies of Hidden Mark o v Chains. IEEE T r ansactions on Information T he ory , V olume 53, Issue 7, July 2007, P age(s):2642 - 265 2 . [9] G. Han and B. Marcus. Asymptotics of noisy constrained capacit y . Pr o c. ISI T 200 7 , Nice, June 24-June 29, 2007, P ages:991-99 5 . The journal v ersion of this pap er (with sligh tly diﬀeren t title) a v aila ble at [10] T. Holliday , A. Goldsmith and P . Glynn. Capacit y of ﬁnite state marko v c hannels with general inputs. Pr o c e e dings of the 2003 IEEE I nternational Symp osium on Info rmation The ory , 29 June-4 July 20 03, P age(s):289 - 289 . [11] T. Holliday , A. Goldsmith, and P . G lynn. Capacity of Finite State Channels Based on Ly apuno v Exp onents of Random Matrices. IEEE T r ansactions on Information The ory , V olume 52, Issue 8, Aug. 20 06, P age(s):3509 - 35 3 2. [12] P . Jacquet, G. Seroussi, and W. Szpank o wski. On the Entrop y of a Hidden Mark o v Pro cess (extended a bstract). Data Comp r ession Confer enc e , 362–3 71, Sno wbird, 2 004. [13] P . Ja cquet, G. Seroussi, a nd W. Szpank ow ski. Noisy Constrained Capacit y . Interna- tional Symp osi um o n Info rm ation The ory , 986-9 9 0, Nice, 2007. [14] D. Lind a nd B. Marcus. An Intr o duction to S ymb olic Dynamics and Co ding . Cam bridge Univ ersit y Press, 1995 . [15] E. Ordentlic h and T. W eissman. On the optimality of sym b o l by sym b o l ﬁltering and denoising. Information The ory, IEEE T r ansactions , V olume 52, Issue 1, Jan. 2006 P age(s):19 - 40. [16] E. Orden tlic h and T. W eissm an. New b ounds on the en tropy ra te of hidden Mark o v pro cess. IEEE I nformation The ory Workshop , San An tonio, T exas, 24-29 Oct. 2004, P age(s):117 - 1 2 2. [17] H. Pﬁster, J. Soriaga a nd P . Siegel. The ac hiev able information rates of ﬁnite-state ISI c hannels. Pr o c. IEEE GLOBECOM , (San Antonio, TX), pp. 2 992–2996 , Nov. 2 001. 15 [18] E. Seneta. Spri n ger Series in Statistics . Non-negativ e Matrices a nd Mark o v Chains. Springer-V erlag, New Y ork Heidelb erg Berlin, 1980. [19] V. Sharma and S. Sing h. En tro py and c hannel capacit y in the regenerativ e setup with applications to Mark ov channels . Pr o c. IEEE I ntern. Symp. on Inform. The ory , (W ash- ington, D.C.), p. 283, June 24-29 2001. [20] O. Zuk, I. K a n ter and E. Do man y . The entrop y of a binary hidden Mark o v pro cess. J. Stat. Phys. , 121(3 - 4): 343-360 (2005 ) [21] O. Z uk, E. Domany , I. Kanter, and M. Aizenman. F ro m Finite-System En tropy to En trop y Rate for a Hidden Marko v Pro cess. Sign al Pr o c essing L etters, IEEE , V olume 13, Issue 9, Sept. 2006 Page(s):517 - 520. 16

Asymptotics of Entropy Rate in Special Families of Hidden Markov Chains

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment