Joint Source-Channel Coding Revisited: Information-Spectrum Approach

Join t Source-C hannel Co ding Revisit e d: Information-Sp ect r u m Approac h † T e Sun HAN ‡ Octob er 28, 2018 † This pap er is an extended reﬁnement of a part of Chapter 3 in t h e bo ok Han [11]. ‡ T e Sun Han was with Graduate Sc ho ol of In formation Systems, Universit y of Ele ctro- Comm unications, Chofugaok a 1 -5-1, Chofu, T okyo 182-8585, Japan. He i s no w visiting Department of Computer Science, F acult y of Science and Engineering, W aseda Universit y , Ro om 902, Bld. 201 (Shinjuk u Lamdax Building), Oh kub o 2-4-12, S hinjuku-ku , T okyo 169-0072, Japan. E-mail: han@is.uec.ac.jp, han@aoni.was eda.jp Abstract: Giv en a general sour ce V = { V n } ∞ n =1 with c ountably inﬁ- nite sour ce alphab et and a general c hannel W = { W n } ∞ n =1 with arbitrary abstr act channel inpu t/channel output alphab ets, we study the joint source- c hannel co ding problem f rom the in formation-sp ectrum p oin t of view. First, w e g eneralize F einstein’s l emma (direct part) a nd V erd ´ u-Han’s lemma (con- v erse part) so as to b e applicable to the general joint sou r ce-c hann el co ding problem. Based on these lemmas, we establish a suﬃcient condition as well as a necessary condition for the source V to b e reliably transmissib le o ver the c hannel W with asymptotically v anish in g p robabilit y of error. It is sho wn that our su ﬃcien t condition is equiv alen t to th e su ﬃcien t condition deriv ed b y V em b u, V erd ´ u and Stein b erg [9], whereas our necessary co nd i- tion is sho wn to b e stronger than or equiv alen t to the necessary cond ition deriv ed by them. It turns out, a s a dir ect consequence, that “ sep ar ation principle ” in a relev an tly generaliz ed sense h olds for a wid e class of sour ces and c hannels, as was sho wn in a quite dﬁﬀerent manner b y V embu, V erd ´ u and Stein b erg [9]. It should also b e remarked that a nice d u alit y is found b e- t w een our necessary and su ﬃ cien t conditions, w h ereas w e cannot fully enjoy suc h a d ualit y b et wee n the necessary condition and the suﬃcien t condition b y V em b u, V erd ´ u and Steinberg [9]. In addition, w e demonstrate a suf- ﬁcien t condition as well as a necessary condition for the ε -transmissib ilit y (0 ≤ ε < 1 ). Finally , t he separation theorem of th e traditional standard form is sho wn to hold for the class of sources and c hannels that satisfy the semi-strong con v erse prop ert y . Index terms: general source, general c h an n el, joint sour ce-c hann el co ding, separation theorem, information-sp ectrum , trans missibilit y , gener- alized F einstein’s lemma, generalized V erd ´ u-Han’s lemma 0 1 In tro duction Giv en a sour ce V = { V n } ∞ n =1 and a c hannel W = { W n } ∞ n =1 , joint sour c e- channel c o ding means that the enco der maps th e outpu t from the sou r ce directly to the c h annel input ( one step enc o ding ), wh ere the probabilit y of deco ding error is required to v anish as blo c k-length n tends to ∞ . In u sual situations, how ev er, the join t source-c hann el cod ing can b e d ecomp osed into separate sour c e c o ding and channel c o ding ( tw o step enc o ding ). This t w o s tep enco ding do es not cause any d isadv an tages from the standp oin t of asymp- totical ly v anishing error probabilities, provided that the so-called Sep ar ation The or em holds. T ypically , the traditional separation theorem, wh ic h we call the sepa- ration theorem in the narr ow sense , states that if the inﬁmum R f ( V ) of all achiev able ﬁ xed-length co ding rates for the source V is smaller than the capacit y C ( W ) for the c hann el W , then th e source V is reliably transm is- sible by t wo step encodin g o v er the c hann el W ; w hereas if R f ( V ) is larger than C ( W ) then the reliable transmission is imp ossible. While the form er statemen t is alw a ys true for an y general source V and any general channel W , the latte r statemen t is not alw a ys true. Then, a v ery natural question ma y b e raised for w hat class of sources and c h annels and in wh at sense the separation theorem holds in general. Shannon [1] has ﬁrst shown that the separation theorem h olds for the class of stationary memoryless sources and c h an n els. Since then, this theo- rem has receiv ed extensiv e atten tion by a num b er of researc hers w ho ha v e attempted to pro ve v ersions that app ly to more and more general classes of sources and c h annels. Among others, for example, Dobrushin [4], Pinsker [5], and Hu [6] ha ve stud ied the separation theorem p roblem in the framew ork of information-stable sources and channels. Recen tly , on the other hand, V em bu, V erd ´ u and Stein b erg [9] ha ve p ut forth this problem in a muc h more ge neral informatio n-sp ectrum con text with general sour ce V and general c hannel W . F rom the viewp oin t of information sp ectra, they ha v e generalized the notion of separation theorem and s ho wn t hat, usu ally in man y cases ev en with R f ( V ) > C ( W ), it is p ossible to reliably transmit the output of the source V o v er the c hannel W . F urthermore, in terms of information sp ectra, they ha ve established a suﬃcien t condition for the transmiss ibilit y as we ll as a n ecessary condition. It should b e noticed here that, in this ge neral joint sour ce-c hannel coding situation, what indeed matters is not th e v alidit y problem of the traditional t yp e of s eparation theorems but the d er iv ation problem of necessary and /or 1 suﬃcien t conditions for the transmissibilit y from the inf ormation-sp ectrum p oin t of view. Ho wev er, while th eir s u ﬃcien t co nd ition looks simple and signiﬁcant ly tigh t, their n ecessary condition do es not lo ok quite close to tight . The p resen t pap er was m ainly m otiv ated by the r easonable question wh y the form s of these tw o co nd itions lo ok rather very diﬀerent from one another. First, in Section 3, the b asic to ols to answ er this question are established, i.e., tw o f undamenta l lemmas: a generalization of F einstein’s lemma [2 ] and a generalization of V erd ´ u-Han’s lemma [8], wh ic h p r o v id e with the very basis for the k ey results to b e s tated in the subsequent sec- tions. These lemmas are of dualistic information-sp ectrum forms, wh ic h is in nice accordance with the general joint sour ce-c hannel co ding framework. In Section 4, giv en a general sour ce V and a general c hannel W , w e establish, in terms of information-sp ectra, a suﬃcien t condition ( D i r e ct the or em ) for th e transmissibilit y as well as a necessary cond ition ( Converse the or em ). The forms of these t wo conditions are very close from eac h other, and “fairly” coincides with one another, pro vided that w e dare disregard some relev an t asymptotically v anish ing term. Next, we equiv alen tly r ewrite these conditions in the forms useful to see relations to the separation theorem. As a consequen ce, it turn s out that a sep aration-theorem-lik e equiv alen t of our suﬃcien t condition just coin- cides with the suﬃ cien t condition give n by V em bu , V erd ´ u and Stein b erg [9 ], whereas a separation-theorem-lik e equiv alen t of our n ecessary condition is sho wn to b e strictly str on ger than or equiv alent to the necessary condition giv en by them. Here it is pleasing to observ e th at a nice du alit y is found b e- t w een our necessary and su ﬃ cien t conditions, w h ereas w e cannot fully enjoy suc h a d ualit y b et wee n the necessary condition and the suﬃcien t condition b y V embu, V erd ´ u and Steinberg [9]. On the other h and, in Section 5, we demonstrate a suﬃcient cond ition as w ell as a necessary cond ition for the ε -transmissibilit y , which is the gen- eralizatio n of th e suﬃcient condition as w ell as the necessary condition as w as sho wn in Section 4. Finally , in Section 6, w e restrict the class of sources and channels to those that satisfy the strong con verse p rop ert y (or, more generally , the semi-strong con v erse pr op ert y) to show that the s eparation theorem in the traditional sense holds for this class. 2 2 Basic Notation and Deﬁnitions In this preliminary s ection, we prep are the basic notat ion and deﬁnitions whic h will b e us ed in the sub sequen t sections. 2.1 General Sources Let us ﬁrs t giv e here the formal d eﬁn tion of the general sou r ce. A gener al sour c es is deﬁned as an inﬁn ite sequence V = { V n = ( V ( n ) 1 , · · · , V ( n ) n ) } ∞ n =1 of n -dimensional rand om v ariables V n where eac h comp onen t rand om v ariable V ( n ) i (1 ≤ i ≤ n ) take s v alues in a c ountably inﬁnite set V that w e call the sour c e alphab e t . It should b e noted here that eac h comp onen t of V n ma y change dep ending on blo c k length n . This implies that the sequence V is quite general in the sen s e that it may not satisfy ev en the consistency condition as usual pro cesses, where the consistency condition means that for an y int egers m, n such that m < n it holds that V ( m ) i ≡ V ( n ) i for all i = 1 , 2 , · · · , m. The class of s ources thus deﬁned cov ers a very wide range of sources includin g all nonstationary and/or nonergo dic sources (cf. Han and V erd ´ u [7]). 2.2 General Channels The formal deﬁnition of a general c hannel is as follo ws. Let X , Y b e arbitrary abstr act ( not necessarily count able) sets, wh ic h w e call th e input alphab et and the output alpha b e t , resp ectiv ely . A g ener al c hannel is deﬁned as an inﬁnite sequence W = { W n : X n → Y n } ∞ n =1 of n -dimensional p r obabilit y transition matrices W n , where W n ( y | x ) ( x ∈ X n , y ∈ Y n ) d enotes the conditonal p robabilit y of y giv en x . ∗ The class of channels thus deﬁned co vers a very wide range of channels in cluding all nonstationary and /or nonergo dic channels with arbitrary memory structur es (cf. Han and V erd ´ u [7]). Remark 2.1 A more reasonable deﬁn ition of a general source is the fol- lo w ing. Let {V n } ∞ n =1 b e any sequence of ar bitr ary source alphab ets V n (a coun tabley inﬁnite or abstract set) and let V n b e any rand om v ariable taking v alues in V n ( n = 1 , 2 , · · · ). Then, the sequence V = { V n } ∞ n =1 of r andom ∗ In the case where the output alphabet Y is abstr act , W n ( y | x ) is understo od to b e the (conditional) probability measure element W n ( d y | x ) th at is measurable in x . 3 v ariables V n is called a gener al sour c e (cf. V erd ´ u and Han [10]). The ab o ve deﬁnition is a sp ecial ca se of this general source w ith V n = V n ( n = 1 , 2 , · · · ). On the other hand, a more reasonable deﬁnition of the general c hann el is the follo wing. Let { W n : X n → Y n } ∞ n =1 b e any sequence of arbitr ary probabilit y tr ansition matrices, where X n , Y n are arbitrary abstr act s ets. Then, the sequence W = { W n } ∞ n =1 of probabilit y transitio n matrices W n is calle d a gener al channel (cf. Han [11]). T h e ab ov e deﬁn ition is a sp ecial case of this general c hannel w ith X n = X n , Y n = Y n ( n = 1 , 2 , · · · ). The results in this pap er (Lemma 3.1, Lemma 3.2, Theorem 4.1, Theorem 4.2, Theorem 4.3, Th eorem 4.4, Theorem 5.1, Theorem 5.2 and Theorems 6.1 ∼ 6.7 ) con tin ue to b e v alid as w ell also in this more general setting with V n , V n , V and X n , Y n , W n , W replaced b y V n , V n , V and X n , Y n , W n , W , resp ectiv ely . In the sequel we use the conv enti on that P Z ( · ) d en otes the pr obabilit y distribution of a random v ariable Z , whereas P Z | U ( ·|· ) d enotes the condi- tional p robabilit y d istribution of a random v ariable Z give n a random v ari- able U . ✷ 2.3 Join t Source-Channel Co ding Let V = { V n = ( V ( n ) 1 , · · · , V ( n ) n ) } ∞ n =1 b e any general source, and let W = { W n ( ·|· ) : X n → Y n } ∞ n =1 b e any general c hann el. W e consider an enc o der ϕ n : V n → X n and a de c o der ψ n : Y n → V n , and put X n = ϕ n ( V n ). Then, denoting b y Y n the output from the c hannel W n due to the inp u t X n , w e ha v e the ob vious r elation: V n → X n → Y n (a Mark o v chain) . (2.1) The err or pr ob ability ε n with co de ( ϕ n , ψ n ) is deﬁned by ε n ≡ Pr { V n 6 = ψ n ( Y n ) } = X v ∈V n P V n ( v ) W n ( D c ( v ) | ϕ n ( v )) , (2.2) where D ( v ) ≡ { y ∈ Y n | ψ n ( y ) = v } ( ∀ v ∈ V n ) ( D ( v ) is called the de c o ding set for v ) and “ c ” denotes the complemen t of a set. A pair ( ϕ n , ψ n ) with error probabilit y ε n is sim p ly called a joint source-c hannel code ( n, ε n ). W e now deﬁne the tr ansmissibility in terms of joint source-c hann el co des ( n, ε n ) as 4 Deﬁnition 2.1 Source V is transmissible ov er c hann el W def ⇐ ⇒ There exists an ( n, ε n ) co de suc h that lim n →∞ ε n = 0 . With this d eﬁnition of transmissibilit y , in the follo win g sections w e shall establish a suﬃcient condition as w ell as a necessary cond ition for the trans- missibilit y when we are giv en a geneal source V and a general c h annel W . These t wo conditions are very close to eac h other and could actually b e seen as giving “ almost the same c ondition, ” pro v id ed that w e dare d isregard an asymptotically n egligible term γ n → 0 app earing in those cond itions (cf. Section 4). Remark 2.2 Th e qu antit y ε n deﬁned b y (2.2) is more sp eciﬁcally called the aver age error probability , b ecause it is av eraged with resp ect to P V n ( v ) o ver all sour ce outputs v ∈ V n . On the other hand, w e ma y deﬁne another kind of error probabilit y by ε n ≡ sup v : P V n ( v ) > 0 W n ( D c ( v ) | ϕ n ( v )) , (2.3) whic h we call the maximum error probab ility . It is eviden t that the trans- missibilit y in the maxim um sense implies th e tran s missibilit y in the av erage sense. Ho wev er, the in ve rse is n ot necessarily true. T o see this, it suﬃces to consider the f ollo wing simple example. Let the source, channel input, c hannel output alph ab ets b e V n = { 0 , 1 , 2 } , X n = { 1 , 2 } , Y n = { 1 , 2 } , re- sp ectiv ely; and the (deterministic) c hannel W n : X n → Y n b e deﬁned b y W n ( j | i ) = 1 for i = j, W n (1 | 0) = 1. Moreo ve r, let the source V n ha v e probabilit y distribution P V n (0) = α n , P V n (1) = P V n (2) = 1 − α n 2 ( α n → 0 as n → ∞ ). One of the best c hoices of p ossible pairs of enco der-deco der ( ϕ n : V n → X n , ψ n : Y n → V n ), either in the a ve rage sense or in the max- im um sense, is such that ϕ n ( i ) = i for i = 1 , 2; ϕ n (0) = 1; ψ n ( i ) = i for i = 1 , 2. Then, the a v erage error probabilit y is ε a n = α n → 0 , while the maxim um error pr obabilit y is ε m n = 1 . Thus, in this case, the s ource V n is transmissible in the av erage sense o v er the c hannel W n , wh ile it is not transmissible in the maxim um sens e. Hereafter, the pr ob ab ility ε n is understo o d to d enote the “a verag e” error probabilit y , unless otherwise stated. ✷ 5 3 F undamen tal Lemmas In this section, we prepare tw o fundamental lemmas that are needed in the n ext section in order to establish the main theorems ( Dir e ct p art and Converse p art ). Lemma 3.1 (Generalization of F einstein’s lemma) Giv en a general source V = { V n } ∞ n =1 and a general c hannel W = { W n } ∞ n =1 , let X n b e an y inpu t random v ariable taking v alues in X n and Y n b e the c hannel output via W n due to the channel inpu t X n , w here V n → X n → Y n . T hen, for ev ery n = 1 , 2 , · · · , there exists an ( n, ε n ) co de s u c h that ε n ≤ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ  + e − nγ , (3.1) where † γ > 0 is an arbitrary p ositiv e n u m b er. Remark 3.1 In a sp ecial case w here the source V = { V n } ∞ n =1 is uniformly distributed on the massage set M n = { 1 , 2 , · · · , M n } , it follo ws that 1 n log 1 P V n ( V n ) = 1 n log M n , whic h imp lies that the en tropy sp ectrum ‡ of the source V = { V n } ∞ n =1 is exactly one p oin t sp ectrum concen trated on 1 n log M n . Th er efore, in this sp ecial case, Lemma 3.1 reduecs to F einstein’s lemma [2]. ✷ Pr o of of L emma3.1 : F or eac h v ∈ V n , generate x ( v ) ∈ X n at random acc ording to the con- ditional distribution P X n | V n ( ·| v ) and let x ( v ) b e the codeword for v . In other w ords, we deﬁne the encoder ϕ n : V n → X n as ϕ n ( v ) = x ( v ), where † In the case where the input and output alph ab ets X , Y are abstr act ( not n ecessaril y countable), W n ( Y n | X n ) P Y n ( Y n ) in (3.1) is und erstoo d to b e g ( Y n | X n ), where g ( y | x ) ≡ W n ( d y | x ) P Y n ( d y ) = W n ( d y | x ) P X n ( d x ) P Y n ( d y ) P X n ( d x ) = P X n Y n ( d x ,d y ) P X n ( d x ) P Y n ( d y ) is the Radon-Nikody m deriva tive that is measurable in ( x , y ). ‡ The probablity distribution of 1 n log 1 P V n ( V n ) is called the entr opy sp e ctrum of the source V = { V n } ∞ n =1 , whereas the probability distribution of 1 n log W n ( Y n | X n ) P Y n ( Y n ) is calle d the mutual information sp e ctrum of th e channel W = { W n } ∞ n =1 giv en the input X = { X n } ∞ n =1 (cf. Han and V erd ´ u [7]). 6 { x ( v ) | ∀ v ∈ V n } are all indep end en tly generated. W e d eﬁne the d ecoder ψ n : Y n → V n as follo w s: S et S n =  ( v , x , y ) ∈ Z n     1 n log W n ( y | x ) P Y n ( y ) > 1 n log 1 P V n ( v ) + γ  , (3.2) S n ( v ) = { ( x , y ) ∈ X n × Y n | ( v , x , y ) ∈ S n } , (3.3) where for simplicit y w e hav e put Z n ≡ V n × X n × Y n . Supp ose that the deco der ψ n receiv ed a c hannel output y ∈ Y n . If there exists one and only one v ∈ V n suc h that ( x ( v ) , y ) ∈ S n ( v ), deﬁne the deco der as ψ n ( y ) = v ; otherwise, let the output of the deco der ψ n ( y ) ∈ V n b e arbitrary . Then, the probabilit y ε n of error for this pair ( ϕ n , ψ n ) (a verage d o ver all the realiza- tioins of the random co de) is giv en b y ε n = X v ∈V n P V n ( v ) ε n ( v ) , (3.4) where ε n ( v ) is the p robabilit y of err or (av eraged ov er all th e r ealiza tioins of the rand om cod e) when v ∈ V n is the source output. W e can ev aluate ε n ( v ) as ε n ( v ) ≤ Pr { ( x ( v ) , Y n ) / ∈ S n ( v ) } + Pr    [ v ′ : v ′ 6 = v  ( x ( v ′ ) , Y n ) ∈ S n ( v ′ )     ≤ Pr { ( x ( v ) , Y n ) / ∈ S n ( v ) } + X v ′ : v ′ 6 = v Pr  ( x ( v ′ ) , Y n ) ∈ S n ( v ′ )  , (3.5) where Y n is the c hannel output via W n due to the c hannel input x ( v ). Th e ﬁrst term on the righ t-hand side of (3.5) is written as A n ( v ) ≡ Pr { ( x ( v ) , Y n ) / ∈ S n ( v ) } = X ( x , y ) / ∈ S n ( v ) P X n Y n | V n ( x , y | v ) . Hence, X v ∈V n P V n ( v ) A n ( v ) = X v ∈V n P V n ( v ) X ( x , y ) / ∈ S n ( v ) P X n Y n | V n ( x , y | v ) = X ( v , x , y ) / ∈ S n P V n X n Y n ( v , x , y ) = Pr { V n X n Y n / ∈ S n } . (3.6) 7 On the other hand, n oting that x ( v ′ ) , x ( v ) ( v ′ 6 = v ) are in d ep endent and hence x ( v ′ ), Y n are also indep endent, the second term on the righ t-han d side of (3.5) is ev aluated as B n ( v ) ≡ X v ′ : v ′ 6 = v Pr  ( x ( v ′ ) , Y n ) ∈ S n ( v ′ )  = X v ′ : v ′ 6 = v X ( x , y ) ∈ S n ( v ′ ) P Y n | V n ( y | v ) P X n | V n ( x | v ′ ) ≤ X v ′ ∈V n X ( x , y ) ∈ S n ( v ′ ) P Y n | V n ( y | v ) P X n | V n ( x | v ′ ) . Hence, X v ∈V n P V n ( v ) B n ( v ) ≤ X v ∈V n X v ′ ∈V n X ( x , y ) ∈ S n ( v ′ ) P V n ( v ) P Y n | V n ( y | v ) P X n | V n ( x | v ′ ) = X v ′ ∈V n X ( x , y ) ∈ S n ( v ′ ) P Y n ( y ) P X n | V n ( x | v ′ ) . (3.7) On the other hand, in view of (3.2), (3.3), ( x , y ) ∈ S n ( v ′ ) imp lies P Y n ( y ) ≤ P V n ( v ′ ) W n ( y | x ) e − nγ . Therefore, (3.7) is furth er transformed to X v ∈V n P V n ( v ) B n ( v ) ≤ e − nγ X v ′ ∈V n X ( x , y ) ∈ S n ( v ′ ) P V n ( v ′ ) P X n | V n ( x | v ′ ) W n ( y | x ) ≤ e − nγ X ( v ′ , x , y ) ∈Z n P V n ( v ′ ) P X n | V n ( x | v ′ ) W n ( y | x ) = e − nγ . (3.8) Then, from (3.4), (3.6) and (3.8) it follo ws that ε n = X v ∈V n P V n ( v ) ε n ( v ) ≤ X v ∈V n P V n ( v ) A n ( v ) + X v ∈V n P V n ( v ) B n ( v ) ≤ Pr { V n X n Y n / ∈ S n } + e − nγ . 8 Th us, there must exist a d eterministic ( n , ε n ) co de s u c h that ε n ≤ Pr { V n X n Y n / ∈ S n } + e − nγ , thereb y p ro ving Lemm a 3.1. ✷ Lemma 3.2 (Generalization of V erd ´ u-Han’s lemma) L et V = { V n } ∞ n =1 and W = { W n } ∞ n =1 b e a general source and a general c hannel, resp ectiv ely , and let ϕ n : V n → X n b e the enco der of an ( n, ε n ) co de for ( V n , W n ). Put X n = ϕ n ( V n ) and let Y n b e the c hann el output via W n due to the c hannel input X n , w here V n → X n → Y n . Th en, f or ev ery n = 1 , 2 , · · · , it h olds that ε n ≥ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) − γ  − e − nγ , (3.9) where γ > 0 is an arbitrary p ositiv e num b er. Remark 3.2 In a sp ecial case w here the source V = { V n } ∞ n =1 is uniformly distributed on the massage set M n = { 1 , 2 , · · · , M n } , it follo ws that 1 n log 1 P V n ( V n ) = 1 n log M n , whic h imp lies that the entrop y sp ectrum of the source V = { V n } ∞ n =1 is exactly one p oin t sp ectrum concen trated on 1 n log M n . Th er efore, in this sp ecial case, Lemma 3.2 reduecs to V erd ´ u-Han’s lemma [8]. ✷ Pr o of of L emma3.2 Deﬁne L n =  ( v , x , y ) ∈ Z n     1 n log W n ( y | x ) P Y n ( y ) ≤ 1 n log 1 P V n ( v ) − γ  , (3.10) and, for eac h v ∈ V n set D ( v ) = { y ∈ Y n | ψ n ( y ) = v } , that is, D ( v ) is the deco ding set for v . Moreo ver, f or eac h ( v , x ) ∈ V n × X n , set B ( v , x ) = { y ∈ Y n | ( v , x , y ) ∈ L n } . (3.11) 9 Then, noting the Mark o v c hain prop ert y (2.1), w e hav e Pr { V n X n Y n ∈ L n } = X ( v , x , y ) ∈ L n P V n X n Y n ( v , x , y ) = X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( B ( v , x ) | x ) = X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( B ( v , x ) ∩ D c ( v ) | x ) + X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( B ( v , x ) ∩ D ( v ) | x ) ≤ X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( D c ( v ) | x ) + X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( B ( v , x ) ∩ D ( v ) | x ) = ε n + X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( B ( v , x ) ∩ D ( v ) | x ) = ε n + X ( v , x ) ∈V n ×X n P V n X n ( v , x ) X y ∈B ( v , x ) ∩D ( v ) W n ( y | x ) , (3.12) where w e hav e used the relation: ε n = X ( v , x ) ∈V n ×X n P V n X n ( v , x ) W n ( D c ( v ) | x ) . No w , it follo ws fr om (3.10) and (3.11) that y ∈ B ( v , x ) implies W n ( y | x ) ≤ e − nγ P Y n ( y ) P V n ( v ) , whic h is su bstituted into the r igh t-hand side of (3.12) to yield Pr { V n X n Y n ∈ L n } ≤ ε n + e − nγ X ( v , x ) ∈V n ×X n P X n | V n ( x | v ) X y ∈B ( v , x ) ∩D ( v ) P Y n ( y ) ≤ ε n + e − nγ X ( v , x ) ∈V n ×X n P X n | V n ( x | v ) P Y n ( D ( v )) = ε n + e − nγ X v ∈V n P Y n ( D ( v )) = ε n + e − nγ , 10 thereb y p ro ving th e clai m of the lemma. ✷ 4 Theorems on T ransmissibilit y In this section we giv e b oth of a suﬃcient cond ition and a necessary cond ition for the transmissibilit y with a giv en general souce V = { V n } ∞ n =1 and a g ive n general c hannel W = { W n } ∞ n =1 . First, Lemma 3.1 immediately leads us to the follo wing direct th eorem: Theorem 4.1 (Direct theorem) Let V = { V n } ∞ n =1 , W = { W n } ∞ n =1 b e a general source and a general c hann el, resp ectiv ely . If there exist some c hannel inp ut X = { X n } ∞ n =1 and some sequence { γ n } ∞ n =1 satisfying γ n > 0 , γ n → 0 an d nγ n → ∞ ( n → ∞ ) (4.1) for whic h it holds that lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ n  = 0 , (4.2) then the s ou r ce V = { V n } ∞ n =1 is transmissib le o ve r the c h annel W = { W n } ∞ n =1 , where Y n is the c h annel outpu t via W n due to the c hann el in put X n and V n → X n → Y n . Pr o of: Since in Lemma 3.1 we can choose the constan t γ > 0 so as to dep end on n , let us tak e, instead of γ , an arbitrary { γ n } ∞ n =1 satisfying condition (4.1). Th en, the second term on the righ t-hand side of (3.1) v anish es as n tends to ∞ , and hence it follo ws from (4.2) that th e r igh t-hand side of (3.1) v anishes as n tends to ∞ . Therefore, th e ( n, ε n ) co de as sp eciﬁed in Lemma 3.1 satisﬁes lim n →∞ ε n = 0. ✷ Next, Lemma 3.2 immediately leads u s to the follo wing conv erse theorem: Theorem 4.2 (Conv erse theorem) S upp ose that a general source V = { V n } ∞ n =1 is transmissib le o v er a general c hann el W = { W n } ∞ n =1 . Let the c hannel inpu t b e X = { X n ≡ ϕ n ( V n ) } ∞ n =1 where ϕ n : V n → X n is the c hannel encod er. Then, for any sequence { γ n } ∞ n =1 satisfying condition (4.1), it holds that lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) − γ n  = 0 , (4.3) 11 where Y n is the channel output via W n due to the channel input X n and V n → X n → Y n . Pr o of: If V is transmissib le ov er W , then , by Deﬁnition 2.1 there exists an ( n, ε n ) co de suc h that lim n →∞ ε n = 0. Hence, the claim of the th eorem imme- diately follo ws from (3.9) in Lemma 3.2 with γ n instead of γ . ✷ Remark 4.1 Comp arin g (4.3) in Th eorem 4.2 with (4.2) in Theorem 4.1, w e observ e that the only diﬀerence is that the s ign of γ n is c hanged from + to − . Since γ n v anishes as n tends to ∞ , this d iﬀerence is asym p totical ly negligible. ✷ No w , let us think of the implication of conditions (4.2) and (4.3). First, let us think of (4.2). Pu tting A n = 1 n log W n ( Y n | X n ) P Y n ( Y n ) , B n = 1 n log 1 P V n ( V n ) for simplicit y , (4.2) is wr itten as α n ≡ Pr { A n ≤ B n + γ n } → 0 ( n → ∞ ) , (4.4) whic h can b e transf orm ed to Pr { A n ≤ B n + γ n } = X u Pr { B n = u } Pr { A n ≤ B n + γ n | B n = u } = X u Pr { B n = u } Pr { A n ≤ u + γ n | B n = u } . Set T n = { u | Pr { A n ≤ u + γ n | B n = u } ≤ √ α n } , (4.5) then b y virtu e of (4.4) and Marko v inequalit y , w e hav e Pr { B n ∈ T n } ≥ 1 − √ α n . (4.6) Let us no w deﬁn e the upp er cumulativ e p robabilities for A n , B n b y P n ( t ) = Pr { A n ≥ t } , Q n ( t ) = Pr { B n ≥ t } , 12 then it follo w s that P n itj = X u Pr { B n = u } Pr { A n ≥ t | B n = u } ≥ X u ∈ T n : u ≥ t − γ n Pr { B n = u } Pr { A n ≥ t | B n = u } ≥ X u ∈ T n : u ≥ t − γ n Pr { B n = u } Pr { A n ≥ u + γ n | B n = u } . (4.7) On the other hand, by means of (4.5), u ∈ T n implies th at Pr { A n ≥ u + γ n | B n = u } ≥ 1 − √ α n . Theore, by (4.6), (4.7) it is concluded that P n ( t ) ≥ (1 − √ α n ) X u ∈ T n : u ≥ t − γ n Pr { B n = u } ≥ (1 − √ α n )( Q n ( t − γ n ) − Pr { B n / ∈ T n } ) ≥ (1 − √ α n )( Q n ( t − γ n ) − √ α n ) ≥ Q n ( t − γ n ) − 2 √ α n . That is, P n ( t ) ≥ Q n ( t − γ n ) − 2 √ α n . This means th at, for all t , the upp er cum u lativ e prob ab ility P n ( t ) of A n is la rger than or equ al to the upp er cumulativ e pr obabilit y Q n ( t − γ n ) of B n , except for the asymptotically v anish ing diﬀerence 2 √ α n . This in turn implies that, as a whole, the mutual information sp ectrum of the channel is shifted to the right in comparison w ith the entrop y sp ectrum of the source. With − γ n instead of γ n , the same imp licatio n follo ws also from (4.3). It is suc h an allo cation relatio n b et ween the mutual information sp ectrum and the entrop y sp ectrum that enables us to mak e an transmissible joint source- c hannel co ding. Ho wev er, it is n ot easy in general to c hec k wh ether conditions (4.2), (4.3) in these forms are satisﬁed or not. Therefore, we consider to equiv alently rewrite conditions (4.2), (4.3) into alternativ e information-sp ectrum forms hop efully easier to depict an int uitive pictur e. T his can actually b e done by re-c h o osing th e inpu t and outp u t v ariables X n , Y n as b elow. These forms 13 are useful in order to see the r elatio n of conditions (4.2), (4.3) with the so-calle d sep ar ation the or em . First, w e sho w another information-sp ectrum form equiv alent to the suf- ﬁcien t cond ition (4.2) in T heorem 4.1. Theorem 4.3 (Equiv alence of suﬃcien t conditions) The follo wing t wo conditions are equiv alent : 1) F or some channel input X = { X n } ∞ n =1 and some sequence { γ n } ∞ n =1 satisfying condition (4.1), it h olds that lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ n  = 0 , (4.8) where Y n is the channel output via W n due to the channel input X n and V n → X n → Y n . 2) ( Strict domination: V embu, V erd ´ u and Stein b erg [9]) F or some c hannel inpu t X = { X n } ∞ n =1 , some sequence { c n } ∞ n =1 and some sequence { γ n } ∞ n =1 satisfying condition (4.1), it holds that lim n →∞  Pr  1 n log 1 P V n ( V n ) ≥ c n  + Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ c n + γ n  = 0 , (4.9) where Y n is the c hannel outpu t via W n due to the channel in put X n . Remark 4.2 (separation in general) @ Th e suﬃcien t cond ition 2) in Theorem 4.3 means th at the entrop y sp ectrum of the source and the mu- tual information sp ectrum of the c hannel are asymptotically completely split with a v acan t b ound ary of asymptotica lly v anish ing width γ n , and the for- mer is placed to the left of the latter, wh ere th ese t w o sp ectra ma y oscillate “sync hronously” with n . In the case where suc h a separation condition 2) is satisﬁed, we can sp lit reliable j oint sour ce-c hannel cod ing in tw o steps as follo ws ( sep ar ation of sour ce co ding and channel co d in g): W e ﬁrst en cod e the sour ce output V n at the ﬁxed-length co ding r ate c n = 1 n log M n ( M n is the size of the message set M n ), and then enco de the output of the source enco der into the c h annel. The error probabilty ε n for this t wo step co ding is 14 upp er b ounded b y the sum of the error probabilit y of the ﬁxed -length source co ding (cf. V embu, V erd ´ u and Steinberg [9]; Han [11, Lemma 1.3.1]): Pr  1 n log 1 P V n ( V n ) ≥ c n  and the “maxim um” error p r obabilit y of the c hannel co ding (cf. F einstein [2], Ash [3], Han [11, Lemma 3.4.1]): Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ c n + γ n  + e − nγ n . It then follo ws from (4.9) that b oth of these t wo err or probabilities v anish as n tends to ∞ , where it sh ould b e noted th at e − nγ n → 0 as n → ∞ . Thus, w e ha v e lim n →∞ ε n = 0 to conclude that the source V = { V n } ∞ n =1 is transmissib le o ver the c han n el W = { W n } ∞ n =1 . This can b e regarded as p ro viding another pro of of T heorem 4.1. ✷ Pr o of of The or em 4.3: 2) ⇒ 1): F or an y joint p robabilit y d istr ibution P V n X n for V n and X n , we ha v e Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ n  ≤ Pr  1 n log 1 P V n ( V n ) ≥ c n  + Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ c n + γ n  , whic h together with (4.9) implies (4.8). 1) ⇒ 2)F Supp osing that condition 1) holds, pu t α n ≡ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ n  , (4.10) and moreo v er, with γ ′ n = γ n 4 , δ n = max( √ α n , e − nγ ′ n ), deﬁne d n = sup  R     Pr  1 n log 1 P V n ( V n ) ≥ R  > δ n  − γ ′ n . (4.11) 15 F urthermore, d eﬁne S n =  v ∈ V n     1 n log 1 P V n ( v ) ≥ d n  , (4.12) λ (1) n = Pr { V n ∈ S n } , λ (2) n = Pr { V n / ∈ S n } , (4.13) then the join t probabilit y distribu tion P V n X n Y n can b e written as a mixture: P V n X n Y n ( v , x , y ) = λ (1) n P ˜ V n ˜ X n ˜ Y n ( v , x , y ) + λ (2) n P V n X n Y n ( v , x , y ) , (4.14) where P ˜ V n ˜ X n ˜ Y n , P V n X n Y n are the conditional probabilit y distributions of V n X n Y n conditioned on V n ∈ S n , V n / ∈ S n , resp ectiv ely . W e notice h ere that the Mark o v c hain prop ert y V n → X n → Y n implies P ˜ Y n | ˜ X n = P Y n | X n = W n and the Marko v c hain prop erties ˜ V n → ˜ X n → ˜ Y n , V n → X n → Y n . W e now rewrite (4.10) as α n = λ (1) n Pr ( 1 n log W n ( ˜ Y n | ˜ X n ) P Y n ( ˜ Y n ) ≤ 1 n log 1 P V n ( ˜ V n ) + γ n ) + λ (2) n Pr ( 1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ n ) . (4.15) On the other hand , since (4.11), (4.12) lead to λ (1) n > δ n ≥ √ α n , it follo ws from (4.15) that Pr ( 1 n log W n ( ˜ Y n | ˜ X n ) P Y n ( ˜ Y n ) ≤ 1 n log 1 P V n ( ˜ V n ) + γ n ) ≤ √ α n . (4.16) Then, by the deﬁnition of ˜ V n , 1 n log 1 P V n ( ˜ V n ) ≥ d n , and so from (4.16), w e obtain Pr ( 1 n log W n ( ˜ Y n | ˜ X n ) P Y n ( ˜ Y n ) ≤ d n + γ n ) ≤ √ α n . (4.17) 16 Next, since it follo w s from (4.14) that P Y n ( y ) = λ (1) n P ˜ Y n ( y ) + λ (2) n P Y n ( y ) ≥ λ (1) n P ˜ Y n ( y ) ≥ δ n P ˜ Y n ( y ) ≥ e − nγ ′ n P ˜ Y n ( y ) , w e hav e 1 n log 1 P Y n ( ˜ Y n ) ≤ 1 n log 1 P ˜ Y n ( ˜ Y n ) + γ ′ n , whic h is su bstituted into (4.17) to get Pr ( 1 n log W n ( ˜ Y n | ˜ X n ) P ˜ Y n ( ˜ Y n ) ≤ d n + γ n − γ ′ n ) ≤ √ α n . (4.18) On the other hand, by the deﬁnition (4.11) of d n , Pr  1 n log 1 P V n ( V n ) ≥ d n + 2 γ ′ n  ≤ δ n . (4.19) Set c n = d n + 2 γ ′ n and note that α n → 0 , δ n → 0 ( n → ∞ ) and γ ′ n = γ n 4 , then b y (4.18), (4.19) we hav e lim n →∞  Pr  1 n log 1 P V n ( V n ) ≥ c n  + Pr ( 1 n log W n ( ˜ Y n | ˜ X n ) P ˜ Y n ( ˜ Y n ) ≤ c n + 1 4 γ n )! = 0 . Finally , resetting ˜ X n ˜ Y n , 1 4 γ n as X n Y n and γ n , resp ectiv ely , w e conclude that condition 2), i.e., (4.9) h olds. ✷ Ha vin g established an information-sp ectrum separation-lik e form of the suﬃcien t condition (4.2) in Theorem 4.1 , let us now tu rn to demonstrate sev eral information-sp ectrum v ersions derive d fr om the necessary condition (4.3) in Theorem 4.2. Prop osition 4.1 (Necessary conditions) The follo wing t wo are neces- sary conditions for the transmissibility . 17 1 ) F or some c h annel input X = { X n } ∞ n =1 and for any sequence { γ n } ∞ n =1 satisfying condition (4.1), it h olds that lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) − γ n  = 0 , (4.20) where Y n is the channel output via W n due to the channel input X n and V n → X n → Y n . 2) F or any sequence { γ n } ∞ n =1 satisfying cond ition (4.1) and for some c hannel inp ut X = { X n } ∞ n =1 , it holds that lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) − γ n  = 0 , (4.21) where Y n is the channel output via W n due to the channel input X n and V n → X n → Y n . Pr o of: The necessit y of condition 1) im m ediately follo ws from n ecessit y con- dition (4.3) in T h eorem 4.2. Moreo v er, it is also trivial to see that condition 1) imp lies condition 2) as an immediate logical consequence, and hen ce con- dition 2) is also a necessary condition. ✷ The necessary condition 1) in T heorem 4.4 b elo w is the same as condition 2) in Prop osition 4.1. Th is is written here again in order to emph asize a pleasing d ualit y b et w een Theorem 4.3 and Theorem 4.4, which reﬂects on the dualit y b et w een tw o f undamenta l Lemmas 3.1 and 3.2 . Theorem 4.4 (Equiv alence of necessary conditions) The follo wing t w o conditions are equiv alent : 1) F or any sequence { γ n } ∞ n =1 satisfying cond ition (4.1) and for some c hannel inp ut X = { X n } ∞ n =1 , it holds that lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) − γ n  = 0 , (4.22) where Y n is the channel output via W n due to the channel input X n and V n → X n → Y n . 18 2) ( Domination ) F or any sequ ence { γ n } ∞ n =1 satisfying cond ition (4.1) and f or some c hannel inp ut X = { X n } ∞ n =1 and some sequen ce { c n } ∞ n =1 , it holds that lim n →∞  Pr  1 n log 1 P V n ( V n ) ≥ c n  + Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ c n − γ n  = 0 , (4.23) where Y n is the c hannel outpu t via W n due to the channel in put X n . Pr o of: This theorem can b e pro ved in the entirely same manner as in the pro of of Theorem 4.3 with γ n replaced by − γ n . ✷ Remark 4.3 Or iginally , the deﬁn ition of domination giv en by V em bu, V erd ´ u and Stein b erg [9] is not cond ition 2) in T h eorem 4.4 but the f ollo wing: 2 ′ ) (Domination) F or any sequence { d n } ∞ n =1 and any sequence { γ n } ∞ n =1 satisfying condition (4.1) , there exists some channel input X = { X n } ∞ n =1 suc h that lim n →∞  Pr  1 n log 1 P V n ( V n ) ≥ d n  @ × Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ d n − γ n  = 0 (4.24) holds, w here Y n is the c han n el output via W n due to the c h annel inp ut X n . ✷ This necessary condition 2 ′ ) is imp lied b y necessary condition 2) in The- orem 4.4. T o s ee this, set α n ≡ Pr  1 n log 1 P V n ( V n ) ≥ c n  , (4.25) β n ≡ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ c n − γ n  , (4.26) κ n ≡ Pr  1 n log 1 P V n ( V n ) ≥ d n  , (4.27) µ n ≡ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ d n − γ n  . (4.28) 19 Then, w e observe that κ n ≤ α n if d n ≥ c n ; and µ n ≤ β n if d n ≤ c n , and hence it follo ws from condition 2) that κ n µ n ≤ α n + β n → 0 as n tends to ∞ . Th us, condition 2) imp lies condition 2 ′ ), w hic h means that condition 2) is strictly stronger than or equiv alent to condition 2 ′ ) as necessary conditions for the tran s missibilit y . It is not cur ren tly clear, how ev er, whether b oth are equiv alen t or not. ✷ Remark 4.4 Cond ition 2) in Theorem 4.4 of this f orm is used later to directly prov e Theorem 6.6 (separation theorem), while condition 2 ′ ) in Re- mark 4.3 of this form is irrelev ant for this pu rp ose. ✷ 5 ε -T ransmissibilit y Theorem So far w e ha v e considered only the case where the error p r obabilit y ε n sat- isﬁes the condition lim n →∞ ε n = 0. How ev er, we can relax this condition as follo ws: lim sup n →∞ ε n ≤ ε, (5.1) where ε is any constan t suc h that 0 ≤ ε < 1. (It is ob vious that the sp ecial case with ε = 0 coincides w ith the case that w e h a ve considered so far.) W e no w say that the source V is ε - tr ansmissible ov er the c hann el W wh en there exists an ( n, ε n ) co de satisfying condition (5.1). Then, the same argumen ts as in th e previous sectio ns with due sligh t mo diﬁcations lead to the follo win g t wo theorems in parallel with Theorem 4.1 and Theorem 4.2, resp ectiv ely: Theorem 5.1 ( ε -Direct theorem) L et V = { V n } ∞ n =1 , W = { W n } ∞ n =1 b e a general source and a general c hannel, resp ectiv ely . If there exist some c hannel inp ut X = { X n } ∞ n =1 and some sequence { γ n } ∞ n =1 suc h that γ n > 0 , γ n → 0 an d nγ n → ∞ ( n → ∞ ) (5.2) for whic h it holds that lim sup n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) + γ n  ≤ ε, (5.3) then the sou r ce V = { V n } ∞ n =1 is ε -transmissible o v er the c hann el W = { W n } ∞ n =1 , where Y n is the c h annel outpu t via W n due to the c hann el in put 20 X n and V n → X n → Y n . ✷ Theorem 5.2 ( ε -Conv erse theorem) S upp ose that a general source V = { V n } ∞ n =1 is ε -transmissible o v er a general channel W = { W n } ∞ n =1 , and let the c hannel input b e X = { X n ≡ ϕ n ( V n ) } ∞ n =1 where ϕ n : V n → X n is the c hannel encod er. Then, for any sequence { γ n } ∞ n =1 satisfying condition (5.2), it holds that lim sup n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ 1 n log 1 P V n ( V n ) − γ n  ≤ ε, (5.4) where Y n is the channel output via W n due to the channel input X n and V n → X n → Y n . ✷ Remark 5.1 It sh ould b e noted here th at suc h a su ﬃcien t condition (5.3) as well as suc h a necessary condition (5.4) for the ε -transm iss ibilit y can- not actually b e deriv ed in the wa y of generalizing the str ict domin ation in (4.9) and the domination in (4.2 3). It sh ould b e noted also that, under th e ε -transmissibilit y criterion, joint source-c hann el co ding is b eyo nd the sepa- ration principle. ✷ 6 Separation Th eorems of the T raditional T yp e Th us far we ha v e inv estigated the join t source-c h annel cod ing pr ob lem from the viewp oin t of in formation sp ectra and established the fund amen tal the- orems (Theorems 4.1 ∼ 4.4). T hese results are of seemingly diﬀerent f orms from separation theorems of the traditional t yp e. Then, it wo uld b e n atural to ask a question how the separation pr in ciple of the information sp ectrum t yp e is related to separatio n theorems of the traditional t yp e. In this section w e addr ess this qu estion. T o d o so, w e ﬁ rst n eed s ome p r eparation. W e denote by R f ( V ) the inﬁmum of all ac h iev able ﬁxed -length co ding rates for a general source V = { V n } ∞ n =1 (as for th e formal deﬁn ition, see Han and V erd ´ u [7], Han [11, Deﬁnitions 1.1.1, 1.1.2]), and d enote by C ( W ) the capacit y of a general c hannel W = { W n : X n → Y n } ∞ n =1 (as for the formal deﬁnition, see Han and V erd ´ u [7], Han [11, Deﬁnitions 3.1 .1, 3.1.2]). First, R f ( V ) is c haracterized as 21 Theorem 6.1 ( Han and V erd ´ u [7], Han[11]) R f ( V ) = H ( V ) , (6.1) where § H ( V ) = p- lim su p n →∞ 1 n log 1 P V n ( V n ) . (6.2) Next, let us consider ab out the c haracterization of C ( W ). Giv en a general c hann el W = { W n } ∞ n =1 and its input X = { X n } ∞ n =1 , let Y = { Y n } ∞ n =1 b e the output du e to the in p ut X = { X n } ∞ n =1 via th e channel W = { W n } ∞ n =1 . Deﬁne Deﬁnition 6.1 I ( X ; Y ) = p- lim inf n →∞ 1 n log W n ( Y n | X n ) P Y n ( Y n ) . (6.3) Then, the capacit y C ( W ) is c haracterized as f ollo ws. Theorem 6.2 ( V erd ´ u and Han [8], Han[11]) C ( W ) = sup X I ( X ; Y ) , (6.4) where sup X means th e supremum ov er all p ossible in p uts X . ✷ With these preparations, let us turn to the separation theorem p rob- lem of the traditional type. A general source V = { V n } ∞ n =1 is said to b e information-stable (cf. Dobru shin [4 ], P insk er [5]) if 1 n log 1 P V n ( V n ) H n ( V n ) → 1 in prob. , (6.5) where H n ( V n ) = 1 n H ( V n ) and H ( V n ) stands for the entrop y of V n (cf. Co v er and Thomas [13]). Moreo v er, a general c hannel W = { W n } ∞ n =1 is § F or an arbitrary sequence of real-v alued random v ariables { Z n } ∞ n =1 , we deﬁne the follo wing notions (cf. Han and V erd ´ u [7], Han[11]): p- lim sup n →∞ Z n ≡ inf { α | lim n →∞ Pr { Z n > α } = 0 } (the limi t sup erior in pr ob abil ity ), and p - lim inf n →∞ Z n ≡ sup { β | lim n →∞ Pr { Z n < β } = 0 } (th e li mit inferior in pr ob ability ). 22 said to b e information-stable (cf. Dobru s hin [4], Pinske r [5], Hu [6]) if there exists a c hannel inp ut X = { X n } ∞ n =1 suc h that 1 n log W ( Y n | X n ) P Y n ( Y n ) C n ( W n ) → 1 in prob. , (6.6) where C n ( W n ) = sup X n 1 n I ( X n ; Y n ) , and Y n is the channel outp ut via W n due to the c hannel input X n ; and I ( X n ; Y n ) is th e m utual information b et we en X n and Y n (cf. Co ve r and Thomas [13]). Then, we can summarize a typical separation theorem of the traditional t yp e as follo w s. Theorem 6.3 ( Dobru s hin [4], Pinsker [5]) Let the c hann el W = { W n } ∞ n =1 b e information-stable and supp ose that the limit lim n →∞ C n ( W n ) exists, or, let the source V = { V n } ∞ n =1 b e information-stable and su pp ose that the limit lim n →∞ H n ( V n ) exists. Th en , the follo wing t wo statemen ts h old: 1) If R f ( V ) < C ( W ), then the source V is transmiss ib le o v er the c hann el W . In this case, we can separate the source co ding and the channel co ding. 2) If the source V is trans m issible o v er the c hannel W , th en it must hold that R f ( V ) ≤ C ( W ). ✷ In ord er to generalize Theorem 6.3, w e need to introd uce the concept of optimistic co din g. The “optimistic” stand p oin t means that w e ev aluate the co ding reliabilit y w ith err or pr obabilit y lim inf n →∞ ε n = 0 (that is, ε n < ∀ ε for inﬁnitely many n ). I n con trast with this, the stand p oin t th at w e hav e tak en so far is called p essimistic with error probabilit y lim n →∞ ε n = 0 (that is, ε n < ∀ ε for al l suﬃciently lar ge n ). The f ollo wing one co ncerns the optimistic source co din g with an y general source V . Deﬁnition 6.2 (Opt imistic ac hiev ability for source co ding) Rate R is optimistically achiev able def ⇐ ⇒ There exists an ( n, M n , ε n )- source co de satisfying lim inf n →∞ ε n = 0 and lim sup n →∞ 1 n log M n ≤ R, 23 where 1 n log M n is the co ding rate p er source letter (see, e.g., Han [11, S ection 1.1]). Deﬁnition 6.3 (Opt imistic ac hiev able ﬁxed- le ngt h co ding rate) R f ( V ) = inf { R | R is optimistically ac hiev able } . Then, for an y general sour ce V = { V n } ∞ n =1 w e hav e: Theorem 6.4 ( Ch en and Ala ja ji [14]) R f ( V ) = inf  R     lim inf n →∞ Pr  1 n log 1 P V n ( V n ) ≥ R  = 0  . (6.7) On the other hand , the next one concerns the optimistic c hann el capacit y . Deﬁnition 6.4 (Opt imistic ac hiev ability for c hannel co ding) Rate R is optimistically achiev able def ⇐ ⇒ There exists an ( n, M n , ε n )-c h an n el co de satisfying lim inf n →∞ ε n = 0 and lim inf n →∞ 1 n log M n ≥ R , where 1 n log M n is the co din g r ate p er c hann el use (see, e.g., Han [11, Section 3.1]). Deﬁnition 6.5 (Opt imistic c hannel capacit y) C ( W ) = sup { R | R is optimistically ac hiev able } . Then, with a general c hannel W = { W n } ∞ n =1 w e ha v e Theorem 6.5 ( Ch en and Ala ja ji [14]) C ( W ) = sup X sup  R     lim inf n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ R  = 0  , (6.8) where Y n is the output due to the inpu t X = { X n } ∞ n =1 . ✷ 24 Remark 6.1 It is not d iﬃcult to c hec k that, in parallel with Theorem 6.4 and Theorem 6.5, Theorem 6.1 and Theorem 6.2 can b e rewritten as R f ( V ) = inf  R     lim n →∞ Pr  1 n log 1 P V n ( V n ) ≥ R  = 0  , (6.9) C ( W ) = sup X sup  R     lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ R  = 0  , (6.10) from whic h , together with Theorem 6.4 and Theorem 6.5, it immediately follo ws that C ( W ) ≤ C ( W ) , (6.11) R f ( V ) ≤ R f ( V ) . (6.12) No w , we ha v e: Theorem 6.6 Let W = { W n } ∞ n =1 b e a general channel and V = { V n } ∞ n =1 b e a general source. Th en, the follo wing t wo s tatemen ts hold: 1) If R f ( V ) < C ( W ), then the source V is transmiss ib le o v er the c hann el W . In this case, we can separate the source co ding and the channel co ding. 2) If the source V is trans m issible o v er the c hannel W , th en it must hold that R f ( V ) ≤ C ( W ) , (6.13) R f ( V ) ≤ C ( W ) . (6.1 4) Remark 6.2 As wa s men tioned in Remark 4.4, w e u se Theorem 4.4 in or- der to pro v e (6.13) and (6 .14), where inequalit y (6.14) wa s shown in a rather roundab out manner by V embu, V erd ´ u and Stein b erg [9] (in voki ng Domina- tion 2 ′ ) in Remark 4.3 instead of Domination 2) in Theorem 4.4). ✷ Pr o of of The or em 6.6 . 1): Since R f ( V ) = H ( V ), C ( W ) = su p X I ( X ; Y ) by T heorem 6.1 and Theorem 6.2 , the inequalit y R f ( V ) < C ( W ) implies that condition 2) in Theorem 4.3 holds for X = { X n } ∞ n =1 attaining the supremum sup X I ( X ; Y ) 25 with, for examp le, c n = 1 2 ( R f ( V ) + C ( W )). Therefore, the source V is transmissible o ver the channel W . 2): If the source V is transmissible o ver the channel W , then cond ition 2) in Theorem 4.4 holds with some { c n } ∞ n =1 , i.e., lim n →∞ Pr  1 n log 1 P V n ( V n ) ≥ c n  = 0 , (6.15) lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ c n − γ n  = 0 . (6.16) Since lim n →∞ γ n = 0, these t wo conditions with any small constan t δ > 0 lead us to the follo wing formulas: lim inf n →∞ Pr  1 n log 1 P V n ( V n ) ≥ lim inf n →∞ c n + δ  = 0 , (6.17) lim n →∞ Pr  1 n log 1 P V n ( V n ) ≥ lim sup n →∞ c n + δ  = 0 , (6.18) lim n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ lim inf n →∞ c n − δ  = 0 , (6.19) lim inf n →∞ Pr  1 n log W n ( Y n | X n ) P Y n ( Y n ) ≤ lim sup n →∞ c n − δ  = 0 . (6.20) Then, Th eorem 6.4 and (6.17) imply that R f ( V ) ≤ lim inf n →∞ c n , whereas (6.19) implies that I ( X ; Y ) ≥ lim inf n →∞ c n . Therefore, b y T heorem 6.2 w e ha v e R f ( V ) ≤ lim inf n →∞ c n ≤ I ( X ; Y ) ≤ sup X I ( X ; Y ) = C ( W ) . On th e other hand , (6.18) implies th at H ( V ) ≤ lim su p n →∞ c n . F u rther- more, (6.20) together with Th eorem 6.5 giv es u s H ( V ) ≤ lim sup n →∞ c n ≤ C ( W ) . Finally , note that R f ( V ) = H ( V ) by Theorem 6.1. ✷ W e are no w in terested in the p roblem of what conditions are needed to attain equalities R f ( V ) = R f ( V ) and/or C ( W ) = C ( W ) in Th eorem 6.6 and so on. T o see th is, w e need the follo wing f ou r deﬁnitions: 26 Deﬁnition 6.6 A general sour ce V = { V n } ∞ n =1 is said to satisfy the str ong c onverse pr op erty if H ( V ) = H ( V ) holds (as for the op erational meaning, refer to Han [11]), wher e H ( V ) = p - lim inf n →∞ 1 n log 1 P V n ( V n ) . Deﬁnition 6.7 A general c hannel W = { W n } ∞ n =1 is said to s atisfy the str ong c onverse pr op erty if sup X I ( X ; Y ) = sup X I ( X ; Y ) (6.21) holds (as for the op erational meaning, refer to Han [11], V erd ´ u and Han [8]), where I ( X ; Y ) = p- lim sup n →∞ 1 n log W n ( Y n | X n ) P Y n ( Y n ) . Deﬁnition 6.8 A general s ou r ce V = { V n } ∞ n =1 is said to satisfy the se mi- str ong c onverse pr op erty if for al l diverge nt subsequences { n i } ∞ n =1 of p ositiv e in tegers su c h th at n 1 < n 2 < · · · → ∞ it holds that p- lim sup i →∞ 1 n i log 1 P V n i ( V n i ) = H ( V ) . (6.22) Deﬁnition 6.9 A general c hannel W = { W n } ∞ n =1 is said to s atisfy the semi-str ong c onverse pr op erty if for al l div ergen t sub sequences { n i } ∞ n =1 of p ositiv e intege rs suc h that n 1 < n 2 < · · · → ∞ it holds that p- lim inf i →∞ 1 n i log W n i ( Y n i | X n i ) P Y n i ( Y n i ) ≤ sup X I ( X ; Y ) , (6.23) where Y n is the c hannel outpu t via W n due to the channel in put X n . ✷ With these deﬁnitions, we hav e the follo wing lemmas: Lemma 6.1 1) The information-stabilit y of a source V (resp. a c hann el W ) with the limit implies th e strong conv erse prop ert y of V (resp . W ). 27 2) The strong conv erse pr op ert y of a source V (resp. a c hann el W ) im- plies th e semi-strong conv erse prop ert y of V (resp. W ). ✷ Lemma 6.2 1) A general source V satisﬁes the semi-strong con verse prop ert y if and only if R f ( V ) = R f ( V ) . (6.24) 2) A ge neral c hannel W satisﬁes the semi-strong conv erse pr op ert y if and only if C ( W ) = C ( W ) . (6.25) Pr o of: It is ob vious in view of Theorem 6.4, Theorem 6.5 and Remark 6.1. ✷ Remark 6.3 An op erational equiv alen t of the notio n of semi-strong con- v erse prop erty is found in V em bu , V erd ´ u and Stein b erg [9]. Originally , Csisz´ ar and K¨ orner [12] p osed t w o op erational standp oin ts in source co ding and c hann el co din g, i.e., the p essimistic standp oint and the optimistic stand- p oint . In their terminology , Lemma 6.2 states that, for source co ding, the semi-strong con vserse prop erty is equiv alent to the statemen t that b oth the p essimistic standp oint and the optiimistic standp oint result in the same inﬁ- m um of all ac h iev able ﬁxed-length source coding rates; similarly , for c hannel co ding, th e semi-strong con vserse prop ert y is equiv alent to the claim th at b oth the p essimistic standp oint and the optimistic stand p oin t result in the same supremum of all ac hiev able c hann el coding rates. ✷ Th us, Theorem 6.6 together w ith L emma 6.2 immed iately yields the follo win g stronger separation theorem of the traditional t yp e: Theorem 6.7 Let either a general source V = { V n } ∞ n =1 or a general c han- nel W = { W n } ∞ n =1 satisfy the semi-strong conv erse pr op ert y . Then, the follo win g t wo statement s hold: 1) If R f ( V ) < C ( W ), then the source V is transmiss ib le o v er the c hann el W . In this case, we can separate the source co ding and the channel co ding. 28 2) If the source V is trans m issible o v er the c hannel W , th en it must hold that R f ( V ) ≤ C ( W ). ✷ Example 6.1 Theorem 6.3 is an immediate consequence of Theorem 6.7 together with Lemma 6.1. ✷ Example 6.2 Let us consider tw o diﬀeren t stati onary memoryless sources V 1 = { V n 1 } ∞ n =1 , V 2 = { V n 2 } ∞ n =1 with coun tably inﬁ nite source alphab et V , and deﬁne its mixe d sour ce V = { V n } ∞ n =1 b y P V n ( v ) = α 1 P V n 1 ( v ) + α 2 P V n 2 ( v ) ( v ∈ V n ) , where α 1 , α 2 are p ositiv e constants suc h that α 1 + α 2 = 1. Then, this mixed source V = { V n } ∞ n =1 satisﬁes the semi-strong con v erse prop ert y b ut neither the strong con v erse prop ert y nor the information-stabilit y . Similarly , let us consider t wo d iﬀeren t sta tionary memoryless c hannels W 1 = { W n 1 } ∞ n =1 , W 2 = { W n 2 } ∞ n =1 with arbitrary abstract input and output alphab ets X , Y , and deﬁ ne its mixe d channel W = { W n } ∞ n =1 b y W n ( y | x ) = α 1 W n 1 ( y | x ) + α 2 W n 2 ( y | x ) ( x ∈ X n , y ∈ Y n ) . Then, this mixed c hannel W = { W n } ∞ n =1 satisﬁes th e semi-strong con- v erse prop erty but n either th e stron g con v erse prop erty nor th e inf ormation- stabilit y . Th us, in these mixed cases the s eparation th eorem holds. ✷ References [1] C. E. Shannon, “A mathematical theory of communicat ion,” B e l l Sys- tem T e chnic al Journal , vo l.27, pp.379-4 23, pp. 623-656, 1948 [2] A. F einstein, “A new basic theorem of inf ormation theory ,” IRE T r ans. PGIT , vol.4 , pp.2-22, 1954 [3] R.B. Ash, Informat ion The ory , In terscience Publishers, New Y ork, 1965 29 [4] R. L. Dobrush in, “A general formulation of the fund amen tal Shan- non theorem in information theory ,” Usp ehi Mat. A c ad. Nauk. SSSR , v ol.40, p p.3-104, 1959: T ranslation in T r ansactions of Americ an Math- ematic al So ciety , Series 2, v ol.33, p p .323-4 38, 1963 [5] M. S . Pinsker, Information and Information Stability of R andom V ari- ables and Pr o c e sses , Holden-Da y , San F rancisco, 196 4 [6] G. D. Hu, “On S hannon theorem and its conv erse f or s equ ence of comm unication schemes in the case of abstr act random v ariables,” in T r ans. 3r d Pr ague Confer e nc e on Information The ory, Statistic al De ci- sion F unctions, R andom Pr o c esses , Czec hslov ak Academy of S ciences, Prague, p p . 285-333, 1964 [7] T.S. Han and S . V erd ´ u, “Appro x im ation theory of outpu t statistics,” IEEE T r ansactions on Information The ory , vol .IT-39, no.3, pp. 752- 772, 1993 [8] S. V erd ´ u and T .S. Han, “A general formula for c hann el capacit y ,” IEEE T r ansactions on Information The ory , v ol.IT-40, no.4, pp.1147- 1157, 1994 [9] S. V em bu, S. V erd ´ u and Y. Stein b erg, “The source-c hannel separa- tion theorem revisited,” IEEE T r ansactions on Information The ory , v ol.IT-41, no.1, pp . 44-54, 1995 [10] S . V erd ´ u and T. S. Han, “The role of the asymptotic equipartition prop erty in n oiseless sour ce co d in g,” IEE E T r ansactions on Informa- tion The ory , v ol.IT-43, no.3, pp .847- 857, 199 7 [11] T . S. Han, Information-Sp e ctrum Metho ds in Information The ory , Springer V erlag, New Y ork, 2003 [12] I. Csisz´ ar and J. K¨ orner, Infor mation The ory: Co ding The or ems for Discr ete Memoryless Systems , Academic Pr ess, New Y ork, 1981 [13] T . M. Cov er and J. Thomas, Elements of Information The ory , Wiley , New Y ork, 199 1 [14] P .N. Chen and F. Ala ja ji, ”Op timistic Shannon co ding theorems f or arbitrary single-user systems,” IEEE T r ansactions on Information The ory , IT-45, pp. 2623-262 9, 1999 30

Joint Source-Channel Coding Revisited: Information-Spectrum Approach

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment