On Directed Information and Gambling
We study the problem of gambling in horse races with causal side information and show that Massey's directed information characterizes the increment in the maximum achievable capital growth rate due to the availability of side information. This resul…
Authors: Haim H. Permuter, Young-Han Kim, Tsachy Weissman
On Directe d Information and Gambling Haim H. Permuter Stanford University Stanford, CA, USA haim1@stanfor d.edu Y oung-Han Kim University of California, San Diego La Jolla, CA, USA yhk@ucsd.e du Tsachy W eissman Stanford University/T echnion Stanford, CA, USA/Haifa, Israel tsachy@stanford .edu Abstract —W e stu dy the problem of gambling in horse races with causal side information an d show th at Massey’s directed informa tion characterizes the increme nt in the ma ximum achiev- able capital growth rate due to the av ailability of sid e infor - mation. This result give s a natural int erpr etation of directed informa tion I ( Y n → X n ) as the amount of information that Y n causally pro vid es ab out X n . Extensions to stock market portf olio strategies and data compression with causal side in f ormation are also discussed. I . I N T R O D U C T I O N Mutual inform ation arises as the canonical answer to a va- riety of problems. Most n otably , Shannon [1] sho wed that the capacity C , the maximum data rate for reliable commu nication over a discrete memoryle ss chann el p ( y | x ) with input X and output Y , is giv en by C = max p ( x ) I ( X ; Y ) , (1) which leads naturally to the oper ational inter pretation of mu- tual information I ( X ; Y ) = H ( X ) − H ( X | Y ) as the amount of uncertainty about X tha t can b e reduced by observ ation Y , or equ i valently , the amoun t of informatio n Y can provide about X . I ndeed, mutual informa tion I ( X ; Y ) plays the cen - tral ro le in Shann on’ s random cod ing argument, because the probab ility that ind ependen tly drawn X n and Y n sequences “look” as if they were drawn jo intly dec ays exponen tially with expon ent I ( X ; Y ) . Shannon also proved a dual re sult [2] sho wing that the m inimum compr ession rate R to satisfy a certain fide lity criterion D between the sour ce X an d its reconstruc tion ˆ X is given by R ( D ) = min p ( ˆ x | x ) I ( X ; ˆ X ) . In another duality result (Lagra nge du ality th is time) to (1), Gal- lager [3] proved the m inimax red undan cy theorem, conn ecting the redundan cy of the universal lossless source code to the capacity of the c hannel with condition al distribution descr ibed by the set of possible source distributions. Later on , it was shown that mutual information has also an importan t role in p roblems that are no t necessarily re lated to describing so urces or transferrin g inform ation through chan- nels. Per haps the most lucrativ e example is the use o f mu tual informa tion in gam bling. Kelly showed in [4] that if each ho rse race outcome ca n be represented as a n in depend ent and identically distributed (i.i.d.) c opy of a random variable X and the gambler h as some side information Y relevant to the outco me of th e race, then under some con ditions on the odds, the mutua l inform ation I ( X ; Y ) c aptures the difference between growth r ates of the optimal gambler’ s wealth with and with out side information Y . Thu s, Kelly’ s result g iv es an in terpretation that m utual informa tion I ( X ; Y ) is the value of side information Y for the horse race X . In o rder to tackle problems arising in info rmation systems with causally d ependen t co mponen ts, M assey [5] introd uced the notion of directed information as I ( X n → Y n ) , n X i =1 I ( X i ; Y i | Y i − 1 ) , and showed tha t the maximum d irected inform ation uppe r bound s the capacity of channels with feed back. Su bsequently , it was shown that Massey’ s dir ected info rmation and its variants indeed characterize th e cap acity o f feedb ack an d two- way channels [6 ]–[13 ] a nd the ra te distortion fun ction with feedfor ward [14] . The main co ntribution of this p aper is sho wing that d irected informa tion I ( Y n → X n ) has a natu ral inter pretation in gambling as the difference in gr owth rates due to ca usal side informa tion. As a spec ial case, if the h orse r ace o utcome a nd the co rrespond ing side inf ormation sequen ces are i.i.d., then the (nor malized) d irected informatio n beco mes a single letter mutual info rmation I ( X ; Y ) , and it coincide s with Kelly’ s result. The pap er is organized as follows. W e describ e the notation of directed in formation an d causal co nditioning in Section II. In Section III, we formu late the horse-race gam bling pr oblem, in which side informatio n is re vealed c ausally to th e gamb ler . W e p resent the m ain result in Section I V and an an alytically solved example in Section V. Finally , Section VI concludes the paper and states two possible extensions of this work to stoc k market and d ata compre ssion with causal side inf ormation . I I . D I R E C T E D I N F O R M A T I O N A N D C AU S A L C O N D I T I O N I N G Throu ghout this pap er , we use the causal co nditionin g nota- tion ( ·||· ) developed by Kramer [6]. W e d enote as p ( x n || y n − d ) the prob ability mass function (pmf) o f X n = ( X 1 , . . . , X n ) causally co nditioned on Y n − d , f or so me in teger d ≥ 0 , which is defined as p ( x n || y n − d ) , n Y i =1 p ( x i | x i − 1 , y i − d ) . (By convention, if i − d ≤ 0 then x i − d is set to n ull.) In particular, we use extensi vely th e cases d = 0 , 1 : p ( x n || y n ) , n Y i =1 p ( x i | x i − 1 , y i ) , p ( x n || y n − 1 ) , n Y i =1 p ( x i | x i − 1 , y i − 1 ) . Using the chain rule, we can easily verify that p ( x n , y n ) = p ( x n || y n ) p ( y n || x n − 1 ) . The cau sally condition al en tr opy H ( X n || Y n ) is defined as H ( X n || Y n ) , E[log p ( X n || Y n )] = n X i =1 H ( X i | X i − 1 , Y i ) . Under this notation , d irected information can b e written as I ( Y n → X n ) = n X i =1 I ( X i ; Y i | X i − 1 ) = H ( X n ) − H ( X n || Y n ) , which hints, in a rough analogy to mu tual in formation , a po s- sible interpretatio n of directed inform ation I ( Y n → X n ) as the amo unt of inf ormation causally av ailable side in formation Y n can provide about X n . Note that the channel capa city results inv olve the term I ( X n → Y n ) , which measure the information in the for ward link X n → Y n . In contr ast, in gambling the gain in growth rate is due to the side infor mation (ba ckward link) , a nd therefor e the expression I ( Y n → X n ) appears. I I I . G A M B L I N G I N H O R S E R A C E S W I T H C AU S A L S I D E I N F O R M AT I O N Suppose that th ere ar e m r acing ho rses in an infinite sequence of ho rse races and let X i ∈ X , [1 , 2 , ..., m ] , i = 1 , 2 , . . . , deno te the hor se that wins at time i . Before betting in the i -th horse race, the g ambler kn ows some side informa tion Y i ∈ Y . W e assume that the gambler in vests all his capital in the horse race as a function of the information that he knows at time i , i.e., the previous ho rse race outcomes X i − 1 and side inform ation Y i up to time i . Let b ( x i | x i − 1 , y i ) be the propo rtion of wealth that th e gambler bets o n horse x i giv en X i − 1 = x i − 1 and Y i = y i . Th e b etting scheme should satisfy b ( x i | x i − 1 , y i ) ≥ 0 (no short) and P x i b ( x i | x i − 1 , y i ) = 1 f or any history x i − 1 , y i . Let o ( x i | x i − 1 ) denote the odds of a h orse x i giv en the previous outcomes x i − 1 , wh ich is the am ount of capital that the g ambler gets for each unit capital in vested in the ho rse. W e deno te by S ( x n || y n ) the g ambler’ s wealth after n races where the r ace outcomes wer e x n and the side informa tion that was causally available was y n . The gr o wth , denoted by W ( X n || Y n ) , is defined as the expected logarith m (base 2) of the gambler’ s wealth, i.e., W ( X n || Y n ) , E[lo g S ( X n || Y n )] . (2) Finally the gr owth rate 1 n W ( X n || Y n ) is defined as the nor- malized growth. Here is a summary of the notation: • X i is the outcome of the horse race at time i . • Y i is the the side informatio n at time i . • o ( X i | X i − 1 ) is the p ayoffs at time i for horse X i giv en that in the previous race the hor ses X i − 1 won. • b ( X i | Y i , X i − 1 ) the fr actions of the gambler’ s wealth in vested in ho rse X i at time i giv en that the outcome of the previous r aces are X i − 1 and the side infor mation av ailab le at time i is Y i . • S ( X n || Y n ) the g ambler’ s wealth after n races when th e outcomes of the r aces are X n and the side infor mation Y n is causally a vailable. • 1 n W ( X n || Y n ) is the growth r ate. W itho ut lo ss of generality , we assume that the g ambler’ s capital is 1 initially; therefor e S 0 = 1 . I V . M A I N R E S U LT S In Sub section IV -A, we assume that the gamb ler invests all his mon ey in the horse race wh ile in Sub section IV -B, we allow the gambler to inves t only part of the money . Using Kelly’ s result, it is sho wn in Sub section IV -B that if the odds are fair with respect to some distribution then the gambler should in vest all his money in the race. A. Investing all the money in the h orse race W e assume th at a t any time n the gamb ler in vests all h is capital and therefore S ( X n || Y n ) = b ( X n | X n − 1 , Y n ) o ( X n | X n − 1 ) S ( X n − 1 || Y n − 1 ) . This also implies that S ( X n || Y n ) = n Y i =1 b ( X i | X i − 1 , Y i ) o ( X i | X i − 1 ) . The following proposition ch aracterizes th e optimal b etting strategy an d the correspon ding growth of we alth. Theor em 1: For any finite horizon n , the maximu m growth rate is achieved when the gambler in vests the money propor- tional to the causal condition ing distribution, i.e., b ∗ ( x i | x i − 1 , y i ) = p ( x i | x i − 1 , y i ) , ∀ x i , y i , i ≤ n, (3) and the growth is W ∗ ( X n || Y n ) = E[log o ( X n )] − H ( X n || Y n ) . Note that the sequence { p ( x i | x i − 1 , y i ) } n i =1 uniquely de- termines p ( x n || y n ) . Also fo r all pairs ( x n , y n ) such that p ( x n || y n ) > 0 , th e seq uence { p ( x i | x i − 1 , y i ) } n i =1 is deter- mined uniquely by p ( x n || y n ) simply by the identity p ( x i | x i − 1 , y i ) = p ( x i || y i ) p ( x i − 1 || y i − 1 ) . A similar argument applies for { b ∗ ( x i | x i − 1 , y i ) } n i =1 and b ∗ ( x n || y n ) , and theref ore (3) is equiv alen t to b ∗ ( x n || y n ) = p ( x n || y n ) , ∀ x n ∈ X n , y n ∈ Y n . Pr oof of Theorem 1: W e hav e W ∗ ( X n || Y n ) = max b ( x n || y n ) E[log b ( X n || Y n ) o ( X n )] = max b ( x n || y n ) E[log b ( X n || Y n )] + E[log o ( X n )] = − H ( X n || Y n ) + E[log o ( X n )] , where the last equality is achieved by ch oosing b ( x n || y n ) = p ( x n || y n ) , and it is justified by the following up per bound E[ log b ( X n || Y n )] = X x n ,y n p ( x n , y n ) log p ( x n || y n ) + log b ( x n || y n ) p ( x n || y n ) = − H ( X n || Y n ) + X x n ,y n p ( x n , y n ) log b ( x n || y n ) p ( x n || y n ) ( a ) ≤ − H ( X n || Y n ) + log X x n ,y n p ( x n , y n ) b ( x n || y n ) p ( x n || y n ) ( b ) ≤ − H ( X n || Y n ) + log X x n ,y n p ( y n || x n − 1 ) b ( x n || y n ) = − H ( X n || Y n ) , (4) where (a) follows from Jensen’ s inequality and (b) from the fact that P x n ,y n p ( y n || x n − 1 ) b ( x n || y n ) = 1 . All sum mations in ( 4) are ov er the argum ents ( x n , y n ) fo r which p ( x n , y n ) > 0 . This en sures that p ( x n || y n ) > 0 , an d therefo re, we can multiply and divide by p ( x n || y n ) in the first step of (4). In the case that the odds are fair and un iform, i.e., o ( X i | X i − 1 ) = 1 |X | , then 1 n W ∗ ( X n || Y n ) = log |X | − 1 n H ( X n || Y n ) . Thus th e sum of th e growth rate 1 n W ( X n || Y n ) an d the en- tropy rate 1 n H ( X n || Y n ) of the h orse ra ce pro cess con ditioned causally on the side information is constant, and on e can see a duality between H ( X n || Y n ) and W ∗ ( X n || Y n ) ; cf. [15, th. 6.1.3] Let us den ote by ∆ W the increase in the g rowth rate du e to causal side informa tion, i.e., ∆ W = 1 n W ∗ ( X n || Y n ) − 1 n W ∗ ( X n ) . (5) Thus ∆ W character izes the value of side informa tion Y n . Theorem 1 lead s to the following prop osition, which giv es a new op erational meaning of Masse y’ s directed inform ation. Cor ollary 1: T he increase in growth rate due to causal side informa tion Y n for horse races X n is ∆ W = 1 n I ( Y n → X n ) . (6) Pr oof: From Theo rem 1, w e hav e W ∗ ( X n || Y n ) − W ∗ ( X n ) = − H ( X n || Y n ) + H ( X n ) = I ( Y n → X n ) . B. Investing only part of the money In this sub section we co nsider the case where the gam bler does not necessarily in vest all his mon ey in the gambling. Let b 0 ( y i , x i − 1 ) be the po rtion of money that the g ambler does not in vest in gamb ling at time i giv en that the pre vious rac es results were x i − 1 and the side informatio n is y i . In this setting, the wealth is giv en b y S ( X n || Y n ) = n Y i =1 b 0 ( X i − 1 , Y i ) + ( b ( X i | X i − 1 , Y i ) o ( X i | X i − 1 ) , and the growth W ( X n || Y n ) is defined as befor e in (2). The term W ( X n || Y n ) ob eys a chain rule similar to the causal conditionin g entro py defin ition H ( X n || Y n ) , i.e., W ( X n || Y n ) = n X i =1 W ( X i | X i − 1 , Y i ) , where W ( X i | X i − 1 , Y i − 1 ) , E log( b 0 ( X i − 1 , Y i ) + b ( X i | X i − 1 , Y i ) o ( X i | X i − 1 )) . Note that for any given history ( x i − 1 , y i ) ∈ X i − 1 × Y i , the betting schem e { b 0 ( x i − 1 , y i ) , b ( x i | x i − 1 , y i ) } influences on ly W ( X i | X i − 1 , Y i ) , so that we hav e max { b 0 ( x i − 1 ,y i ) ,b ( x i | x i − 1 ,y i ) } n i =1 W ( X n || Y n ) = n X i =1 max b 0 ( x i − 1 ,y i ) ,b ( x i | x i − 1 ,y i ) W ( X i | X i − 1 , Y i ) = n X i =1 X x i − 1 ,y i p ( x i − 1 , y i ) max b 0 ( x i − 1 ,y i ) ,b ( x i | x i − 1 ,y i ) W ( X i | x i − 1 , y i ) . The optimization problem in the last equation is equiv alen t to the p roblem of findin g the optimal b etting strategy in the memory less case whe re th e win ning horse distribution p ( x ) is p ( x ) = Pr( X i = x | x i − 1 , y i ) , the odds o ( x ) are o ( x ) = o ( X i = x | x i − 1 ) , an d the betting strategy ( b 0 , b ( x )) is ( b 0 ( x i , y i − 1 ) , b ( X i = x | x i − 1 , y i )) , re spectiv ely . He nce, the op timization, max W ( X i | x i − 1 , y i ) , is equivalent to th e following co n vex problem: maximize X x p ( x ) log ( b 0 + b ( x ) o ( x )) subject to b 0 + X x b ( x ) = 1 , b 0 ≥ 0 , b ( x ) ≥ 0 , ∀ x ∈ X . The solution to this optimization pro blem was g iv en by Kelly [4 ]. If th e o dds are super - fair , n amely , P x 1 o ( x ) ≤ 1 , then the gambler will in vest all his wealth in the race rather than lea ve some as cash, since by betting b ( x ) = c o ( x ) , where c = 1 / P x 1 o ( x ) , the gambler’ s m oney will b e multip lied by c ≥ 1 , regardless of th e race ou tcome. Theref ore, for this case, the solution is given by Theorem 1, where the gambler inv ests propo rtional to the causal conditionin g distribution p ( x n || y n ) . If the odds ar e sub-fair, i.e., P x 1 o ( x ) > 1 , then it is o ptimal to bet o nly some of the money , namely b 0 > 0 . The solution to this p roblem is g i ven in term s of an algor ithm in [4, p . 925] . V . A N E X A M P L E Here we co nsider betting in a horse race, where the wining horse can be repr esented as a Markov pro cess, and causal side informa tion is available. Example 1: Consider the ho rse r ace p rocess depicted in Figure 1 wh ere two horses are racing and the win ning ho rse X i behaves as a Markov process. A horse th at won will win again with pr obability 1 − p and lose with pro bability p . At time zero, we assum e that bo th hor ses have prob ability 1 2 of wining. The s ide information Y i at time i is a noisy observation of the horse r ace outco me X i . It ha s p robability 1 − q of being equal to X i , and proba bility q of being different from X i . For this e xample, the increase in gro wth rate due to side informa tion as n goes to infinity is ∆ W = h ( p ∗ q ) − h ( q ) , where the fun ction h ( · ) denotes the binary entro py , i.e. , h ( x ) = − x lo g x − (1 − x ) log(1 − x ) , an d p ∗ q denotes the param eter of a Bern oulli distribution th at results from conv olv ing two Bernoulli d istributions with parameters p and q , i.e., p ∗ q = (1 − p ) q + (1 − q ) p . The in crease in th e growth rate ∆ W f or this examp le can be obtained using first princip les as follows: ∆ W = lim n →∞ 1 n I ( Y n → X n ) = lim n →∞ 1 n n X i =1 H ( Y i | X i − 1 ) − H ( Y i | X i ) = lim n →∞ 1 n n X i =1 H ( Y i | X i − 1 ) − H ( Y i 2 | X i 2 ) − H ( Y 1 | X 1 ) ( a ) = lim n →∞ 1 n n X i =1 H ( Y i | X i − 1 ) − H ( Y i − 1 | X i − 1 ) − H ( Y 1 | X 1 ) = lim n →∞ 1 n n X i =1 H ( Y i | Y i − 1 , X i − 1 ) − H ( Y 1 | X 1 ) ( b ) = H ( Y 1 | X 0 ) − H ( Y 1 | X 1 ) = h ( p ∗ q ) − h ( q ) , (7) where steps (a) and (b) are due to the statio narity of the process ( X i , Y i ) . Alternatively , the sequence of equalities up to step (b) in (7) can be derived directly using 1 n I ( Y n → X n ) ( a ) = 1 n n X i =1 I ( Y i ; X n i | X i − 1 , Y i − 1 ) ( b ) = H ( Y 1 | X 0 ) − H ( Y 1 | X 1 ) , (8) where (a) is the id entity given in [11, eq . (9) ] and (b) is due to the stationarity of the process. P S f r a g r e p l a c e m e n t s H o r s e w i n s 1 2 p p 1 − p 1 − p Y n X = 1 Horse 1 wins X = 2 Horse 2 wins 1 1 2 2 X Y q q 1 − q 1 − q Fig. 1. The setting of E xample 1. The winning horse X i is represented as a Markov process with two states. In state 1, horse number 1 wins, and in state 2, horse number 2 wins. The side informa tion, Y i , is noisy observ ation of the wining horse, X i . If the side information is known with some lookahea d k ∈ { 0 , 1 , ... } , that is, if the gamb ler knows Y i + k at tim e i , then the increase in growth rate is given by ∆ W = lim n →∞ 1 n I ( Y n + k → X n ) = H ( Y k +1 | Y k , X 0 ) − H ( Y 1 | X 1 ) , (9) where the last equa lity is due to the same arguments as in ( 8). Figure 2 shows the increase in growth rate ∆ W due to side informa tion as a functio n of the side inf ormation p arameters ( q , k ) . The left plot shows ∆ W as a fun ction of q , where p = 0 . 2 and no loo kahead, k = 0 . The righ t plot shows ∆ W as a fun ction of k , where p = 0 . 2 and q = 0 . 25 . If the entir e side information sequence Y 1 , Y 2 , ... is known to the gambler ahead of time, then we should hav e mutual informatio n ra ther then directed information , i.e., ∆ W = lim n →∞ 1 n I ( Y n ; X n ) = lim n →∞ H ( Y n ) n − H ( Y 1 | X 1 ) , (10) and this coincides with the fact that f or a stationa ry hidden Markov process { Y 1 , Y 2 , ... } the sequ ence H ( Y k +1 | Y k − 1 , X 0 ) conv erges to the entropy r ate of the process. V I . C O N C L U S I O N A N D F U RT H E R E X T E N S I O N S W e hav e sho wn that dire cted information ar ises naturally in gambling as the gain in the maximum achievable capital growth due to the availability of causal side information . W e now outline two extensions: stock ma rket portf olio strategies and data compression in the presence of causal side in forma- tion. Details are gi ven in [ 16]. A. Stock ma rket Using notation similar to that in [15, ch. 16 ], a stock market at time i is represen ted as a vector of stocks X i = ( X i 1 , X i 2 , ..., X im ) , where m is the n umber o f stock s, a nd the price relative X ik is th e ratio of the price of stock- k at the end of day i to the price o f stock- k at the beginning of day i . W e assume that at time i ther e is side information Y i that is known to the inv estor . A portfolio is an allocation of wealth across the stock s. A n onparticip ating or causal po rtfolio strategy with 10 −4 10 −3 10 −2 10 −1 0 0.2 0.4 0.6 0.8 1 0.5 0 2 4 6 8 0.12 0.13 0.14 0.15 0.16 0.17 P S f r a g r e p l a c e m e n t s ∆ W ∆ W q k → H ( X 1 | X 0 ) k H ( Y 1 | Y 0 , Y − 1 , ... ) − H ( Y 1 | X 1 ) ← k Fig. 2. Increase in the gro wth rate, in Example 1, as a functio n of the side informat ion paramete rs ( q, k ) . T he left plot of the figure shows the increase of the gro wth rate ∆ W as a functi on of q = Pr( X i 6 = Y i ) and no lookahe ad. T he right plot sho ws the increase of the growt h rate as functi on of lookahead k , where q = 0 . 25 . The horse race outcome is assumed to be a first-order binary symmetric Mark ov process with parameter p = 0 . 2 . causal side information at time i is den oted as b ( x i − 1 , y i ) , and it satisfies P m k =1 b k ( x i − 1 , y i ) = 1 , and b k ( X i − 1 , Y i ) ≥ 0 for all possible x i − 1 , y i . W e de fine S ( x n || y n ) as the wealth at the end of day n for a stock sequence x n and causal side informa tion y n . W e can write S ( x n || y n ) = b t ( x n − 1 , y n ) x n S ( x n − 1 || y n − 1 ) where ( · ) t denotes the tran spose of a vector . The g oal is to maximize the growth W ( X n || Y n ) = E[log S ( X n || Y n )] . W e also define W ( X n | X n − 1 , Y n ) = E[log( b t ( X n − 1 , Y n ) X n )] . From this definition, we can write the chain rule W ( X n || Y n ) = n X i =1 W ( X i | X i − 1 , Y i ) . The g ambling in hor se races with m ho rses stud ied in the previous section is a special c ase o f investing the stock market with m + 1 stocks. Th e first m s tocks corr espond to the m horses and at the end of the day one of th e stocks, say k ∈ { 1 , ..., m } , g ets the value o ( k ) with p robability p ( k ) and all other stocks becom e zer o. The m + 1 -st stock is always one, and it allows the gamb ler to invest only part of the wealth in the horse race. The de velopments in the previous section can be expan ded to character ize th e increase in growth r ate due to side inf or- mation, where again directed info rmation emerges as the key quantity , up per-boundin g the value o f ca usal side inform ation; cf. [17]. Details will be giv en in [16]. B. Instanta neous co mpr ession with cau sal side info rmation Let X 1 , X 2 , . . . be a source and Y 1 , Y 2 , . . . its side in- formation sequence. The sou rce is to be losslessly encoded instantaneou sly , with cau sal av a ilable side information . Mo re precisely , an in stantaneous lossless sour ce encod er with causal side in formation co nsists o f a sequence of m appings { M i } i ≥ 1 such that each M i : X i × Y i → { 0 , 1 } ∗ has th e pro perty that f or e very x i − 1 and y i M i ( x i − 1 · , y i ) is an instantaneo us (prefix) code for X i . An instan taneous lossless sour ce encoder with causal side informa tion operates sequentially , emitting the conc atenated bit stream M 1 ( X 1 , Y 1 ) M 2 ( X 2 , Y 2 ) · · · . The defin ing proper ty that M i ( x i − 1 · , y i ) is an instantaneo us code f or every x i − 1 and y i is a necessary and sufficient con dition for the existence of a d ecoder that can lo sslessly r ecover x i based o n y i and th e bit s tream M 1 ( x 1 , y 1 ) M 2 ( x 2 , y 2 ) · · · just as soon as it sees M 1 ( x 1 , y 1 ) M 2 ( x 2 , y 2 ) · · · M i ( x i , y i ) , fo r all sequence pairs ( x 1 , y 1 ) , ( x 2 , y 2 ) . . . and all i ≥ 1 . Using natural extension s of standard arguments w e show in [16 ] that I ( Y n → X n ) is essentially (up to ter ms that are sub linear in n ) th e rate savings in optimal sequential lossless compression of X n due to the causal a vailability o f the side informatio n. R E F E R E N C E S [1] C. E. Shannon, “ A mathematical theory of communication, ” Bell Syst. T ech. J. , vol . 27, pp. 379–423 and 623–656, 1948. [2] ——, “ Coding theorems fo r a discrete source wit h fideli ty criterion, ” in Informatio n and Dec ision Pr ocesses , R. E . Machol, Ed. McGra w-Hill, 1960, pp. 93–126. [3] R. G. Gallager , “Source coding with side informatio n and uni versal coding, ” Sept. 1976, unpublished manuscri pt. [4] J. L. Ke lly , “ A new interpretat ion of information rate, ” Bell System T echnic al J ournal , vol. 35, pp. 917–926, 1956. [5] J. Massey , “Causality , feedback and directe d information , ” P r oc. Int. Symp. Inf. Theory Applic. (ISITA-90) , pp. 303–305, 1990. [6] G. Kramer , “Directed informati on for channels with feedb ack, ” Ph.D. Dissertat ion, Swiss Federal Institute of T echnology (ETH) Zurich, 1998. [7] S. T atikonda, “Control under communicat ion constraints, ” Ph.D. diser- tation, Massachuset ts Instit ute of T echnolo gy , Cambridge , MA , 2000. [8] G. Kramer , “C apacit y result s fo r the discrete memoryless network, ” IEEE T rans. Inf. Theory , vo l. IT -49, pp. 4–21, 2003. [9] H. H. Permute r , T . W eissman, and A. J. Goldsmit h, “Finite state chan nels with time-in variant determini stic feedback, ” Sept. 2006, submitted to IEEE T rans. Inf. Theory . A v ailble at arxi v . org/ pdf/cs.IT/060807 0. [10] S. C. T atikonda and S. Mitter , “The cap acity of channels with feedback, ” September 2006, submitt ed to IEEE Tr ans. Inf . Theory . A vai lble at arxi v .org/c s.IT/0609139. [11] Y .-H. Kim, “ A coding the orem for a class of stationary channels with feedbac k, ” Jan 2007, submitted to IEEE T rans. Inf. Theory . A va ilble at arxi v .org/c s.IT/0701041. [12] H. H. Permuter and T . W eissman, “Capacity region of the fi nite- state multiple access channel with and without feedback, ” Au- gust 2007, submitted to IEEE T rans. Inf. Theory . A va ilble at arxi v .org/pd f/cs.IT/0608070. [13] B. Shrader and H. H. Permute r , “On the compound finite state channel with feedbac k, ” i n Proc. Inte rnat. Symp. Inf. Theory , Nice, France, 2007. [14] R. V enkat aramanan and S. S. Prad han, “Source coding with feedfor ward: Rate-di stortion theorems and error expon ents for a gener al source, ” IEEE T rans. Inf. Theory , vol. IT -53, pp. 2154–2179 , 2007. [15] T . M. Cover and J. A. Thomas, Elements of Information Theory , 2nd ed. Ne w-Y ork: Wil ey , 2006. [16] Y .-H. Kim, H. H. Permuter , and T . W eissman, “ An interpretat ion of di- rected information in gambling, port folio theo ry a nd da ta co mpression, ” Jan. 2007, in prepa ration. [17] A. R. Barron and T . M. Cover , “ A bound on the financia l va lue of informati on, ” IEEE T rans. Inf. Theory , vol. IT -34, pp. 1097–1100, 1988.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment