An axiomatic characterization of a two-parameter extended relative entropy

An axiomat ic c haracterization of a t w o-para meter extended relativ e en trop y Shigeru F uruic hi 1 ∗ 1 Department of Computer Science and System Analysis, College of Humanities and Sciences, Nihon Universit y , 3-25- 40, Sakura jyousui, Setagay a-ku, T okyo, 156- 8550 , Japan Abstract. The uniqueness theorem for a t w o-parameter extended relativ e entrop y is prov en. This r esu lt extends our p r evious one, the uniqueness theorem for a one-parameter extended rel- ativ e en tropy , to a tw o-parameter case. In addition, the pr op erties of a t wo -p arameter extended relativ e e ntrop y are studied. Keyw ords : Tsallis relativ e en tropy , tw o-parameter extended en trop y , a xiomatic charac- terizatio n and uniqueness theorem 2000 Mathematics Sub je ct Classiﬁcation : 94A17, 62B10 and 4 6N55 1 In tro d uction Shannon en tropy [1] is one of fu ndamenta l quan tities in classical information theory and u niquely determined b y the Sh annon-Khinchin axiom or the F addeev axiom. One-parameter extensions for Shannon en tropy ha v e been stud ied b y many r esearc hers [2]. The R ´ en yi en trop y [3 ] and the Tsallis entrop y [4] are famous. In the paper [5], the uniqueness theorem for the Tsallis entrop y w as pro ven. See also the pap er [6] and the references therein, for the axiomatic c haracterizations of one-paramete r e xtended en tropies. The t w o-parameter family of en trop y was i n tro duced b y Borges and Ro diti in [7] by the use of the generaliz ed Jac kson deriv ativ e me tho d. Recen tly , a t wo -parameter extended entrop y , wh ich is e ss en tially same form with the t wo-paramete r family of entrop y in [7], has studied by sev eral researc hers [8, 9, 10] and the uniqu eness theorem for a t w o-parameter extended entrop y w as pr o ve n in [10] b y generalizing the Sh annon-Khinchin axiom. In this pap er, w e denote a t wo- parameter e xtend ed ent r op y b y S α,β ( x 1 , x 2 , · · · , x n ) = n X j =1 x α j − x β j β − α , ( α 6 = β ) for t w o real n umbers α and β s uc h that 0 ≤ α ≤ 1 ≤ β or 0 ≤ β ≤ 1 ≤ α . If w e tak e α = 1 or β = 1, then it reco v ers the Ts allis en tr op y deﬁned b y S q ( x 1 , x 2 , · · · , x n ) ≡ n X j =1 x j − x q j q − 1 , (1 6 = q ≥ 0) . The Tsallis entrop y rec ov ers Shannon en tropy S 1 ( X ) ≡ − n X j =1 x j log x j ∗ E-mail:furuic hi@chs.nihon-u.ac.jp 1 in the limit q → 1. In this pap er, we s tu dy on in formation measure (en trop y) deﬁn ed for t wo probabilit y dis- tributions. The relativ e entrop y (K u llbac k-Leibler informatio n or div ergence) is deﬁn ed for two probabilit y distributions X = { x 1 , · · · , x n } and Y = { y 1 , · · · , y n } : D 1 ( X || Y ) ≡ n X j =1 x j (log x j − log y j ) . Since Shannon en trop y is deﬁn ed for one probabilit y distribution and it ca n b e repro duced b y the relativ e entrop y as log n − D 1 ( X || U ) for the uniform distrib ution U = { 1 /n, · · · , 1 /n } , th e relativ e en tropy can b e r egarded as a generalization for Sh annon en trop y . W e here note that we ha ve one-parameter extended relativ e ent r opies s u c h as the R´ en yi r elativ e en tr op y D R q ( X | Y ), α -div ergence D ( α ) ( X || Y ) and the Tsallis en trop y D T q ( X || Y ). Th ese are deﬁn ed by D R q ( X || Y ) ≡ 1 q − 1 log n X j =1 x q j y 1 − q j , D ( α ) ( X || Y ) ≡ 4 1 − α 2   1 − n X j =1 x 1 − α 2 j y 1+ α 2 j   , D T q ( X || Y ) ≡ n X j =1 x j − x q j y 1 − q j 1 − q , for q 6 = 1 and α 6 = ± 1. These quantitie s r eco v er the relativ e entrop y in their limit q → 1 or α → ± 1. These qu an tities are also essen tially same one in the sense that D ( q ) ( X || Y ) = 1 q D T q ( X || Y ) , ( q 6 = 0 , 1) , D R q ( X || Y ) = log  1 + ( q − 1) D T q ( X || Y )  q − 1 , ( q 6 = 1) , where w e set q = 1 − α 2 in D ( q ) ( X || Y ). Here, w e note th at the form P n j =1 x q j y 1 − q j is a p p eared in all one-parameter extended relat iv e en tropies. Therefore it was suﬃcient to st udy one qu an tit y of them, for the stud y of a on e-parameter extension of the relativ e entrop y . It is also notable that the Tsall is e ntrop y ca n b e rewritten by the Tsallis r elativ e en trop y as a sp ecial case: S q ( X ) = ln q n − n 1 − q D q ( X || U ) for the uniform distrib ution U = { 1 /n, · · · , 1 /n } , wh er e the q -logarithmic function is deﬁned b y ln q ( x ) ≡ x 1 − q − 1 1 − q , q > 0 , q 6 = 1 , x > 0 . Th us the uniqueness theorem for th e Tsallis relativ e en trop y w as pro ven in our previous pap er [12]. In the pr esen t pap er, as a fu rther extension of our p r evious result, we giv e a tw o-parameter extended axiom for the function deﬁned f or any pairs of the probabilit y distributions and p ro ve the uniqu eness theorem for a t wo -p arameter extended relativ e en tropy . This pap er is organized as follo ws. In Section 2, w e review the un iqueness theorem for relativ e entrop y pro ve n b y A.Hobson, and the un iqueness theorem for a one-parameter extended r elativ e en tr opy . In S ection 3, w e sho w our main theorem. In Section 4, we c haracterize the constan t app eared in S ection 3. In Section 5, we giv e prop erties for a tw o-parameter extended relativ e entrop y . 2 2 Review of the uniqueness theorem for one-paramete r extended relativ e en trop y The uniqu eness theorem for relativ e en tropy w as sho wn by A. Hobson as f ollo w s [11]: Theorem 2.1 ([11]) The f u nction D 1 ( A || B ) is assume d to b e deﬁne d for any two pr ob ability distributions A = { a j } and B = { b j } for j = 1 , · · · , n . If D 1 ( A || B ) satisﬁes the fol lowing c onditions (R 1)-(R5), then it is given by the form k P n j =1 a j log a j b j with a p ositive c onstant k . (R1) Continuity: D 1 ( A || B ) is a c ontinuous function of 2 n variables. (R2) Symmetry: D 1 ( a 1 , · · · , a j , · · · , a k , · · · , a n || b 1 , · · · , b j , · · · , b k , · · · , b n ) = D 1 ( a 1 , · · · , a k , · · · , a j , · · · , a n || b 1 , · · · , b k , · · · , b j , · · · , b n ) . (R3) A dditivity: D 1 ( a 11 , · · · , a 1 m , a 21 , · · · , a 2 m || b 11 , · · · , b 1 m , b 21 , · · · , b 2 m ) = D 1 ( c 1 , c 2 || d 1 , d 2 ) + c 1 D 1  a 11 c 1 , · · · , a 1 m c 1         b 11 d 1 , · · · , b 1 m d 1  + c 2 D 1  a 21 c 2 , · · · , a 2 m c 2         b 21 d 2 , · · · , b 2 m d 2  wher e c i = P m j =1 a ij and d i = P m j =1 b ij . (R4) D 1 ( A || B ) = 0 if a j = b j for al l j . (R5) D 1 ( 1 n , · · · , 1 n , 0 , · · · , 0 || 1 n 0 , · · · , 1 n 0 ) is an incr e asing function of n 0 and a de cr e asing function of n , f or any inte gers n and n 0 such that n 0 ≥ n . As a one-paramete r extension, we ga v e the uniqueness theorem for the T sallis relativ e ent r op y as follo ws. The f unction D q is deﬁn ed for the probabilit y distributions A = { a j } and B = { b j } on a ﬁnite probabilit y sp ace with one parameter q ≥ 0. The one-parameter extended r elativ e en tropy (Tsallis relativ e entrop y) w as c h aracterized by m eans of the follo wing triplet of the generalized conditions (OR1) , (OR2) and (OR3). Axiom 2.2 ([12]) (OR1) Continuity: D q ( a 1 , · · · , a n || b 1 , · · · , b n ) is a c ontinuous function o f 2 n variables. (OR2) Symmetry: D q ( a 1 , · · · , a j , · · · , a k , · · · , a n || b 1 , · · · , b j , · · · , b k , · · · , b n ) = D q ( a 1 , · · · , a k , · · · , a j , · · · , a n || b 1 , · · · , b k , · · · , b j , · · · , b n ) . (OR3) A dditivity: D q ( a 11 , · · · , a 1 m , · · · , a n 1 , · · · , a nm || b 11 , · · · , b 1 m , · · · , b n 1 , · · · , b nm ) = D q ( c 1 , · · · , c n || d 1 · · · , d n ) + n X i =1 c q i d 1 − q i D q  a i 1 c i , . . . , a im c i         b i 1 d i , . . . , b im d i  , (1) wher e c i = P m j =1 a ij and d i = P m j =1 b ij . 3 Then, we ha ve the follo wing theorem. Theorem 2.3 ([12]) If c onditions (O R1), (OR2) and (OR3) hold, then D q ( A | B ) is given in the fol lowing form: D q ( A || B ) = n X j =1 a j − a q j b 1 − q j φ ( q ) (2) with a c ertain c onstant φ ( q ) dep ending on the p ar ameter q . As f or p rop erties and applications of the Tsallis relativ e en tropy , see our previous p ap ers [13, 14, 15]. 3 Uniqueness theorem for t w o-parameter extended relativ e en- trop y In our previous p ap er [12], we ga ve Axiom 2.2 in order to charact er ize the Tsallis relativ e entrop y (one-parameter extended relativ e en tropy). In this section, we p ro ve the uniqueness theorem for a t wo-paramete r exte n ded rela tive en tr op y . Theorem 3.1 If the function D α,β ( X || Y ) , deﬁne d for any p airs of th e pr ob ability distributions X = { x 1 , · · · , x n } and Y = { y 1 , · · · , y n } on a ﬁnite pr ob ability sp ac e, satisﬁes the c onditions (TR1)-(TR3) in th e b elow, then D α,β ( X || Y ) is uniqu ely given by the for m D α,β ( X || Y ) = n X j =1 x α j y 1 − α j − x β j y 1 − β j φ ( α, β ) (3) with a c ertain c onstant φ ( α , β ) dep ending on tw o p ar ameters α and β . (TR1) Continuity : D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) is a c ontinuous function for 2 n variables. (TR2) Symmetry : D α,β ( x 1 , · · · , x j , · · · , x k , · · · , x n || y 1 , · · · , y j , · · · , y k , · · · , y n ) = D α,β ( x 1 , · · · , x k , · · · , x j , · · · , x n || y 1 , · · · , y k , · · · , y j , · · · , y n ) . (TR3) A dditivity : D α,β ( x 11 , · · · , x 1 m , · · · , x n 1 , · · · , x nm || y 11 , · · · , y 1 m , · · · , y n 1 , · · · , y nm ) = D α,β ( z 1 , · · · , z n || w 1 · · · , w n ) m X j =1  x ij z i  β  y ij w i  1 − β + n X i =1 z α i w 1 − α i D α,β  x i 1 z i , . . . , x im z i         y i 1 w i , . . . , y im w i  , (4) wher e z i = P m j =1 x ij and w i = P m j =1 y ij . 4 Pr o of : F rom (TR2), w e ha v e D α,β  1 su , · · · , 1 su , 0 , · · · , 0 , · · · , 1 su , · · · , 1 su , 0 , · · · , 0         1 tv , · · · , 1 tv  = D α,β  1 su , · · · , 1 su , · · · , 1 su , · · · , 1 su , 0 , · · · , 0 , · · · , 0 , · · · , 0         1 tv , · · · , 1 tv  . F rom (TR3), w e a lso ha v e D α,β  1 su , · · · , 1 su , 0 , · · · , 0 , · · · , 1 su , · · · , 1 su , 0 , · · · , 0         1 tv , · · · , 1 tv  = s  1 s  β  1 t  1 − β D α,β  1 u , · · · , 1 u , 0 , · · · , 0         1 v , · · · , 1 v  + u  1 u  α  1 v  1 − α D α,β  1 s , · · · , 1 s , 0 , · · · , 0         1 t , · · · , 1 t  . F rom ab ov e t wo equations, w e ha ve D α,β  1 su , · · · , 1 su , · · · , 1 su , · · · , 1 su , 0 , · · · , 0 , · · · , 0 , · · · , 0         1 tv , · · · , 1 tv  =  s t  1 − β D α,β  1 u , · · · , 1 u , 0 , · · · , 0         1 v , · · · , 1 v  +  u v  1 − α D α,β  1 s , · · · , 1 s , 0 , · · · , 0         1 t , · · · , 1 t  . If we put f α,β ( s, t ) ≡ D α,β  1 s , · · · , 1 s , 0 , · · · , 0         1 t , · · · , 1 t  , ( t ≥ s ) , then we ha v e f α,β ( su, tv ) =  s t  1 − β f α,β ( u, v ) +  u v  1 − α f α,β ( s, t ) . W e also ha ve f α,β ( us, v t ) =  v u  1 − β f α,β ( s, t ) +  s t  1 − α f α,β ( u, v ) , putting s = u, u = s, t = v and v = t in the ab o ve equation. F rom ab o ve t wo equations, w e ha v e  s t  1 − α −  s t  1 − β f α,β ( s, t ) =  u v  1 − α −  u v  1 − β f α,β ( u, v ) ∆ = φ ( α, β ) . Therefore we ha v e f α,β ( s, t ) =  s t  1 − α −  s t  1 − β φ ( α, β ) . F or t w o natural n umb ers l i and m i suc h that l i ≤ m i , w e put z i ≡ l i n P k =1 l k , ( i = 1 , · · · , n ) , w i ≡ m i n P k =1 m k , ( i = 1 , · · · , n ) 5 and x ij ≡ 1 n P k =1 l k , ( i = 1 , · · · , n ; j = 1 , · · · , l i ) , y ij ≡ 1 n P k =1 m k , ( i = 1 , · · · , n ; j = 1 , · · · , m i ) . F rom (TR2) and (TR3), we then ha ve D α,β     1 n P k =1 l k , · · · , 1 n P k =1 l k , 0 , · · · , 0 , · · · , 1 n P k =1 l k , · · · , 1 n P k =1 l k , 0 , · · · , 0                 1 n P k =1 m k , · · · , 1 n P k =1 m k     = D α,β ( z 1 , · · · , z n || w 1 , · · · , w n ) l i X j =1  1 l i  β  1 m i  1 − β + n X i =1 z α i w 1 − α i D α,β  1 l i , · · · , 1 l i , 0 , · · · , 0         1 m i , · · · , 1 m i  , since x ij = 0 for j = l i + 1 , · · · , m . T hus w e ha v e D α,β ( z 1 , · · · , z n || w 1 , · · · , w n ) = f α,β  n P k =1 l k , n P k =1 m k  − n P i =1 z α i w 1 − α i f α,β ( l i , m i ) l i P j =1  1 l i  β  1 m i  1 − β =  P n k =1 l k P n k =1 m k  1 − α −  P n k =1 l k P n k =1 m k  1 − β φ ( α, β )  l i m i  1 − β − P n i =1 z α i w 1 − α i   l i m i  1 − α −  l i m i  1 − β  φ ( α, β )  l i m i  1 − β Here we ha v e n X i =1 z r i w 1 − r i  l i m i  1 − r = n X i =1     l i n P k =1 l k     r     m i n P k =1 m k     1 − r  l i m i  1 − r =     n P k =1 l k n P k =1 m k     1 − r , ( r ∈ R ) for z i ≡ l i n P k =1 l k , ( i = 1 , · · · , n ) , w i ≡ m i n P k =1 m k , ( i = 1 , · · · , n ) . Th us w e ha v e D α,β ( z 1 , · · · , z n || w 1 , · · · , w n ) = n P i =1 z α i w 1 − α i  l i m i  1 − β − n P i =1 z β i w 1 − β i  l i m i  1 − β φ ( α, β )  l i m i  1 − β . Since w e can take l i and m i arbitrary , w e ma y tak e l i = l and m i = m , t h en we ha v e D α,β ( z 1 , · · · , z n || w 1 , · · · , w n ) = n P i =1 z α i w 1 − α i − n P i =1 z β i w 1 − β i φ ( α, β ) . 6 F rom ( T R1) and the fact th at an y real num b er can b e appro ximated b y a rational num b er, the ab o ve result is tru e for any p ositiv e real n u m b er z j and w j satisfying P n j =1 z j = P n j =1 w j = 1. Putting β = 1 and α = q in th e ab o v e theorem, we hav e the un iqueness theorem for a one-parameter extended rela tive en tr opy (Theorem 2 .3 ). 4 Characterizations of φ ( α , β ) In this section, w e charac terize the constan t φ ( α, β ) dep ending on tw o parameters α and β . Prop osition 4.1 The p ostulate that our quantity D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) deﬁne d for any p airs o f the p r ob ability distributions: D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) = n X j =1 x α j y 1 − α j − x β j y 1 − β j φ ( α, β ) (5) derive d in The or em 3.1 r e c overs the r elative entr opy when α → 1 an d β → 1 , that is, lim α,β → 1 D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) = k n X j =1 x j (log x j − log y j ) (6) implies the fol lowing c onditions. (c1) We ha v e lim α → 1 φ ( α, 1) = lim β → 1 φ (1 , β ) = lim β → α φ ( α, β ) = 0 and φ ( α, β ) 6 = 0 f or α 6 = β . (c2) Ther e exists the interval ( a, b ) su c h that φ ( α, 1) and φ (1 , β ) ar e diﬀer entiable on ( a, 1) ∪ (1 , b ) . (c3) Ther e exists th e c onstant k > 0 such that lim α → 1 dφ ( α, 1) dα = 1 k and lim β → 1 dφ (1 ,β ) dβ = − 1 k . Pr o of : (c1) W e ma y calculate the limit of the left hand side in Eq.(6) in the follo wing w a ys. (i) Firstly w e may tak e t he limit α → 1 in Eq.(5) and then late r take the limit β → 1: lim β → 1 D 1 ,β ( x 1 , · · · , x n || y 1 , · · · , y n ) = lim β → 1 n X j =1 x j − x β j y 1 − β j φ (1 , β ) . Since w e h a ve lim β → 1 n P j =1  x j − x β j y 1 − β j  = 0, w e need lim β → 1 φ (1 , β ) = 0 in order th at we ha ve the limit in the ab o ve . (ii) By the similar w ay to (i), we ha ve lim α → 1 φ ( α, 1) = 0. (iii) Firstly w e ma y put β → α and th en la ter tak e the limit α → 1. In the c ase β → α , the sum mation of the numerator of the r igh t hand side in Eq.(5) is equal to 0: lim β → α n X j =1  x α j y 1 − α j − x β j y 1 − β j  = 0 . Therefore we ha ve lim β → α φ ( α, β ) = 0, otherwise lim β → α D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) tak es 0, whic h contradicts the Eq.(6). F rom the reason wh y we ha ve the limit of the left hand side in (6), w e also hav e φ ( α, β ) 6 = 0 for α 6 = β , sin ce n P j =1 x α j y 1 − α j − x β j y 1 − β j 6 = 0 for α 6 = β . 7 (c2) Since n P j =1 x j − x β j y 1 − β j is d iﬀeren tiable b y β , we n eed that th er e exists an inte r v al ( a, b ) suc h th at φ (1 , β ) is also diﬀerentiable by β on ( a, 1) ∪ (1 , b ), in order that we ha ve the limit of the left hand side in Eq.(6). By the similar w a y , there exists an int erv al ( a, b ) suc h that φ ( α, 1) i s also d iﬀeren tiable by β on ( a, 1) ∪ (1 , b ). (c3) Since we ha v e lim β → 1 D 1 ,β ( x 1 , · · · , x n || y 1 , · · · , y n ) = lim β → 1 n X j =1 x j − x β j y 1 − β j φ (1 , β ) = lim β → 1 − n P j =1 x β j y 1 − β j (log x j − log y j ) dφ (1 ,β ) dβ , there exists a constan t k > 0 suc h that dφ (1 ,β ) dβ = − 1 k . By the similar wa y , there exists a constan t k > 0 suc h that dφ ( α, 1) dβ = 1 k . Prop osition 4.2 D α,β ( X || U ) takes the minimum value for ﬁxe d p osterior pr ob ability distribu- tion as uniform distribution U =  1 n , · · · , 1 n  : D α,β  x 1 , · · · , x n         1 n , · · · , 1 n  ≥ D α,β  1 n , · · · , 1 n         1 n , · · · , 1 n  , when we have (c4) the fol lowing r elations (i) and (ii) for α an d β (i) α 6 = β . (ii) If φ ( α, β ) > 0 , then we have 0 ≤ β ≤ 1 ≤ α . If φ ( α, β ) < 0 , then we have 0 ≤ α ≤ 1 ≤ β . Pr o of : T he second deriv ativ e of D α,β  x 1 , · · · , x n     1 n , · · · , 1 n  on x j is calculated by d 2 D α,β  x 1 , · · · , x n     1 n , · · · , 1 n  dx 2 j = n α − 1 α ( α − 1) x α − 2 j − n β − 1 β ( β − 1) x β − 2 j φ ( α, β ) This tak es p ositiv e v alue in the case of (c4) so that it should b e con ve x in x j . Ther efore D α,β ( X || U ) takes the m in im u m v alue. 5 Prop erties of a t w o-p arameter extended relativ e en trop y As an example satisfying the conditions (c1)-(c4 ) on φ ( α, β ), w e simply tak e φ ( α, β ) = α − β . Then w e ma y deﬁ n e a tw o-parameter extended r elativ e en trop y in the follo wing. Deﬁnition 5.1 F or two p ar ameters α, β ∈ R satisfying 0 ≤ α ≤ 1 ≤ β or 0 ≤ β ≤ 1 ≤ α , and two pr ob ability distributions X = { x 1 , · · · , x n } and Y = { y 1 , · · · , y n } , we d eﬁne a two-p ar ameter extende d r elative entr opy by D α,β ( X || Y ) ≡ n X j =1 x α j y 1 − α j − x β j y 1 − β j α − β , ( α 6 = β ) . 8 Note that a t wo -p arameter extended rela tive entrop y is a generalization of the relativ e en tropy in the s en se that lim α,β → 1 D α,β ( X || Y ) = D 1 ( X || Y ) . W e also note that a t w o-parameter extended relativ e en trop y reco vers the Tsallis relati v e en tropy (one-parameter extended relativ e en tropy) when α = 1 or β = 1. Th e Tsallis relativ e en trop y is also a one-parameter generali zation of the relativ e en tropy: lim q → 1 D T q ( X || Y ) = D 1 ( X || Y ) . In addition, we note that a t wo -p arameter extended relativ e en tropy is expressed b y the con ve x com bination of the Tsallis r elativ e entrop y: D α,β ( X || Y ) = α − 1 α − β D T α ( X || Y ) + 1 − β α − β D T β ( X || Y ) . (7) Th us we ha v e the follo wing prop erties on a t wo-parameter relativ e entrop y , thanks to the ab o ve relation and t h e prop erties of the Tsallis r elativ e en trop y studied in [13]. Prop osition 5.2 F or a two-p ar ameter extende d r elative entr opy D α,β ( X || Y ) , we have the fol- lowing pr op erties. (i) (Nonne gativity) D α,β ( X || Y ) ≥ 0 . (ii) (Symmetry) D α,β  x π (1) , · · · , x π ( n )     y π (1) , · · · , y π ( n )  = D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) . (iii) (P ossibility of e xtension) D α,β ( x 1 , · · · , x n , 0 || y 1 , · · · , y n , 0 ) = D α,β ( x 1 , · · · , x n || y 1 , · · · , y n ) . (iv) (Joint c onvexity) F or 0 ≤ λ ≤ 1 and the pr ob ability distributions X ( i ) = n x ( i ) j o , Y ( i ) = n y ( i ) j o , ( i = 1 , 2; j = 1 , · · · , n ) , we ha v e D α,β  λX (1) + (1 − λ ) X (2)       λY (1) + (1 − λ ) Y (2)  ≤ λD α,β  X (1)       Y (1)  + (1 − λ ) D α,β  X (2)       Y (2)  . (v) (Monotonicity) F or the tr ansition pr ob ability matrix W , we have D α,β ( W X || W Y ) ≤ D α,β ( X || Y ) . It is also n otable that w e ha ve the follo wing expression for a tw o-parameter extended relativ e en tropy: D α,β ( X || U ) = n α − 1 − n β − 1 α − β − n α − 1  α − 1 α − β  S α ( X ) − n β − 1  1 − β α − β  S β ( X ) for the uniform distribution U = { 1 /n, · · · , 1 /n } , while w e also ha ve the follo w in g relation b et ween the tw o-parameter ext ended entrop y and t he Tsallis en trop y (one- parameter e xtend ed en tropy): S α,β ( X ) =  α − 1 α − β  S α ( X ) +  1 − β α − β  S β ( X ) . Therefore w e ma y not obtain the dir ect relation b et ween D α,β ( X || U ) and S α,β ( X ) except for n = 1. 9 6 Conclusion As w e ha ve s een in S ection 3, the t w o-parameter extended relativ e en tr opy is c haracterized b y con tinuit y , sym m etry and additivit y . On the other h and, it is k n o wn th at the f -div ergence is characte r ized b y symm etry , monotonicit y and joint con vexit y [16 ]. The p rop erties such as monotonicit y and joint con v exit y are repr esen ted by the inequalities. F or the c haracterization of f -div ergence, we need the inequalities inv olving their equ ality conditions, wh ile for the c h ar- acterizat ion of the tw o-parameter extended r elativ e en tr opy , we need the fu nctional equation referred b y an ad d itivit y . Therefore the cond itions in our axiom are essen tially d iﬀeren t from those of the axiom c h aracterizing f -div ergence. It is also notable that our c haracterization of a t w o-parameter extended relativ e en tropy (Theorem 3.1) is the un iqueness t h eorem suc h that the function D α,β ( X || Y ) is uniqu ely give n b y Eq.(3), while the charac terization of f -div ergence (Theorem 1 in [16]) is the existence theorem for a conv ex fu nction f suc h that the function deﬁned for an y pair of the probabilit y distributions is equal to the f -div er gence. In other wo rds, in the pap er [16 ], the existence of the conv ex fun ction has b een sh o wn b ut the uniqueness of the con vex function f h as not b een sho wn , so that our axiomatic c haracterization ma y hav e an adv an tage since it un iquely give s a t w o-parameter extended relativ e en tropy . It is also notable that the uniqu eness theorem f or α -div ergence w as recen tly sho wn in [17] for the sp ecial case suc h that the dive r gence measure ( f unctional) is written b y a su m of all comp onen ts. Closing th is section, w e giv e the expressions of a tw o-parameter extended relativ e entrop y b y mea n s of f -dive rgence: D f ( X || Y ) ≡ n X j =1 y j f  x j y j  , where f is a con v ex function on (0 , ∞ ) and f (1) = 0. If w e tak e f ( t ) = t log t , then f -div ergence D f ( X || Y ) reco v er s the relativ e en tropy . Here, if w e put f α,β ( t ) ≡ t α − t β α − β , ( α 6 = β ) , (8) then d 2 f α,β ( t ) dt 2 ≥ 0 for 0 ≤ α ≤ 1 ≤ β or 0 ≤ β ≤ 1 ≤ α . And then we ha ve the f ollo wing expression: D α,β ( X || Y ) = D f α,β ( X || Y ) . It is kno w n th at the relativ e en tropy is connected to man y imp ortant results in the mathe- matical physics and information science. F or a t wo- parameter extended relativ e entrop y , such connections (for example with H- theorem or v ariational expressions rela ted to the fr ee en er gy) will b e studied in the future. Ac kno wledgemen ts I would lik e to thank Pr ofessor H.Suya ri and Professor T .W ada giving m e an opp ortun it y to read their in teresting pap er [10] in the w orkshop at Chiba Univ ersity . The author w as partially supp orted b y th e Japanese Ministry of Ed u cation, Science, Sp orts and Cultur e, Gran t-in-Aid for Encouragemen t of Y oun g Scienti s ts (B) 20 740067. References [1] C.E.Shann on, A mathematical theory of communicati on , Bell Syst.T ec h .J.,V ol.27(1948) , pp.379-423 and pp .623-6 56. 10 [2] J.Acz ´ el a n d Z.Dar´ oczy , On measures of in formation and t h eir c haracterizations, Academic Press, 1975. [3] A.R ´ en yi, On measures of en tropy and information, in Pro c. 4th Berk eley Symp ., Math- ematical and Statistica l Probabilit y , Berk eley , CA: Univ. Calif. Pr ess, V ol. 1(1961), p p . 547-5 61. [4] C. Tsallis, Po s s ible generalizatio n of Bo lzmann-Gibbs statistics, J .Stat. Ph ys., V ol. 52(19 88), pp. 479 -487. [5] H.Suyari, Generalization of S hannon-Kh inc hin axioms to n onextensiv e sys tems and th e uniqueness theo r em for the nonextensiv e entrop y , IEEE T rans. Information Theory , V ol.50(2004 ), pp.1783-1787 . [6] I. Csisz´ ar, Axiomatic c haracterizatio n s of information m easures, Entrop y ,V ol.10(2008) , pp.261-273 . [7] E.P .Borges and I.Rod iti,A family of n onextensiv e ent r opies, Ph ys .Lett.A, V ol.246(199 8),pp .399-402. [8] G.Kaniadakis, M.Lissia and A.M.Scarfone, Deformed logarithms and en tropies, Ph y s ica A, V ol.340(200 4),pp .41-49. [9] G.Kaniadakis, M.Lissia and A.M .Scarfone, Tw o-parameter deformations of log arithm , ex- p onentia l, and en tropy: A consisten t fr amework for generalized statistical mec hanics, Ph ys .Rev.E, V ol.71(2 005),0 46128. [10] T.W ada and H.Su y ari, A tw o-parameter generalization of Sh annon-Khinchin axioms and the uniqueness theo r em, Phys.Lett.A, V ol.368(2 007),pp.199-205. [11] A.Hobson, A new t heorem of inf ormation theory , J.S tat.Ph ys.,V ol.1(19 69),pp.383-391. [12] S.F u ruic hi, On u niqueness theo r ems for Tsallis en trop y and Tsallis relativ e en tropy , IEEE T rans. Inform ation Theory , V ol.51(20 05), p p.3638-36 45. [13] S.F u ruic hi, K.Y anagi and K.Kuriy ama, F undamenta l pr op erties of Tsallis relativ e en tr op y , J.Math.Ph ys.,V ol.45(200 4),pp.4868-4877. [14] S.F u ruic hi, In formation theoretical p rop erties of Tsallis entropies, J.Math.Phys., V ol.47(2006 ), 023302 . [15] S.F u ruic hi, On the m axim um en trop y pr inciple and the m inimization of the Fisher infor- mation in Tsallis sta tistics, J.Math.Phys., V ol.50(2 009), 013303. [16] I. C sisz´ ar, Information m easur es: A critical survey , T r ansactions of the Sev enth Prague Conference on In formation T heory , Statistical Decision F unctions, Random Pro cesses and of the 197 4 Eur op ean meeting of Statistic ians, held at Prague, from August 18 to 23, 1974 V olume B, p p.73-86, A CADEMIA, Publising House of the C zechoslo - v ak Academ y of Sciences, Prague, 1978. (D. Reidel Publishing Comp an y , DOR- DRECHT:HOLLAND/BOSTON:U.S.A.) [17] S.Amari, α -Div ergence Is Un ique, Belonging to Both f -Div ergence and Bregman Div ergence Classes, IEEE T rans. Inform ation theory , V ol. 55(2009), pp.4925 -4931. 11

An axiomatic characterization of a two-parameter extended relative entropy

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment