Forecasting for stationary binary time series
The forecasting problem for a stationary and ergodic binary time series $\{X_n\}_{n=0}^{\infty}$ is to estimate the probability that $X_{n+1}=1$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process…
Authors: ** G. Morvai, S. Yakowitz, A. Györfi (논문에 명시된 주요 저자) **
Guszt´ av Morv ai and Benjamin W eiss: F orecasting for stationary binary ti me series. Acta Appl. Math. 79 (2 003), no. 1-2, 25–3 4. Abstract The forecasting problem for a sta tionary and ergodic b inary time series { X n } ∞ n =0 is to estimate the probabilit y that X n +1 = 1 based on the observ ations X i , 0 ≤ i ≤ n without p rior knowledge of the distribution of the pro cess { X n } . It is kn o wn that this is not p ossible if one estimates at all v alues of n . W e present a simple procedu re whic h will attempt to mak e such a prediction infinitely often at carefully s elected stopping times c hosen b y the algorithm. W e sh ow that the prop osed pr o cedure is consistent under certain conditions, and we estimate the gro wth rate of the stopping times. 1 In tro duct ion T. Co v er [3] p osed t w o fundamen tal problems concerning estimation for stationar y and er- go dic binary time series { X n } ∞ n = −∞ . (Note that a stationary time series { X n } ∞ n =0 can b e extended to b e a t w o sided statio nary time series { X n } ∞ n = −∞ .) Cov er’s first problem w as on bac kw ard estimation. Problem 1 Is ther e a n estimation scheme f n +1 for the value P ( X 1 = 1 | X − n , . . . , X 0 ) such that f n +1 dep ends solely on the observe d data se gment ( X − n , . . . , X 0 ) and lim n →∞ | f n +1 ( X − n , . . . , X 0 ) − P ( X 1 = 1 | X − n , . . . , X 0 ) | = 0 almost sur ely for al l stationary and er go d i c binary time series { X n } ∞ n = −∞ ? This problem was solv ed b y Ornstein [13] by construc ting suc h a sc heme. (See also Bailey [2].) Ornstein’s sc heme is not a simple one and the pro of of consistency is rather sophisticated. A muc h simpler sc heme and pro of of consistency w ere provided by Morv ai, Y ako witz, G y¨ orfi [12]. (See also W eiss [18].) Co v er’s second problem was on forw ard estimation (forecasting). Problem 2 Is ther e an estimation scheme f n +1 for the value P ( X n +1 = 1 | X 0 , . . . , X n ) such that f n +1 dep ends solely on the data se gment ( X 0 , . . . , X n ) and lim n →∞ | f n +1 ( X 0 , . . . , X n ) − P ( X n +1 = 1 | X 0 , . . . , X n ) | = 0 almost sur ely for al l stationary and er go d i c binary time series { X n } ∞ n = −∞ ? This problem w as answ ered by Bailey [2] in a negative wa y , that is, he sho w ed that there is no such sc heme. (Also see Ryabk o [16], Gy¨ orfi, Morv ai, Y ak o witz [7] a nd W eiss [18 ].) Bailey used the tec hnique of cutting and stacking dev eloped b y Ornstein [14] (see also Shields [17]). Ry abk o’s construction w as based on a function of an infinite state Mark o v- chain. This negativ e result can b e in terpreted as follo ws . Consider a mark et analyst whose ta sk it is to predict the probabilit y o f the ev ent ’the price of a certain share will go up tomorro w’ given the observ at io ns up to the prese n t day . Bailey’s result say s that the difference betw een the estimate and the true conditional probability cannot ev entually b e small for all statio na ry and ergo dic marke t pro cesses . The difference will b e big infinitely o ften. These results show that there is a g reat difference b etw een Problems 1 and 2 . Problem 1 was addressed b y Morv ai, Y ak o witz, Algo et [11] and a v ery simple estimation sc heme w as give n whic h satisfies the statemen t in Problem 1 in pr ob ability instead of almost sur ely . How ev er, for the class of all stationary and ergo dic binary Mark ov-c hains of some finite order Problem 2 can b e solv ed. Indeed, if the time series is a Mark o v-c hain of some finite (but unkno wn) order, w e can estimate the order (e.g. as in Csisz´ ar, Shields [5]) and count f r equencies of blo ck s with length equal to the order. Let X ∗− b e the set of all one-sided binary sequences, that is, X ∗− = { ( . . . , x − 1 , x 0 ) : x i ∈ { 0 , 1 } for all −∞ < i ≤ 0 } . 1 Let d ( · , · ) b e the Hamming distance ( t ha t is f or x, y ∈ { 0 , 1 } , d ( x, y ) = 0 if and only if x = y a nd d ( x, y ) = 1 otherwise), and define the distance on sequences ( . . . , x − 1 , x 0 , ) and ( . . . , y − 1 , y 0 ) as f o llo ws. Let d ∗ (( . . . , x − 1 , x 0 ) , ( . . . , y − 1 , y 0 )) = ∞ X i =0 2 − i − 1 d ( x − i , y − i ) . (1) (F or details see Gra y [6] p. 51. ) Definition 1 The c onditional pr ob ability P ( X 1 = 1 | . . . , X − 1 , X 0 ) is almost sur ely c on- tinuous if for some set C ⊆ X ∗− which has pr ob a b ility one the c onditional pr ob ability P ( X 1 = 1 | . . . , X − 1 , X 0 ) r estricte d to this set C is c o n tinuous with r esp e ct to metric d ∗ ( · , · ) in (1). W e note that from the pro of o f Ry abk o [16] and Gy¨ orfi, Morv ai, Y a k o witz [7] it is clear that ev en for the class o f all stationary and ergo dic binary time-series with a lmo st surely con tin uo us conditional probability P ( X 1 = 1 | . . . , X − 1 , X 0 ) one can not solv e Problem 2. F or n ≥ 1 , let the function p n ( · ) b e defined as p n ( x − n +1 , . . . , x 0 ) = P ( X − n +1 = x − n +1 , . . . , X 0 = x 0 ) (2) where x − i ∈ { 0 , 1 } for 0 ≤ i ≤ n − 1. The en trop y rate H asso ciated with a stationary binary time-series { X n } ∞ −∞ is defined as H = lim n →∞ − 1 n E log 2 p n ( X − n +1 , . . . , X − 1 , X 0 ). W e note that the entrop y rate of a stationa ry binary time-series alw ays exists. F or details cf. Cov er, Thomas [4], pp. 63-64. No w w e ma y p ose our problem. Problem 3 I s ther e a se quenc e of strictly i ncr e asing stopping times { λ n } with λ n ≤ 2 n ( H + ǫ ) and an estima tion scheme f n ( X 0 , . . . , X λ n ) which de p ends on the observe d data se gment ( X 0 , . . . , X λ n ) such that lim n →∞ | f n ( X 0 , . . . , X λ n ) − P ( X λ n +1 = 1 | X 0 , . . . , X λ n ) | = 0 almost sur ely for a l l stationary and er go dic binary time series { X n } ∞ n = −∞ with almost sur ely c ontinuous c ond i tional pr ob ability P ( X 1 = 1 | . . . , X − 1 , X 0 ) ? It turns out that the answ er is a ffirmativ e and suc h a sc heme will b e exhibited below. This result can be in terpreted as if the mark et analyst can refrain from predicting, tha t is, he ma y say that he do es not w an t to predict to da y , but will predict at infinitely man y time instances, and not to o rarely , since λ n ≤ 2 n ( H + ǫ ) , a nd the difference betw een the prediction and the true conditional probabilit y will v anish almost surely at these stopping times. W e note that the stationary pro cesses with a lmost surely con tin uous conditional distribution generalize the pro cesses for whic h the conditional distribution is actually con tin uous, these are essen tially the Rando m Marko v Pro cesses of Kalik o w [8], or t he con tinuous g-measures studied by Mik e Keane in [9]. Morv ai [10] prop osed a differen t estimator whic h is consisten t on a certain stopping time sequence, but those stopping times grow like an exponential to w er whic h is unrealistic and muc h faster grow th than the mere expo nential one in Problem 3. 2 2 The Prop osed Estimator Let { X n } ∞ n = −∞ b e a stationary time series taking v alues from a binary alphab et X = { 0 , 1 } . (Note that all stationary time series { X n } ∞ n =0 can b e thought to b e a t w o sided time series , that is, { X n } ∞ n = −∞ . ) Now w e exhibit an estimator whic h is consisten t on a certain stopping time sequence fo r a restricted class of stationa ry time series. F or notational con v enience, let X n m = ( X m , . . . , X n ), where m ≤ n . Define the stopping times as follows. Set ζ 0 = 0. F or k = 1 , 2 , . . . , define sequence η k and ζ k recursiv ely . Let η k = min { t > 0 : X ζ k − 1 + t ζ k − 1 − ( k − 1)+ t = X ζ k − 1 ζ k − 1 − ( k − 1) } and ζ k = ζ k − 1 + η k . One denotes the k th estimate of P ( X ζ k +1 = 1 | X ζ k 0 ) b y g k , and defines it to b e g k = 1 k k − 1 X j =0 X ζ j +1 . (3) It will b e useful to define other pro cesses { ˜ X n } 0 n = −∞ and { ˆ X ( k ) n } ∞ n = −∞ for k ≥ 0 as follo ws. Let ˜ X − n = X ζ n − n for n ≥ 0, and ˆ X ( k ) n = X ζ k + n for −∞ < n < ∞ . (4) F or an ar bitrary stat ionary binary time series { Y n } , and fo r a ll k ≥ 1 and 1 ≤ i ≤ k define ˆ ζ k 0 ( Y 0 −∞ ) = 0 and ˆ η k i ( Y 0 −∞ ) = min { t > 0 : Y ˆ ζ k i − 1 − t ˆ ζ k i − 1 − ( k − i ) − t = Y ˆ ζ k i − 1 ˆ ζ k i − 1 − ( k − i ) } and ˆ ζ k i ( Y 0 −∞ ) = ˆ ζ k i − 1 ( Y 0 −∞ ) − ˆ η k i ( Y 0 −∞ ) . When it is obv ious on whic h time series ˆ η k i ( Y 0 −∞ ) and ˆ ζ k i ( Y 0 −∞ ) ar e ev aluated, w e will use the notation ˆ η k i and ˆ ζ k i . Let T denote the left shift op erator, that is, ( T x ∞ −∞ ) i = x i +1 . It is easy to see that if ζ k ( x ∞ −∞ ) = l then ˆ ζ k k ( T l x ∞ −∞ ) = − l . W e will need the next lemm a for lat er use. Lemma 1 L et { X n } ∞ n = −∞ b e a stationary binary pr o c ess. Then the time series { ˆ X ( k ) n } ∞ n = −∞ , { ˜ X n } 0 n = −∞ and { X n } ∞ n = −∞ have identic al d istribution. Thus al l these time series ar e sta- tionary, and { ˜ X n } 0 n = −∞ c an b e thought to b e two sid e d stationary time series { ˜ X n } ∞ n = −∞ . Let k ≥ 0, n ≥ 0, m ≥ 0, x m m − n ∈ X n +1 b e arbitra r y . It is immediate that fo r l ≥ 0, T l { X ζ k + m ζ k + m − n = x m m − n , ζ k = l } = { X m m − n = x m m − n , ˆ ζ k k ( X 0 −∞ ) = − l } . (5) 3 First w e pro v e that for k ≥ 0, P (( ˆ X ( k ) m − n , . . . , ˆ X ( k ) m ) = ( x m − n , . . . , x m )) = P ( X m m − n = x m m − n ). By the construction in (4), the stationar ity o f the time series { X n } , and (5) w e ha ve P (( ˆ X ( k ) m − n , . . . , ˆ X ( k ) m ) = ( x m − n , . . . , x m )) = P ( X ζ k + m ζ k + m − n = x m m − n ) = ∞ X l =0 P ( X ζ k + m ζ k + m − n = x m m − n , ζ k = l ) = ∞ X l =0 P ( X m m − n = x m m − n , ˆ ζ k k ( X 0 −∞ ) = − l ) = P ( X m m − n = x m m − n ) . No w w e pro v e that P ( ˜ X 0 − n = x 0 − n ) = P ( X 0 − n = x 0 − n ). By the construction in (4), the stationarit y of the time series { X n } , and (5) (with m = 0) w e ha v e P ( ˜ X 0 − n = x 0 − n ) = P ( X ζ n ζ n − n = x 0 − n ) = ∞ X l =0 P ( X ζ n ζ n − n = x 0 − n , ζ n = l ) = ∞ X l =0 P ( X 0 − n = x 0 − n , ˆ ζ n n ( X 0 −∞ ) = − l ) = P ( X 0 − n = x 0 − n ) . The pro of of t he Lemma is complete. No w w e sho w t he consiste ncy of our estimate g k defined in (3 ). Theorem 1 L et { X n } b e a stationary binary time series. F o r the estimator define d in (3), lim k →∞ g k − P ( X ζ k +1 = 1 | X ζ k 0 ) = 0 almost sur ely pr ovide d that the c onditional pr ob ability P ( X 1 = 1 | X 0 −∞ ) is almost sur ely c ontinuous. Mor e- over, under the sam e c onditions, lim k →∞ g k = lim k →∞ P ( X ζ k +1 = 1 | X ζ k 0 ) = P ( ˜ X 1 = 1 | ˜ X 0 −∞ ) almo st sur ely. Recalling (3) we can write g k = 1 k k − 1 X j =0 [ X ζ j +1 − P ( X ζ j +1 = 1 | X ζ j −∞ )] + 1 k k − 1 X j =0 P ( X ζ j +1 = 1 | X ζ j −∞ ) = 1 k k − 1 X j =0 Γ j + 1 k k − 1 X j =0 P ( X ζ j +1 = 1 | X ζ j −∞ ) . (6) Observ e that { Γ j , σ ( X ζ j +1 −∞ ) } is a b ounded martingale difference sequence for 0 ≤ j < ∞ . T o see this notice that σ ( X ζ j +1 −∞ ) is monotone increasing, and Γ j is measurable with resp ect to 4 σ ( X ζ j +1 −∞ ), and E (Γ j | X ζ j − 1 +1 −∞ ) = 0 f or 0 ≤ j < ∞ (where y ou may define ζ − 1 = − 1). No w apply Azuma’s exp onen tial b ound for b ounded martingale differences in Azuma [1] to get that for any ǫ > 0, P 1 k k − 1 X j =0 Γ j > ǫ ≤ 2 exp( − ǫ 2 k / 2) . After summing the righ t hand side o v er k , and app ealing to the Borel-Cantelli lemma for a sequence of ǫ ’s tending to zero we g et 1 k P k − 1 j =0 Γ j → 0 almost surely . Define the function p : X ∗− → [0 , 1] as p ( x 0 −∞ ) = P ( X 1 = 1 | X 0 −∞ = x 0 −∞ ). F or arbitrar y j ≥ 0, by the construction in (4), X ζ j ζ j − j = ( ˆ X ( j ) − j , . . . , ˆ X ( j ) 0 ) = ˜ X 0 − j and lim j →∞ d ∗ ( ˜ X 0 −∞ , ( . . . , ˆ X ( j ) − 1 , ˆ X ( j ) 0 )) = 0 (7) almost surely . By assumption, the function p ( · ) is con t inuous on a set C ⊆ X ∗− with P ( X 0 −∞ ∈ C ) = 1, and b y the Lemma, P ( ˜ X 0 −∞ ∈ C ) = 1, and for eac h j ≥ 0, P (( . . . , ˆ X ( j ) − 1 , ˆ X ( j ) 0 ) ∈ C ) = 1, and finally , P ( ˜ X 0 −∞ ∈ C , ( . . . , ˆ X ( j ) − 1 , ˆ X ( j ) 0 ) ∈ C for all j ≥ 0) = 1 . By the Lemma, the construction in (4) , t he con tin uit y of p ( · ) o n t he set C , and by (7) P ( X ζ j +1 = 1 | X ζ j −∞ ) = p ( . . . , ˆ X ( j ) − 1 , ˆ X ( j ) 0 ) → p ( ˜ X 0 −∞ ) = P ( ˜ X 1 = 1 | ˜ X 0 −∞ ) and 1 k P k − 1 j =0 P ( X ζ j +1 = 1 | X ζ j −∞ ) → P ( ˜ X 1 = 1 | ˜ X 0 −∞ ) almost surely . W e hav e prov ed that g k → P ( ˜ X 1 = 1 | ˜ X 0 −∞ ) almost sure ly . No w observ e that b y (1 ) and the con tin uit y o f p ( · ) on the set C , almost surely , for all ǫ > 0, there is a J ( ǫ, ˜ X 0 −∞ ), suc h that for all z 0 −∞ ∈ C , if z 0 − J = ˜ X 0 − J then | p ( z 0 −∞ ) − p ( ˜ X 0 −∞ ) | < ǫ . By (7), and sinc e ǫ > 0 w as arbitrary , almost surely , lim j →∞ P ( X ζ j +1 = 1 | X ζ j 0 ) = lim j →∞ E { P ( X ζ j +1 = 1 | X ζ j −∞ ) | X ζ j 0 } = lim j →∞ E { p ( X ζ j −∞ ) | X ζ j 0 } = p ( ˜ X 0 −∞ ) = P ( ˜ X 1 = 1 | ˜ X 0 −∞ ) . The pro of of Theorem 1 is complete. Remark. W e note that for all statio na r y binary time-series, the estimation sc heme describ ed ab ov e is consisten t in probabilit y . This may b e se en as follo ws : E g k − P ( X ζ k +1 = 1 | X ζ k 0 ) ≤ E 1 k k − 1 X j =0 [ X ζ j +1 − P ( X ζ j +1 = 1 | X ζ j −∞ )] + 1 k k − 1 X j =0 E P ( ˆ X ( j ) 1 = 1 | . . . , ˆ X ( j ) − 1 , ˆ X ( j ) 0 ) − P ( ˆ X ( j ) 1 = 1 | ˆ X ( j ) − j , . . . , ˆ X ( j ) 0 ) + E 1 k k − 1 X j =0 P ( ˆ X ( k ) 1 = 1 | ˆ X ( k ) − j , . . . , ˆ X ( k ) 0 ) − P ( ˆ X ( k ) 1 = 1 | ˆ X ( k ) ˆ ζ k k , . . . , ˆ X ( k ) 0 ) , 5 where w e used (7) and the Lemma. The first term conv erges to zero since X ζ j +1 − P ( X ζ j +1 = 1 | X ζ j −∞ ) is a martingale difference sequenc e with resp ect to σ ( X ζ j +1 −∞ ) and an av erage of b ounded mart ing ale differences conv erges to zero almost surely cf. Azuma [1]. Applying (4), (7) and the Lemma, the sum of the last t wo terms can b e estimated by the sum 1 k k − 1 X j =0 E P ( X 1 = 1 | X 0 −∞ ) − P ( X 1 = 1 | X 0 − j ) + E 1 k k − 1 X j =0 P ( X 1 = 1 | X 0 − j ) − P ( X 1 = 1 | X 0 ˆ ζ k k ) and b oth terms con v erge t o zero since b y the martingale con v ergence theorem lim j →∞ P ( X 1 = 1 | X 0 − j ) = P ( X 1 = 1 | X 0 −∞ ) almost surely , and th us the limit in fact ex ists and equals zero. Next we will give some univ ersal estimates for the growth rate of the stopping times ζ k in terms of the entrop y rate of the pro cess. This is natural since the ζ k are defined b y recurrence times for blo c ks of length k , and these are kno wn to grow exp o nentially with the entrop y rate. (Cf. Ornstein and W eiss [15].) Theorem 2 L et { X n } b e a stationary and er go dic binary time series. Then for arbitr ary ǫ > 0 , ζ k < 2 k ( H + ǫ ) eventual ly almost sur e ly, wher e H den o tes the entr opy r ate asso ciate d with time series { X n } . Let X ∗ b e the set of all t w o-sided binary sequences, that is, X ∗ = { ( . . . , x − 1 , x 0 , x 1 , . . . ) : x i ∈ { 0 , 1 } for all −∞ < i < ∞} . Define B k ⊆ { 0 , 1 } k as B k = { x 0 − k +1 ∈ { 0 , 1 } k : 2 − k ( H +0 . 5 ǫ ) < p k ( x 0 − k +1 ) } , where p k ( · ) is as in (2). Note that there is a trivial b o und on the cardinalit y of the set B k , namely , | B k | ≤ 2 k ( H +0 . 5 ǫ ) . (8) By t he Lemma, the distribution of the time series { ˜ X n } is the same as the distribution of { X n } and by the Shannon- McMillan-Breiman Theorem (cf. Cov er, Thomas [4], p. 475), P ∞ [ k =1 \ i ≥ k { ˜ X 0 − i +1 ∈ B i } = 1 . (9) Define the set Q k ( y 0 − k +1 ) as follows : Q k ( y 0 − k +1 ) = { z ∞ −∞ ∈ X ∗ : − ˆ ζ k k ( z 0 −∞ ) ≥ 2 k ( H + ǫ ) , z 0 − k +1 = y 0 − k +1 ) } . 6 W e will estimate the probabilit y of Q k ( y 0 − k +1 ) b y means of the ergo dic theorem. Let x ∞ −∞ ∈ X ∗ b e a typic al sequence of t he time series { X n } . D efine α 0 ( y 0 − k +1 ) = 0 a nd fo r i ≥ 1 let α i ( y 0 − k +1 ) = min { l > α i − 1 ( y 0 − k +1 ) : T − l x ∞ −∞ ∈ Q k ( y 0 − k +1 ) } . Define also β 0 ( y 0 − k +1 ) = 0 a nd for i ≥ 1 let β i ( y 0 − k +1 ) = min { l > β i − 1 ( y 0 − k +1 ) + 2 k ( H + ǫ ) : T − l x ∞ −∞ ∈ Q k ( y 0 − k +1 ) } . Observ e that for arbitrary l > 0, ∞ X j =1 1 { β l − 1 ( y 0 − k +1 ) <α j ( y 0 − k +1 ) ≤ β l ( y 0 − k +1 ) } ≤ k + 1 . By the Lemma a nd the ergo dicit y o f the time series { X n } , P (( . . . , ˆ X ( k ) − 1 , ˆ X ( k ) 0 , ˆ X ( k ) 1 , . . . ) ∈ Q k ( y 0 − k +1 )) = P ( X ∞ −∞ ∈ Q k ( y 0 − k +1 )) = lim t →∞ 1 β t ( y 0 − k +1 ) ∞ X j =1 1 { α j ( y 0 − k +1 ) ≤ β t ( y 0 − k +1 ) } = lim t →∞ 1 β t ( y 0 − k +1 ) t X l =1 ∞ X j =1 1 { β l − 1 ( y 0 − k +1 ) <α j ( y 0 − k +1 ) ≤ β l ( y 0 − k +1 ) } ≤ lim t →∞ t ( k + 1) t 2 k ( H + ǫ ) = ( k + 1) 2 k ( H + ǫ ) . (10) By the construction in (4), − ˆ ζ k k ( . . . , ˆ X ( k ) − 1 , ˆ X ( k ) 0 ) = ζ k ( X ∞ 0 ), and ( ˆ X ( k ) − k +1 , . . . , ˆ X ( k ) 0 ) = ˜ X 0 − k +1 and by the upp er bo und on t he cardinalit y of set B k in (8) and b y (10), we get P ( ζ k ( X ∞ 0 ) ≥ 2 k ( H + ǫ ) , ˜ X 0 − k +1 ∈ B k ) = P ( − ˆ ζ k k ( . . . , ˆ X ( k ) − 1 , ˆ X ( k ) 0 ) ≥ 2 k ( H + ǫ ) , ˜ X 0 − k +1 ∈ B k ) = P ( − ˆ ζ k k ( . . . , ˆ X ( k ) − 1 , ˆ X ( k ) 0 ) ≥ 2 k ( H + ǫ ) , ( ˆ X ( k ) − k +1 , . . . , ˆ X ( k ) 0 ) ∈ B k ) = X y 0 − k +1 ∈ B k P (( . . . , ˆ X ( k ) − 1 , ˆ X ( k ) 0 , ˆ X ( k ) 1 , . . . ) ∈ Q k ( y 0 − k +1 )) ≤ ( k + 1)2 − k 0 . 5 ǫ . The righ t hand side sums, the Borel-Cantelli Lemma and the Shannon-McMillan-Breiman Theorem in (9) together yield that ζ k < 2 k ( H + ǫ ) ev en tually almost surely and Theorem 2 is pro v ed. References [1] K. Azuma, ”W eighted sums of certain dependen t random v ariables,” in T ohoku Math- ematic al Journal, v ol. 37, pp. 3 57–367, 1967. [2] D. H. Bailey , Se quential Schemes for Classifying and Pr e dicting Er go dic Pr o c e sses. Ph. D. thesis, Stanfo r d Univers it y , 1976 . 7 [3] T. M. Co ver, ” O p en problems in information theory ,” in 1975 IEEE Joint Workshop on Information The ory , pp. 35–36 . New Y ork: IEEE Press, 1 9 75. [4] T.M. Cov er and J. Thomas, Elemen ts of In formation Th e ory , Wiley , 1991. [5] I. Csisz ´ ar and P . Shields, ”The consistency o f the BIC Marko v order estimator,” A nnals of Statistics. , vol. 28, pp. 160 1 -1619, 2 0 00. [6] R.M. Gray , Pr ob ability, R andom Pr o c esses, an d Er g o dic Pr op erties. Springer-V erlag, New Y ork, 198 8. [7] L. Gy¨ or fi, G . Morv ai, a nd S. Y ako witz, ”Limits to consisten t o n-line forecasting for ergo dic time series,” IEEE T r ansactions on Inf o rmation The ory , v ol. 44, pp. 886–892, 1998. [8] S. Ka liko w ”Random Mark o v pro cesses and uniform martingales ,” Isr ael Journal of Mathematics , v ol. 71, pp. 3 3 –54, 1990. [9] M. Keane ”Strongly mixing g-measures,” Invent. Math. , v ol. 16, pp. 30 9–324, 1972. [10] G. Morv a i ”G uessing the output of a stationary binary time series ” In: F oundations of statistic al infer enc e (Shor esh) , pp. 207–215 , Con trib. Statist., Ph ysic a, Heidelb erg, 2003. [11] G. Morv ai, S. Y ako witz, and P . Algo et, ”W eakly con v ergen t nonparametric fo recasting of stationary time series,” IEEE T r an sactions on Information The ory , v ol. 43, pp. 483- 498, 1 997. [12] G. Morv ai, S. Y a k o witz, and L. Gy¨ orfi, ”Nonparametric inferences for ergo dic, stationary time series,” Annals o f Statistics. , vol. 24 , pp. 370–3 79, 19 96. [13] D. S. O r nstein, ”Guessing the next output of a stationa ry pro cess,” Isr ael Journal of Mathematics, v ol. 30, pp. 2 92–296, 1978. [14] D. S. Ornstein, Er go dic The ory, R andom n ess, and D ynamic al S ystems. Y ale Univ ersit y Press, 1974. [15] D. S. Ornstein and B. W eiss, ”En tro p y and data compression sc hemes,” IEEE T r ans- actions on In formation The ory , v ol. 39, pp. 7 8–83, 1993 . [16] B. Y a. Ryabk o, ”Prediction of random sequences and univ ers al co ding,” Pr oblems of Inform. T r ans., v ol. 24, pp. 87-96, Apr.-June 1988. [17] P .C. Shields, ”Cutting and stac king: a metho d for constructing stationa r y pro cesses,” IEEE T r ansac tions on Inform ation The ory, v ol. 37, pp. 1605–16 1 4, 1991. [18] B. W eiss, Single Orb i t Dynamics , American Mathematical So ciety , 20 00. 8
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment