Covariance estimation for multivariate conditionally Gaussian dynamic linear models
In multivariate time series, the estimation of the covariance matrix of the observation innovations plays an important role in forecasting as it enables the computation of the standardized forecast error vectors as well as it enables the computation …
Authors: ** K. Triantafyllopoulos (University of Sheffield, 영국) **
Co v ariance Estimatio n for Multiv ariate Conditionally Gauss ian Dynamic Linear Mo dels K. T rian ta fyllop oulos ∗ 1 Marc h 2006 Abstract In m ultiv ariate time series , the estimation of the cov ar iance matrix of the observ ation innov ations plays an impo rtant role in fore c asting as it enables the computation o f the standardized foreca st err or vectors as w ell as it enables the computation of c o nfidence bo unds of the for ecasts. W e develop an o n-line, non-itera tive Bay esian algo rithm for estimation and forecasting. It is empirica lly found that, for a range of sim ulated time series, the prop osed cov ar iance es timator ha s go o d p er formance converging to the true v a lues of the unkno wn observ ation c ov ariance matrix. Over a simulated time s eries, the new metho d approximates the co rrect estima tes , pro duced by a non-sequential Monte Carlo simulation pro cedure, which is used here a s the g old standard. The sp ecia l, but impo rtant, v ector autoregres sive (V AR) and time-v a rying V AR mo dels are illustrated b y considering London metal exchange data consisting of sp ot prices of aluminium, copp er, lead and zinc. Some key wor ds: Multiv ariate time series, dynamic linear mo del, Kalman filter, vector autoregr essive mo del, London metal exchange. In tro duction Multiv ariate time series receiv e considerable atten tion b ecause a great deal of time series data arriv e in v ector form. Whittle (1984) and L ¨ utk ep ohl (1993) discuss V ARMA mo dels for vec tor resp onses, whilst Harvey (1989, Ch apter 8), W est and Harrison (1997, C hapter 16) and Durbin and Ko opman (2001, Chapter 3) extend this work to state space mo dels for observ ation vecto rs. In econometrics most stu d ies of s tate s p ace m o dels fo cus on trend estimation, signal extraction and v olatilit y . A review of recen t dev elopment s of state space mo dels in econometrics can b e found in Pol lo ck (2003). Barassi et al. (2005) and Gra vel le and Morley (2005) giv e applications of the Kalman filter to inte rest r ates data and Harv ey et al. (1994 ) u se Kalman filter tec hniques to estimate the v olatilit y of foreign exc hange rates usin g m ultiv ariate sto c hastic v olatilit y (MSV) mod els. With the exception of multiv ariate GAR CH and MS V m o dels, wh ic h fo cus on the prediction of the v olatilit y , it is usually desirable to use a structur al state sp ace mo del to forecast time series v ectors (e.g. foreign exc hange rates, monthly s ales, in terest rates, etc) and to estimate the observ atio n innov ation co v ariance matrix of the und erlying time series. F or such applications and for short term forecasting the ∗ Department of Probability and St atistics, Universit y of S heffield, UK, email: k.triantaf yllopoulos@shef field.ac.uk 1 ab o v e co v ariance matrix can b e assumed time-in v arian t, but unkn own, and its estimation is the m ain aim of this pap er. The estimation of the observ ation co v ariance matrix pla ys an imp ortant role in forecasting. Firstly we note that, un der the general multiv ariate dynamic linear mo del (see equ ation (1) b elo w), the multi-step forecast mean of th e r esp onse time series v ector is a non-linear fu n ction of the observ ation co v ariance matrix (W est and Harrison, 1997, Chapter 16). Secondly , th e computation of the standardized forecast error ve ctors r equires a pr ecise estimation of the observ ation co v ariance matrix and thus a miss-sp ecification of the observ ation cov ariance matrix can lead to false results regarding the ev aluation and judgement of the mo d el. Thirdly , the m ulti-step forecast co v ariance matrix is a linear fun ction of the obs er v ation co v ariance matrix and the former is of particular interest; the f orecast co v ariance matrix can explain th e v ariabilit y of th e forecasts and hence it can enable the computation of confid ence b ound s for the forecasts. Finally , the pr ecise estimation of the observ ation co v ariance matrix give s an accurate estimation of the cross-correlation str u cture of the s everal comp onent time series, whic h is particularly us efu l, esp ecially for fi nancial time series. F or all the ab o v e reasons th e study of the estimation of the observ ation co v ariance matrix is w orthwhile and its con tribu tion to forecasting for m ultiv ariate time s er ies is paramount. The problem of the estimation of the observ ation inno v ation v ariance for univ ariate state space mo dels has b een well rep orted (W est and Harr ison, 1997, § 4.5; Durb in and Ko opm an, 2001, § 2.10), how ever, for v ector time series this pr oblem b ecomes considerably more com- plex and the a v ailable metho d ology consists of sp ecial cases, approxi mations and iterativ e pro cedur es. Let y t b e a p -dimensional observ ation vecto r follo w ing the Gaussian dynamic linear mo del (DLM): y t = F ′ θ t + ǫ t and θ t = Gθ t − 1 + ω t , (1) where θ t is a d -dimens ional Marko vian state v ector, F is a known d × p design matrix and G a known d × d transition matrix. The notation F ′ is used for the transp ose matrix of F . The distributions u sually adopted for { ǫ t } , { ω t } and θ 0 are th e multiv ariate Gaussian, i.e. ǫ t ∼ N p (0 , Σ), ω t ∼ N d (0 , Ω) and θ 0 ∼ N d ( m 0 , P 0 ), for some kno wn priors m 0 and P 0 . The innov ation ve ctors { ǫ t } and { ω t } are assum ed individually and mutually uncorrelated and they are also assumed uncorrelated with the initial state ve ctor θ 0 , i.e. for all t 6 = s : E ( ǫ t ǫ ′ s ) = 0, E ( ω t ω ′ s ) = 0, and f or all t, s > 0: E ( ǫ t ω ′ s ) = 0, E ( ǫ t θ ′ 0 ) = 0 and E ( ω t θ ′ 0 ) = 0, where E ( · ) denotes exp ectation. The cov ariance matrices Σ and Ω are t ypically unknown and th eir estimation or sp ecificatio n is a we ll k n o wn problem. The inte rest is cen tered on the estimation of Σ, wh ile Ω can b e sp ecified a priori (W est and Harrison, 1997, Chapter 6; Durbin and Ko opm an, 2001, § 3.2.2) . Sev eral metho ds h a v e b een prop osed, for the estimation of Σ. Harv ey (1986) and Q uin- tana and W est (1987) indep enden tly in tro duce matrix-v ariate DLMs, whic h are m atrix-v ariate linear state space m o dels allo wing for cov ariance estimation. Harv ey (1986) pr op oses a like - liho o d estimator, while Quinta na and W est (1987) prop ose a Ba ye sian estimation mo delling Σ with an inv erted Wishart distribution. Harvey (1986)’s mo d el is rep orted and furth er de- v elop ed in Harv ey (1989 ), F ern ´ andez and Harv ey (1990), Harve y and K o opman (1997) and Moauro and Sa vio (2005 ), w hile Quintana and W est (1987)’s mo del is rep orted and fu rther dev elop ed in Qu in tana and W est (1988), Queen and Smith (1992), W est and Harrison (1997) , Salv ador et al. (2003), Salv ador and Gargalo (2004) and Salv ador et al. (2004). Ho w ev er, b oth suggestions (Harv ey (1989)’s and Quinta na an d W est (1987)’s) are criticize d in Barb osa 2 and Harrison (199 2) wh ere it is sho wn that the ab o v e mo d els are restrictiv e in the sense th at one can decomp ose the r esp onse vecto r y t in to sev eral scalar time series and mo del eac h of these time series individually , us in g univ ariate DLMs. Barb osa and Harrison (1992) prop ose an appr o ximate algorithm for th e general DLM (1), bu t their main assu m ption seems rather unjus tifi ed , since it su ggests that for any p × p matrix C it is Σ 1 / 2 C Σ − 1 / 2 = b Σ 1 / 2 C b Σ − 1 / 2 , where b Σ is a p oin t estimate of Σ and the notation Σ 1 / 2 stands for the symmetric square ro ot of Σ (Gupta and Nagar, 1999, p . 7). T his assum ption holds clearly w h en b Σ 1 / 2 , C comm ute and when Σ 1 / 2 = Σ ∗ , C comm ute, wh ere (Σ ∗ ) 2 is any particular realizati on of Σ. How ever, in general the ab ov e assum ption is difficult to c hec k since Σ is the u nknown co v ariance ma- trix s u b ject to estimation. In addition, that assumption seems to b e prob ab ilistically quite inappropriate, since it translates that the non-sto c hastic qu antit y b Σ 1 / 2 C b Σ − 1 / 2 equals the sto c hastic q u an tit y Σ 1 / 2 C Σ − 1 / 2 with probabilit y 1. A p ossible analysis can b e obtained in sp ecial cases where Σ is diagonal or when th e off-diagonal elemen ts of Σ are all common. T ri- an tafyllop oulos and Piko ulas (2002) and T rian tafyllop oulos (2006) adopt the mo d el of Harve y (1986 ) and they p r o vide an improv ed on-line estimator for Σ based on a standard maxim um lik eliho o d tec hnique. T he problem is again that th e mo dels discussed lac k the general for- m ulation of the state space m o del (1); e.g. one can easily s h o w th at all ab o v e mo dels are sp ecial cases of mo d el (1). Iterativ e pro cedu res via maxim um lik eliho o d and Marko v chain Mon te Carlo (MCMC) tec hniqu es are a v ailable, b ut they tend to b e slo w, esp ecially as the dimension of th e observ ation ve ctor p in cr eases. Kitaga w a and Gersc h (1996), Shum w a y and Stoffer (2000, Chapter 4), Durbin and Ko opm an (2001, C hapter 7) an d Doucet et al. (2001) discuss univ ariate mo delling w ith iterativ e m etho ds, b ut their efficiency in m ultiv ariate time series is not ye t explored. Barb osa and Harrison (1992) and W est and Harrison (1997, § 16.2.3 ) discuss the problem of inefficiency of iterativ e metho ds and they p oin t out that the num b er of parameters to b e estimated in Σ is p ( p + 1) / 2 , which r apidly increases with the d imension p of the resp ons e v ector, e.g. for p = 10 there are 55 distinct parameters in Σ to b e estimated. In add ition to th is Dic key et al. (198 6) discu ss relev an t issu es on sp ecifying and assessing the prior distribution of Σ p oin ting out d ifficulties in the implementat ion of iterativ e pro cedures. In this pap er we prop ose a new non-iterativ e Ba y esian pro cedur e for estimating Σ and for forecasting y t . This pro cedure offers a n ov el estimator of Σ for the general DLM (1). The prop osed estimator is empirically found to conv erge to the true v alue of Σ and th is estimator appro ximates we ll the resp ectiv e estimators in the sp ecial cases of the conjugate univ ariate and matrix-v ariate DLMs. A comparison with a n on-sequen tial Mon te Carlo sim ulation sho ws that the n ew metho d pro du ces estimates close to the MCMC. The fo cus and the b enefit emplo ying the new metho d is on on-line estimation and therefore n o attempt has b een made to compare the prop osed algorithms with sequentia l iterativ e pro cedur es. The reason for this is jus tifi ed by the ab o v e discussion and the in terested reader sh ould refer to Dic k ey et al. (1986 ) and W est and Harrison (1997, § 16.2 .3). The p rop osed forecasting pro cedure for mo d el (1) is applied to the imp ortan t mo del su b classes of vecto r autoregressiv e (V AR) and V AR with time-dep endent parameters. T hese mo dels are illustrated b y considering Lond on metal exc hange data, consisting of sp ot prices of aluminiu m, copp er, lead and zinc (W atkins and McAleer, 2004). W e b egin by d evelo ping the main idea of the pap er and giving the prop osed algorithm. The p erformance of this algorithm is illus tr ated in the f ollo wing section by considering sim ulated time series data; a comparison with a Mon te Carlo simula tion is p erformed. The p ro ceeding section give s an application to v ector autoregressiv e mo delling, which is used to analyze London m etal exchange data, in the follo wing section. The app endix d etails a pro of of a 3 theorem in the pap er and it describ es the MCMC s imulation pro cedu re. Main Results Denote w ith y t = ( y 1 , y 2 , . . . , y t ) the inf orm ation set comprising data up to time t , for some p ositiv e in teger t > 0. Let m t and P t b e the p osterior mean and co v ariance matrix of θ t | y t and S t b e th e p osterior exp ectatio n of Σ, i.e. E (Σ | y t ) = S t . Let y t (1) = E ( y t +1 | y t ) = F ′ Gm t b e the one-step forecast mean at time t and Q t +1 = V ar ( y t +1 | y t ) = F ′ R t +1 F + S t b e the one-step forecast co v ariance matrix at t , where R t +1 = GP t G ′ + Ω. Up on observing y t +1 , we define the one-step forecast error v ector as e t +1 = y t +1 − y t (1). The next result (pro v ed in the ap p endix) giv es an approximat e prop ert y of S t . Theorem 1. Consider the dynamic line ar mo del (1). L et Σ b e the c ovarianc e matrix of the observation innovation ǫ t and assume that lim t →∞ S t = Σ , wher e E (Σ | y t ) = S t is the true p osterior me an of Σ given y t . L e t n 0 b e a p ositive sc alar and S 0 = E (Σ) b e the prior exp e ctation of Σ . If Σ is b ounde d, then for lar ge t the fol lowing holds appr oximately S t = 1 n 0 + t n 0 S 0 + t X i =1 S 1 / 2 i − 1 Q − 1 / 2 i e i e ′ i Q − 1 / 2 i S 1 / 2 i − 1 ! , (2) wher e e i , Q i ar e define d ab ove and S 1 / 2 i − 1 , Q − 1 / 2 i denote r esp e c tively the symmetric squar e r o ots of the matric es S i − 1 , Q − 1 i b ase d on the sp e ctr al de c omp osition factorization of symmetric p ositive definite matric es ( i = 1 , 2 , . . . , t ) . Conditionally n o w on Σ = S , for a particular v alue S , we can apply the Kalman filter to the DLM (1) and obtain the p osterior and predictiv e distributions of θ t | Σ = S, y t and y t + h | Σ = S, y t , for a p ositive in teger h > 0, kn o wn as the forecast horizon. Theorem 1 motiv ates approximat ing the true p osterior mean S t b y S = e S t , whic h is pr o duced from application of equation (2), give n a particular data set y t = ( y 1 , y 2 , . . . , y t ). T hus w e obtain the f ollo wing algorithm: Algorithm 1. (a) Prior distribution at time t = 0 : θ 0 | Σ = e S 0 ∼ N d ( e m 0 , e P 0 ) , f or some e m 0 , e P 0 and e S 0 . (b) Posterior distribution at time t : θ t | Σ = e S t , y t ∼ N d ( e m t , e P t ) , wher e e e t = y t − e y t − 1 (1) and e m t = G e m t − 1 + A t e e t , e P t = G e P t − 1 G ′ + Ω − A t e Q t A ′ t , A t = ( G e P t − 1 G ′ + Ω) F e Q − 1 t , e S t = 1 n 0 + t n 0 e S 0 + t X i =1 e S 1 / 2 i − 1 e Q − 1 / 2 i e e i e e ′ i e Q − 1 / 2 i e S 1 / 2 i − 1 ! . (c) h -step for e c ast distribution at t : y t + h | Σ = e S t , y t ∼ N p { e y t ( h ) , e Q t ( h ) } , wher e e y t ( h ) = F ′ G h e m t and e Q t ( h ) = F ′ G h e P t ( G h ) ′ F + h − 1 X i =0 F ′ G i Ω( G i ) ′ F + e S t . 4 In the sp ecial case of matrix-v ariate DLMs (Harvey , 1986; W est and Harrison, 1997, § 16.4) the estimator S t appro ximates the true p osterior mean of Σ pro du ced by an application of Ba y es’ theorem, assuming a prior inv erted Wishart d istribution f or Σ. T o see this, n ote that in the matrix-v ariate DLM (this mo del is b r iefly in page 15, s ee equ ation (8)), F is a d -dimensional d esign v ector and Q t = U t S t − 1 with U t = F ′ R t F + 1 an d so equation (2) can b e w ritten recurs ively as S t = n − 1 t ( n t − 1 S t − 1 + e t e ′ t /U t ) and n t = n t − 1 + 1 = n 0 + t. (3) It is easy to v erify that th e assumption lim t →∞ S t = Σ is satisfied, since lim t →∞ S t = lim t →∞ E (Σ | y t ) and lim t →∞ V ar { v ec h (Σ) | y t } = 0, where v ec h ( · ) den otes the column stac king op erator of a lo wer p ortion of a symmetric matrix. F or p = 1 the matrix-v ariate DLM is reduced to th e conjugate Gaussian/gamma DLM (W est and Harrison, 1997, § 4.5). It turn s out that the estimator S t of equation (2) appro ximates the analogous estimators of all existing conjugate Gaussian dyn amic linear mo dels. It is w orth noting that Theorem 1 and Algorithm 1 h a v e b een presen ted for the state space mo del (1) having time-in v ariant comp onents F , G and Ω. How ever, these results apply if some or all of the ab o v e comp onents c hange with time. In addition, if the ev olution co v ariance matrix Ω t is time-dep endent, it can b e sp ecified via d iscoun t f actors (W est and Harrison, 1997, C hapter 6). This is a u s eful consideration, b ecause in pr actice th e signal θ t is un lik ely to hav e the same v ariabilit y o v er time. F or the applicatio n of Algorithm 1 the initial v alues e m 0 , e P 0 , n 0 and e S 0 m ust b e sp ecified. e m 0 can b e sp ecified fr om h istorical information from th e u nderlying exp erimen t and e P 0 can b e set as a typica lly large diagonal matrix, e.g. e P 0 = 1000 I p , reflecting a lo w pr ecision (or high uncertain t y) on the s p ecification of th e moments of θ 0 . Th e scalar n 0 can b e s et to n 0 = 1 (in the sp ecial case of matrix-v ariate DLMs, n 0 is the prior degrees of freedom). e S 0 is a p rior estimate of Σ and requires at least a rough sp ecification. As information is deflated in time series, a miss-sp ecification of e S 0 ma y not affect muc h the p osterior estimate e S t , esp ecially in the pr esence of large data sets. Ho w ev er, in man y cases and esp ecially in financial time series, a miss -sp ecification of e S 0 can lead to p o or estimates of Σ . Here we suggest that a diagonal co v ariance matrix can b e used , where th e d iagonal elemen ts of e S 0 reflect th e empirical exp ecta tion of the diagonal elemen ts of Σ. This exp ectat ion can b e obtained b y stud ying historical data and other qu alitativ e pieces of information, whic h are usually av ailable to p racticing exp er ts of the exp erimen t or of the application of in terest. Sim ulation Studies Empirical Conv ergence of e S t W e hav e generated 1000 b iv ariate time ser ies { y it } t =1 , 2 ,..., 500 i =1 , 2 ,..., 1000 from several state space mo dels and then w e ha v e a v eraged the 1000 estimates e S i,t (pro du ced by eac h of the 1000 time series) and compared the a v erage e S t = 1000 − 1 P 1000 i =1 e S i,t with the true v alue of Σ . Since in p ractice complicated mo d els are decomp osed into simple mo dels comprisin g local lev el, p olynomial trend and seasonal comp onents (Go dolphin and T r ian tafyllop oulos, 2006), w e consider estimation separately in suc h different comp onent mo d els. W e h a v e three m o d- elling s ituations of in terest: situation 1 (biv ariate lo cal lev el mo d els); situation 2 (biv ariate linear tr end mo d els); and situation 3 (biv ariate seasonal mo dels). F or eac h of the ab ov e 5 three situations we ha v e generated 1000 biv ariate time series, eac h of length 500, us in g three differen t co v ariance m atrices Σ, i.e. Σ 1 = 2 3 3 5 , Σ 2 = 100 8 5 85 80 and Σ 3 = 1 7 7 50 . Throughout the simulat ions we ha ve chosen h igh correlations for eac h Σ i ( i = 1 , 2 , 3), since uncorrelated or approximate ly uncorrelated state space mo dels can b e hand led easily by emplo ying several u niv ariate state space m o dels. Th e priors of Σ i are c hosen as e S (1) i, 0 = I 2 , e S (2) i, 0 = 150 I 2 and e S (3) i, 0 = diag { 3 , 40 } ( i = 1 , 2 , . . . , 1000). The d iagonal choic e for th e p r iors e S ( j ) i, 0 has b een don e for: (a) op erational simplicit y (the user is likely to exp ect rough v alues for the diagonal elements of Σ i , r ather than for the asso ciated correlations) and (b) jud ging ho w the estimation of Σ i is affected by imp rop er priors in the s ense of setting the off-diag onal elemen ts of e S ( j ) i, 0 to zero, while the true v alues of Σ i p osses high correlations. Thr ou gh ou t the mo dels the remaining settings are n 0 = 1, Ω = I 2 , m 0 = [0 0] ′ and P 0 = 1000 I 2 , for all mo dels. T able 1 sh o ws the r esu lts. There are three blo c ks of columns, eac h sh o wing resu lts of the state space mo del considered, namely lo cal lev el mo del (LL or b lo c k 1), linear trend mo del (L T or b lo c k 2) and seasonal mo del (SE or b lo c k 3). In eac h blo c k the fir st column sho ws th e mean of the a v erage S = 500 − 1 P 500 t =1 e S t of all e S t . T he second column sho ws the a ve rage e S 100 at time p oin t t = 100. Lik ewise the third column s h o ws the r esp ectiv e e S 500 a v eraged ov er all 1000 series. Th e r ows in T able 1 sho w the picture of e S t o v er the three different v alues of Σ, e.g. Σ 1 , Σ 2 and Σ 3 . The a ve rage estimate of the correlations is also shown and it is marked in the table by ρ . The resu lts suggest that, generally , the LL m o del has the b est p erformance as opp osed to the L T and the SE mo d el, although w e n ote that σ 12 = 3 (co v ariance in Σ 1 ) is estimated b etter from the L T mo del. It app ears th at the estimator e S t for all mo d els conv erges to the true v alues of Σ , but the rate of conv ergence dep en d s on the underlying state space mo del (here LL p erforms faster conv ergence) and on the prior e S 0 . T able 2 sh o ws the a v eraged (o v er all 1000 simulat ed time series) mean v ector of squared standardized one-step forecast errors (MSSE (1) ), for eac h of the three mo dels (LL, L T, S E) and for eac h of Σ (Σ 1 , Σ 2 , Σ 3 ). F or comparison pu rp oses, T able 2 also sho ws th e resp ectiv e v alues of the MSSE (2) when Σ i is the tr ue v alue. The target v alue of the MSSE ( i ) is [1 1]. W e see that the MSSE (1) approac hes the resp ectiv e MSSE (2) and this demonstrates the accuracy of th e estimator e S t . W e observe that un der Σ 3 , th e MSSE (1) has v alues significan tly sm aller than 1 as compared to the MSSE (2) using the tru e v alue of Σ 3 . Comparison of the Lo cal Level Mo del with MCMC W e ha v e sim ulated a sin gle lo cal lev el mo del un der the observ ation co v ariance matrix Σ = Σ 1 and the relev an t m o del comp onen ts of th e lo cal level m o del of the previous sub -section. W e apply Algorithm 1 and w e compare it w ith a state of the art MCMC estimation p ro cedure based on a blo c k ed Gibbs samp ler su itable for state sp ace mo dels (Gamerman, 1997, p. 149); the MCMC p ro cedure we use is describ ed in the app endix. Th e MCMC estimation pro cedur e is an iterativ e non-sequentia l MCMC pr o cedure and its r ole in this section is to pro vide a means of comparison with the non-iterativ e pro cedur e of Algorithm 1. MCMC is the gold standar d , since it pr o duces (giv en enough computation) exact computation of S t . But MCMC is impr actical ; the new p rop osed metho d is a quick, pr actical and easily implemente d 6 T able 1: P erformance of the estimator e S t (of Algorithm 1 ) for 1000 simulated biv ariate dynamic mo dels generated from a lo cal lev el mo del (LL), a linear trend mo d el (L T) and a seasonal mo del (SE ) und er three observ ation co v ariance m atrices Σ 1 , Σ 2 and Σ 3 . Mo del LL L T SE Σ = Σ i S e S 100 e S 500 S e S 100 e S 500 S e S 100 e S 500 σ 11 = 2 1.945 1.938 1.9 97 2.572 2.721 2.392 2.171 2.207 2.165 σ 12 = 3 2.798 2.770 2.9 20 2.988 3.029 3.026 2.314 2.186 2.589 σ 22 = 5 4.722 4.685 4.8 99 4.547 4.489 4.777 4.399 4.283 4.694 ρ = 0 . 948 0.923 0.919 0.9 33 0.874 0.867 0.895 0.748 0.711 0.812 σ 11 = 100 100.03 9 99.9 31 100.30 3 98.271 98.157 98.368 98.7 31 98.806 99.602 σ 12 = 85 83.133 8 3.028 84 .757 79.471 78.969 82.660 79.6 27 79.427 83.254 σ 22 = 80 80.430 8 0.277 80 .353 78.917 78.865 79.822 79.7 55 79.886 79.896 ρ = 0 . 950 0.927 0.927 0.9 44 0.902 0.897 0.933 0.897 0.894 0.933 σ 11 = 1 1.124 1.135 1.1 01 1.200 1.234 1.126 1.184 1.202 1.151 σ 12 = 7 6.506 6.457 6.7 35 5.388 5.177 5.904 5.764 5.623 6.234 σ 22 = 50 49.305 4 9.375 49 .840 48.518 48.579 49.392 48.8 16 49.038 49.540 ρ = 0 . 989 0.784 0.862 0.9 09 0.706 0.668 0.791 0.758 0.732 0.825 T able 2: Mean v ector of squared standardized one-step forecast errors (MSSE ( i ) ) of the mul- tiv ariate d ynamic mo d el of Algorithm 1. The ind ex i = 1 , 2 r efers to when Σ is estimated by the data ( i = 1) and when Σ is assumed kno wn (according to the sim ulations) for comparison purp oses ( i = 2). The notation LL, L T, SE and Σ 1 , Σ 2 , Σ 3 is the same as in T able 1. MSSE (1) MSSE (2) LL (Σ 1 ) 0.994 1.0 71 0 .999 0.99 5 LL (Σ 2 ) 0.939 0.9 14 0 .999 0.99 9 LL (Σ 3 ) 0.773 1.0 26 0 .998 0.99 7 L T (Σ 1 ) 0.875 1.1 41 0 .992 0.99 6 L T (Σ 2 ) 0.900 0.8 95 1 .002 0.99 7 L T (Σ 3 ) 0.774 1.0 31 1 .000 0.99 6 SE (Σ 1 ) 0.930 1.092 0.997 0 .999 SE (Σ 2 ) 0.903 0.864 0.998 0 .996 SE (Σ 3 ) 0.805 1.026 0.998 0 .996 7 T able 3: Biv ariate sim ulated lo cal level dynamic linear m o del. Show ed are: r eal ve rsu s esti- mated v alues of Σ = { σ ij } i,j =1 , 2 and the correlation co efficien t ρ . Th e first column indicates ho w m an y observ ations we re us ed in eac h estimation. σ 11 σ 12 σ 22 ρ N Real MCMC New Real MCMC New Real MCMC New Rea l MCMC New 100 2.0 0 1.68 1.24 3.00 2.35 1.82 5.00 4.12 3.69 0.95 0.90 0.85 150 2.0 0 1.78 1.34 3.00 2.44 1.91 5.00 3.97 3.72 0.95 0.92 0.86 200 2.0 0 1.76 1.46 3.00 2.48 2.04 5.00 4.02 3.77 0.95 0.94 0.87 250 2.0 0 2.04 1.64 3.00 2.89 2.33 5.00 4.50 4.17 0.95 0.95 0.89 300 2.0 0 1.92 1.65 3.00 2.78 2.39 5.00 4.44 4.31 0.95 0.95 0.90 350 2.0 0 1.90 1.69 3.00 2.76 2.46 5.00 4.46 4.40 0.95 0.95 0.90 400 2.0 0 2.01 1.76 3.00 2.90 2.56 5.00 4.59 4.50 0.95 0.96 0.91 450 2.0 0 2.05 1.82 3.00 2.97 2.66 5.00 4.71 4.65 0.95 0.96 0.92 500 2.0 0 2.11 1.85 3.00 3.09 2.74 5.00 4.90 4.79 0.95 0.96 0.92 appro ximation. In this section we compare the new m etho d with the gold stand ard in ord er to sho w ho w goo d is the approxima tion. T ables 3 and 4 give th e results; the former s ho ws the estimates of Σ with b oth metho ds (MCMC and Algorithm 1) and the latter shows the p erformance of the one-step forecast errors for b oth m etho ds. In T able 4 the one-step forecast error vect or e t = [ e 1 t e 2 t ] ′ and the mean vect or of squared one-step forecast errors are sho wn for sev eral v alues of t und er b oth estimation metho ds. W e observ e that the new m etho d (of Algorithm 1) approxima tes well the MCMC estimates, esp ecial ly for large v alues of time t = N . W e n ote th at MCMC should not b e considered as a b etter metho d as compared to the prop osal of Algorithm 1, since MCMC is an iterativ e and in particular in this pap er it is a non- sequen tial estimation pro cedur e. T h e application of sequ en tial MCMC estimation (Doucet et. al. , 2001) often exp erience several c hallenges as for example time-constrain ts, a v ailabilit y for general pu r p ose algorithms, pr ior-sp ecification, prior-sensitivit y , fast monitoring and exp ert in terv enti on features. Th e prop osal of this p ap er pro vides a strong mo delling approac h allo w- ing for v ariance estimation in a wide class of cond itionally Gaussian dyn amic linear mo d els and th is section sho ws that for large time p erio ds its p erform an ce is close to Mon te Carlo estimation. Application to V AR and TVV AR Time Series Mo dels The dynamic mo d el (1) is very general and an im p ortan t su b class of (1) is the p opular v ector ARMA mo del. In r ecent y ears v ector autoregressiv e (V AR) mo dels ha v e b een extensively dev elop ed and u sed, esp ecially for economic time series, as in Doan et al. (1984 ), Litterman (1986 ), Kadiya la and Karlsson (1993, 1997) , Ooms (199 4), Johansen (199 5), Uhlig (1997), Ni and S un (2003), Sun and Ni (2004) and Huerta and Prado (2006). Our discussion in this s ection includes tw o imp ortan t sub classes of mo del (1), wh ich can b e u sed f or a wide-class of s tationary and non-stationary time series forecasting. The first is 8 T able 4: Biv ariate simulat ed lo cal leve l dyn amic linear mo d el. S ho we d are: one-step forecast errors at time t = N and the squared su m s of the forecasting errors up to time N . e 1 N e 2 N N − 1 P N t e 2 1 t N − 1 P N t e 2 2 t N MCMC Ne w MCMC New MCMC New MCMC New 100 − 2.14 − 2.22 − 2.46 − 2.49 4.8 5 4.84 8.54 8.59 150 − 0.14 − 0.28 − 3.88 − 3.92 4.8 3 4.86 8.21 8.26 200 − 1.02 − 1.09 − 0.12 0.05 5.02 5.04 8.10 8.13 250 − 0.24 − 0.20 − 1.45 − 1.47 5.2 5 5.30 8.72 8.76 300 − 1.72 − 1.79 − 0.61 − 0.71 5.0 1 5.12 8.91 8.95 350 − 0.42 − 0.46 − 1.91 − 1.91 5.1 2 5.12 9.01 9.04 400 − 0.42 − 0.54 − 2.26 − 2.29 5.2 1 5.22 9.09 9.12 450 − 3.91 − 4.02 − 5.42 − 5.41 5.2 8 5.29 9.32 9.33 500 1.48 1.68 − 0.93 − 0.98 5.24 5.24 9.54 9.54 the V AR mo del of kn o wn order ℓ ≥ 1, d efined by y t = Φ 1 y t − 1 + Φ 2 y t − 2 + · · · + Φ ℓ y t − ℓ + ǫ t , ǫ t ∼ N p (0 , Σ) , (4) where Φ 1 , Φ 2 , . . . , Φ ℓ are p × p matrices of p arameters. In th e usual estimatio n of V AR, stationarit y h as to b e assumed and so the r o ots of the p olynomial (in z ) | I p − Φ 1 z − Φ 2 z 2 − · · · − Φ ℓ z ℓ | = 0 should lie outside the unit circle. In standard theory (4) ma y not assume a Gaussian d istri- bution for ǫ t , although in pr actice this is u sed for op erational s implicit y . It is also kn o wn that for a high ord er ℓ m o del (4) appro ximates multiv ariate m o ving a ve rage mo dels, whic h are t ypically difficu lt to estimate and this mak es the V AR eve n more attractiv e in app lications. It is also kno wn th at for general on-line estimation and forecasting, the cov ariance m atrix Σ either has to b e assumed kno wn or it has to b e diagonal. This is a ma jor limitation, b ecause it means that either the mo deller kno ws a priori the cross-correlation b etw een the series { y 1 t } , { y 2 t } , . . . , { y pt } , where y t = [ y 1 t y 2 t · · · y pt ] ′ , or that the p scalar time series are all sto c hastically uncorrelated, in which case it is more sensible to u se s everal univ ariate AR mo dels instead. Recen tly , the need for estimation of Σ as a full co v ariance matrix (e.g. where Σ h as p ( p + 1) / 2 elements to b e estimated) is considered, but the existing estimation pro cedur es include n ecessarily iterativ e estimation via imp ortance samp ling (Kadiya la and Karlsson, 1997). Ni and Su n (2003) p oin t out th at from a f requent ist standp oint ordinary least squares and maximum lik elihoo d estimators of (4) are u na v ailable. These authors state that asymp totic theory estimators m ay not b e applicable for V AR (esp ecially when { y t } is a short-length time series). Ni and Su n (2003), Su n and Ni (2004) and Huerta and Prado (2006) prop ose Ba yesian estimation of the autoregressive p arameters Φ i and Σ, b ased on MCMC. It follo ws that for mo d el (1) when Σ is unkn own, only iterativ e estimation pro cedur es can b e applied. Our p rop osal f or on-line estimation of Σ give s a s tep f orw ard to the estimation and forecasting of V AR mo dels and it is outlined b elo w. W e pr op ose a generalization of th e un iv ariate state space repr esentati on considered in W est and Harrison (1997 , § 9.4.6). Other state sp ace r epresen tations of the V AR are consid ered in 9 Huerta and P r ado (2006) , but these represent ations, us ually referred to as canonical represen- tations of the V AR mo del (Sh umw a y and Stoffer, 2000) are not conv enien t for the estimation of Σ , b ecause Σ is embedd ed into the ev olution equation of th e states θ t . First note that we can rewrite (4) as y t = Φ X t + ǫ t , wh ere Φ = [Φ 1 Φ 2 · · · Φ ℓ ] and X t = [ y ′ t − 1 y ′ t − 2 · · · y ′ t − ℓ ] ′ and so w e can write y t = F ′ t θ + ǫ t = ( X ′ t ⊗ I p )v ec(Φ) + ǫ t , (5) where v ec( · ) denotes the column stac king op erator of a p ortion of a m atrix and ⊗ denotes the Kroneck er or tensor pro duct of t wo matrices. Mod el (5) can b e seen as a r egression-t yp e time series mo d el and it can b e hand led b y the general Algorithm 1 for mo del (1) if w e set G = I p , Ω = 0 and if we rep lace F b y the time-v arying F t = X t ⊗ I p . Th us we can readily apply Algorithm 1 to estimate Σ and θ or Φ 1 , Φ 2 , . . . , Φ ℓ . Mo ving to the time-v arying vec tor autoregressiv e (TVV AR) time series, in r ecen t ye ars there has b een a growing literature for TVV AR time series. Kitaga w a and Gersch (1996), Dahlhaus (1997), F rancq and Gautier (2004) and And erson and Meersc haert (2005 ) study parameter estimation based on the asymptotic b ehavi our of TVV AR and time-v arying ARMA mo dels. F rom a s tate sp ace standp oint W est et al. (1999) prop ose a state space formulati on for a univ ariate time-v aryin g AR mo d el applied to electroencephalographic data. In th is section we extend this state space formulation to a v ector of observ ations and hence w e can prop ose the application of Algorithm 1 in order to estimate the co v ariance matrix of th e err or drifts of the TVV AR mo d el. Consider th at the p -vect or time series { y t } follo ws the TVV AR mo del of k n o wn ord er ℓ defined by y t = Φ 1 t y t − 1 + Φ 2 t y t − 2 + · · · + Φ ℓt y t − ℓ + ǫ t , ǫ t ∼ N p (0 , Σ) , (6) where Φ 1 t , Φ 2 t , . . . , Φ ℓt are the time-v aryin g autoregressive parameter matrices. T h e mo del can b e stationary , lo cally-stati onary or non-stationary dep end ing on the ro ots of the t p oly- nomials (in z ) | I p − Φ 1 t z − Φ 2 t z 2 − · · · − Φ ℓt z ℓ | = 0 . T ypical considerations include th e lo cal stationarit y w h ere there are several r egimes for whic h, lo cally , { y t } is stationary , b ut globally { y t } is non-stationary . Also the time-dep endent p aram- eter matrices Φ it can allo w for an imp r o v ed d ynamic fit as opp osed to the s tatic parameters of th e V AR. In our dev elopmen t w e adopt a r andom walk f or the evolutio n of the parameters Φ it ( i = 1 , 2 , . . . , ℓ ), although the mo deller might suggest other Mark o vian sto chastic evo lution form ulae for Φ it . The random w alk ev olution is the natural consideration wh en { y t } is assumed lo cally stationary . Hence we can r ewrite m o del (6) in s tate-space f orm as y t = Φ t X t + ǫ t = F ′ t θ t + ǫ t and θ t = θ t − 1 + ω t , (7) where X t = [ y ′ t − 1 y ′ t − 2 · · · y ′ t − ℓ ] ′ , F t = X t ⊗ I p , Φ t = [Φ 1 t Φ 2 t · · · Φ ℓt ], θ t = vec(Φ t ) an d ω t ∼ N p 2 ℓ (0 , Ω), for s ome transition co v ariance m atrix Ω . Mo del (7) is reduced to (5 ) wh en Ω = 0, in w hic h case θ t = θ t − 1 = θ . After sp ecifying Ω, w e can d ir ectly apply Algorithm 1 to the state sp ace mo del (7) and th us we can obtain an algorithm for th e estimation of Σ, for the estimation of θ t or Φ 1 t , Φ 2 t , . . . , Φ ℓt and for forecasting the s er ies { y t } . 10 Aluminium spot price (US$ per tonne) Time Spot price 0 50 100 150 200 1700 1950 Copper spot price (US$ per tonne) Time Spot price 0 50 100 150 200 3200 4000 Lead spot price (US$ per tonne) Time Spot price 0 50 100 150 200 850 950 Zinc spot price (US$ per tonne) Time Spot price 0 50 100 150 200 1200 1500 Figure 1: LME d ata, consisting of alumin ium, copp er, lead and zinc sp ot p r ices (in US dollars p er tonne of eac h metal). 11 London Metal Exc hange Data In this s ection we analyze London metal exc hange (LME) data consisting of official sp ot pr ices (US dollars p er tonne of metal). LME is the w orld’s leading n on-ferrous metals’ mark et, trading curr en tly highly liquid con tracts for metals, such as aluminiu m, alumin ium allo y , copp er, lead, nick el, tin and zinc. According to th e LME we bsite ( http://ww w.lme.co .uk/ ) “LME is highly successful with a turnov er in excess of US$3,000 billion p er annum. It also con tributes to the UKs invisible earnings to the s um of more th an £ 250 million in o v erseas earnings eac h ye ar.” More inf orm ation ab out th e functions of the LME can b e f ound via its w ebsite (see ab o ve ); the recentl y gro wing literature on the econometrics mo delling of the LME can b e found in th e review of W atkins and McAleer (2004). W e consider forecasting for four m etals exc hanged in the L ME, namely aluminium, copp er, lead and zinc. T he data are provi ded from the LME website for the p erio d of 4 J an uary 200 5 to 31 Octob er 2005. After excluding week end s and bank h olida ys there are N = 210 trading da ys. W e store the d ata into the 4 × 1 vec tor time s er ies { y t } t =1 , 2 ,..., 210 and y t = [ y 1 t y 2 t y 3 t y 4 t ] ′ , where y 1 t denotes th e sp ot price at time t of alumin ium, y 2 t denotes th e sp ot price at time t of copp er, y 3 t denotes th e sp ot p rice at time t of lead and y 4 t denotes th e sp ot p rice at time t of zinc. The data are plotted in Figure 1. W e p rop ose th e V AR and TVV AR m o dels of the pr evious section; the m otiv ation of this b eing that from Figure 1 the evo lution of the d ata seems to follo w roughly an autoregressiv e t yp e mo del. I ndeed ther e is an apparent trend with no seasonalit y , whic h can b e mo d elled with a trend mo del or with a V AR or TVV AR mo d el of the previous s ection. Here we illustrate the prop osal of V AR and TVV AR mo dels, whic h, acco rd ing to the p r evious section, can estimate the co v ariance matrix of y t , giv en the state parameters, and th us th e correlation structure of { y t } can b e stu died. Other mo dels for th is kind of d ata hav e b een applied in T r ian tafyllop oulos (2006) an d we can en visage th at the mo dels of W est and Quintana (1987) can also b e applied to the LME data. First we apply the algorithms of th e previous section to seve ral V AR and T VV AR mo dels of differen t ord ers in order to fin d out which mo del giv es the b est p erformance. Performance here is measured via the mean v ector of squared standard ized one-step forecast errors (MSSE) and the mean vecto r of absolute p ercen tage one-step forecast errors (MAPE). The fi rst is c hosen as a general p erformance measure taking int o accoun t the estimation of the co v ariance matrix Σ and the second is chosen as a generally reliable p ercen t p erformance measure. T able 5 sho ws the results of 10 V AR( i ) and TVV AR( i ) mo dels (firs t column) of ord er i = 1 , 2 , . . . , 10. The d iscount factor δ refers to the discoun ting of the evolutio n co v ariance matrix of the state parameters θ t ; δ = 1 refers to a static θ t = θ (V AR mo del), while δ < 1 refers to a d ynamic lo cal lev el evo lution of θ t = θ t − 1 + ω t (TVV AR mo d el). T able 5 sh ows that the p er f ormance of th e TVV AR is remark able compared with th e p erformance of V AR, w h ic h pro d uces ve ry high MSS E thr oughout the range of i . Out of the V AR mo dels, the b est is the V AR(1), wh ic h still pro duces very large MSSE. Th is ind icates that a m o ving a v erage (MA) mo del is unlike ly to pr o duce go o d results at all, as the MSSE of the V AR increases with the ord er i . Also the appro ximation of a MA mo del with a high order V AR mo d el will include a large num b er of state p arameters to b e estimate d and this will introd u ce computational problems. Therefore, our atten tion is fo cused on th e TVV AR mo dels. F rom a computational stand- p oint w e note that as the ord er increases δ can not b e to o lo w, b ecause then th ere are computational d ifficulties in the calculatio n of the symmetric square r o ot of e Q t , used for the estimation of e S t (the estimate of Σ). Low er v alues of δ w ork b etter (T riantafyllo p oulos, 2006) 12 T able 5: Mean vect or of squ ared stand ardized one-step forecast errors (MSSE) and mean v ector of absolute p ercen tage one-step forecast er r ors (MAPE) of the multiv ariate LME time series { y t } . The fir st column indicates s everal V AR and TVV AR mo dels. MSSE MAPE V AR(1) 6.614 16.782 7.655 18.370 0.033 0.071 0.143 0.057 TVV AR(1): δ = 0 . 1 2.430 1.764 0.622 1.852 0.05 9 0.053 0.084 0.076 V AR(2) 19.610 15.966 11.934 10.271 0.081 0.226 0.201 0.101 TVV AR(2): δ = 0 . 35 1.296 1.743 1.228 1.822 0. 065 0.057 0.116 0.095 V AR(3) 11.777 23.715 9.906 9.058 0.58 5 0.345 0.480 0.246 TVV AR(3): δ = 0 . 65 2.254 3.149 2.222 2.180 0. 074 0.053 0.132 0.108 V AR(4) 39.979 54.892 28.407 19.169 0.159 0.103 0.235 0.161 TVV AR(4): δ = 0 . 6 1.389 1.802 1.210 1.329 0.10 1 0.072 0.179 0.147 V AR(5) 18.592 16.605 15.474 12.570 0.203 0.076 0.392 0.248 TVV AR(5): δ = 0 . 7 1.429 2.269 1.651 1.677 0.11 4 0.079 0.208 0.171 V AR(6) 24.910 19.085 14.584 17.784 0.206 0.134 0.320 0.197 TVV AR(6): δ = 0 . 75 1.828 2.705 1.683 1.757 0. 132 0.089 0.243 0.197 V AR(7) 21.722 38.054 14.597 14.180 0.330 0.092 0.422 0.490 TVV AR(7): δ = 0 . 75 1.366 2.044 1.191 1.531 0. 148 0.101 0.280 0.223 V AR(8) 28.985 35.867 11.291 16.370 0.515 0.325 0.812 0.563 TVV AR(8): δ = 0 . 8 2.130 2.900 1.637 1.910 0.16 8 0.111 0.326 0.249 V AR(9) 40.229 53.798 12.249 19.691 0.393 0.184 0.416 0.411 TVV AR(9): δ = 0 . 95 14.042 21.011 6.724 8.708 0.20 7 0.124 0.352 0.284 V AR(10) 46.791 49.869 16.240 23.974 0.611 0.306 0.751 0.694 TVV AR(10): δ = 0 . 9 4.273 7.541 3.637 5.629 0. 205 0.124 0.391 0.296 13 and here we ha v e chosen th e low est v alues of δ , wh ic h are allo w ed. Our d ecision on the b est TVV AR mo del is based on the follo wing four criteria. 1. low order mo dels are p referable as they hav e fewer state p arameters; 2. δ should not b e to o low, b ecause then th e co v ariance m atrix of θ t will b e to o large; 3. th e MSS E v ector should b e close to [1 1 1 1] ′ ; 4. th e MAPE v ector should b e as lo w as p ossible. Considering the ab ov e criteria we fa v or the TVV AR(2). Figure 2 sho ws the estimate of the obs erv ation co v ariance matrix Σ. F rom the r igh t graph we observe that the estimate of the corr elations of y 1 t and y j t , giv en θ t are very high (close to 1) and this means that in forecasting; this pro vides useful information ab out the cross-dep end ence of th e four m etal prices ov er time. Observation variance Time Variance 0 50 100 150 200 0 2 4 6 8 10 12 14 Observation correlation Time Correlation 0 50 100 150 200 0.0 0.2 0.4 0.6 0.8 Figure 2: Estimate of the observ ation co v ariance matrix Σ = { σ ij } i,j =1 , 2 , 3 , 4 . The left grap h sho ws the estimates of the v ariances σ ii ; th e solid line sh o ws the estimate of σ 11 , th e d ashed line shows the estimate of σ 22 , the d otted line shows the estimate of σ 33 , the d ashed/dotted line s h o ws th e estimate of σ 44 . Th e right graph s h o ws th e estimate of the correlations of y 1 t | θ t with y j t | θ t ( j = 2 , 3 , 4); the solid line sh ows the estimate of the correlation of y 1 t | θ t with y 2 t | θ t , the d ashed line sh o ws the estimate of the correlation of y 1 t | θ t with y 3 t | θ t and the dotted line sh o ws the estimate of the corr elation of y 1 t | θ t with y 4 t | θ t . As m en tioned b efore tw o comp etitiv e mo dels to our TVV AR m o delling for the LME data are the matrix-v ariate DLMs (MV-DLMs) of Quintana and W est (1987) and the d iscount w eigh ted regression (D WR) of T rian tafyllop oulos (2006). Next we compare the TVV AR(2) mo del discussed ab o ve w ith these tw o mo delling approac hes. W e start by br iefly describin g the MV-DLM and the D WR. 14 T able 6: Mean vect or of squ ared stand ardized one-step forecast errors (MSSE) and mean v ector of absolute p ercentag e one-step forecast errors (MAPE) for the LME data and for three m ultiv ariate mo dels: TVV AR, MV-DLM and D WR. MSSE MAPE TVV AR(2) 1.296 1.743 1.228 1.822 0.065 0.057 0.116 0.095 MV-DLM 1.306 2.436 0.984 1.887 0.019 0.022 0.02 6 0.025 D WR 2.202 1.610 1.590 1.868 0.013 0.015 0.017 0.017 The MV-DLM is defined by y ′ t = F ′ Θ t + ǫ ′ t and Θ t = G Θ t − 1 + ω t , (8) where F is a d × 1 d esign v ector, Θ t is a d × p state m atrix, G is a d × d transition matrix, ǫ t | Σ ∼ N p (0 , Σ) and vec( ω t ) | Σ , Ω ∼ N dp (0 , Σ ⊗ Ω), where v ec ( · ) denotes the column stac king op erator of a low er p ortion of a m atrix and ⊗ denotes the Kroneck er p ro duct of t w o matrices. A p rior inv erted Wishart distrib u tion is assumed for Σ and the resulting p osterior distrib utions as well as fu r ther details on the mo del can b e found in Qu in tana and W est (1987) and W est and Harrison (1997, Chapter 16) (for more references on this mo del, see also th e Int ro d uction). In th e app lication of MV-DLMs it is necessary to sp ecify F and G . F ollo wing Qu in tana and W est (1987), w ho consider international exc hange rates d ata, and by consulting the plots of Figure 1 we prop ose a linear trend mo del f or th e LME data. Thus we can set F = 1 0 and G = 1 1 0 1 . The DWR is defined by y t = y t − 1 + ψ t + ǫ t and ψ t = ψ t − 1 + ζ t , with ǫ t | Σ ∼ N p (0 , Σ) and ζ t ∼ N p (0 , Ω t ). This mo del can b e put into state space form as in y t = [ y t − 1 I p ] 1 ψ t + ǫ t = F ′ t θ t + ǫ t , θ t = 1 ψ t = 1 ψ t − 1 + 0 ζ t = θ t − 1 + ω t . The co v ariance matrix Ω t is m o delled with a d iscoun t factor δ and Σ is estimated f ollo wing T r ian tafyllop oulos and Pik oulas (2002) and T rian tafyllop ou los (2006). T able 6 sh o ws the MSSE and the MAPE of the thr ee mo dels. W e see that all mo dels pro du ce reasonable results. F or the MSSE th e b est mo del is the TVV AR(2) (w ith the excep- tion of the lead v ariable wh er e the MV-DLM pr o duces MSSE closer to 1). F or th e MAPE the b est mo del is the DW R with the T VV AR(2) pro ducing th e highest MAPE. Out of the three m o dels, the MV-DLM is limited b y its mathematical form, w h ic h is constructed to giv e conjugate analysis (see also the Int ro d uction). The DW R suffers from similar limitations as the MV-DLM, but it provides go o d results, for linear trend time series without seasonalit y . The TVV AR mo del pro vides a goo d mo delling alternativ e and considering the numerous ap- plications of V AR time series mo dels in econometrics, it is b eliev ed that th e TVV AR h as a great p oten tial. 15 In conclusion, the T VV AR mo del can pro d u ce f orecasts with go o d forecast accuracy , while the correlation of the series can b e estimated on-line with a f ast linear algorithm. A criticism of the mo del is that its efficiency d ep ends on its order and if h igh order TVV AR mo dels are required (e.g. as in app ro ximating mo ving a v erage pro cesses with time-dep endent parameters) its efficiency will b e similar of that of a v ector MA, since the discount factor will ha v e to b e close to 1. It will b e interesting to kno w h o w the order of the TVV AR m o del is related to the b ou n dness of the eigen v alues of the co v ariance estimator e S t . Concluding Commen ts This p ap er develo ps an algorithm for co v ariance estimation in multiv ariate conditionally Gaus- sian dynamic linear mo dels, assum ing that th e observ ation co v ariance matrix is fixed, but unknown. This is a general estimation pr o cedure, w hic h can b e app lied to an y Gaussian linear s tate space mo d el. The algorithm is empirically foun d to h a v e go o d p erformance p r o- viding a co v ariance estimato r w hic h conv erges to the true v alue of the observ ation co v ariance matrix. The prop osed metho dology compares w ell with a n on-sequen tial state of the art MCMC estimation pro cedure and it is found that the pr op osed estimates are close to th e estimates of the MCMC. The new algo rithm is applied (but not limited to) mo d el su b classes of V AR and V AR with time-dep enden t parameters (TVV AR), which h a v e great app lication in financial time series. Considering th e L on d on metal exc hange data, it is found that the TVV AR mo del h as ou tstand ing p erformance as opp osed to the V AR m o del. It is b eliev ed that the dev elopmen t of the T VV AR mo d el is a worth while pro ject and the prop osed fast, on-line algorithm for the estimation of the observ atio n co v ariance matrix, is a step f orward op ening s ev eral paths for p r actical forecasting. The fo cu s in this pap er is on facilitating and adv ancing non-iterativ e cov ariance estimation pro cedur es for vect or time s er ies. Suc h p ro cedures are particularly app ealing, b ecause of their simplicit y and ease in use. F or such wide class of mo dels su c h u s the conditionally Gaussian dynamic linear mo d els, the p rop osed on-line algorithm enables the computation of the mean v ector of standardized errors as we ll as it en ables th e compu tation of the multi-step f orecast co v ariance matrix. Both these computations are v aluable considerations in forecasting and they attract interest b y academics and p ractitioners alik e. Ac kno wledgemen ts I should like to th an k M. Aitkin and A. O’Hagan for helpf u l discus s ions and suggestions on earlier drafts of the pap er. Sp ecial thanks are due to G. Mon tana who help ed on the compu- tational part of the pap er, in particular regarding the MCMC d esign and implement ation. App endix Pro of of Theorem 1 Let v ec h ( · ) denote the column stac king op erator of a lo w er p ortion of a symmetric square matrix and let ⊗ denote the K ronec k er pro du ct of tw o matrices. Firs t we p ro v e th at for large t , it is approximat ely E (Σ − A t e t e ′ t A ′ t | y t ) = E (Σ − A t e t e ′ t A ′ t | y t − 1 ) , (A-1) 16 where A t = n − 1 / 2 t S 1 / 2 t − 1 Q − 1 / 2 t . Cond itional on Σ, we hav e fr om an ap p lication of the Kalman filter that Co v( e it e j t , e k t e ℓt | Σ , y t − 1 ) is b oun ded, wher e e t = [ e 1 t e 2 t · · · e pt ] ′ . Since Σ is b ound ed, S t is also b ounded (lim t →∞ S t = Σ), and so all the co v ariances of e it e j t and e k t e ℓt unconditional on Σ are also b ound ed. Th is means that all the elemen ts of V ar { ve ch( e t e ′ t ) } are b ounded and so V ar { v ec h ( e t e ′ t ) } is b ounded. No w let X 1 = E (Σ − A t e t e ′ t A ′ t | y t , y t − 1 ) = S t − A t e t e ′ t A t and X 2 = E (Σ − A t e t e ′ t A ′ t | y t − 1 ) = S t − 1 − A t Q t A ′ t . Then, s in ce lim t →∞ S t = Σ, there exists appr opriately a large intege r t ( L ) > 0 su c h that for ev ery t > t ( L ) it is E ( X 1 − X 2 | y t − 1 ) ≈ 0. Also V ar { v ec h( X 1 − X 2 ) | y t − 1 } = V ar { ( A t ⊗ A t ) D p v ec h( e t e ′ t ) | y t − 1 } = 1 n 2 t E t → 0 , with E t = h S 1 / 2 t − 1 Q − 1 / 2 t ⊗ S 1 / 2 t − 1 Q − 1 / 2 t i D p [V ar { vec h ( e t e ′ t ) | y t − 1 } ] D ′ p h Q − 1 / 2 t S 1 / 2 t − 1 ⊗ Q − 1 / 2 t S 1 / 2 t − 1 i , where D p is th e d uplication matrix and fr om th e firs t part of the pro of w e ha v e th at E t is b ound ed. It f ollo ws th at for any t > t ( L ) it is X 1 ≈ X 2 with probabilit y 1 and so w e ha v e pro ve d equation (A-1). Using E (Σ | y t ) = S t , f rom equation (A-1) w e ha ve E (Σ | y t ) − A t e t e ′ t A ′ t = E (Σ | y t − 1 ) − A t E ( e t e ′ t | y t − 1 ) A ′ t ⇒ S t = S t − 1 + 1 n t S 1 / 2 t − 1 Q − 1 / 2 t ( e t e ′ t − Q t ) Q − 1 / 2 t S 1 / 2 t − 1 ⇒ S t = S t − 1 − 1 n t S t − 1 + 1 n t S 1 / 2 t − 1 Q − 1 / 2 t e t e ′ t Q − 1 / 2 t S 1 / 2 t − 1 ⇒ n t S t = n t − 1 S t − 1 + S 1 / 2 t − 1 Q − 1 / 2 t e t e ′ t Q − 1 / 2 t S 1 / 2 t − 1 = n 0 S 0 + t X i =1 S 1 / 2 i − 1 Q − 1 / 2 i e i e ′ i Q − 1 / 2 i S 1 / 2 i − 1 and by dividing b y n t = n 0 + t = n t − 1 + 1 we obtain equ ation (2) as required. The Gibbs Sampler for Multiv ariate Conditionally Gaussian DLMs The follo wing pro cedure applies to an y conditionally Gaussian dyn amic linear m o del in the f orm of equation (1). F or the sim ulation studies considered in th is pap er, giv en data y N = ( y 1 , y 2 , . . . , y N ), we are in terested in samp lin g a s et of state vec tors, θ 1 , θ 2 . . . , θ N and the obs er v ation co v ariance matrix Σ from th e f ull, m ultiv ariate p osterior distribution of θ 1 , θ 2 , . . . , θ N , Σ | y N . Gibbs sampling in vo lve s iterativ e samp ling from the fu ll conditional p osterior of eac h θ t , | θ − t , Σ , y N , for all t = 1 , 2 , . . . , N , and Σ | θ 1 , θ 2 , . . . , θ N , y N ; in our notation, θ − t means that w e are conditioning up on all the comp onent s θ 1 , θ 2 , . . . , θ N but θ t . Given the cond itionally normal and linear stru cture of th e system, suc h fu ll conditional distribu tions are s tandard, and therefore easily sampled. Ho w ev er, such an implementat ion of the Gibbs sampler, wh ere eac h comp onent is up dated once at a time, could b e v ery inefficien t when applied to the m ultiv ariate DLMs discussed in this p ap er; in fact, the h igh-correlation of the dynamic system 17 will most lik ely br ing con v ergence problems. In order to o v ercome suc h difficulties, follo wing the early suggestions of Carter and Koh n (1994) and F r ¨ uh wirth-Schnatter (1994) , we ha ve c hosen to imp lemen t a blo c ke d Gibbs sampler Gamerman (1997, p. 149); within this con text, this sampling scheme is b etter known as the forwar d filtering, b ackwar d sampling algorithm. F ollo wing is a concise d escription of the algorithm us ed in our studies; f or m ore details, the reader should consult the references ab o v e, as well as W est and Harrison (199 7, Ch apter 15). The fir st step of the Gibb s sampler in v olv es sampling f rom the up dating distrib ution of θ N | Σ , y N , whic h is giv en b y the m ultiv ariate normal N d ( m M N , P M N ). This is done in the forward filtering phase of the sampler, as follo ws. Starting at time t = 0 w ith some given initial v alues m M 0 , P M 0 and Σ w e compute the follo wing quan tities at eac h time t , for t = 1 , 2 , . . . , N : (a) the pr ior mean v ector and co v ariance matrix of θ t | Σ , y t − 1 , a t = Gm M t − 1 and R M t = GP M t − 1 G ′ + Ω . (b) the mean v ector and co v ariance matrix of the one-step ahead f orecast of y t | y t − 1 , y M t − 1 (1) = F ′ a t and Q M t = F ′ R M t F + Σ . (c) the p osterior mean v ector and cov ariance matrix of θ t | y t , m M t = a t + A M t e M t and P M t = R M t − A M t Q M t ( A M t ) ′ , where A M t = R M t F ( Q M t ) − 1 is th e Kalman gain and e M t = y t − y M t − 1 (1) is the one-step ahead f orecast err or v ector. An up d ated v ector θ N is thus obtained, and the filtering part of the algorithm is completed. The bac kwa rd s sampling p hase inv olves sampling fr om the distrib u tion of θ t | θ t +1 , Σ , y t at all times t = N − 1 , . . . , 1 , 0. Eac h of su ch v ectors is d ra wn from a multiv ariate normal N d ( h t , H t ), where h t = m M t + P M t G ′ ( R M t +1 ) − 1 ( θ t +1 − a t +1 ) and H t = P M t { I d − G ( R M t +1 ) − 1 GP M t ) , with I d b eing the d × d identit y matrix. At eac h time t , we also compute ǫ ∗ t = y t − F ′ θ t . Once the b ac kw ards sampling phase is completed, we set b Σ = N − 1 N X t =1 ǫ ∗ t ( ǫ ∗ t ) ′ . Finally , with n M 0 b eing th e p rior degrees of fr eedom and S M 0 b eing the prior estimate of Σ, w e sample from the fu ll conditional density of Σ | Θ , y N , which is an inv erted Wishart distribution I W p ( n M 0 + N + 2 p, N b Σ N + n M 0 S M 0 ), whose simulati on is also s tand ard. This concludes an iteration of th e Gibb s samp ler. References [1] An d erson, P .L. and Meersc haert, M.M. (2005) Parameter estimation for p er io dically sta- tionary time series. Journal of Time Series Analysis. 26 , 489-518. 18 [2] Barassi, M.R., Cap orale, G.M. and Hall, S.G. (2005) In terest r ate link ages: a Kalman filter appr oac h to detecting structur al c hange. Ec onomic Mo del ling 22 , 253-284. [3] Barb osa, E. and Harrison, P .J. (1992) V ariance estimation f or m ultiv ariate dyn amic linear mo dels. Journal of F or e c asting 11 , 621-628. [4] C arter, C .K . and Kohn , R. (1994) On Gibbs sampling for state sp ace m o dels. Biometrika 81 , 541-553. [5] Dahlhaus , R. (1997) Fitting time series m o dels to nonstationary pro cesses. Annals of Statistics 25 , 1-37. [6] Dic key , J.M., Da wid, A.P . and K adane, J.B. (1986 ) Su b jectiv e probab ility assessmen t metho ds of m ultiv ariate - t and matrix- t mo d els. I n Bayesian Infer enc e and De c ision T e ch- niques: Essays in Honor of Bruno de Finetti , P .K. Go el and A. Zellner (Eds.). North- Holland, Amsterd am. [7] Doucet, A., de F r eitas, N. and Gordon, N.J. (2001 ) Se que ntial Monte Carlo Me tho ds in Pr actic e . S pringer-V erlag, New Y ork. [8] Dur bin, J . an d Ko opman, S .J. (2001) Time Series Analysis by State Sp ac e Me tho ds . Ox - ford Un iv ersit y Press, Oxford. [9] F ern´ andez, F.J. and Harve y , A.C. (1990) S eemingly un related time series equations and a test for homogenit y . Journal of Busi ne ss and Ec onomics Statistics 8 , 71-81. [10] F rancq, C. and Gautier, A. (2004) L arge sample prop erties of parameter least squ ares estimates f or time-v aryin g ARMA m o dels. Journal of Time Series Analysis 25 , 765-78 3. [11] F r ¨ u h wirth-S c hnatter, S. (1994) Data augmen tation and d ynamic lin ear mo dels. Journal of Time Series Analysis 15 , 183-202. [12] Gamerman, D. (1997) Markov Chain M onte Carlo. Chapman and Hall, Lond on . [13] Go dolphin, E.J. and T riant afyllop oulos, K . (2006) Decomp osition of time series mo dels in state-space form . Computational Statistics and D ata Analysis (in p ress). [14] Gupta, A.K. and Nagar, D.K. (1999) Matrix V ariate Di stribu tions. Ch apman and Hall, New Y ork. [15] Gra v elle, T . and Morley , J.C. (2005) A Kalman filter approac h to characte rizing the Canadian term structure of in terest rates. Applie d Financial Ec onomics 15 , 691-705 . [16] Harv ey , A.C. (1986) Analysis and generalisation of a multiv ariate exp onen tial smo othing mo del. Management Sci e nc e 32 , 374-38 0. [17] Harv ey , A.C. (1989) F or e c asting Structur al Time Series Mo dels and the K alman Filter . Cam bridge Univ ersit y Press, Cambridge. [18] Harv ey , A.C. and Ko opman , S .J. (1997) Multiv ariate str uctural time series mo dels (with commen ts). In C. Heij, J.M. S h umacher, B. Hanzon and C. Praagman (eds.) System Dy- namics in Ec onomics and Financial Mo dels. Wiley , Chic hester, 269-298. 19 [19] Harv ey , A.C., Ruiz E. and Sh ep hard, N. (1994) Multiv ariate sto c hastic v ariance m o dels. R eview Ec onomic Studies 61 , 247-264 . [20] Huerta, G. and Pr ado, R. (2006) Stru ctured p riors for multiv ariate time series. Journal of Statistic al P lanning and Infer enc e (in press). [21] Johansen, S. (1995) Likeliho o d-Base d Infer enc e in Cointe gr ate d V e ctor Autor e gr essive Mo dels. Oxford Univ ersity Press, Oxford. [22] Kadiy ala, K.R. and Karlsson, S. (1993) F orecasti ng w ith generalize d Ba y esian v ector autoregressions. Journal of F or e c asting 12 , 365-378. [23] Kadiy ala, K.R. and Karlsson, S. (199 7) Numerical metho ds for estimation and inference in Ba ye sian V AR-mo dels. Journal of Applie d E c onometrics 12 , 99-132. [24] Kitaga wa, G. an d Gersc h, W. (1996) Smo othness Priors Analys is of Time Serie s . Springer-V erlag, New Y ork. [25] Litterman, R.B. (1986) F orecasting w ith Bay esian v ector autoregressions - five y ears of exp erience. Journal of Business and Ec onomic Statistics 4 , 25-38. [26] L ¨ utk ep oh l, H. (1993) Intr o duction to Mu ltiple Time Series Analysis . Springer-V erlag, Berlin. [27] Moauro, F. and Savio, G. (2005) T emp oral disaggregation using multiv ariate structur al time series mo d els. Ec onometrics Journal 8 , 214-23 4. [28] Ni, S. and S un, D. (2003 ) Noninformative p riors and frequentist risks of Ba y esian esti- mators of v ector-autoregressiv e mo d els. Journal of Ec onometrics 115 , 159-197. [29] Ooms, M. (1994) Empiric al V e ctor Autor e gr essive Mo del ling . Sp ringer-V erlag, New Y ork. [30] P ollo c k, D.S.G. (2003) Recursive estimation in econometrics. Computational Statistics and D ata Analysis 44 , 37-75. [31] Queen, C.M. and Smith, J.Q. (1992) Dyn amic graphical mo dels. In Bayesian Statistics 4 , J.M. Bernardo, J.O. Berger, A.P . Da wid, and A.F.M. Smith (Eds). Oxford Univ ersit y Press, Oxf ord, 741-751. [32] Quinta na, J.M. and W est, M. (1987) An analysis of internatio nal exc hange rates u s ing m ultiv ariate DLMs. The Statistician 36 , 275-281. [33] Quinta na, J.M. and W est, M. (1988) Time series analysis of comp ositional data. In Bayesian Statistics 3 , J.M. Bernardo, M.H. DeGro ot, D.V. Lindley , and A.F.M. Smith (Eds). Oxford Universit y Press, O x f ord, 747-756 . [34] Salv ador, M. and Gargallo, P . (2004) . Au tomatic monitoring and interv en tion in multi - v ariate dyn amic linear mo dels. Computationa l Statistics and D ata Analysis 47 , 401-43 1. [35] Salv ador, M., Gallizo, J.L. and Gargallo, P . (200 3). A dynamic pr in cipal comp onents analysis based on m ultiv ariate matrix normal dynamic linear mo dels. Journal of F or e c asting 22 , 457-478. 20 [36] Salv ador, M., Gallizo, J.L. an d Gargallo, P . (2004) Ba y esian inference in a matrix n ormal dynamic linear mo d el with unknown co v ariance m atrices. Statistics 38 , 307-33 5. [37] Shum w a y , R.H. and Stoffer, D.S. (2000) Time Serie s Analys is and i ts Applic ations. Springer-V erlag, New Y ork. [38] Sun, D. and Ni, S. (2004) Ba y esian analysis of v ector autoregressiv e mo dels with non- informativ e pr iors. J ournal of Statistic al Planning and Infer enc e 121 , 291-30 9. [39] T r ian tafyllop oulos, K. (2006) Multiv ariate discount we ight ed regression and lo cal lev el mo dels. Computational Statistics and Data Analysis (in p ress). [40] T r ian tafyllop oulos, K. and P ikoulas, J. (2002) Multiv ariate regression applied to the problem of net work securit y . Journal of F or e c asting 21 , 579-594 . [41] W atkins C. and McAleer M. (2004). Econometric m o delling of non-ferr ous metal p r ices. Journal of Ec onomic Surveys 18 , 651-70 1. [42] W est, M. and Harrison, P .J. (1997) Bayesian F or e c asting and Dynamic Mo dels . 2nd edition. S p ringer V erlag, New Y ork. [43] W est, M., Prado, R. and Krystal, A.D. (1999) Ev aluatio n and comparison of EEG traces: laten t stru cture in n on s tationary time series. Journal of the Americ an Statistic al Asso ci a- tion 94 , 375-387 . [44] Whittle, P . (1984) Pr e diction and R e gu lation: by line ar le ast-squ ar e metho ds . 2nd edition. Blac k well, Oxf ord. 21
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment