Covariance estimation for multivariate conditionally Gaussian dynamic linear models

Co v ariance Estimatio n for Multiv ariate Conditionally Gauss ian Dynamic Linear Mo dels K. T rian ta fyllop oulos ∗ 1 Marc h 2006 Abstract In m ultiv ariate time series , the estimation of the cov ar iance matrix of the observ ation innov ations plays an impo rtant role in fore c asting as it enables the computation o f the standardized foreca st err or vectors as w ell as it enables the computation of c o nﬁdence bo unds of the for ecasts. W e develop an o n-line, non-itera tive Bay esian algo rithm for estimation and forecasting. It is empirica lly found that, for a range of sim ulated time series, the prop osed cov ar iance es timator ha s go o d p er formance converging to the true v a lues of the unkno wn observ ation c ov ariance matrix. Over a simulated time s eries, the new metho d approximates the co rrect estima tes , pro duced by a non-sequential Monte Carlo simulation pro cedure, which is used here a s the g old standard. The sp ecia l, but impo rtant, v ector autoregres sive (V AR) and time-v a rying V AR mo dels are illustrated b y considering London metal exchange data consisting of sp ot prices of aluminium, copp er, lead and zinc. Some key wor ds: Multiv ariate time series, dynamic linear mo del, Kalman ﬁlter, vector autoregr essive mo del, London metal exchange. In tro duction Multiv ariate time series receiv e considerable atten tion b ecause a great deal of time series data arriv e in v ector form. Whittle (1984) and L ¨ utk ep ohl (1993) discuss V ARMA mo dels for vec tor resp onses, whilst Harvey (1989, Ch apter 8), W est and Harrison (1997, C hapter 16) and Durbin and Ko opman (2001, Chapter 3) extend this work to state space mo dels for observ ation vecto rs. In econometrics most stu d ies of s tate s p ace m o dels fo cus on trend estimation, signal extraction and v olatilit y . A review of recen t dev elopment s of state space mo dels in econometrics can b e found in Pol lo ck (2003). Barassi et al. (2005) and Gra vel le and Morley (2005) giv e applications of the Kalman ﬁlter to inte rest r ates data and Harv ey et al. (1994 ) u se Kalman ﬁlter tec hniques to estimate the v olatilit y of foreign exc hange rates usin g m ultiv ariate sto c hastic v olatilit y (MSV) mod els. With the exception of multiv ariate GAR CH and MS V m o dels, wh ic h fo cus on the prediction of the v olatilit y , it is usually desirable to use a structur al state sp ace mo del to forecast time series v ectors (e.g. foreign exc hange rates, monthly s ales, in terest rates, etc) and to estimate the observ atio n innov ation co v ariance matrix of the und erlying time series. F or such applications and for short term forecasting the ∗ Department of Probability and St atistics, Universit y of S heﬃeld, UK, email: k.triantaf yllopoulos@shef field.ac.uk 1 ab o v e co v ariance matrix can b e assumed time-in v arian t, but unkn own, and its estimation is the m ain aim of this pap er. The estimation of the observ ation co v ariance matrix pla ys an imp ortant role in forecasting. Firstly we note that, un der the general multiv ariate dynamic linear mo del (see equ ation (1) b elo w), the multi-step forecast mean of th e r esp onse time series v ector is a non-linear fu n ction of the observ ation co v ariance matrix (W est and Harrison, 1997, Chapter 16). Secondly , th e computation of the standardized forecast error ve ctors r equires a pr ecise estimation of the observ ation co v ariance matrix and thus a miss-sp eciﬁcation of the observ ation cov ariance matrix can lead to false results regarding the ev aluation and judgement of the mo d el. Thirdly , the m ulti-step forecast co v ariance matrix is a linear fun ction of the obs er v ation co v ariance matrix and the former is of particular interest; the f orecast co v ariance matrix can explain th e v ariabilit y of th e forecasts and hence it can enable the computation of conﬁd ence b ound s for the forecasts. Finally , the pr ecise estimation of the observ ation co v ariance matrix give s an accurate estimation of the cross-correlation str u cture of the s everal comp onent time series, whic h is particularly us efu l, esp ecially for ﬁ nancial time series. F or all the ab o v e reasons th e study of the estimation of the observ ation co v ariance matrix is w orthwhile and its con tribu tion to forecasting for m ultiv ariate time s er ies is paramount. The problem of the estimation of the observ ation inno v ation v ariance for univ ariate state space mo dels has b een well rep orted (W est and Harr ison, 1997, § 4.5; Durb in and Ko opm an, 2001, § 2.10), how ever, for v ector time series this pr oblem b ecomes considerably more com- plex and the a v ailable metho d ology consists of sp ecial cases, approxi mations and iterativ e pro cedur es. Let y t b e a p -dimensional observ ation vecto r follo w ing the Gaussian dynamic linear mo del (DLM): y t = F ′ θ t + ǫ t and θ t = Gθ t − 1 + ω t , (1) where θ t is a d -dimens ional Marko vian state v ector, F is a known d × p design matrix and G a known d × d transition matrix. The notation F ′ is used for the transp ose matrix of F . The distributions u sually adopted for { ǫ t } , { ω t } and θ 0 are th e multiv ariate Gaussian, i.e. ǫ t ∼ N p (0 , Σ), ω t ∼ N d (0 , Ω) and θ 0 ∼ N d ( m 0 , P 0 ), for some kno wn priors m 0 and P 0 . The innov ation ve ctors { ǫ t } and { ω t } are assum ed individually and mutually uncorrelated and they are also assumed uncorrelated with the initial state ve ctor θ 0 , i.e. for all t 6 = s : E ( ǫ t ǫ ′ s ) = 0, E ( ω t ω ′ s ) = 0, and f or all t, s > 0: E ( ǫ t ω ′ s ) = 0, E ( ǫ t θ ′ 0 ) = 0 and E ( ω t θ ′ 0 ) = 0, where E ( · ) denotes exp ectation. The cov ariance matrices Σ and Ω are t ypically unknown and th eir estimation or sp eciﬁcatio n is a we ll k n o wn problem. The inte rest is cen tered on the estimation of Σ, wh ile Ω can b e sp eciﬁed a priori (W est and Harrison, 1997, Chapter 6; Durbin and Ko opm an, 2001, § 3.2.2) . Sev eral metho ds h a v e b een prop osed, for the estimation of Σ. Harv ey (1986) and Q uin- tana and W est (1987) indep enden tly in tro duce matrix-v ariate DLMs, whic h are m atrix-v ariate linear state space m o dels allo wing for cov ariance estimation. Harv ey (1986) pr op oses a like - liho o d estimator, while Quinta na and W est (1987) prop ose a Ba ye sian estimation mo delling Σ with an inv erted Wishart distribution. Harvey (1986)’s mo d el is rep orted and furth er de- v elop ed in Harv ey (1989 ), F ern ´ andez and Harv ey (1990), Harve y and K o opman (1997) and Moauro and Sa vio (2005 ), w hile Quintana and W est (1987)’s mo del is rep orted and fu rther dev elop ed in Qu in tana and W est (1988), Queen and Smith (1992), W est and Harrison (1997) , Salv ador et al. (2003), Salv ador and Gargalo (2004) and Salv ador et al. (2004). Ho w ev er, b oth suggestions (Harv ey (1989)’s and Quinta na an d W est (1987)’s) are criticize d in Barb osa 2 and Harrison (199 2) wh ere it is sho wn that the ab o v e mo d els are restrictiv e in the sense th at one can decomp ose the r esp onse vecto r y t in to sev eral scalar time series and mo del eac h of these time series individually , us in g univ ariate DLMs. Barb osa and Harrison (1992) prop ose an appr o ximate algorithm for th e general DLM (1), bu t their main assu m ption seems rather unjus tiﬁ ed , since it su ggests that for any p × p matrix C it is Σ 1 / 2 C Σ − 1 / 2 = b Σ 1 / 2 C b Σ − 1 / 2 , where b Σ is a p oin t estimate of Σ and the notation Σ 1 / 2 stands for the symmetric square ro ot of Σ (Gupta and Nagar, 1999, p . 7). T his assum ption holds clearly w h en b Σ 1 / 2 , C comm ute and when Σ 1 / 2 = Σ ∗ , C comm ute, wh ere (Σ ∗ ) 2 is any particular realizati on of Σ. How ever, in general the ab ov e assum ption is diﬃcult to c hec k since Σ is the u nknown co v ariance ma- trix s u b ject to estimation. In addition, that assumption seems to b e prob ab ilistically quite inappropriate, since it translates that the non-sto c hastic qu antit y b Σ 1 / 2 C b Σ − 1 / 2 equals the sto c hastic q u an tit y Σ 1 / 2 C Σ − 1 / 2 with probabilit y 1. A p ossible analysis can b e obtained in sp ecial cases where Σ is diagonal or when th e oﬀ-diagonal elemen ts of Σ are all common. T ri- an tafyllop oulos and Piko ulas (2002) and T rian tafyllop oulos (2006) adopt the mo d el of Harve y (1986 ) and they p r o vide an improv ed on-line estimator for Σ based on a standard maxim um lik eliho o d tec hnique. T he problem is again that th e mo dels discussed lac k the general for- m ulation of the state space m o del (1); e.g. one can easily s h o w th at all ab o v e mo dels are sp ecial cases of mo d el (1). Iterativ e pro cedu res via maxim um lik eliho o d and Marko v chain Mon te Carlo (MCMC) tec hniqu es are a v ailable, b ut they tend to b e slo w, esp ecially as the dimension of th e observ ation ve ctor p in cr eases. Kitaga w a and Gersc h (1996), Shum w a y and Stoﬀer (2000, Chapter 4), Durbin and Ko opm an (2001, C hapter 7) an d Doucet et al. (2001) discuss univ ariate mo delling w ith iterativ e m etho ds, b ut their eﬃciency in m ultiv ariate time series is not ye t explored. Barb osa and Harrison (1992) and W est and Harrison (1997, § 16.2.3 ) discuss the problem of ineﬃciency of iterativ e metho ds and they p oin t out that the num b er of parameters to b e estimated in Σ is p ( p + 1) / 2 , which r apidly increases with the d imension p of the resp ons e v ector, e.g. for p = 10 there are 55 distinct parameters in Σ to b e estimated. In add ition to th is Dic key et al. (198 6) discu ss relev an t issu es on sp ecifying and assessing the prior distribution of Σ p oin ting out d iﬃculties in the implementat ion of iterativ e pro cedures. In this pap er we prop ose a new non-iterativ e Ba y esian pro cedur e for estimating Σ and for forecasting y t . This pro cedure oﬀers a n ov el estimator of Σ for the general DLM (1). The prop osed estimator is empirically found to conv erge to the true v alue of Σ and th is estimator appro ximates we ll the resp ectiv e estimators in the sp ecial cases of the conjugate univ ariate and matrix-v ariate DLMs. A comparison with a n on-sequen tial Mon te Carlo sim ulation sho ws that the n ew metho d pro du ces estimates close to the MCMC. The fo cus and the b eneﬁt emplo ying the new metho d is on on-line estimation and therefore n o attempt has b een made to compare the prop osed algorithms with sequentia l iterativ e pro cedur es. The reason for this is jus tiﬁ ed by the ab o v e discussion and the in terested reader sh ould refer to Dic k ey et al. (1986 ) and W est and Harrison (1997, § 16.2 .3). The p rop osed forecasting pro cedure for mo d el (1) is applied to the imp ortan t mo del su b classes of vecto r autoregressiv e (V AR) and V AR with time-dep endent parameters. T hese mo dels are illustrated b y considering Lond on metal exc hange data, consisting of sp ot prices of aluminiu m, copp er, lead and zinc (W atkins and McAleer, 2004). W e b egin by d evelo ping the main idea of the pap er and giving the prop osed algorithm. The p erformance of this algorithm is illus tr ated in the f ollo wing section by considering sim ulated time series data; a comparison with a Mon te Carlo simula tion is p erformed. The p ro ceeding section give s an application to v ector autoregressiv e mo delling, which is used to analyze London m etal exchange data, in the follo wing section. The app endix d etails a pro of of a 3 theorem in the pap er and it describ es the MCMC s imulation pro cedu re. Main Results Denote w ith y t = ( y 1 , y 2 , . . . , y t ) the inf orm ation set comprising data up to time t , for some p ositiv e in teger t > 0. Let m t and P t b e the p osterior mean and co v ariance matrix of θ t | y t and S t b e th e p osterior exp ectatio n of Σ, i.e. E (Σ | y t ) = S t . Let y t (1) = E ( y t +1 | y t ) = F ′ Gm t b e the one-step forecast mean at time t and Q t +1 = V ar ( y t +1 | y t ) = F ′ R t +1 F + S t b e the one-step forecast co v ariance matrix at t , where R t +1 = GP t G ′ + Ω. Up on observing y t +1 , we deﬁne the one-step forecast error v ector as e t +1 = y t +1 − y t (1). The next result (pro v ed in the ap p endix) giv es an approximat e prop ert y of S t . Theorem 1. Consider the dynamic line ar mo del (1). L et Σ b e the c ovarianc e matrix of the observation innovation ǫ t and assume that lim t →∞ S t = Σ , wher e E (Σ | y t ) = S t is the true p osterior me an of Σ given y t . L e t n 0 b e a p ositive sc alar and S 0 = E (Σ) b e the prior exp e ctation of Σ . If Σ is b ounde d, then for lar ge t the fol lowing holds appr oximately S t = 1 n 0 + t n 0 S 0 + t X i =1 S 1 / 2 i − 1 Q − 1 / 2 i e i e ′ i Q − 1 / 2 i S 1 / 2 i − 1 ! , (2) wher e e i , Q i ar e deﬁne d ab ove and S 1 / 2 i − 1 , Q − 1 / 2 i denote r esp e c tively the symmetric squar e r o ots of the matric es S i − 1 , Q − 1 i b ase d on the sp e ctr al de c omp osition factorization of symmetric p ositive deﬁnite matric es ( i = 1 , 2 , . . . , t ) . Conditionally n o w on Σ = S , for a particular v alue S , we can apply the Kalman ﬁlter to the DLM (1) and obtain the p osterior and predictiv e distributions of θ t | Σ = S, y t and y t + h | Σ = S, y t , for a p ositive in teger h > 0, kn o wn as the forecast horizon. Theorem 1 motiv ates approximat ing the true p osterior mean S t b y S = e S t , whic h is pr o duced from application of equation (2), give n a particular data set y t = ( y 1 , y 2 , . . . , y t ). T hus w e obtain the f ollo wing algorithm: Algorithm 1. (a) Prior distribution at time t = 0 : θ 0 | Σ = e S 0 ∼ N d ( e m 0 , e P 0 ) , f or some e m 0 , e P 0 and e S 0 . (b) Posterior distribution at time t : θ t | Σ = e S t , y t ∼ N d ( e m t , e P t ) , wher e e e t = y t − e y t − 1 (1) and e m t = G e m t − 1 + A t e e t , e P t = G e P t − 1 G ′ + Ω − A t e Q t A ′ t , A t = ( G e P t − 1 G ′ + Ω) F e Q − 1 t , e S t = 1 n 0 + t n 0 e S 0 + t X i =1 e S 1 / 2 i − 1 e Q − 1 / 2 i e e i e e ′ i e Q − 1 / 2 i e S 1 / 2 i − 1 ! . (c) h -step for e c ast distribution at t : y t + h | Σ = e S t , y t ∼ N p { e y t ( h ) , e Q t ( h ) } , wher e e y t ( h ) = F ′ G h e m t and e Q t ( h ) = F ′ G h e P t ( G h ) ′ F + h − 1 X i =0 F ′ G i Ω( G i ) ′ F + e S t . 4 In the sp ecial case of matrix-v ariate DLMs (Harvey , 1986; W est and Harrison, 1997, § 16.4) the estimator S t appro ximates the true p osterior mean of Σ pro du ced by an application of Ba y es’ theorem, assuming a prior inv erted Wishart d istribution f or Σ. T o see this, n ote that in the matrix-v ariate DLM (this mo del is b r ieﬂy in page 15, s ee equ ation (8)), F is a d -dimensional d esign v ector and Q t = U t S t − 1 with U t = F ′ R t F + 1 an d so equation (2) can b e w ritten recurs ively as S t = n − 1 t ( n t − 1 S t − 1 + e t e ′ t /U t ) and n t = n t − 1 + 1 = n 0 + t. (3) It is easy to v erify that th e assumption lim t →∞ S t = Σ is satisﬁed, since lim t →∞ S t = lim t →∞ E (Σ | y t ) and lim t →∞ V ar { v ec h (Σ) | y t } = 0, where v ec h ( · ) den otes the column stac king op erator of a lo wer p ortion of a symmetric matrix. F or p = 1 the matrix-v ariate DLM is reduced to th e conjugate Gaussian/gamma DLM (W est and Harrison, 1997, § 4.5). It turn s out that the estimator S t of equation (2) appro ximates the analogous estimators of all existing conjugate Gaussian dyn amic linear mo dels. It is w orth noting that Theorem 1 and Algorithm 1 h a v e b een presen ted for the state space mo del (1) having time-in v ariant comp onents F , G and Ω. How ever, these results apply if some or all of the ab o v e comp onents c hange with time. In addition, if the ev olution co v ariance matrix Ω t is time-dep endent, it can b e sp eciﬁed via d iscoun t f actors (W est and Harrison, 1997, C hapter 6). This is a u s eful consideration, b ecause in pr actice th e signal θ t is un lik ely to hav e the same v ariabilit y o v er time. F or the applicatio n of Algorithm 1 the initial v alues e m 0 , e P 0 , n 0 and e S 0 m ust b e sp eciﬁed. e m 0 can b e sp eciﬁed fr om h istorical information from th e u nderlying exp erimen t and e P 0 can b e set as a typica lly large diagonal matrix, e.g. e P 0 = 1000 I p , reﬂecting a lo w pr ecision (or high uncertain t y) on the s p eciﬁcation of th e moments of θ 0 . Th e scalar n 0 can b e s et to n 0 = 1 (in the sp ecial case of matrix-v ariate DLMs, n 0 is the prior degrees of freedom). e S 0 is a p rior estimate of Σ and requires at least a rough sp eciﬁcation. As information is deﬂated in time series, a miss-sp eciﬁcation of e S 0 ma y not aﬀect muc h the p osterior estimate e S t , esp ecially in the pr esence of large data sets. Ho w ev er, in man y cases and esp ecially in ﬁnancial time series, a miss -sp eciﬁcation of e S 0 can lead to p o or estimates of Σ . Here we suggest that a diagonal co v ariance matrix can b e used , where th e d iagonal elemen ts of e S 0 reﬂect th e empirical exp ecta tion of the diagonal elemen ts of Σ. This exp ectat ion can b e obtained b y stud ying historical data and other qu alitativ e pieces of information, whic h are usually av ailable to p racticing exp er ts of the exp erimen t or of the application of in terest. Sim ulation Studies Empirical Conv ergence of e S t W e hav e generated 1000 b iv ariate time ser ies { y it } t =1 , 2 ,..., 500 i =1 , 2 ,..., 1000 from several state space mo dels and then w e ha v e a v eraged the 1000 estimates e S i,t (pro du ced by eac h of the 1000 time series) and compared the a v erage e S t = 1000 − 1 P 1000 i =1 e S i,t with the true v alue of Σ . Since in p ractice complicated mo d els are decomp osed into simple mo dels comprisin g local lev el, p olynomial trend and seasonal comp onents (Go dolphin and T r ian tafyllop oulos, 2006), w e consider estimation separately in suc h diﬀerent comp onent mo d els. W e h a v e three m o d- elling s ituations of in terest: situation 1 (biv ariate lo cal lev el mo d els); situation 2 (biv ariate linear tr end mo d els); and situation 3 (biv ariate seasonal mo dels). F or eac h of the ab ov e 5 three situations we ha v e generated 1000 biv ariate time series, eac h of length 500, us in g three diﬀeren t co v ariance m atrices Σ, i.e. Σ 1 =  2 3 3 5  , Σ 2 =  100 8 5 85 80  and Σ 3 =  1 7 7 50  . Throughout the simulat ions we ha ve chosen h igh correlations for eac h Σ i ( i = 1 , 2 , 3), since uncorrelated or approximate ly uncorrelated state space mo dels can b e hand led easily by emplo ying several u niv ariate state space m o dels. Th e priors of Σ i are c hosen as e S (1) i, 0 = I 2 , e S (2) i, 0 = 150 I 2 and e S (3) i, 0 = diag { 3 , 40 } ( i = 1 , 2 , . . . , 1000). The d iagonal choic e for th e p r iors e S ( j ) i, 0 has b een don e for: (a) op erational simplicit y (the user is likely to exp ect rough v alues for the diagonal elements of Σ i , r ather than for the asso ciated correlations) and (b) jud ging ho w the estimation of Σ i is aﬀected by imp rop er priors in the s ense of setting the oﬀ-diag onal elemen ts of e S ( j ) i, 0 to zero, while the true v alues of Σ i p osses high correlations. Thr ou gh ou t the mo dels the remaining settings are n 0 = 1, Ω = I 2 , m 0 = [0 0] ′ and P 0 = 1000 I 2 , for all mo dels. T able 1 sh o ws the r esu lts. There are three blo c ks of columns, eac h sh o wing resu lts of the state space mo del considered, namely lo cal lev el mo del (LL or b lo c k 1), linear trend mo del (L T or b lo c k 2) and seasonal mo del (SE or b lo c k 3). In eac h blo c k the ﬁr st column sho ws th e mean of the a v erage S = 500 − 1 P 500 t =1 e S t of all e S t . T he second column sho ws the a ve rage e S 100 at time p oin t t = 100. Lik ewise the third column s h o ws the r esp ectiv e e S 500 a v eraged ov er all 1000 series. Th e r ows in T able 1 sho w the picture of e S t o v er the three diﬀerent v alues of Σ, e.g. Σ 1 , Σ 2 and Σ 3 . The a ve rage estimate of the correlations is also shown and it is marked in the table by ρ . The resu lts suggest that, generally , the LL m o del has the b est p erformance as opp osed to the L T and the SE mo d el, although w e n ote that σ 12 = 3 (co v ariance in Σ 1 ) is estimated b etter from the L T mo del. It app ears th at the estimator e S t for all mo d els conv erges to the true v alues of Σ , but the rate of conv ergence dep en d s on the underlying state space mo del (here LL p erforms faster conv ergence) and on the prior e S 0 . T able 2 sh o ws the a v eraged (o v er all 1000 simulat ed time series) mean v ector of squared standardized one-step forecast errors (MSSE (1) ), for eac h of the three mo dels (LL, L T, S E) and for eac h of Σ (Σ 1 , Σ 2 , Σ 3 ). F or comparison pu rp oses, T able 2 also sho ws th e resp ectiv e v alues of the MSSE (2) when Σ i is the tr ue v alue. The target v alue of the MSSE ( i ) is [1 1]. W e see that the MSSE (1) approac hes the resp ectiv e MSSE (2) and this demonstrates the accuracy of th e estimator e S t . W e observe that un der Σ 3 , th e MSSE (1) has v alues signiﬁcan tly sm aller than 1 as compared to the MSSE (2) using the tru e v alue of Σ 3 . Comparison of the Lo cal Level Mo del with MCMC W e ha v e sim ulated a sin gle lo cal lev el mo del un der the observ ation co v ariance matrix Σ = Σ 1 and the relev an t m o del comp onen ts of th e lo cal level m o del of the previous sub -section. W e apply Algorithm 1 and w e compare it w ith a state of the art MCMC estimation p ro cedure based on a blo c k ed Gibbs samp ler su itable for state sp ace mo dels (Gamerman, 1997, p. 149); the MCMC p ro cedure we use is describ ed in the app endix. Th e MCMC estimation pro cedur e is an iterativ e non-sequentia l MCMC pr o cedure and its r ole in this section is to pro vide a means of comparison with the non-iterativ e pro cedur e of Algorithm 1. MCMC is the gold standar d , since it pr o duces (giv en enough computation) exact computation of S t . But MCMC is impr actical ; the new p rop osed metho d is a quick, pr actical and easily implemente d 6 T able 1: P erformance of the estimator e S t (of Algorithm 1 ) for 1000 simulated biv ariate dynamic mo dels generated from a lo cal lev el mo del (LL), a linear trend mo d el (L T) and a seasonal mo del (SE ) und er three observ ation co v ariance m atrices Σ 1 , Σ 2 and Σ 3 . Mo del LL L T SE Σ = Σ i S e S 100 e S 500 S e S 100 e S 500 S e S 100 e S 500 σ 11 = 2 1.945 1.938 1.9 97 2.572 2.721 2.392 2.171 2.207 2.165 σ 12 = 3 2.798 2.770 2.9 20 2.988 3.029 3.026 2.314 2.186 2.589 σ 22 = 5 4.722 4.685 4.8 99 4.547 4.489 4.777 4.399 4.283 4.694 ρ = 0 . 948 0.923 0.919 0.9 33 0.874 0.867 0.895 0.748 0.711 0.812 σ 11 = 100 100.03 9 99.9 31 100.30 3 98.271 98.157 98.368 98.7 31 98.806 99.602 σ 12 = 85 83.133 8 3.028 84 .757 79.471 78.969 82.660 79.6 27 79.427 83.254 σ 22 = 80 80.430 8 0.277 80 .353 78.917 78.865 79.822 79.7 55 79.886 79.896 ρ = 0 . 950 0.927 0.927 0.9 44 0.902 0.897 0.933 0.897 0.894 0.933 σ 11 = 1 1.124 1.135 1.1 01 1.200 1.234 1.126 1.184 1.202 1.151 σ 12 = 7 6.506 6.457 6.7 35 5.388 5.177 5.904 5.764 5.623 6.234 σ 22 = 50 49.305 4 9.375 49 .840 48.518 48.579 49.392 48.8 16 49.038 49.540 ρ = 0 . 989 0.784 0.862 0.9 09 0.706 0.668 0.791 0.758 0.732 0.825 T able 2: Mean v ector of squared standardized one-step forecast errors (MSSE ( i ) ) of the mul- tiv ariate d ynamic mo d el of Algorithm 1. The ind ex i = 1 , 2 r efers to when Σ is estimated by the data ( i = 1) and when Σ is assumed kno wn (according to the sim ulations) for comparison purp oses ( i = 2). The notation LL, L T, SE and Σ 1 , Σ 2 , Σ 3 is the same as in T able 1. MSSE (1) MSSE (2) LL (Σ 1 ) 0.994 1.0 71 0 .999 0.99 5 LL (Σ 2 ) 0.939 0.9 14 0 .999 0.99 9 LL (Σ 3 ) 0.773 1.0 26 0 .998 0.99 7 L T (Σ 1 ) 0.875 1.1 41 0 .992 0.99 6 L T (Σ 2 ) 0.900 0.8 95 1 .002 0.99 7 L T (Σ 3 ) 0.774 1.0 31 1 .000 0.99 6 SE (Σ 1 ) 0.930 1.092 0.997 0 .999 SE (Σ 2 ) 0.903 0.864 0.998 0 .996 SE (Σ 3 ) 0.805 1.026 0.998 0 .996 7 T able 3: Biv ariate sim ulated lo cal level dynamic linear m o del. Show ed are: r eal ve rsu s esti- mated v alues of Σ = { σ ij } i,j =1 , 2 and the correlation co eﬃcien t ρ . Th e ﬁrst column indicates ho w m an y observ ations we re us ed in eac h estimation. σ 11 σ 12 σ 22 ρ N Real MCMC New Real MCMC New Real MCMC New Rea l MCMC New 100 2.0 0 1.68 1.24 3.00 2.35 1.82 5.00 4.12 3.69 0.95 0.90 0.85 150 2.0 0 1.78 1.34 3.00 2.44 1.91 5.00 3.97 3.72 0.95 0.92 0.86 200 2.0 0 1.76 1.46 3.00 2.48 2.04 5.00 4.02 3.77 0.95 0.94 0.87 250 2.0 0 2.04 1.64 3.00 2.89 2.33 5.00 4.50 4.17 0.95 0.95 0.89 300 2.0 0 1.92 1.65 3.00 2.78 2.39 5.00 4.44 4.31 0.95 0.95 0.90 350 2.0 0 1.90 1.69 3.00 2.76 2.46 5.00 4.46 4.40 0.95 0.95 0.90 400 2.0 0 2.01 1.76 3.00 2.90 2.56 5.00 4.59 4.50 0.95 0.96 0.91 450 2.0 0 2.05 1.82 3.00 2.97 2.66 5.00 4.71 4.65 0.95 0.96 0.92 500 2.0 0 2.11 1.85 3.00 3.09 2.74 5.00 4.90 4.79 0.95 0.96 0.92 appro ximation. In this section we compare the new m etho d with the gold stand ard in ord er to sho w ho w goo d is the approxima tion. T ables 3 and 4 give th e results; the former s ho ws the estimates of Σ with b oth metho ds (MCMC and Algorithm 1) and the latter shows the p erformance of the one-step forecast errors for b oth m etho ds. In T able 4 the one-step forecast error vect or e t = [ e 1 t e 2 t ] ′ and the mean vect or of squared one-step forecast errors are sho wn for sev eral v alues of t und er b oth estimation metho ds. W e observ e that the new m etho d (of Algorithm 1) approxima tes well the MCMC estimates, esp ecial ly for large v alues of time t = N . W e n ote th at MCMC should not b e considered as a b etter metho d as compared to the prop osal of Algorithm 1, since MCMC is an iterativ e and in particular in this pap er it is a non- sequen tial estimation pro cedur e. T h e application of sequ en tial MCMC estimation (Doucet et. al. , 2001) often exp erience several c hallenges as for example time-constrain ts, a v ailabilit y for general pu r p ose algorithms, pr ior-sp eciﬁcation, prior-sensitivit y , fast monitoring and exp ert in terv enti on features. Th e prop osal of this p ap er pro vides a strong mo delling approac h allo w- ing for v ariance estimation in a wide class of cond itionally Gaussian dyn amic linear mo d els and th is section sho ws that for large time p erio ds its p erform an ce is close to Mon te Carlo estimation. Application to V AR and TVV AR Time Series Mo dels The dynamic mo d el (1) is very general and an im p ortan t su b class of (1) is the p opular v ector ARMA mo del. In r ecent y ears v ector autoregressiv e (V AR) mo dels ha v e b een extensively dev elop ed and u sed, esp ecially for economic time series, as in Doan et al. (1984 ), Litterman (1986 ), Kadiya la and Karlsson (1993, 1997) , Ooms (199 4), Johansen (199 5), Uhlig (1997), Ni and S un (2003), Sun and Ni (2004) and Huerta and Prado (2006). Our discussion in this s ection includes tw o imp ortan t sub classes of mo del (1), wh ich can b e u sed f or a wide-class of s tationary and non-stationary time series forecasting. The ﬁrst is 8 T able 4: Biv ariate simulat ed lo cal leve l dyn amic linear mo d el. S ho we d are: one-step forecast errors at time t = N and the squared su m s of the forecasting errors up to time N . e 1 N e 2 N N − 1 P N t e 2 1 t N − 1 P N t e 2 2 t N MCMC Ne w MCMC New MCMC New MCMC New 100 − 2.14 − 2.22 − 2.46 − 2.49 4.8 5 4.84 8.54 8.59 150 − 0.14 − 0.28 − 3.88 − 3.92 4.8 3 4.86 8.21 8.26 200 − 1.02 − 1.09 − 0.12 0.05 5.02 5.04 8.10 8.13 250 − 0.24 − 0.20 − 1.45 − 1.47 5.2 5 5.30 8.72 8.76 300 − 1.72 − 1.79 − 0.61 − 0.71 5.0 1 5.12 8.91 8.95 350 − 0.42 − 0.46 − 1.91 − 1.91 5.1 2 5.12 9.01 9.04 400 − 0.42 − 0.54 − 2.26 − 2.29 5.2 1 5.22 9.09 9.12 450 − 3.91 − 4.02 − 5.42 − 5.41 5.2 8 5.29 9.32 9.33 500 1.48 1.68 − 0.93 − 0.98 5.24 5.24 9.54 9.54 the V AR mo del of kn o wn order ℓ ≥ 1, d eﬁned by y t = Φ 1 y t − 1 + Φ 2 y t − 2 + · · · + Φ ℓ y t − ℓ + ǫ t , ǫ t ∼ N p (0 , Σ) , (4) where Φ 1 , Φ 2 , . . . , Φ ℓ are p × p matrices of p arameters. In th e usual estimatio n of V AR, stationarit y h as to b e assumed and so the r o ots of the p olynomial (in z ) | I p − Φ 1 z − Φ 2 z 2 − · · · − Φ ℓ z ℓ | = 0 should lie outside the unit circle. In standard theory (4) ma y not assume a Gaussian d istri- bution for ǫ t , although in pr actice this is u sed for op erational s implicit y . It is also kn o wn that for a high ord er ℓ m o del (4) appro ximates multiv ariate m o ving a ve rage mo dels, whic h are t ypically diﬃcu lt to estimate and this mak es the V AR eve n more attractiv e in app lications. It is also kno wn th at for general on-line estimation and forecasting, the cov ariance m atrix Σ either has to b e assumed kno wn or it has to b e diagonal. This is a ma jor limitation, b ecause it means that either the mo deller kno ws a priori the cross-correlation b etw een the series { y 1 t } , { y 2 t } , . . . , { y pt } , where y t = [ y 1 t y 2 t · · · y pt ] ′ , or that the p scalar time series are all sto c hastically uncorrelated, in which case it is more sensible to u se s everal univ ariate AR mo dels instead. Recen tly , the need for estimation of Σ as a full co v ariance matrix (e.g. where Σ h as p ( p + 1) / 2 elements to b e estimated) is considered, but the existing estimation pro cedur es include n ecessarily iterativ e estimation via imp ortance samp ling (Kadiya la and Karlsson, 1997). Ni and Su n (2003) p oin t out th at from a f requent ist standp oint ordinary least squares and maximum lik elihoo d estimators of (4) are u na v ailable. These authors state that asymp totic theory estimators m ay not b e applicable for V AR (esp ecially when { y t } is a short-length time series). Ni and Su n (2003), Su n and Ni (2004) and Huerta and Prado (2006) prop ose Ba yesian estimation of the autoregressive p arameters Φ i and Σ, b ased on MCMC. It follo ws that for mo d el (1) when Σ is unkn own, only iterativ e estimation pro cedur es can b e applied. Our p rop osal f or on-line estimation of Σ give s a s tep f orw ard to the estimation and forecasting of V AR mo dels and it is outlined b elo w. W e pr op ose a generalization of th e un iv ariate state space repr esentati on considered in W est and Harrison (1997 , § 9.4.6). Other state sp ace r epresen tations of the V AR are consid ered in 9 Huerta and P r ado (2006) , but these represent ations, us ually referred to as canonical represen- tations of the V AR mo del (Sh umw a y and Stoﬀer, 2000) are not conv enien t for the estimation of Σ , b ecause Σ is embedd ed into the ev olution equation of th e states θ t . First note that we can rewrite (4) as y t = Φ X t + ǫ t , wh ere Φ = [Φ 1 Φ 2 · · · Φ ℓ ] and X t = [ y ′ t − 1 y ′ t − 2 · · · y ′ t − ℓ ] ′ and so w e can write y t = F ′ t θ + ǫ t = ( X ′ t ⊗ I p )v ec(Φ) + ǫ t , (5) where v ec( · ) denotes the column stac king op erator of a p ortion of a m atrix and ⊗ denotes the Kroneck er or tensor pro duct of t wo matrices. Mod el (5) can b e seen as a r egression-t yp e time series mo d el and it can b e hand led b y the general Algorithm 1 for mo del (1) if w e set G = I p , Ω = 0 and if we rep lace F b y the time-v arying F t = X t ⊗ I p . Th us we can readily apply Algorithm 1 to estimate Σ and θ or Φ 1 , Φ 2 , . . . , Φ ℓ . Mo ving to the time-v arying vec tor autoregressiv e (TVV AR) time series, in r ecen t ye ars there has b een a growing literature for TVV AR time series. Kitaga w a and Gersch (1996), Dahlhaus (1997), F rancq and Gautier (2004) and And erson and Meersc haert (2005 ) study parameter estimation based on the asymptotic b ehavi our of TVV AR and time-v arying ARMA mo dels. F rom a s tate sp ace standp oint W est et al. (1999) prop ose a state space formulati on for a univ ariate time-v aryin g AR mo d el applied to electroencephalographic data. In th is section we extend this state space formulation to a v ector of observ ations and hence w e can prop ose the application of Algorithm 1 in order to estimate the co v ariance matrix of th e err or drifts of the TVV AR mo d el. Consider th at the p -vect or time series { y t } follo ws the TVV AR mo del of k n o wn ord er ℓ deﬁned by y t = Φ 1 t y t − 1 + Φ 2 t y t − 2 + · · · + Φ ℓt y t − ℓ + ǫ t , ǫ t ∼ N p (0 , Σ) , (6) where Φ 1 t , Φ 2 t , . . . , Φ ℓt are the time-v aryin g autoregressive parameter matrices. T h e mo del can b e stationary , lo cally-stati onary or non-stationary dep end ing on the ro ots of the t p oly- nomials (in z ) | I p − Φ 1 t z − Φ 2 t z 2 − · · · − Φ ℓt z ℓ | = 0 . T ypical considerations include th e lo cal stationarit y w h ere there are several r egimes for whic h, lo cally , { y t } is stationary , b ut globally { y t } is non-stationary . Also the time-dep endent p aram- eter matrices Φ it can allo w for an imp r o v ed d ynamic ﬁt as opp osed to the s tatic parameters of th e V AR. In our dev elopmen t w e adopt a r andom walk f or the evolutio n of the parameters Φ it ( i = 1 , 2 , . . . , ℓ ), although the mo deller might suggest other Mark o vian sto chastic evo lution form ulae for Φ it . The random w alk ev olution is the natural consideration wh en { y t } is assumed lo cally stationary . Hence we can r ewrite m o del (6) in s tate-space f orm as y t = Φ t X t + ǫ t = F ′ t θ t + ǫ t and θ t = θ t − 1 + ω t , (7) where X t = [ y ′ t − 1 y ′ t − 2 · · · y ′ t − ℓ ] ′ , F t = X t ⊗ I p , Φ t = [Φ 1 t Φ 2 t · · · Φ ℓt ], θ t = vec(Φ t ) an d ω t ∼ N p 2 ℓ (0 , Ω), for s ome transition co v ariance m atrix Ω . Mo del (7) is reduced to (5 ) wh en Ω = 0, in w hic h case θ t = θ t − 1 = θ . After sp ecifying Ω, w e can d ir ectly apply Algorithm 1 to the state sp ace mo del (7) and th us we can obtain an algorithm for th e estimation of Σ, for the estimation of θ t or Φ 1 t , Φ 2 t , . . . , Φ ℓt and for forecasting the s er ies { y t } . 10 Aluminium spot price (US$ per tonne) Time Spot price 0 50 100 150 200 1700 1950 Copper spot price (US$ per tonne) Time Spot price 0 50 100 150 200 3200 4000 Lead spot price (US$ per tonne) Time Spot price 0 50 100 150 200 850 950 Zinc spot price (US$ per tonne) Time Spot price 0 50 100 150 200 1200 1500 Figure 1: LME d ata, consisting of alumin ium, copp er, lead and zinc sp ot p r ices (in US dollars p er tonne of eac h metal). 11 London Metal Exc hange Data In this s ection we analyze London metal exc hange (LME) data consisting of oﬃcial sp ot pr ices (US dollars p er tonne of metal). LME is the w orld’s leading n on-ferrous metals’ mark et, trading curr en tly highly liquid con tracts for metals, such as aluminiu m, alumin ium allo y , copp er, lead, nick el, tin and zinc. According to th e LME we bsite ( http://ww w.lme.co .uk/ ) “LME is highly successful with a turnov er in excess of US$3,000 billion p er annum. It also con tributes to the UKs invisible earnings to the s um of more th an £ 250 million in o v erseas earnings eac h ye ar.” More inf orm ation ab out th e functions of the LME can b e f ound via its w ebsite (see ab o ve ); the recentl y gro wing literature on the econometrics mo delling of the LME can b e found in th e review of W atkins and McAleer (2004). W e consider forecasting for four m etals exc hanged in the L ME, namely aluminium, copp er, lead and zinc. T he data are provi ded from the LME website for the p erio d of 4 J an uary 200 5 to 31 Octob er 2005. After excluding week end s and bank h olida ys there are N = 210 trading da ys. W e store the d ata into the 4 × 1 vec tor time s er ies { y t } t =1 , 2 ,..., 210 and y t = [ y 1 t y 2 t y 3 t y 4 t ] ′ , where y 1 t denotes th e sp ot price at time t of alumin ium, y 2 t denotes th e sp ot price at time t of copp er, y 3 t denotes th e sp ot p rice at time t of lead and y 4 t denotes th e sp ot p rice at time t of zinc. The data are plotted in Figure 1. W e p rop ose th e V AR and TVV AR m o dels of the pr evious section; the m otiv ation of this b eing that from Figure 1 the evo lution of the d ata seems to follo w roughly an autoregressiv e t yp e mo del. I ndeed ther e is an apparent trend with no seasonalit y , whic h can b e mo d elled with a trend mo del or with a V AR or TVV AR mo d el of the previous s ection. Here we illustrate the prop osal of V AR and TVV AR mo dels, whic h, acco rd ing to the p r evious section, can estimate the co v ariance matrix of y t , giv en the state parameters, and th us th e correlation structure of { y t } can b e stu died. Other mo dels for th is kind of d ata hav e b een applied in T r ian tafyllop oulos (2006) an d we can en visage th at the mo dels of W est and Quintana (1987) can also b e applied to the LME data. First we apply the algorithms of th e previous section to seve ral V AR and T VV AR mo dels of diﬀeren t ord ers in order to ﬁn d out which mo del giv es the b est p erformance. Performance here is measured via the mean v ector of squared standard ized one-step forecast errors (MSSE) and the mean vecto r of absolute p ercen tage one-step forecast errors (MAPE). The ﬁ rst is c hosen as a general p erformance measure taking int o accoun t the estimation of the co v ariance matrix Σ and the second is chosen as a generally reliable p ercen t p erformance measure. T able 5 sho ws the results of 10 V AR( i ) and TVV AR( i ) mo dels (ﬁrs t column) of ord er i = 1 , 2 , . . . , 10. The d iscount factor δ refers to the discoun ting of the evolutio n co v ariance matrix of the state parameters θ t ; δ = 1 refers to a static θ t = θ (V AR mo del), while δ < 1 refers to a d ynamic lo cal lev el evo lution of θ t = θ t − 1 + ω t (TVV AR mo d el). T able 5 sh ows that the p er f ormance of th e TVV AR is remark able compared with th e p erformance of V AR, w h ic h pro d uces ve ry high MSS E thr oughout the range of i . Out of the V AR mo dels, the b est is the V AR(1), wh ic h still pro duces very large MSSE. Th is ind icates that a m o ving a v erage (MA) mo del is unlike ly to pr o duce go o d results at all, as the MSSE of the V AR increases with the ord er i . Also the appro ximation of a MA mo del with a high order V AR mo d el will include a large num b er of state p arameters to b e estimate d and this will introd u ce computational problems. Therefore, our atten tion is fo cused on th e TVV AR mo dels. F rom a computational stand- p oint w e note that as the ord er increases δ can not b e to o lo w, b ecause then th ere are computational d iﬃculties in the calculatio n of the symmetric square r o ot of e Q t , used for the estimation of e S t (the estimate of Σ). Low er v alues of δ w ork b etter (T riantafyllo p oulos, 2006) 12 T able 5: Mean vect or of squ ared stand ardized one-step forecast errors (MSSE) and mean v ector of absolute p ercen tage one-step forecast er r ors (MAPE) of the multiv ariate LME time series { y t } . The ﬁr st column indicates s everal V AR and TVV AR mo dels. MSSE MAPE V AR(1) 6.614 16.782 7.655 18.370 0.033 0.071 0.143 0.057 TVV AR(1): δ = 0 . 1 2.430 1.764 0.622 1.852 0.05 9 0.053 0.084 0.076 V AR(2) 19.610 15.966 11.934 10.271 0.081 0.226 0.201 0.101 TVV AR(2): δ = 0 . 35 1.296 1.743 1.228 1.822 0. 065 0.057 0.116 0.095 V AR(3) 11.777 23.715 9.906 9.058 0.58 5 0.345 0.480 0.246 TVV AR(3): δ = 0 . 65 2.254 3.149 2.222 2.180 0. 074 0.053 0.132 0.108 V AR(4) 39.979 54.892 28.407 19.169 0.159 0.103 0.235 0.161 TVV AR(4): δ = 0 . 6 1.389 1.802 1.210 1.329 0.10 1 0.072 0.179 0.147 V AR(5) 18.592 16.605 15.474 12.570 0.203 0.076 0.392 0.248 TVV AR(5): δ = 0 . 7 1.429 2.269 1.651 1.677 0.11 4 0.079 0.208 0.171 V AR(6) 24.910 19.085 14.584 17.784 0.206 0.134 0.320 0.197 TVV AR(6): δ = 0 . 75 1.828 2.705 1.683 1.757 0. 132 0.089 0.243 0.197 V AR(7) 21.722 38.054 14.597 14.180 0.330 0.092 0.422 0.490 TVV AR(7): δ = 0 . 75 1.366 2.044 1.191 1.531 0. 148 0.101 0.280 0.223 V AR(8) 28.985 35.867 11.291 16.370 0.515 0.325 0.812 0.563 TVV AR(8): δ = 0 . 8 2.130 2.900 1.637 1.910 0.16 8 0.111 0.326 0.249 V AR(9) 40.229 53.798 12.249 19.691 0.393 0.184 0.416 0.411 TVV AR(9): δ = 0 . 95 14.042 21.011 6.724 8.708 0.20 7 0.124 0.352 0.284 V AR(10) 46.791 49.869 16.240 23.974 0.611 0.306 0.751 0.694 TVV AR(10): δ = 0 . 9 4.273 7.541 3.637 5.629 0. 205 0.124 0.391 0.296 13 and here we ha v e chosen th e low est v alues of δ , wh ic h are allo w ed. Our d ecision on the b est TVV AR mo del is based on the follo wing four criteria. 1. low order mo dels are p referable as they hav e fewer state p arameters; 2. δ should not b e to o low, b ecause then th e co v ariance m atrix of θ t will b e to o large; 3. th e MSS E v ector should b e close to [1 1 1 1] ′ ; 4. th e MAPE v ector should b e as lo w as p ossible. Considering the ab ov e criteria we fa v or the TVV AR(2). Figure 2 sho ws the estimate of the obs erv ation co v ariance matrix Σ. F rom the r igh t graph we observe that the estimate of the corr elations of y 1 t and y j t , giv en θ t are very high (close to 1) and this means that in forecasting; this pro vides useful information ab out the cross-dep end ence of th e four m etal prices ov er time. Observation variance Time Variance 0 50 100 150 200 0 2 4 6 8 10 12 14 Observation correlation Time Correlation 0 50 100 150 200 0.0 0.2 0.4 0.6 0.8 Figure 2: Estimate of the observ ation co v ariance matrix Σ = { σ ij } i,j =1 , 2 , 3 , 4 . The left grap h sho ws the estimates of the v ariances σ ii ; th e solid line sh o ws the estimate of σ 11 , th e d ashed line shows the estimate of σ 22 , the d otted line shows the estimate of σ 33 , the d ashed/dotted line s h o ws th e estimate of σ 44 . Th e right graph s h o ws th e estimate of the correlations of y 1 t | θ t with y j t | θ t ( j = 2 , 3 , 4); the solid line sh ows the estimate of the correlation of y 1 t | θ t with y 2 t | θ t , the d ashed line sh o ws the estimate of the correlation of y 1 t | θ t with y 3 t | θ t and the dotted line sh o ws the estimate of the corr elation of y 1 t | θ t with y 4 t | θ t . As m en tioned b efore tw o comp etitiv e mo dels to our TVV AR m o delling for the LME data are the matrix-v ariate DLMs (MV-DLMs) of Quintana and W est (1987) and the d iscount w eigh ted regression (D WR) of T rian tafyllop oulos (2006). Next we compare the TVV AR(2) mo del discussed ab o ve w ith these tw o mo delling approac hes. W e start by br ieﬂy describin g the MV-DLM and the D WR. 14 T able 6: Mean vect or of squ ared stand ardized one-step forecast errors (MSSE) and mean v ector of absolute p ercentag e one-step forecast errors (MAPE) for the LME data and for three m ultiv ariate mo dels: TVV AR, MV-DLM and D WR. MSSE MAPE TVV AR(2) 1.296 1.743 1.228 1.822 0.065 0.057 0.116 0.095 MV-DLM 1.306 2.436 0.984 1.887 0.019 0.022 0.02 6 0.025 D WR 2.202 1.610 1.590 1.868 0.013 0.015 0.017 0.017 The MV-DLM is deﬁned by y ′ t = F ′ Θ t + ǫ ′ t and Θ t = G Θ t − 1 + ω t , (8) where F is a d × 1 d esign v ector, Θ t is a d × p state m atrix, G is a d × d transition matrix, ǫ t | Σ ∼ N p (0 , Σ) and vec( ω t ) | Σ , Ω ∼ N dp (0 , Σ ⊗ Ω), where v ec ( · ) denotes the column stac king op erator of a low er p ortion of a m atrix and ⊗ denotes the Kroneck er p ro duct of t w o matrices. A p rior inv erted Wishart distrib u tion is assumed for Σ and the resulting p osterior distrib utions as well as fu r ther details on the mo del can b e found in Qu in tana and W est (1987) and W est and Harrison (1997, Chapter 16) (for more references on this mo del, see also th e Int ro d uction). In th e app lication of MV-DLMs it is necessary to sp ecify F and G . F ollo wing Qu in tana and W est (1987), w ho consider international exc hange rates d ata, and by consulting the plots of Figure 1 we prop ose a linear trend mo del f or th e LME data. Thus we can set F =  1 0  and G =  1 1 0 1  . The DWR is deﬁned by y t = y t − 1 + ψ t + ǫ t and ψ t = ψ t − 1 + ζ t , with ǫ t | Σ ∼ N p (0 , Σ) and ζ t ∼ N p (0 , Ω t ). This mo del can b e put into state space form as in y t = [ y t − 1 I p ]  1 ψ t  + ǫ t = F ′ t θ t + ǫ t , θ t =  1 ψ t  =  1 ψ t − 1  +  0 ζ t  = θ t − 1 + ω t . The co v ariance matrix Ω t is m o delled with a d iscoun t factor δ and Σ is estimated f ollo wing T r ian tafyllop oulos and Pik oulas (2002) and T rian tafyllop ou los (2006). T able 6 sh o ws the MSSE and the MAPE of the thr ee mo dels. W e see that all mo dels pro du ce reasonable results. F or the MSSE th e b est mo del is the TVV AR(2) (w ith the excep- tion of the lead v ariable wh er e the MV-DLM pr o duces MSSE closer to 1). F or th e MAPE the b est mo del is the DW R with the T VV AR(2) pro ducing th e highest MAPE. Out of the three m o dels, the MV-DLM is limited b y its mathematical form, w h ic h is constructed to giv e conjugate analysis (see also the Int ro d uction). The DW R suﬀers from similar limitations as the MV-DLM, but it provides go o d results, for linear trend time series without seasonalit y . The TVV AR mo del pro vides a goo d mo delling alternativ e and considering the numerous ap- plications of V AR time series mo dels in econometrics, it is b eliev ed that th e TVV AR h as a great p oten tial. 15 In conclusion, the T VV AR mo del can pro d u ce f orecasts with go o d forecast accuracy , while the correlation of the series can b e estimated on-line with a f ast linear algorithm. A criticism of the mo del is that its eﬃciency d ep ends on its order and if h igh order TVV AR mo dels are required (e.g. as in app ro ximating mo ving a v erage pro cesses with time-dep endent parameters) its eﬃciency will b e similar of that of a v ector MA, since the discount factor will ha v e to b e close to 1. It will b e interesting to kno w h o w the order of the TVV AR m o del is related to the b ou n dness of the eigen v alues of the co v ariance estimator e S t . Concluding Commen ts This p ap er develo ps an algorithm for co v ariance estimation in multiv ariate conditionally Gaus- sian dynamic linear mo dels, assum ing that th e observ ation co v ariance matrix is ﬁxed, but unknown. This is a general estimation pr o cedure, w hic h can b e app lied to an y Gaussian linear s tate space mo d el. The algorithm is empirically foun d to h a v e go o d p erformance p r o- viding a co v ariance estimato r w hic h conv erges to the true v alue of the observ ation co v ariance matrix. The prop osed metho dology compares w ell with a n on-sequen tial state of the art MCMC estimation pro cedure and it is found that the pr op osed estimates are close to th e estimates of the MCMC. The new algo rithm is applied (but not limited to) mo d el su b classes of V AR and V AR with time-dep enden t parameters (TVV AR), which h a v e great app lication in ﬁnancial time series. Considering th e L on d on metal exc hange data, it is found that the TVV AR mo del h as ou tstand ing p erformance as opp osed to the V AR m o del. It is b eliev ed that the dev elopmen t of the T VV AR mo d el is a worth while pro ject and the prop osed fast, on-line algorithm for the estimation of the observ atio n co v ariance matrix, is a step f orward op ening s ev eral paths for p r actical forecasting. The fo cu s in this pap er is on facilitating and adv ancing non-iterativ e cov ariance estimation pro cedur es for vect or time s er ies. Suc h p ro cedures are particularly app ealing, b ecause of their simplicit y and ease in use. F or such wide class of mo dels su c h u s the conditionally Gaussian dynamic linear mo d els, the p rop osed on-line algorithm enables the computation of the mean v ector of standardized errors as we ll as it en ables th e compu tation of the multi-step f orecast co v ariance matrix. Both these computations are v aluable considerations in forecasting and they attract interest b y academics and p ractitioners alik e. Ac kno wledgemen ts I should like to th an k M. Aitkin and A. O’Hagan for helpf u l discus s ions and suggestions on earlier drafts of the pap er. Sp ecial thanks are due to G. Mon tana who help ed on the compu- tational part of the pap er, in particular regarding the MCMC d esign and implement ation. App endix Pro of of Theorem 1 Let v ec h ( · ) denote the column stac king op erator of a lo w er p ortion of a symmetric square matrix and let ⊗ denote the K ronec k er pro du ct of tw o matrices. Firs t we p ro v e th at for large t , it is approximat ely E (Σ − A t e t e ′ t A ′ t | y t ) = E (Σ − A t e t e ′ t A ′ t | y t − 1 ) , (A-1) 16 where A t = n − 1 / 2 t S 1 / 2 t − 1 Q − 1 / 2 t . Cond itional on Σ, we hav e fr om an ap p lication of the Kalman ﬁlter that Co v( e it e j t , e k t e ℓt | Σ , y t − 1 ) is b oun ded, wher e e t = [ e 1 t e 2 t · · · e pt ] ′ . Since Σ is b ound ed, S t is also b ounded (lim t →∞ S t = Σ), and so all the co v ariances of e it e j t and e k t e ℓt unconditional on Σ are also b ound ed. Th is means that all the elemen ts of V ar { ve ch( e t e ′ t ) } are b ounded and so V ar { v ec h ( e t e ′ t ) } is b ounded. No w let X 1 = E (Σ − A t e t e ′ t A ′ t | y t , y t − 1 ) = S t − A t e t e ′ t A t and X 2 = E (Σ − A t e t e ′ t A ′ t | y t − 1 ) = S t − 1 − A t Q t A ′ t . Then, s in ce lim t →∞ S t = Σ, there exists appr opriately a large intege r t ( L ) > 0 su c h that for ev ery t > t ( L ) it is E ( X 1 − X 2 | y t − 1 ) ≈ 0. Also V ar { v ec h( X 1 − X 2 ) | y t − 1 } = V ar { ( A t ⊗ A t ) D p v ec h( e t e ′ t ) | y t − 1 } = 1 n 2 t E t → 0 , with E t = h S 1 / 2 t − 1 Q − 1 / 2 t ⊗ S 1 / 2 t − 1 Q − 1 / 2 t i D p [V ar { vec h ( e t e ′ t ) | y t − 1 } ] D ′ p h Q − 1 / 2 t S 1 / 2 t − 1 ⊗ Q − 1 / 2 t S 1 / 2 t − 1 i , where D p is th e d uplication matrix and fr om th e ﬁrs t part of the pro of w e ha v e th at E t is b ound ed. It f ollo ws th at for any t > t ( L ) it is X 1 ≈ X 2 with probabilit y 1 and so w e ha v e pro ve d equation (A-1). Using E (Σ | y t ) = S t , f rom equation (A-1) w e ha ve E (Σ | y t ) − A t e t e ′ t A ′ t = E (Σ | y t − 1 ) − A t E ( e t e ′ t | y t − 1 ) A ′ t ⇒ S t = S t − 1 + 1 n t S 1 / 2 t − 1 Q − 1 / 2 t ( e t e ′ t − Q t ) Q − 1 / 2 t S 1 / 2 t − 1 ⇒ S t = S t − 1 − 1 n t S t − 1 + 1 n t S 1 / 2 t − 1 Q − 1 / 2 t e t e ′ t Q − 1 / 2 t S 1 / 2 t − 1 ⇒ n t S t = n t − 1 S t − 1 + S 1 / 2 t − 1 Q − 1 / 2 t e t e ′ t Q − 1 / 2 t S 1 / 2 t − 1 = n 0 S 0 + t X i =1 S 1 / 2 i − 1 Q − 1 / 2 i e i e ′ i Q − 1 / 2 i S 1 / 2 i − 1 and by dividing b y n t = n 0 + t = n t − 1 + 1 we obtain equ ation (2) as required. The Gibbs Sampler for Multiv ariate Conditionally Gaussian DLMs The follo wing pro cedure applies to an y conditionally Gaussian dyn amic linear m o del in the f orm of equation (1). F or the sim ulation studies considered in th is pap er, giv en data y N = ( y 1 , y 2 , . . . , y N ), we are in terested in samp lin g a s et of state vec tors, θ 1 , θ 2 . . . , θ N and the obs er v ation co v ariance matrix Σ from th e f ull, m ultiv ariate p osterior distribution of θ 1 , θ 2 , . . . , θ N , Σ | y N . Gibbs sampling in vo lve s iterativ e samp ling from the fu ll conditional p osterior of eac h θ t , | θ − t , Σ , y N , for all t = 1 , 2 , . . . , N , and Σ | θ 1 , θ 2 , . . . , θ N , y N ; in our notation, θ − t means that w e are conditioning up on all the comp onent s θ 1 , θ 2 , . . . , θ N but θ t . Given the cond itionally normal and linear stru cture of th e system, suc h fu ll conditional distribu tions are s tandard, and therefore easily sampled. Ho w ev er, such an implementat ion of the Gibbs sampler, wh ere eac h comp onent is up dated once at a time, could b e v ery ineﬃcien t when applied to the m ultiv ariate DLMs discussed in this p ap er; in fact, the h igh-correlation of the dynamic system 17 will most lik ely br ing con v ergence problems. In order to o v ercome suc h diﬃculties, follo wing the early suggestions of Carter and Koh n (1994) and F r ¨ uh wirth-Schnatter (1994) , we ha ve c hosen to imp lemen t a blo c ke d Gibbs sampler Gamerman (1997, p. 149); within this con text, this sampling scheme is b etter known as the forwar d ﬁltering, b ackwar d sampling algorithm. F ollo wing is a concise d escription of the algorithm us ed in our studies; f or m ore details, the reader should consult the references ab o v e, as well as W est and Harrison (199 7, Ch apter 15). The ﬁr st step of the Gibb s sampler in v olv es sampling f rom the up dating distrib ution of θ N | Σ , y N , whic h is giv en b y the m ultiv ariate normal N d ( m M N , P M N ). This is done in the forward ﬁltering phase of the sampler, as follo ws. Starting at time t = 0 w ith some given initial v alues m M 0 , P M 0 and Σ w e compute the follo wing quan tities at eac h time t , for t = 1 , 2 , . . . , N : (a) the pr ior mean v ector and co v ariance matrix of θ t | Σ , y t − 1 , a t = Gm M t − 1 and R M t = GP M t − 1 G ′ + Ω . (b) the mean v ector and co v ariance matrix of the one-step ahead f orecast of y t | y t − 1 , y M t − 1 (1) = F ′ a t and Q M t = F ′ R M t F + Σ . (c) the p osterior mean v ector and cov ariance matrix of θ t | y t , m M t = a t + A M t e M t and P M t = R M t − A M t Q M t ( A M t ) ′ , where A M t = R M t F ( Q M t ) − 1 is th e Kalman gain and e M t = y t − y M t − 1 (1) is the one-step ahead f orecast err or v ector. An up d ated v ector θ N is thus obtained, and the ﬁltering part of the algorithm is completed. The bac kwa rd s sampling p hase inv olves sampling fr om the distrib u tion of θ t | θ t +1 , Σ , y t at all times t = N − 1 , . . . , 1 , 0. Eac h of su ch v ectors is d ra wn from a multiv ariate normal N d ( h t , H t ), where h t = m M t + P M t G ′ ( R M t +1 ) − 1 ( θ t +1 − a t +1 ) and H t = P M t { I d − G ( R M t +1 ) − 1 GP M t ) , with I d b eing the d × d identit y matrix. At eac h time t , we also compute ǫ ∗ t = y t − F ′ θ t . Once the b ac kw ards sampling phase is completed, we set b Σ = N − 1 N X t =1 ǫ ∗ t ( ǫ ∗ t ) ′ . Finally , with n M 0 b eing th e p rior degrees of fr eedom and S M 0 b eing the prior estimate of Σ, w e sample from the fu ll conditional density of Σ | Θ , y N , which is an inv erted Wishart distribution I W p ( n M 0 + N + 2 p, N b Σ N + n M 0 S M 0 ), whose simulati on is also s tand ard. This concludes an iteration of th e Gibb s samp ler. References [1] An d erson, P .L. and Meersc haert, M.M. (2005) Parameter estimation for p er io dically sta- tionary time series. Journal of Time Series Analysis. 26 , 489-518. 18 [2] Barassi, M.R., Cap orale, G.M. and Hall, S.G. (2005) In terest r ate link ages: a Kalman ﬁlter appr oac h to detecting structur al c hange. Ec onomic Mo del ling 22 , 253-284. [3] Barb osa, E. and Harrison, P .J. (1992) V ariance estimation f or m ultiv ariate dyn amic linear mo dels. Journal of F or e c asting 11 , 621-628. [4] C arter, C .K . and Kohn , R. (1994) On Gibbs sampling for state sp ace m o dels. Biometrika 81 , 541-553. [5] Dahlhaus , R. (1997) Fitting time series m o dels to nonstationary pro cesses. Annals of Statistics 25 , 1-37. [6] Dic key , J.M., Da wid, A.P . and K adane, J.B. (1986 ) Su b jectiv e probab ility assessmen t metho ds of m ultiv ariate - t and matrix- t mo d els. I n Bayesian Infer enc e and De c ision T e ch- niques: Essays in Honor of Bruno de Finetti , P .K. Go el and A. Zellner (Eds.). North- Holland, Amsterd am. [7] Doucet, A., de F r eitas, N. and Gordon, N.J. (2001 ) Se que ntial Monte Carlo Me tho ds in Pr actic e . S pringer-V erlag, New Y ork. [8] Dur bin, J . an d Ko opman, S .J. (2001) Time Series Analysis by State Sp ac e Me tho ds . Ox - ford Un iv ersit y Press, Oxford. [9] F ern´ andez, F.J. and Harve y , A.C. (1990) S eemingly un related time series equations and a test for homogenit y . Journal of Busi ne ss and Ec onomics Statistics 8 , 71-81. [10] F rancq, C. and Gautier, A. (2004) L arge sample prop erties of parameter least squ ares estimates f or time-v aryin g ARMA m o dels. Journal of Time Series Analysis 25 , 765-78 3. [11] F r ¨ u h wirth-S c hnatter, S. (1994) Data augmen tation and d ynamic lin ear mo dels. Journal of Time Series Analysis 15 , 183-202. [12] Gamerman, D. (1997) Markov Chain M onte Carlo. Chapman and Hall, Lond on . [13] Go dolphin, E.J. and T riant afyllop oulos, K . (2006) Decomp osition of time series mo dels in state-space form . Computational Statistics and D ata Analysis (in p ress). [14] Gupta, A.K. and Nagar, D.K. (1999) Matrix V ariate Di stribu tions. Ch apman and Hall, New Y ork. [15] Gra v elle, T . and Morley , J.C. (2005) A Kalman ﬁlter approac h to characte rizing the Canadian term structure of in terest rates. Applie d Financial Ec onomics 15 , 691-705 . [16] Harv ey , A.C. (1986) Analysis and generalisation of a multiv ariate exp onen tial smo othing mo del. Management Sci e nc e 32 , 374-38 0. [17] Harv ey , A.C. (1989) F or e c asting Structur al Time Series Mo dels and the K alman Filter . Cam bridge Univ ersit y Press, Cambridge. [18] Harv ey , A.C. and Ko opman , S .J. (1997) Multiv ariate str uctural time series mo dels (with commen ts). In C. Heij, J.M. S h umacher, B. Hanzon and C. Praagman (eds.) System Dy- namics in Ec onomics and Financial Mo dels. Wiley , Chic hester, 269-298. 19 [19] Harv ey , A.C., Ruiz E. and Sh ep hard, N. (1994) Multiv ariate sto c hastic v ariance m o dels. R eview Ec onomic Studies 61 , 247-264 . [20] Huerta, G. and Pr ado, R. (2006) Stru ctured p riors for multiv ariate time series. Journal of Statistic al P lanning and Infer enc e (in press). [21] Johansen, S. (1995) Likeliho o d-Base d Infer enc e in Cointe gr ate d V e ctor Autor e gr essive Mo dels. Oxford Univ ersity Press, Oxford. [22] Kadiy ala, K.R. and Karlsson, S. (1993) F orecasti ng w ith generalize d Ba y esian v ector autoregressions. Journal of F or e c asting 12 , 365-378. [23] Kadiy ala, K.R. and Karlsson, S. (199 7) Numerical metho ds for estimation and inference in Ba ye sian V AR-mo dels. Journal of Applie d E c onometrics 12 , 99-132. [24] Kitaga wa, G. an d Gersc h, W. (1996) Smo othness Priors Analys is of Time Serie s . Springer-V erlag, New Y ork. [25] Litterman, R.B. (1986) F orecasting w ith Bay esian v ector autoregressions - ﬁve y ears of exp erience. Journal of Business and Ec onomic Statistics 4 , 25-38. [26] L ¨ utk ep oh l, H. (1993) Intr o duction to Mu ltiple Time Series Analysis . Springer-V erlag, Berlin. [27] Moauro, F. and Savio, G. (2005) T emp oral disaggregation using multiv ariate structur al time series mo d els. Ec onometrics Journal 8 , 214-23 4. [28] Ni, S. and S un, D. (2003 ) Noninformative p riors and frequentist risks of Ba y esian esti- mators of v ector-autoregressiv e mo d els. Journal of Ec onometrics 115 , 159-197. [29] Ooms, M. (1994) Empiric al V e ctor Autor e gr essive Mo del ling . Sp ringer-V erlag, New Y ork. [30] P ollo c k, D.S.G. (2003) Recursive estimation in econometrics. Computational Statistics and D ata Analysis 44 , 37-75. [31] Queen, C.M. and Smith, J.Q. (1992) Dyn amic graphical mo dels. In Bayesian Statistics 4 , J.M. Bernardo, J.O. Berger, A.P . Da wid, and A.F.M. Smith (Eds). Oxford Univ ersit y Press, Oxf ord, 741-751. [32] Quinta na, J.M. and W est, M. (1987) An analysis of internatio nal exc hange rates u s ing m ultiv ariate DLMs. The Statistician 36 , 275-281. [33] Quinta na, J.M. and W est, M. (1988) Time series analysis of comp ositional data. In Bayesian Statistics 3 , J.M. Bernardo, M.H. DeGro ot, D.V. Lindley , and A.F.M. Smith (Eds). Oxford Universit y Press, O x f ord, 747-756 . [34] Salv ador, M. and Gargallo, P . (2004) . Au tomatic monitoring and interv en tion in multi - v ariate dyn amic linear mo dels. Computationa l Statistics and D ata Analysis 47 , 401-43 1. [35] Salv ador, M., Gallizo, J.L. and Gargallo, P . (200 3). A dynamic pr in cipal comp onents analysis based on m ultiv ariate matrix normal dynamic linear mo dels. Journal of F or e c asting 22 , 457-478. 20 [36] Salv ador, M., Gallizo, J.L. an d Gargallo, P . (2004) Ba y esian inference in a matrix n ormal dynamic linear mo d el with unknown co v ariance m atrices. Statistics 38 , 307-33 5. [37] Shum w a y , R.H. and Stoﬀer, D.S. (2000) Time Serie s Analys is and i ts Applic ations. Springer-V erlag, New Y ork. [38] Sun, D. and Ni, S. (2004) Ba y esian analysis of v ector autoregressiv e mo dels with non- informativ e pr iors. J ournal of Statistic al Planning and Infer enc e 121 , 291-30 9. [39] T r ian tafyllop oulos, K. (2006) Multiv ariate discount we ight ed regression and lo cal lev el mo dels. Computational Statistics and Data Analysis (in p ress). [40] T r ian tafyllop oulos, K. and P ikoulas, J. (2002) Multiv ariate regression applied to the problem of net work securit y . Journal of F or e c asting 21 , 579-594 . [41] W atkins C. and McAleer M. (2004). Econometric m o delling of non-ferr ous metal p r ices. Journal of Ec onomic Surveys 18 , 651-70 1. [42] W est, M. and Harrison, P .J. (1997) Bayesian F or e c asting and Dynamic Mo dels . 2nd edition. S p ringer V erlag, New Y ork. [43] W est, M., Prado, R. and Krystal, A.D. (1999) Ev aluatio n and comparison of EEG traces: laten t stru cture in n on s tationary time series. Journal of the Americ an Statistic al Asso ci a- tion 94 , 375-387 . [44] Whittle, P . (1984) Pr e diction and R e gu lation: by line ar le ast-squ ar e metho ds . 2nd edition. Blac k well, Oxf ord. 21

Covariance estimation for multivariate conditionally Gaussian dynamic linear models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment