Panel Cointegration with Global Stochastic Trends

P ANEL COINTE GRA TION WITH GLOBAL STOC HASTIC TRENDS Jushan Bai ∗ Chihw a Kao † Serena Ng ‡ This v ersion: Octob er 29, 2018 Abstract This pap er studies estimation of panel cointegration models with c r oss-se c tional depe ndenc e generated by unobserved global stochastic trends. The standard le ast squares estimator is, in general, inconsistent owing to the spuriousnes s induced by the unobserv able I(1) tr ends. W e prop ose t wo iterative procedur es that jointly estimate the slo pe parameter s and the sto chastic trends. The r esulting estimator s are refer red to resp ectively as CupBC (contin uo us ly-up dated and bias-corrected) and the CupFM (co nt inu ously-up dated and fully-mo diﬁed) estimato r s. W e establish their consistency and der ive their limiting distributions. Both are asymptotica lly un biased and asy mptotically normal and per mit inference to b e conducted using standa rd test statistics. The estimators a r e also v alid when there are mixed sta tionary and non-s ta tionary factors, as w ell as when the factors a r e all stationary . JEL Classiﬁc ation : C13; C33 Keywor ds : Panel data; Common sho cks; Co-movemen ts; Cros s -sectional dependenc e ; F ac tor analysis; F ully mo diﬁed estimator. ∗ Department of Economics, New Y ork U niversit y , New Y ork, NY 1000 3,US A, and School of Economics and Managemen t, Tsingh u a Un ivers it y . Email: Jushan.Bai@n yu.edu. † Cen ter for Policy Research and Departmen t of Economics, Syracuse Universit y , S yracuse, NY 13244-1020, USA. Email: cdka o@maxw ell.syr.edu. ‡ Department of Economics, Colum bia Universit y , 440 W. 11 8 St. New Y ork, NY 10027, USA , Email: ser- ena.Ng@colum bia.edu. W e thank Yixiao Sun, Joon Park, Y o osoon Chang, and Joa kim W esterlund for helpful discussions. W e w ould also like to thank seminar and conference partici pants at NYU, T exas A&M, CESG 2 006, Academia Sinica, National T aiw an Universit y , Greater New Y ork Area Econometrics Colloq uium at Y ale, and CIREQ Time Series Conference in Mon- treal for helpful comments. Bai and Ng gratefully acknow ledge ﬁnancial sup p o rt from th e NSF (grants SES- 0551275 and SES- 0549978 ) . 1 In t ro duc tion This pap er is concerned with estimating panel coin tegration mo d els using a large p anel of data. Our fo cus is on estimating the s lop e parameters of the n on -stationary regressors wh en the cross sections share common sources of n on -stationary v ariation in the form of global sto chastic trend s. The standard least squ ares estimator is either inconsisten t or has a slow er conv ergence rate. W e p ro vide a f r amew ork for estimation and inference. W e prop ose t wo iterativ e pro cedur es that estimate the laten t common trends (hereafter factors) and th e slop e parameters joint ly . The estimators are √ nT consisten t and asymp totical ly mixed norm al. As su ch, inference can b e m ade us in g standard t and W ald tests. The estimators are also v alid when some or all of the common factors are s tationary , and when some of the observe d regressors are stationary . P anel data ha v e long b een us ed to stud y and test economic h yp otheses. P anel data bring in information from tw o dimensions to p ermit analysis that would otherwise b e ineﬃcien t, if not imp ossible, with time s eries or cross-sectional data alone. A new dev elopment in recen t years is the use of ‘large dimensional panels’, m eanin g that th e sample size in the time series ( T ) and the cross-section ( n ) d imensions are b oth large. This is in con trast to traditional pan els in which we ha ve data of many un its o ve r a s h ort time span, or of a few v ariables ov er a long horizon. Ma ny researc hers hav e come up with new ideas to exploit the ric h information in large panels. 1 Ho we v er, large panels also raise econometric issu es of their o wn. In this an alysis, we ta c kle t wo of these issues: the data ( y it , x it ) are non stationary , and th e structural errors e it = y it − x ′ it β are neither iid across i n or o ver t . Instead, they are cross-sectionally d ep endent and str on gly p er s isten t and p ossibly non-stationary . In addition, e it are also correlated with the explanatory v ariables x it . These problems are dealt with by putting a factor structur e on e it and mo delling the factor pro cess explicitly . The pr esence of common s ou r ces of non-stationarit y leads naturally to the concept of coin tegra- tion. In a small panel made up of individually I(1) (or u nit r o ot) p ro cesses y t and x t , where small means that the d imension of y t plus the dimension of x t is treated as ﬁxed in asymptotic analysis, coin tegration as deﬁned in En gle and Granger (1986) means that there exists a coin tegrating vect or, (1 − β ′ ), su c h that the linear com b inations y t − x ′ t β are stationary , or are an I(0) pr o cesses. In a panel d ata mo d el sp e ciﬁed by y it = x ′ it β + e it where y it and x it are I(1) pro cesses, and that e it are iid across i , coin tegration is said to h old if e it are ‘join tly’ I(0), or in other wo rds, (1 , − β ) is the common coint egrating v ector b etw een y it and x it for al l n u nits. A large literature pan el coin te- 1 See, for example, Baltagi ( 2005), Hsiao (2003), Pesara n and S mith (1995), Kao (1999), and Mo on and Phillips (2000, 2004) in the context of testing the un it ro ot hypothesis using panel data. Sto ck and W atson (2002) suggest diﬀusion-index forecasting, while Bernanke and Boivin (2003) suggest new formulations of vector autoregressions to exploit th e information in large panels. 1 gration already exists 2 for mo delling panel cointe gration wh en e it is cross-sectionally indep enden t. A serious d ra wb ac k of panel cointe gration m o dels with cross-section indep endence is that there is no role for common shocks, whic h, in theory should b e the und erlying source of como v ement in the cross-section u nits. F ailure to accoun t for common sh o c ks can p otenti ally inv alidate estimation and inference of β . 3 In view of this, m ore recen t wo r k has allo we d for cross-sectional d ep endence of e it when testing f or the null hyp othesis of panel coin tegration. 4 There is also a growing literature on p an el u nit r o o t tests with cross-sectional dep e ndence. 5 In this p ap er, we consider estimation and inf erence of parameters in a panel mo d el with cross- sectional dep enden ce in the form of common sto c hastic trends. The fr amew ork we adopt is that e it has a common comp o n en t and a stationary idiosyncratic comp onent. That is, e it = λ ′ i F t + u it , so that panel coin tegration holds when u it = y it − β x it − λ ′ i F t is join tly stationary . A regression of y it on x it will gi v e a consistent estimator for β wh en F t is I(0) . W e fo cus on estimatio n and inference ab out β when F t is non-stationary . Emp irical studies suggest the relev ance of suc h a setup. Holly , Pe s aran and Y amagata (2006) analyzed the relationship b et w een real h ousing prices and r eal income at s tate level , allo wing for u nobserv able common factors. They f ound evidence of coin tegration after con trolling for common factors and add itional spatial correlations. Some economic mo dels lead naturally to this set up. Consid er a panel of indu s try data on output, factor inputs su c h as capital, an d lab or. Neo classical pro duction fu nction suggests th at log outpu t y it is linear in log factor inp uts x it and log pro d uctivit y e it . Decomposing th e laten t e it in to the industry wide comp o n en t F t and an ind ustry sp eci ﬁc comp o n en t u it and assu ming that F t is the sou r ce of non-stationarit y leads to a m o del w ith latent common trends. In suc h a case, a regression of y it on x it is sp urious sin ce e it is not only cross-sectionally correlated, b ut also non-stationary . W e deal with the problem b y treating th e common I(1) v ariables as parameters. These are estimated join tly with β u sing an iterated p ro cedure. The pr o c ed ure is shown to yield a consisten t estimator of β , but the estimator is asymp totically b iased. W e then construct t wo estimators to accoun t for the bias arising from endogeneit y and serial correlation so as to re-cen ter the limiting distribution around zero. T he ﬁr st, denoted CupBC, estimates the asymptotic bias directly . The 2 See, for ex ample, Ph illips and Moon (1999) and Kao (1999). Recent surveys can b e found in Baltagi and Kao (2000) and Breitung and Pes aran (2005). 3 Andrews (2005) show ed that cross-section d ep endence induced by common sho cks can y ield inconsisten t es timates. Andrews’ argument is made in the context of a single cross section and for stationary regressors and errors. F or a single cross section, not muc h can b e done ab out common sho cks. But for panel data, we can explore the common shocks to yield con sistent pro cedures. 4 See, for example, Phillips and Su l (2003), Gengenbac h et al. (2005b), and W esterlund (2006). 5 F or example, Chang (2002,20 04), Choi (2006), Moon and Perron (2004), Breitung and Das (2005), Gen genbach et al. (2005a ) , and W esterlund and Edgerton (2006). Breitung and Pesaran (2005) provide additional references in their survey . 2 second, denoted Cu pFM, mo diﬁes the data so that the limiting distribu tion do es not dep end on n u isance p arameters. Both are ‘conti n u ously up d ated’ (Cu p) p ro cedures and require iteration till con ve rgence. The estimators are √ nT consisten t for the common slop e co eﬃcien t v ector, β . The estimators enable use of stand ard test statistics su ch as t , F , and χ 2 for in f erence. The estimators are robus t to mixed I(1) /I(0) factors, as well as mixed I(1)/I(0 ) regressors. Thus, our approac h is an alternativ e to the solution prop osed in Bai and Kao (2006) for stationary factors. As w e argue b elo w , the Cup estimators ha ve some adv an tages that make an analysis of their prop erties in teresting in its own righ t. The rest of the pap er is organized as follo ws. Section 2 describ es the b asic mo del of panel coin- tegration w ith un observ able common sto chastic trend s. Section 3 develo p s the asymptotic theory for th e con tinuously-up dated and f u lly-mo diﬁed estimators. S ection 4 examines issues related to inciden tal trends, mixed I(0)/I(1) regressors and mixed I(0)/I(1) common sh o c ks, and iss u es of testing cross-sectional indep en d ence. Section 5 present s Mon te Carlo r esults to illustrate the ﬁnite sample prop erties of the p rop osed estimators. Section 6 provi des a brief conclusion. The app endix con tains the tec hn ical materials. 2 The Mo del Consider the mo del y it = x ′ it β + e it where for i = 1 , ..., n, t = 1 , ..., T , y it is a scalar, x it = x it − 1 + ε it . (1) is a set of k non-stationary regressors, β is a k × 1 v ector of the common slop e parameters, and e it is the r egression err or. S upp ose e it is stationary and iid across i . Then it is easy to s ho w th at the p o oled least squares estimator of β deﬁn ed by ˆ β LS = n X i =1 T X t =1 x it x ′ it ! − 1 n X i =1 T X t =1 x it y it (2) is, in general, T consistent. 6 Similar to the case of time series regression considered b y Phillips and Hansen (1990), the limiting distrib ution is s h ifted a wa y from zero due to an asymptotic b ias indu ced b y the long ru n correlation b et ween e it and ε it . The exception is when x it is strictly exogenous, in whic h case the estimator is √ nT consistent. The asymp totic bias can b e estimated, and a pan el 6 The estimator can b e regarded as √ nT consistent b ut with a bias of order O ( √ n ). Up to the bias, the estimator is also asymptotically mixed n ormal. 3 fully-mo diﬁed estimator can b e devel op ed along the lines of P h illips and Hansen (1990) to ac h iev e √ nT consistency and asymp totic n ormalit y . The cross-section in dep end ence assump tion is restrictiv e and diﬃcult to justify when the d ata under in vestig ation are economic time series. In view of como v ements of economic v ariables and common sho cks, w e mo del the cross-section dep end ence by im p osing a f actor stru cture on e it . That is, e it = λ ′ i F t + u it where F t is a r × 1 vecto r of laten t common factors, λ i is a r × 1 vecto r of factor loadings and u it is the idiosyncratic error. If F t and u it are b oth stationary , then e it is also stationary . In th is case, a consisten t estimator of the regression co eﬃcien ts can still b e obtained ev en when the cross-section dep end ence is ignored, just lik e the fact that simultaneit y bias is of seco n d order in the ﬁ xed n coin tegration f r amew ork. Using th is prop ert y , Bai and Kao (20 06) considered a t wo- step fu lly mo diﬁed estimator (2sFM). In the ﬁrst s tep, p o oled OLS is used to obtain a consisten t estimate of β . The residu als are then used to construct a fully-mo diﬁed (FM) estimator along th e line of Phillips an d Hansen (1990) . Es sen tially , nuisance parameters in duced b y cross-section correlation are dealt with just like serial correlation by suitable estimation of the long-run co v ariance matrices. The 2sFM treats the I(0) common s h o c ks as p art of th e err or pro cesses. Ho w ever, an alternativ e estimator can b e d ev elop ed by rewriting the regression mo d el as y it = x ′ it β + λ ′ i F t + u it . (3) Mo ving F t from the err or term to the r egression function (treated as parameters) is desirable for the follo wing reason. If some comp onent s of x it are actually I(0), treating F t as part of err or pr o cess will yield an in consistent estimate for β when F t and x it are correlated. The s imultaneit y b ias is no w of the same order as the conv ergence rate of the co eﬃcien t estimates on th e I(0) regressors. Estimating β fr om (3) w ith F b eing I(0) w as suggested in Bai and Kao (2006), b ut its theory wa s not explored. When F t is I(1), whic h is the primary fo cus of this pap er, there is an imp ortan t diﬀerence b et ween estimating β from (3) versus p o oled OLS in (2) b ecause the latter is no longer v alid. More precisely , if F t = F t − 1 + η t then e it is I(1) and p o oled OLS in (2) is, in general, not consisten t. T o see this, consid er the follo wing d ata generating p ro cess f or x it x it = τ ′ i F t + ξ it (4) 4 with ξ it b eing I(1) suc h that ξ it = ξ it − 1 + ζ it . F or s implicit y , assume there is a single factor. It follo ws that x it is I(1) and can b e written as (1) with ε it = τ ′ i η t + ζ it . The p o oled OLS can b e written as ˆ β LS − β = ( 1 n P n i =1 τ i λ i )( 1 T 2 P T t =1 F 2 t ) 1 nT 2 P n i =1 P T t =1 x 2 it + O p ( n − 1 / 2 ) + O p ( T − 1 ) If τ i and λ i are correlated, or when th ey ha ve non-zero means, th e ﬁrst term on the r ight hand side is O p (1), implying inconsistency of the p o oled OLS . Th e b est conv ergence rate is √ n when x it and F t are indep end en t random w alks. The problem arises b ecause as seen fr om (3), w e n o w h av e a panel mo d el with non-stationary regressors x it and F t , and in which u it is stationary b y assum ption. This means that y it conin tegrates with x it and F t with coin tegrating ve ctor (1 , − β ′ , λ i ). Omitting F t creates a s p urious regression problem. It is worth noting that th e coin tegrating v ector v aries with i b ecause the factor loading is un it sp eciﬁc. Estimation of the p arameter of interest β inv olv es a n ew m etho dology b ecause F is unob s erv able. In the rest of the pap er, we will show ho w to obtain √ nT consisten t and asymptotically n ormal estimates of β when the data generating pro cess is c haracterized b y (3) assu m ing that x it and F t are b oth I(1), and that x it , F t and u it are p oten tially correlated. W e will refer to F t as the global sto c h astic trend s since they are shared by eac h cross-sectio n al unit. Hereafter, w e wr ite the integ ral R 1 0 W ( s ) ds as R W when there is no am b iguit y . W e deﬁne Ω 1 / 2 to b e an y matrix suc h that Ω =  Ω 1 / 2   Ω 1 / 2  ′ , and B M (Ω) to denote Bro w nian motion with th e co v ariance matrix Ω. W e u se k A k to den ote ( tr ( A ′ A )) 1 / 2 , d − → to denote con v ergence in distribu tion, p − → to d enote con ve rgence in probability , [ x ] to denote the largest in teger less than or equ al to x . W e let M < ∞ b e a generic p ositiv e num b er, not dep ending on T or n . W e also deﬁne the matrix that pr o jects on to th e orthogonal space of z as M z = I T − z ( z ′ z ) − 1 z ′ . W e will us e β 0 , F 0 t , and λ 0 i to d enote the true common slop e parameters, true common trends, and the true factor loading co eﬃcient s . Denote ( n , T ) → ∞ as the join t limit. Denote ( n, T ) seq → ∞ as the sequentia l limit, i.e., T → ∞ ﬁrst and n → ∞ later. W e use M N (0 , V ) to denote a mixed n ormal distrib ution with v ariance V . Our analysis is based on the follo wing assu mptions. Assumption 1 F actors and L o adings: (a) E   λ 0 i   4 ≤ M . As n → ∞ , 1 n P n i =1 λ 0 i λ 0 ′ i p − → Σ λ , a r × r diagonal matrix. (b) E k η t k 4+ δ ≤ M for some δ > 0 and f or al l t . As T → ∞ , 1 T 2 P n i =1 F 0 t F 0 ′ t d − → R B η B ′ η , a r × r r andom matrix, wher e B η is a ve ctor of Br ownian motions with c ovarianc e matrix Ω η , which is a p ositive deﬁnite matrix. 5 Assumption 2 L et w it =  u it , ε ′ it , η ′ t  ′ . F or e ach i , w it = Π i ( L ) v it = P ∞ j =0 Π ij v it − j wher e v it is i.i. d. over t , P ∞ j =0 j a k Π ij k ≤ M , for some a > 1 , and | Π i (1) | > c > 0 f or al l i . In addition, E v it = 0 , E ( v it v ′ it ) = Σ v > 0 , and E k v it k 8 ≤ M < ∞ . Assumption 3 W e ak cr oss-se ctional c orr elation and heter oke dasticity (a) E ( u it u j s ) = σ ij,ts , | σ ij,ts | ≤ ¯ σ ij for al l ( t, s ) and | σ ij,ts | ≤ τ ts for al l ( i, j ) such that (i) 1 n P n i,j =1 ¯ σ ij ≤ M , (ii) 1 T P T t,s =1 τ ts ≤ M , and (iii) 1 nT P i,j,t,s =1 | σ ij,ts | ≤ M . (b) F or every ( t, s ) , E    1 √ n P n i =1 [ u is u it − E ( u is u it )]    4 ≤ M . (c) 1 nT 2 P t,s,u,v P i,j | cov ( u it u is , u j u u j v ) | ≤ M and 1 nT 2 P t,s P i,j,k ,l | cov ( u it u j s , u k u u ls ) | ≤ M . Assumption 4  x it , F 0 t  ar e not c ointe gr ate d. Assumption 1 is standard in the panel factor literature. Assum ption 3 allo ws f or limited time series and cross-sectional d ep endence in the error term, u it . Heterosk edasticit y in b oth time series and cross-sectional dimen s ions for u it is allo w ed as w ell. Th e assum p tion that Ω η is p ositive deﬁn ite rules out cointe gration among the comp onents of F 0 t . Assu mption 4 also rules out th e coin tegration b et ween x it and F 0 t . Assumption 2 implies that a multiv ariate inv ariance pr inciple for w it holds, i.e., the partial sum pro cess 1 √ T P [ T · ] t =1 w it satisﬁes: 1 √ T [ T · ] X t =1 w it d − → B i ( · ) = B ( Ω i ) as T → ∞ for all i, where B i =  B ui B ′ εi B ′ η  ′ . The long-ru n cov ariance matrix of { w it } is giv en by Ω i = ∞ X j = −∞ E  w i 0 w ′ ij  =   Ω ui Ω uεi Ω uηi Ω εui Ω εi Ω εηi Ω ηui Ω η εi Ω η   (5) are partitioned conformably w ith w it . Deﬁne the one-sided long-run co v ariance ∆ i = ∞ X j =0 E  w i 0 w ′ ij  =   ∆ ui ∆ uεi ∆ uηi ∆ εui ∆ εi ∆ εηi ∆ η ui ∆ η εi ∆ η   . (6) 6 F or future reference, it will b e con venien t to group elemen ts corresp ondin g to ε it and η t tak en together. Let B bi =  B ′ εi B ′ η  ′ Ω bi =  Ω εi Ω εηi Ω η εi Ω η  . Then B i can b e rewr itten as B i =  B ui B bi  = " Ω 1 / 2 u.bi Ω ubi Ω − 1 / 2 bi 0 Ω 1 / 2 bi #  V i W i  where  V i W ′ i  ′ = B M ( I ) is a stand ardized Bro wn ian motion and Ω u.bi = Ω ui − Ω ubi Ω − 1 bi Ω bui is the long-run conditional v ariance of u it giv en ( △ x ′ it , △ F 0 ′ t ) ′ . Note that Ω bi > 0 since we assume that th ere is n o coin tegration relationship in ( x ′ it , F 0 ′ t ) ′ in Assu mption 4. Finally , we state an additional assumption, which is needed when deriving the limiting distri- bution but is n ot n eeded for consistency of the prop osed estimators. Assumption 5 The idiosyncr atic err ors u it ar e cr oss-se ctional ly indep endent. 3 Estimation In this section, we ﬁrst consider the p r oblem of estimating β wh en F is ob s erv ed. W e then consider t wo iterativ e pro cedu res that jointly estimate β and F . The p ro cedures yield tw o estimators that are √ nT consistent and asymptotically normal. These estimators, denoted Cu pBC and Cu pFM, are presen ted in sub sections 3.2 and 3.3. 3.1 Estimation when F is observed The tr ue mo del (3) in vec tor form, is y i = x i β 0 + F 0 λ 0 i + u i where y i =      y i 1 y i 2 . . . y iT      , x i =      x ′ i 1 x ′ i 2 . . . x ′ iT      , F =      F ′ 1 F ′ 2 . . . F ′ T      , u i =      u i 1 u i 2 . . . u iT      . Deﬁne Λ = ( λ 1 , .., λ n ) ′ to b e an a n × r matrix. In matrix notation y = X β 0 + F 0 Λ 0 ′ + u. 7 Giv en data y , x, and F 0 , the least squares ob jectiv e fun ction is S 0 nT ( β , Λ) = n X i =1  y − x i β − F 0 λ i  ′  y − x i β − F 0 λ i  . After concen trating out λ , the least squares estimator for β is then ˜ β LS = n X i =1 x ′ i M F 0 x i ! − 1 n X i =1 x ′ i M F 0 y i . The least squares estimator has the follo win g prop er ties. 7 Prop osition 1 Under Assumptions 1-5, as ( n, T ) seq → ∞ √ nT  e β LS − β 0  − √ nφ 0 nT d − → M N  0 , Σ 0  wher e φ 0 nT = " 1 nT 2 n X i =1 x ′ i M F 0 x i # − 1  1 n n X i =1 θ 0 i  (7) Σ 0 = D − 1 " lim n →∞ 1 n n X i =1 Ω u.bi E  Z Q i Q ′ i | C  # D − 1 , (8) and with C b eing the σ -ﬁeld gener ate d by { F t } , D = lim n →∞ 1 n n X i =1 E  Z Q i Q ′ i | C  Q i = B εi −  Z B εi B ′ η   Z B η B ′ η  − 1 B η , θ 0 i = 1 T x ′ i M F 0 ∆ b i Ω − 1 bi Ω bui +  ∆ + εui − δ 0 ′ i ∆ + η u  , δ 0 i = ( F 0 ′ F 0 ) − 1 F 0 ′ x i , ∆ b i = ( ∆ x i ∆ F 0 ) ∆ + bui =  ∆ + εui ∆ + η u  =  ∆ bui ∆ bi   I k − Ω − 1 bi Ω bui  = ∆ bui − ∆ bi Ω − 1 bi Ω bui . The estimator is √ nT consisten t if φ 0 nT = 0, wh ich o ccurs wh en x it is strictly exogenous. Otherwise, the estimator is T consisten t as there is an asymp totic bias giv en b y the term √ nφ 0 nT . This is an a v erage of individual biases that are data sp eciﬁc as seen from the deﬁnition of θ 0 i . The in dividual biases arise fr om the con temp oraneous and lo w f r equency correlations b et ween the regression error and the innov ations of the I(1) regressors as giv en by terms su c h as Ω bui and ∆ bui . 7 The limiting distribution for F b eing I(0) can also b e obtained. Park and Phillips (1988) provide the limiting theory with mixed I (1) and I(0) regressors in a single equation framewor k. 8 T o estimate the bias, w e need to consistentl y estimate the nuisance parameters. W e use a kernel estimator. Let b Ω i = T − 1 X j = T +1 ω  j K  b Γ i ( j ) , b ∆ i = T − 1 X j =0 ω  j K  b Γ i ( j ) b Γ i ( j ) = 1 T T − j X t =1 b w it + j b w ′ it . where ˆ w it = ( ˆ u it , ∆ x ′ it , ∆ F 0 ′ t ) ′ . T o state the asymp totic theory for the bias-corrected estimator, w e need th e f ollo wing assum ption, as used in Mo on and Perron (2004): Assumption 6 (a) lim inf n,T →∞ (log T / log n ) > 1 . (b) the kernel function ω ( · ) : R → [ − 1 , 1] satisﬁes (i) ω (0) = 1 , ω ( x ) = ω ( − x ) , (ii) R 1 − 1 ω ( x ) 2 dx < ∞ and with Parzen ’s exp onent q ∈ (0 , ∞ ) such that lim 1 − ω ( x ) | x | q < ∞ . (c) The b andwidth p ar ameter K satisﬁes K ∽ n b and 1 2 q < b < lim inf log T log n − 1 . Let ˆ φ 0 nT = " 1 nT 2 n X i =1 x ′ i M F 0 x i # − 1 ˆ θ n where ˆ θ n = 1 n P n i =1 ˆ θ i , ˆ θ i is a consistent estimate of θ 0 i . Th e resu lting bias-corrected estimator is ˜ β LS B C = ˜ β LS − 1 T ˆ φ 0 nT . (9) This estimator can alternativ ely b e wr itten as e β LS F M = n X i =1 x ′ i M F 0 x i ! − 1 n X i =1  x ′ i M F 0 ˜ y + i − T  ˜ ∆ + εui − δ 0 ′ i ˜ ∆ + η u  (10) where ˜ y + and ˜ ∆ + are consisten t estimates of y + and ∆ + etc, with y + it = y it − Ω ubi Ω − 1 bi  ∆ x it ∆ F 0 t  u + it = u it − Ω ubi Ω − 1 bi  ∆ x it ∆ F 0 t  View ed in this ligh t, the bias-corrected estimator is also a panel fu lly-mo diﬁed estimator in the s pirit of Phillips and Hans en (1990), and is the reason why the estimator is also lab eled ˆ β LS F M . It is n ot diﬃcult to v er if y that ˆ β LS B C and ˆ β LS F M are id en tical. Pa n el f ully mo diﬁed estimators were also 9 considered by Phillips and Mo on (1999) and Bai and Kao (2006). Here, we extend those analysis to allo w for common stoc hastic trends. By constru ction u + it has a zero long-run co v ariance with ( ∆ x ′ it ∆ F 0 ′ t ) ′ and h ence th e end ogeneit y can b e remo v ed. F urtherm ore, n uisance parameters arising from the lo w frequency correlation of the errors are summarized in ∆ + bui . Prop osition 2 L et e β LS F M b e deﬁne d by (10). Under Assumptions 1-6, as ( n, T ) seq → ∞ √ nT ( e β LS F M − β 0 ) d − → M N  0 , Σ 0  . In small scale coin tegrated systems, coin tegrated v ectors are T consisten t, and th is fast rate of con ve rgence is already accelerated relativ e to the case of stationary regressions, w h ic h is √ T . Here in a pan el d ata con text with observe d global s to chastic trend s, the estimates conv erge to the true v alues at an ev en faster rate of √ nT and the limiting distribu tions are n ormal. T o take adv an tage of this fast con ve rgence r ate made p ossible by large p anels, w e need to deal with the fact th at F 0 is not observed. This problem is considered in th e next tw o subs ections. 3.2 Unobserv ed F 0 and t he Cup Estimator The LSFM considered abov e is a linear estimator and can b e obtained if F 0 is observ ed. When F 0 is n ot obs er ved, the previous estimator is infeasible. Recall that least squ ares estimator th at ignores F is, in general, inconsistent. In th is section, we consider estimating F along with β and Λ b y minimizing the ob jectiv e f u nction S nT ( β , F , Λ) = n X i =1 ( y − x i β − F λ i ) ′ ( y − x i β − F λ i ) (11) sub j ect to the constraint T − 2 F ′ F = I r and Λ ′ Λ is p ositiv e deﬁ n ite. Th e least squares estimator for β for a giv en F is b β = n X i =1 x ′ i M F x i ! − 1 n X i =1 x ′ i M F y i . Deﬁne w i = y i − x i β = F λ i + u i . Notice that giv en β , w i has a pure factor stru cture. Let W = ( w i , ..., w n ) b e a T × n matrix. W e can rewrite the ob jectiv e function (11) as tr [( W − F Λ ′ )( W − F Λ ′ ) ′ ] . If w e concent r ate out Λ = W ′ F  F ′ F  − 1 = T − 2 W ′ F , w e h a ve the concen trated ob jectiv e function: tr  W ′ M F W  = tr  W ′ W  − tr  F ′ W W ′ F /T 2  . (12) 10 Since the ﬁrst term do es n ot dep end on F , minimizing (12 ) with r esp ect to F is equiv alen t to maximizing tr  T − 2 F ′ W W ′ F  sub j ect to the constraint T − 2 F ′ F = I r . Th e solution, d enoted b F , is a matrix of the ﬁrst r eigen vec tors (multiplied by T ) of the matrix 1 nT 2 P n i =1 ( y i − x i β ) ( y i − x i β ) ′ . Although F is n ot obs erv ed wh en estimating β , and similarly , β is n ot observed wh en estimating F , we can replace the un ob s erv ed quant ities b y initial estimates and iterate u ntil con v ergence. Such a solution is m ore easily seen if we rewrite the left h and side of (12) with y − x β sub stituting in for W . Deﬁne S nT ( β , F ) = 1 nT 2 n X i =1 ( y i − x i β ) ′ M F ( y i − x i β ) . The contin uous up dated estimator (Cu p) for ( β , F ) is d eﬁ ned as  b β C up , b F C up  = argmin β ,F S nT ( β , F ) . More p recisely , ( b β C up , b F C up ) is the solution to the follo wing t wo nonlinear equations b β = n X i =1 x ′ i M b F x i ! − 1 n X i =1 x ′ i M b F y i (13) b F V nT = " 1 nT 2 n X i =1  y i − x i b β   y i − x i b β  ′ # b F (14) where M b F = I T − T − 2 b F b F ′ since b F ′ b F /T 2 = I r , and V nT is a diagonal matrix consisting of the r largest eigen v alues of the matrix inside the b rac ke ts, arranged in d ecreasing ord er. Note th at the estimator is obtained b y iterativ ely solving for ˆ β and ˆ F usin g (13 ) and (14). It is a non-linear estimator eve n though linear least squares estimation is inv olv ed at eac h iteration. An estimate of Λ can b e obtained as: b Λ = T − 2 b F ′  Y − X b β  . The tr iplet  b β , b F , b Λ  join tly min im izes the ob jectiv e fu nction (11). The estimator ˆ β C up is consistent for β . W e state this result in the f ollo wing p r op osition. Prop osition 3 Under Assumptions 1-4 and as ( n, T ) → ∞ , b β C up p − → β 0 . W e n o w turn to the asymptotic representa tion of ˆ β C up . Prop osition 4 Supp ose A ssumptions 1- 4 hold and ( n, T ) → ∞ . Then √ nT  b β C up − β 0  = D  F 0  − 1 " 1 √ nT n X i =1 x ′ i M F 0 − 1 n n X k =1 a ik x ′ i M F 0 ! u i # + o p (1) , 11 wher e a ik = λ ′ i  Λ ′ Λ n  − 1 λ k , D  F 0  = 1 nT 2 P n i =1 Z ′ i Z i and Z i = M F 0 x i − 1 n P n k =1 M F 0 x k a ik . In comparison with the p o oled least squares estimator f or the case of kn o wn F 0 , estimation of the sto chastic trends clearly aﬀects the limiting b eh avior of the estimator. The term inv olving a ik is du e to the estimati on of F . This eﬀect is carried o v er to the limiting distribu tion and to the asymptotic bias, as w e now pro ceed to sh o w. Let ¯ w it = ( u it , ∆ ¯ x ′ i , η ′ t ) ′ where ¯ x i = x i − 1 n P n k =1 x k a ik . F or the rest of the pap er, w e use bar to denote those long run co v ariance matrices (includin g one sided and conditional co v ariances and so on) generated from ¯ w it instead of w it . Th us, ¯ Ω i is the long run co v ariance matrix of ¯ w it as in (5), and d eﬁne ¯ ∆ i is the one-sided co v ariance matrix of ¯ w it . These qu an tities d ep end on n , bu t this dep endence is sup pressed for notional simplicit y . Because the r igh t h and side of the representati on d o es n ot dep end on estimated quantiti es, it is not diﬃcult to derive the limiting distr ibution of ˆ β C up , eve n allo wing for cross-sectional correlation in u it . Ho we v er, estimating the resu lting n u isance parameters would b e more diﬃcult. Th us, although consistency of the Cup estimator do es not r equire the cross-section indep endence of u it , our asym p totic distribution f or ˆ β C up is deriv ed w ith Assu m ption 5 imp osed. Theorem 1 Supp ose that Assumptions 1-5 hold. L et ˆ β C up b e obtaine d by iter atively up dating (13) and (14). As ( n, T ) seq → ∞ , we have √ nT  b β C up − β  − √ nφ nT d − → M N (0 , Σ) wher e φ nT = " 1 nT 2 n X i =1 Z ′ i Z i # − 1  1 n n X i =1 θ i  θ i = 1 T Z ′ i ∆ ¯ b i ¯ Ω − 1 bi ¯ Ω bui +  ¯ ∆ + εui − ¯ δ ′ i ¯ ∆ + ηu  , Σ = D − 1 Z " lim n →∞ 1 n n X i =1 ¯ Ω u.bi E  Z R ni R ′ ni | C  # D − 1 Z , D Z = lim n →∞ 1 n n X i =1 E  Z R ni R ′ ni | C  , (15) R ni = Q i − 1 n n X k =1 Q k a ik , ∆ ¯ b i =  ∆ ¯ x i ∆ F 0  , ¯ x i = x i − 1 n n X k =1 x k a ik , ¯ δ i = δ i − 1 n n X k =1 δ k a ik . 12 Theorem 1 establishes the large sample pr op erties of the Cup estimator. The C up estimator is √ nT consistent pro vided that φ nT = 0, which o ccurs when x it and F t are exogenous. Since φ nT = O p (1), the Cup estimator is at least T consisten t. This is in con tr ast with p o oled OLS in Section 2, where it w as sho wn to b e inconsisten t in general. Neve r theless, as in the case wh en F is observ ed, the Cup estimat or has an asymptotic bias and thus th e limiting distribution is n ot cen tered around zero. T here is an extra bias term (the term inv olving a ik ) that arises from ha vin g to estimate F t . In consequence, the bias is now a function of terms not presen t in Prop osition 1, which is v alid when F t is observed. W e n o w consid er remo vin g the bias b y constr u cting a consisten t estimate of φ nT . This can b e obtained up on replacing F 0 , ∆ ¯ b i , ¯ Ω bi , ¯ Ω bui , ¯ ∆ + εui , ¯ ∆ + η u b y their consistent estimates. W e consider t w o fully-mo diﬁ ed estimators. Th e ﬁrst one directly corrects the b ias of ˆ β C up , and is den oted by ˆ β C upB C . The second one will b e considered in the next subsection, where correction is made d uring eac h iteration, and w ill b e denoted by ˆ β C upF M . Consider b ¯ Ω i = T − 1 X j = T +1 ω  j K  b Γ i ( j ) , b ¯ ∆ i = T − 1 X j =0 ω  j K  b Γ i ( j ) b Γ i ( j ) = 1 T T − j X t =1 b ¯ w it + j b ¯ w ′ it . where b ¯ w it = ( ˆ u it , ∆ ˆ ¯ x ′ it , ∆ ˆ F ′ t ) ′ with ∆ ˆ ¯ x it = ∆ x it − 1 n n X k =1 ∆ x k t ˆ a ik The b ias-corrected Cup estimator is d eﬁ ned as ˆ β C upB C = ˆ β C up − 1 T ˆ φ nT where b φ nT = " 1 nT 2 n X i =1 b Z ′ i b Z i # − 1  1 n n X i =1 ˆ θ i  ˆ θ i = b Z ′ i ∆ b ¯ b i ˆ ¯ Ω − 1 bi ˆ ¯ Ω bui +  b ¯ ∆ + εui − ˆ ¯ δ ′ i b ∆ + ηu  , ˆ ¯ δ i =  b F ′ b F  − 1 b F ′ b ¯ x i ∆ b b i =  ∆ ˆ ¯ x i ∆ b F  b ¯ x i = x i − 1 n n X k =1 x k t b a ik , b a ik = b λ ′ i  b Λ ′ b Λ /n  − 1 b λ k . 13 Theorem 2 Supp ose Assumptions 1-6 hold. Then as ( n, T ) seq → ∞ , √ nT  b β C upB C − β 0  d − → M N (0 , Σ) . The Cup BC is √ nT consistent with a limiting distribution that is cen tered at zero. T his t yp e of bias correction approac h is also us ed in Hahn and Kuersteiner (2002), for examp le, and is not uncommon in p anel data analysis. Because the bias-corrected estimator is √ nT and has a normal limit distribution, the usual t and W ald tests can b e used for inference. Note that the limiting distribution is diﬀeren t from that of the infeasible L SBC estimator, whic h coincides with LSFM and whose asymptotic v ariance is Σ 0 instead of Σ. Thus, the estimation of F aﬀect s th e asymptotic distribution of the estimator. As in th e case wh en F is observed, the bias corrected estimator can b e rewritten as a f u lly mo diﬁed estimator. Suc h a fully-mo diﬁ ed estimator is now d iscussed. 3.3 A F ully Mo diﬁe d Cup Estimator The Cup BC just considered is constructed b y estimating the asymptotic b ias of ˆ β C up , and then subtracting it from ˆ β C up . In this subsection, w e consid er a diﬀerent fully-mo diﬁed estimator, denoted by ˆ β C upF M . Let y + it = y it − b ¯ Ω ubi b ¯ Ω − 1 bi  ∆ b ¯ x it ∆ b F t  ˆ ¯ δ i =  b F ′ b F  − 1 b F ′ b ¯ x i where b ¯ Ω ubi , b ¯ Ω bi , and b ¯ ∆ bui are estimates of ¯ Ω ubi , ¯ Ω bi and ¯ ∆ bui , resp ectiv ely . Recall that b β C up is obtained b y jointly solving (13) and (14). Cons id er r eplacing these equations by the follo wing: b β C upF M = n X i =1 x ′ i M b F x i ! − 1 n X i =1  x ′ i M b F y + i − T  b ¯ ∆ + εui − b ¯ δ ′ i b ¯ ∆ + ηu  (16) b F V nT = " 1 nT 2 n X i =1  y i − x i b β C upF M   y i − x i b β C upF M  ′ # b F . (17) Lik e the FM estimator of Phillips and Hansen (1990), the corrections are m ad e to the data to remo ve serial correlation and endogeneit y . T h e CupFM estimator for ( β , F ) is obtained b y iterativ ely solving (16) and (17). T h u s correction to endogeneit y and serial correlation is mad e dur ing eac h iteration. Theorem 3 Supp ose Assumptions 1-6 hold. Then as ( n, T ) seq → ∞ , √ nT  b β C upF M − β 0  d − → M N (0 , Σ) , wher e Σ is given i n (15) 14 The Cup FM and Cup BC ha ve the same asymptotic distribu tion, but they are constru cted diﬀeren tly . Th e estimato r b β C upB C do es the bias correction only on ce, i.e., at the ﬁn al stage of the iteration, and b β C upF M do es the correction at eve r y iteration. The situation is diﬀerent from the case of kno wn F , in which the bias-corrected estimato r and the fully-mo diﬁed estimat or are iden tical due to the absence of iteration. Aga in , b ecause of the mixture of n ormalit y , hyp othesis testing on β can pro ceed with the usu al t or c hi squ are distr ibutions. Kap etanios, Pesa ran, and Y amagata (2006 ) suggest an alternativ e estimation pr o cedure based on Pesaran (2006). The mo del is augmente d with additional regressors ¯ y t and ¯ x t , wh ic h are cross- sectional a verage s of y it and x it . These a verages are u sed as p ro xy for F t . Th e estimator f or the slop e p arameter β is shown to b e √ n consistent, b ut f u lly mo diﬁed estimator is not considered. While the fo cus is on estimating the slop e parameters β , the global sto c hastic trend s F are also of in terest. W e state this result as a prop osition: Prop osition 5 L et ˆ F b e the solution of (17). Under Assumptions of 1-4, we have 1 T T X t =1 k ˆ F t − H F 0 t k 2 = O p ( 1 n ) + O p ( 1 T 2 ) wher e H is an r × r invertible matrix. Th us, we can estimate the true global sto chasti c trends up to a rotation. This is th e same rate as in Bai (2004, Lemma B.1), wh ere the regressor x it is absent. Similarly , the factor loadings λ i are estimated with the same rate of con ve rgence as in Bai (2004). Th us far, our analysis assu mes that the num b er of sto c h astic trends, r , is kno wn . If this is n ot the case, r can b e consistentl y estimated usin g the inf orm ation criterion fun ction d ev elop ed in Bai and Ng (2002). In p articular, let b r = arg min 1 ≤ r ≤ r max I C ( r ) where r ≤ r max , r max is a b ound ed integer and I C ( r ) = log b σ 2 ( r ) + r g nT where g nT → 0 as n, T → ∞ and m in[ n, T ] g nT → ∞ . F or example, g nT can b e log ( a nT ) /a nT , with a nT = nT n + T . Then P ( ˆ r = r ) → 1 as n, T → ∞ . This criterion estimates the total num b er of factors, including I(0) factors. T o estimate the num b er of I(1) factors only , the criterion in Bai (2004) can b e used. Ignoring the I(0) factors still lead to consistent estimation of β . Ho wev er, our distr ibution theory assumes cross-sectional indep end ence for the idiosyn cr atic errors; lu mping I(0) factors with the regression err ors will violate this assu mption. This suggests the use of Bai-Ng criterion. 15 4 F urther issues The pr eceding analysis assum es that ther e are n o deterministic comp onents and that the regressors and the common factors are all I(1) w ithout d r ifts. This s ection considers construction of the estimator when these restrictions are relaxed. It will b e sho wn that when there are deterministic comp onent s, we can apply the same estimation pro cedu re to the demeaned or detrended series, and the Bro wnian motion pro cesses in the limiting distrib ution are rep laced by the demeaned and/or detrended versions. F urth ermore, the pro cedur e is rob u st to the p r esence of mixed I(1)/I(0) regressors and /or factors. Of course, the con ve rgence rates for I(0) and I(1) regressors will b e diﬀeren t, bu t asymptotic mixed normalit y and the constru ction of test statistics (and their limiting distribution) d o not dep en d on the con vergence rate. Finally , we also discuss th e issue of testing cross-sectional indep endence. 4.1 Inciden tal trends The C up estimator can b e easily extended to mo dels with incident al trends, y it = α i + ρ i t + x ′ it β + λ ′ i F t + u it . (18) In the interce pt only case ( ρ i = 0, for all i ), w e d eﬁne th e pr o jection matrix M T = I T − ι T ι ′ T /T where ι T is a vec tor of 1’s. When a linear trend is also included in the estimation, we deﬁn e M T to b e th e pr o jection matrix orth ogonal to ι T and to the linear trend. Then M T y i = M T x i β + M T F λ i + M T u i , or ˙ y i = ˙ x i β + ˙ F t λ i + ˙ u i where the dotted v ariables are d emeaned and/or detrended versions. The estimation pr o cedure for the cup estimator is identic al to that of S ection 3, except that we use dotted v ariables. With the intercept only case, the construction of FM estimator is also the same as b efore. Theorems 1-3 hold with the follo wing mo diﬁcation for the limiting distribu tion. The random pro cesses B ε,i and B η in Q i are replaced b y the demeaned Bro wnian motions. When linear tren d s are allo we d, ∆ x it is no w replaced by ˆ ε it = ∆ x it − ∆ x i , whic h is detrend ed residual of x it . But since ˙ x i is already a detrended series, and ˆ F is also asymp toticall y d etrended (since it is estimating ˙ F ), ∆ ˙ x it and ∆ ˆ F t are also estimating the d etrended residu als. Thus w e can 16 simply apply the same pro cedur e pr escrib ed in Section 3 with the dotted v ariables. The limiting distribution in Theorem 2 and consequ en tly in Th eorem 3 is mo diﬁed up on replacing the random pro cesses B εi and B η b y the demeaned and detrended Bro wn ian motions. 8 The test statistics ( t and χ 2 ) ha ve standard asymptotic distribu tion, not dep endin g on whether the underlyin g Bro wn ian motion is demeaned or detrend ed. When linear tr en ds are included in the estimatio n, the limiting distribution is inv ariant to whether or not y it , x it and F t con tain a linear trend. No w supp ose that these v ariables d o con tain a linear trend (drifted random wa lks ). With deterministic cointe gration holding (i.e., coint egrating v ector eliminates th e trends), the estimated β will ha ve a f aster conv ergence rate when a separate linear trend is not includ ed in th e estimation. But we do not consider this case. In terested readers are referred to Hansen (1992). 4.2 Mixed I(0)/I ( 1) Regressors and C ommon Sho c ks So far, w e ha ve considered estimation of panel coi n tegration mo dels when all the regressors and common sho c ks are I(1). There are no stationary regressors or stationary common sho cks. In this section w e suggest that the r esults are robust to mixed I(1)/I(0) regressors and mixed I(1) /I(0) common sho c ks. Belo w, we sketc h the arguments f or the LS estimat or assuming the factors are observ ed. If they are not observed, the limiting distribution is diﬀeren t, but the idea of argument is the same. Recall th at th e L S estimator is ˆ β LS = ( P n i =1 x ′ i M F 0 x i ) − 1 P n i =1 x ′ i M F 0 y i . The term M F 0 x i = ( I T − F 0 ( F 0 ′ F 0 ) − 1 F 0 ′ ) x i = x i − F 0 δ i with δ i = ( F 0 ′ F 0 ) − 1 F 0 ′ x i pla ys an imp ortan t role in the p rop erties of th e LS. When x it and F t are I(1), δ i = O p (1) and thus ( M F 0 x i ) t √ T = x it √ T − δ ′ i F 0 t √ T = O p (1) . W e n o w consider this term under mixed I(1) and I(0) assump tions. I(1) Regressors, I( 0) F actors. Supp ose all regressors are I(1) and all common sho c ks are I(0). With I(0) factors, w e ha ve T − 1 F 0 ′ F 0 p − → Σ F = O p (1). Thus δ i =  T − 1 F 0 ′ F 0  − 1 1 T T X t =1 F 0 t x ′ it d − → Σ − 1 F Z dB η B ′ εi = O p (1) . 8 Alternatively , we can use ˆ ε it − 1 n P n k =1 ˆ ε kt ˆ a ik in p lace of ∆ ˆ ¯ x it in Section 3. Similarly , we use ˆ η t = ∆ ˆ F t − ∆ ˆ F in place of ∆ ˆ F t . 17 It follo w s that ( M F 0 x i ) t √ T = x it − δ ′ i F 0 t √ T = x it √ T + o p (1) and x it √ T d − → B εi as T → ∞ . The limiting d istribution of th e LS when the factors are I(0) is the same as when all factors are I(1), except that Q i is no w asymptotically the s ame as B εi . F or the FM, obser ve that the sub matrix Ω η in Ω bi =  Ω εi Ω εηi Ω ηεi Ω η  is a zero matrix since η = ∆ F 0 t is an I ( − 1) pro cess and has zero long-run v ariance. Similarly , Ω εηi is also zero. The submatrix Ω uηi in Ω u.bi = Ω ui − Ω ubi Ω − 1 bi Ω bui as w ell as the submatrices ( ∆ ηui ∆ η i ) in ( ∆ bui ∆ bi ) are also degenerate b ecause the factors are I(0). Note that Ω bi is not in v ertible. Und er appropriate c hoice of b andwidth, see Phillips (1995), Ω − 1 bi Ω bui can b e consisten tly estimated, so that FM estimators can b e constructed. Th is argument treats F t as if it w ere I(1). If it is kn o wn that F t is I(0), w e w ill simply use F t instead of ∆ F t in the FM constru ction. I(1) Regressors, Mixed I(0)/I(1) F actors Consider the mo del y it = x ′ it β + λ ′ 1 i F 1 t + λ ′ 2 i F 2 t + u it (19) where F 1 t = η 1 t is r 1 × 1 and ∆ F 2 t = η 2 t is r 2 × 1. W e again hav e M F 0 x i = x i − F 0 δ i but δ i =  δ 1 i δ 2 i  ′ . Then ( M F 0 x i ) t √ T = x it √ T − 1 √ T h δ ′ 1 i δ ′ 2 i i  F 0 1 t F 0 2 t  = x it √ T − 1 √ T  δ ′ 1 i F 0 1 t + δ ′ 2 i F 0 2 t  = x it √ T − δ ′ 2 i F 0 2 t √ T + o p (1) since δ 1 i = O p (1), δ 2 i = O p (1) but F 0 1 t √ T = o p (1). The r andom m atrix Q i in volv es B εi and B 2 η . In the FM correction, the long run v ariance ( u it , ∆ x ′ it , ∆ F ′ 1 t , ∆ F ′ 2 t ) ′ is degenerate. With an appr opriate c hoice of b an d width as in Phillips (1995), the limiting normalit y still h olds. Mixed I(1)/I(0) Regressors and I(1) F actors Supp ose k 2 regressors denoted by x 2 it are I(1), and k 1 regressors d enoted b y x 1 it are I(0). Assu me F t is I(1) and u it is I(0) as in (3 ). Consider y it = α i + x ′ 1 it β 1 + x ′ 2 it β 2 + λ ′ i F t + u it ∆ x 2 it = ε 2 it . 18 With the inclusion of an int ercept, there is no loss of generalit y to assu me x 1 it ha ving a zero m ean. F or this mo d el, we add the assumption that E ( x 1 it u it ) = 0 (20) to rule out s imultaneit y bias with I(0) regressors. Otherwise β 1 cannot b e consisten tly estimated. Alternativ ely , if u it is correlated with x 1 it , w e can pro ject u it on to x 1 it to obtain the pro j ection residual and still d enote it by u it (with abuse of n otatio n), and by deﬁ n ition, u it is uncorrelated with x 1 it . But th en β 1 is no longer the stru ctural parameter. Th e dynamic least squares app roac h b y add ing ∆ x 2 it is exactly based on this argument, with the pur p ose of more eﬃcien t estimation of β 2 . If one kno ws which v ariable is I(0) and wh ic h is I(1), the situation is very simple. Th e I(1) and I(0) v ariables are asymp totically orthogonal, w e can s eparately analyze th e distribution of th e estimated β 1 and β 2 . Th e estimated β 1 needs no correction and is asymptotically norm al, and the estimated β 2 has a distribu tion as if there is n o I(0) regressors except the in tercept. Note that the FM construction for ˆ β 2 is based on the resid uals with all regressors included. Th e rest of analysis is identica l to the situation of all I (1) regressors with an int er cept. In practice, the s eparation of I(0) or I(1) r egressors ma y not b e kn o wn in adv ance. One can pro ceed by pr etesting to iden tify the integ ration order for eac h v ariable, and then app ly the ab o ve argumen t. One ma j or purp ose of separating I(0) and I(1) v ariables is to derive relev an t rate of con ve rgence for the estimated parameters. But if the u ltimate pu rp ose is to d o h yp othesis testing, there is no need to kno w the rate of conv ergence f or the estimator s ince the scaling factor n or T are cancelled out in the end. On e can pro ceed as if all regressors are I(1). Then care s h ould b e tak en since the long-run co v ariance matrix is of deﬁcien t rank. Phillips (1995) sho ws that FM estimators can b e constru cted w ith appropriate c hoice of ban d width. Interested r eaders are r eferred to Phillips (1995 ) for details. Finally , there is the case of mixed I(1)/I(0) regressors and mixed I(1)/I(0) factors. As explained earlier, I(0) factors do n ot c h an ge the r esult. In practice , there is no need to know whether F 0 is I(1) and I(0), since the Cup estimato r only dep ends on M ˆ F ; scaling in ˆ F do es not alter the n u merical v alue of ˆ β C up . 5 Mon te Carlo Simulations In this section, we conduct Mon te Carlo exp eriment s to assess the ﬁnite sample p rop erties of the prop osed CupBC and C upFM estimators. W e also compare th e p erformance of the prop osed 19 estimators with that of LS D V (least squ ares dumm y v ariables, i.e., the within group estimator) and 2sFM (2-stage fu lly mo d iﬁed which is the CupFM estimator with only one iteration). Data are generated based on the follo wing design. F or i = 1 , ..., n , t = 1 , ..., T , y it = 2 x it + c  λ ′ i F t  + u it F t = F t − 1 + η t x it = x it − 1 + ε it where 9   u it ε it η t   iid ∼ N     0 0 0   ,   1 σ 12 σ 13 σ 21 1 σ 23 σ 31 σ 32 1     . (21) W e assume a single factor, i.e., r = 1, λ i and η t are generated from i.i.d. N ( µ λ , 1) and N ( µ η , 1) resp ectiv ely . W e set µ λ = 2 and µ η = 0. Endogeneit y in the system is con tr olled by only tw o parameters, σ 21 and σ 31 . The parameter c control s the imp ortance of the global sto c hastic tren ds. W e consid er c = (5 , 10) , σ 32 = 0 . 4 , σ 21 = (0 , 0 . 2 , − 0 . 2) and σ 31 = (0 , 0 . 8 , − 0 . 8) . The long-run co v ariance matrix is estimated u sing the KERNEL pro cedur e in COINT 2 . 0. W e use the Bartlett w in do w with the truncation set at ﬁ v e. Results for other k ern els, s u c h as Parzen and qu adratic sp ectral kernels, are similar and hence n ot rep orted. The maximum num b er of the iteration for C upBC and Cu pFM estimators is set to 20. T able 1 rep orts the m eans and standard deviations (in parentheses) of th e estimators for samp le sizes T = n = (20 , 40 , 60 , 120) . The results are b ased on 10 , 000 replications. The bias of th e LSD V estimator do es not decrease as ( n, T ) increases in general. In terms of mean bias, the CupBC and CupFM are d istinctly su p erior to the LS D V and 2sFM estimators for all cases considered. The 2sFM estimator is less eﬃcien t than the C upBC and Cup FM estimators, as seen by the larger standard deviations. T o see ho w the prop erties of the estimator v ary with n and T , T able 2 considers 16 diﬀerent com binations for n and T , eac h r anging from 20 to 120. F rom T able 2, we see that the LSD V an d 2sFM estimators b ecome h ea vily biased w hen the imp ortance of the common sho c k is magniﬁed as w e incr ease c from 5 to 10. On the other h and, the Cu pBC and CupFM estimators are unaﬀected b y the v alues of c . The r esults in T able 2 again ind icate that the Cup BC and CupFM p erform w ell. The prop erties of the t -statistic for testing β = β 0 , are giv en in T able 3. Here, the LS D V t -statistic is the con v entional t -statistic as rep orted b y standard statistical pac k ages. It is clear 9 Random numbers for error t erms, ( u it , ε it , η t ) are generated by th e GAUSS pro ced ure RND NS. At eac h replica- tion, we generate an nT length of rand om num b ers and then split it into n series so that eac h series h as the same mean and v ariance. 20 that L S D V t -statistics and 2sFM t -statistics div erge as ( n, T ) in creases and they are not w ell appro x im ated b y a s tandard N(0,1) d istr ibution. The CupBC and Cup FM t -statistics are muc h b etter appro ximated b y a standard N(0,1 ). In teresting, the p erformance of Cu pBC is no worse than th at of CupFM, eve n though Cup BC do es the full m o diﬁcation in the ﬁnal stage of iteration. T able 4 sho w s that, as n and T in creases, the biases for the t-statistics asso ciated with LS D V and 2sFM d o not decrease. F or CupBC and Cu pFM, the b iases for th e t -statistics b ecome smaller (except for a sm all num b er of cases) as T increases for eac h ﬁxed n . As n in creases, no improv emen t in bias is found. The large standard deviations in the t -statistics asso ciated with LSDV and 2sFM indicate their p o or p erforman ce, esp ecially as T increases. F or the Cup BC and Cup FM, th e standard errors conv erge to 1 . 0 as n an d T (esp ecially as T ) increase. 6 Conclusion This pap er d ev elops an asymptotic theory for a p anel coin tegration mo del with unob s erv able global sto c hastic trends. Standard least squares estimat or is, in general, inconsisten t. In co n tr ast, the prop osed C up estimator is sho wn to b e consistent (at least T -consisten t). In the absence of en- dogeneit y , the Cup estimator is also √ nT consisten t. Because w e allo w the regressors and the unobserv able trend s to b e endogenous, an asymp totic bias exists for the C up estimator. W e fu rther consider t wo bias-corrected estimators, CupBC and CupFM, and d eriv e their rate of con v ergence and their limiting distribu tions. W e show that these estimators are √ nT consisten t and this h olds in sp ite of endogeneit y and in spite of sp uriousness induced b y un observ able I(1) common s ho c ks. A sim u lation study shows that the prop osed C upBC and C u pFM estimators ha ve go o d ﬁnite sample prop erties. 21 App endix Throughout we use ( n, T ) seq → ∞ to denote the sequential limit, i.e., T → ∞ ﬁ r st and f ollo w ed b y n → ∞ . W e use M N (0 , V ) to den ote a mixed normal distrib ution with v ariance V . Let C b e the σ -ﬁeld generated b y  F 0 t  . The ﬁrst lemma assumes u i is un correlated w ith ( u i , F 0 ) for every i . Th is assump tion is relaxed in Lemma A.2. Lemma A.1 Su pp ose that Assumptions 1-5 hold and that u i is unc orr elate d with ( x i , F 0 ) , then as ( n, T ) seq → ∞ (a) 1 n n X i =1 1 T 2 x ′ i M F 0 x i p − → lim n →∞ 1 n n X i =1 E  Z Q i Q ′ i | C  , (b) 1 √ n n X i =1 1 T x ′ i M F 0 u i d − → M N 0 , lim n →∞ 1 n n X i =1 Ω ui E  Z Q i Q ′ i | C  ! . Pro of. Note that 1 n n X i =1 1 T 2 x ′ i M F 0 x i = 1 n n X i =1 1 T 2 x ′ i M F 0 M F 0 x i = 1 n n X i =1 1 T 2 T X t =1 e x it e x ′ it where e x it = x it − δ ′ i F 0 t and δ i =  F 0 ′ F 0  − 1 F 0 ′ x i = F 0 ′ F 0 T 2 ! − 1 1 T 2 T X t =1 F 0 t x ′ it d − →  Z B η B ′ η  − 1 Z B η B ′ εi see, e.g., Ph illips and Ou liaris (1990). Th us e x it √ T = x it √ T − δ ′ i F 0 t √ T d − → B εi − "  Z B η B ′ η  − 1 Z B η B ′ εi # ′ B η = Q i . By the cont in u ous mapping theorem 1 T 2 T X t =1 e x it e x ′ it d − → Z Q i Q ′ i = ζ 1 i as T → ∞ . The v ariable ζ 1 i is ind ep endent across i conditional on C , whic h is an inv ariant σ -ﬁeld. Th us conditioning on C , the la w of large num b ers for indep endent random v ariables giv es, 1 n n X i =1 ζ 1 i p − → lim n →∞ 1 n n X i =1 E ( ζ 1 i | C ) = lim n →∞ 1 n n X i =1 E  Z Q i Q ′ i | C  . 22 Th us, the sequential limit is 1 n n X i =1 1 T 2 x ′ i M F 0 x i p − → lim n →∞ 1 n n X i =1 E  Z Q i Q ′ i | C  This pro v es part (a). Consider (b). Rewr ite 1 √ n n X i =1 1 T x ′ i M F 0 u i = 1 √ n n X i =1 1 T T X t =1 e x it u it where e x it = x it − δ ′ i F 0 t as b efore. By assumption, u it is I(0) and is uncorrelated with e x it . It follo ws that 1 T T X t =1 e x it u it d − → Z Q i dB ui = ξ 2 i ∼ Ω 1 / 2 ui  Z Q i Q ′ i  1 / 2 × Z where Z ∼ N (0 , I k ) as T → ∞ for a ﬁxed n . The v ariable ξ 2 i is ind ep endent across i conditional on C , which is an in v ariant σ -ﬁ eld. Thus conditioning on C , 1 n n X i =1 ξ 2 i ξ ′ 2 i p − → lim n →∞ 1 n n X i =1 E  ξ 2 i ξ ′ 2 i | C  = lim n →∞ 1 n n X i =1 Ω ui Z E  Q i Q ′ i | C  . (22) Let I i b e the σ ﬁeld generated by { F 0 t } and ( ξ 21 , ..., ξ 2 i ). Th en { ξ 2 i , I i ; i ≥ 1 } is a martingale diﬀerence sequence (MDS) b ecause { ξ 2 i } are in dep end en t across i conditional on C and E ( ξ 2 i | I i − 1 ) = E ( ξ 2 i | C ) = 0 . F rom P n i =1 ξ 2 i ξ ′ 2 i = O p ( n ), the conditional Lind eb erg condition in C orollary 3.1 of Hall and Heyde (1980 ) can b e written as 1 n n X i =1 E  ξ 2 i ξ ′ 2 i 1  k ξ 2 i k > √ nδ  | I i − 1  p → 0 (23) for all δ > 0 . T o see (23), notice that 1 n n X i =1 E  ξ 2 i ξ ′ 2 i 1  k ξ 2 i k > √ nδ  | I i − 1  = 1 n n X i =1 E  ξ 2 i ξ ′ 2 i 1  k ξ 2 i k > √ nδ  | C  . Without loss of generalit y w e assume ξ 2 i is a scala r to sa v e n otatio ns. By the Cauc hy-Sc h w arz inequalit y E  ξ 2 2 i 1  k ξ 2 i k > √ nδ  | C  ≤  E  ξ 4 2 i | C  1 / 2  E  1  k ξ 2 i k > √ nδ  | C  1 / 2 . 23 F ur thermore, E  1  k ξ 2 i k > √ nδ  | C  ≤ E  ξ 2 2 i | C  nδ 2 It follo w s that 1 n n X i =1 E  ξ 2 2 i 1  k ξ i k > √ nδ  | C  ≤ 1 √ nδ " 1 n n X i =1  E  ξ 4 2 i | C  E  ξ 2 2 i | C  1 / 2 # = O p ( n − 1 / 2 ) in view of 1 n n X i =1 [ E  ξ 4 2 i | C  E ( ξ 2 2 i | C )] 1 / 2 = O p (1) . This pro v es (23). The central limit th eorem for martingale d iﬀeren ce sequence, e.g., C orollary 3.1 of Hall and Heyde (1980), implies that 1 √ n n X i =1 ξ 2 i d − → " lim n →∞ 1 n n X i =1 E  ξ 2 i ξ ′ 2 i | C  # 1 / 2 × Z (24) where Z ∼ N (0 , I ) and Z is indep end ent of lim n →∞ 1 n P n i =1 E  ξ 2 i ξ ′ 2 i | C  . Note that " lim n →∞ 1 n n X i =1 E  ξ 2 i ξ ′ 2 i | C  # 1 / 2 = lim n →∞ 1 n n X i =1 Ω ui E  Z Q i Q ′ i | C  ! 1 / 2 . Th us, as ( n , T ) seq → ∞ , we ha ve 1 √ n 1 T n X i =1 T X t =1 e x it u it d − → lim n →∞ 1 n n X i =1 Ω ui E  Z Q i Q ′ i | C  ! 1 / 2 × Z whic h is a mixed normal. T he ab o ve can b e rewritten as 1 √ n 1 T n X i =1 T X t =1 e x it u it d − → M N 0 , lim n →∞ 1 n n X i =1 Ω ui E  Z Q i Q ′ i | C  ! . This pro v es part (b). The p ro ofs for Prop ositions 1 an d 2 (with observ able F ) f ollo w immed iately from Lemma A.1. Prop ositions 3 and 4 are prov ed in the supplementary app end ix of Bai et al. (2006). T o d eriv e the limiting d istribution for b β C up , w e need the follo w in g lemma. Hereafte r , we deﬁn e δ nT = min { √ n, T } . Lemma A.2 Su pp ose Assumptions 1-5 hold. L et Z i = M F 0 x i − 1 n P n k =1 M F 0 x k a ik . Then as ( n, T ) seq → ∞ 24 (a) 1 nT 2 n X i =1 Z ′ i Z i p − → lim n →∞ 1 n n X i =1 E  Z R ni R ′ ni | C  , (b) If u i is unc orr elate d with ( x i , F 0 ) f or al l i , then 1 √ nT n X i =1 Z ′ i u i d − → M N 0 , lim n →∞ 1 n n X i =1 Ω ui E  Z R ni R ′ ni | C  ! (c) If u i is p ossibly c orr elate d with ( x i , F 0 ) , then 1 √ nT n X i =1 Z ′ i u i − √ n θ n d − → M N 0 , lim n →∞ 1 n n X i =1 ¯ Ω u.bi E  Z R ni R ′ ni | C  ! wher e R ni = Q i − 1 n n X k =1 Q k a ik , a ik = λ ′ i  Λ ′ Λ /n  − 1 λ k , Q i = B εi −  Z B εi B ′ η   Z B η B ′ η  − 1 B η θ n = 1 n n X i =1  1 T Z ′ i  ∆ ¯ x i ∆ F  ¯ Ω − 1 bi ¯ Ω bui +  I k − ¯ δ ′ i   ¯ ∆ + εui ¯ ∆ + ηu  with ¯ δ i =  F 0 ′ F 0  − 1 F 0 ′ ¯ x i , and ¯ x i = x i − 1 n P n k =1 x k a ik . Pro of of (a ) . Recall M F 0 x i = x i − F 0  F 0 ′ F 0  − 1 F 0 ′ x i = x i − F 0 δ i where δ i =  F 0 ′ F 0  − 1 F 0 ′ x i = F 0 ′ F 0 T 2 ! − 1 1 T 2 T X t =1 F 0 t x ′ it d − →  Z B η B ′ η  − 1 Z B η B ′ εi = π i is a r × k matrix as T → ∞ . W rite e x i = M F 0 x i = x i − F 0 δ i , a T × k m atrix. Hence Z i = M F 0 x i − 1 n n X k =1 M F 0 x k a ik =  x i − F 0 δ i  − 1 n n X k =1  x k − F 0 δ k  a ik = e x i − 1 n n X k =1 e x k a ik 25 where a ik = λ ′ i  Λ ′ Λ /n  − 1 λ k is a scalar and e x it √ T = x it √ T − δ ′ i F 0 t √ T d − → B εi − "  Z B η B ′ η  − 1 Z B η B ′ εi # ′ B η = Q i a k × 1 ve ctor, as T → ∞ . It follo w s that Z it √ T d − → Q i − 1 n n X k =1 Q k a ik = R ni and using s imilar steps in part (a) in L emma A.1 as n → ∞ , 1 nT 2 n X i =1 Z R ni R ′ ni p − → lim n →∞ 1 n n X i =1 E  Z R ni R ′ ni | C  . Hence 1 nT 2 n X i =1 Z ′ i Z i p − → lim n →∞ 1 n n X i =1 E  Z R ni R ′ ni | C  as ( n, T ) seq → ∞ , sho win g (a). Pro of of part (b) . Notice th at 1 √ nT n X i =1 Z ′ i u i = 1 √ nT n X i =1 M F 0 x i − 1 n n X k =1 M F 0 x k a ik ! ′ u i = 1 √ nT n X i =1 ( M F 0 x i ) ′ u i − 1 √ nT n X i =1 1 n n X k =1 M F 0 x k a ik ! ′ u i = I b + I I b . I b is pr o ve d in Lemma A.1 , as ( n, T ) seq → ∞ , I b = 1 √ nT n X i =1 ( M F 0 x i ) ′ u i d − → M N 0 , lim n →∞ 1 n n X i =1 Ω ui E  Z Q i Q ′ i | C  ! if e x it and u it are uncorrelated. Sim ilarly , for I I b , we hav e 1 √ nT n X i =1  1 n n X k =1 a ik M F 0 x k  ′ u i d − → M N 0 , lim n →∞ 1 n n X i =1 Ω ui E ( C ni | C ) ! where C ni = 1 n P n k =1 a ik R Q k Q ′ k w e hav e used the fact that 1 n 2 P n k =1 P n j =1 a ik a ij = 1 n P n k =1 a ik . Th us b oth I b and I I b ha ve a prop er limiting distribution. These distribu tions are dep end en t since they dep end on the same u i . W e can also deriv e their join t limiting distr ib ution. Give n the form of Z i , it is easy to show th at the ab o ve con vergences imply part (b ). 26 Pro of of part (c) . No w supp ose e x it and u it are correlated. It is known that 1 T T X t =1 e x it u it = 1 T T X t =1  x it − δ ′ i F 0 t  u it = 1 T T X t =1  I k − δ ′ i   x it F 0 t  u it =  I k − δ ′ i  1 T T X t =1  x it F 0 t  u it d − →  I k − π ′ i   Z  B εi dB ui B η dB ui  +  ∆ εui ∆ η u  = Z Q i dB ui +  I k − π ′ i   ∆ εui ∆ η u  (25) as T → ∞ (e.g., Phillips and Durlauf, 1986). First we note Z Q i dB ui = Z Q i d  Ω 1 / 2 u.bi V i + Ω ubi Ω − 1 / 2 bi W i  = Z Q i dB u.bi + Z Q i dB ′ bi Ω − 1 bi Ω bui suc h that E  Z Q i dV i  = E  E  Z Q i dV i  | π i  = E  E  Z  B εi − π ′ i B η  dV i | π i  = 0 . Note that 1 T x ′ i M F 0  ∆ x i ∆ F  Ω − 1 bi Ω bui = 1 T e x ′ i  ∆ x i ∆ F  Ω − 1 bi Ω bui =  I k − δ ′ i  1 T T X t =1  x it F 0 t  Ω − 1 bi Ω bui  ∆ x it ∆ F 0 t  d − →  I k − π ′ i   Z  B εi B η  dB ′ bi Ω − 1 bi Ω bui + ∆ bi Ω − 1 bi Ω bui  . Therefore 1 T e x ′ i u i −  1 T x ′ i M F 0  ∆ x i ∆ F  Ω − 1 bi Ω bui +  I k − δ ′ i   ∆ bui − ∆ bi Ω − 1 bi Ω bui   = 1 T e x ′ i u i −  1 T x ′ i M F 0  ∆ x i ∆ F  Ω − 1 bi Ω bui +  I k − δ ′ i  ∆ + bui  (26) d − → Ω 1 / 2 u.bi Z Q i dV i ∼  Ω 1 / 2 u.bi Z Q i Q ′ i  1 / 2 × N ( 0 , I k ) 27 where ∆ + bui = ∆ bui − ∆ bi Ω − 1 bi Ω bui . Let θ n 1 = 1 n n X i =1  1 T x ′ i M F 0  ∆ x i ∆ F  Ω − 1 bi Ω bui +  I k − δ ′ i  ∆ + bui  . Then w e use similar steps in part (b) in Lemm a A.1 to get 1 √ nT n X i =1 e x ′ i u i − √ nθ n 1 = 1 √ nT n X i =1 T X t =1 e x it u it − √ nθ n 1 d − → M N 0 , lim n →∞ 1 n n X i =1 Ω u.bi E  Z Q i Q ′ i | C  ! as ( n, T ) seq → ∞ . Note Z i = e x i − 1 n P n k =1 e x k a ik is a demeaned e x i where 1 n P n k =1 e x k a ik is the weig h ted av erage of e x i with the w eight a ik . It follo w s that Z i = e x i − 1 n n X k =1 e x k a ik =  x i − F 0 δ i  − 1 n n X k =1  x k − F 0 δ k  a ik = x i − 1 n n X k =1 x k a ik ! − F 0 δ i − 1 n n X k =1 δ k a ik ! ′ = ¯ x i − F 0 ¯ δ ′ i where ¯ x i = x i − 1 n P n k =1 x k a ik and ¯ δ i = δ i − 1 n P n k =1 δ k a ik . W e th en can mo dify (25) as 1 T T X t =1 Z it u it = 1 T T X t =1  ¯ x it − ¯ δ ′ i F 0 t  u it = 1 T T X t =1  I k − ¯ δ ′ i   ¯ x it F 0 t  u it =  I k − ¯ δ ′ i  1 T T X t =1  ¯ x it F 0 t  u it d − →  I k − ¯ π ′ i   Z  ¯ B εi dB ui B η dB ui  +  ¯ ∆ εui ¯ ∆ η u  = Z R ni dB ui +  I k − ¯ π ′ i   ¯ ∆ εui ¯ ∆ ηu  (27) where ¯ B εi = B εi − 1 n P n k =1 B εi a ik and ¯ δ i = δ i − 1 n n X k =1 δ k a ik d − →  Z B η B ′ η  − 1 Z B η ¯ B ′ εi = ¯ π i . 28 The R ni terms app ears in th e last line in (27) b ecause ¯ B εi − ¯ π ′ i B η = B εi − 1 n n X k =1 B εk a ik ! −  Z B η B ′ η  − 1 Z B η B εi − 1 n n X k =1 B εk a ik ! ′ B η = B εi − "  Z B η B ′ η  − 1 Z B η B ′ εi # B η − 1 n n X k =1 ( B εk − "  Z B η B ′ η  − 1 Z B η B ′ εk # B η ) a ik = Q i − 1 n n X k =1 Q k a ik = R ni . Let θ n = 1 n n X i =1  1 T Z ′ i  ∆ ¯ x i ∆ F  ¯ Ω − 1 bi ¯ Ω bui +  I k − ¯ δ ′ i  ¯ ∆ + bui  . Clearly 1 √ nT n X i =1 Z ′ i u i − √ n θ n = 1 √ nT n X i =1 e x i − 1 n n X k =1 e x k a ik ! ′ u i d − → M N 0 , lim n →∞ 1 n n X i =1 ¯ Ω u.bi E  Z R ni R ′ ni | C  ! as ( n, T → ∞ ) w ith R ni = Q i − 1 n P n k =1 Q k a ik . This pr o ve s (c). Pro of of Theorem 1. This follo ws dir ectly from Lemma A.2 as ( n, T ) → ∞ when n T → 0 √ nT  b β C up − β  − √ nφ nT d − → M N 0 , D − 1 Z " lim n →∞ 1 n n X i =1 ¯ Ω u.bi E  Z R ni R ′ ni | C  # D − 1 Z ! where D Z = lim n →∞ 1 n P n i =1 E  R R ni R ′ ni | C  and φ nT = h 1 nT 2 P n i =1 Z ′ i Z i i − 1 θ n . Pro of of Theorem 2 and 3 . The pro of for Theorem 2 is similar to that of Theorem 3 b elo w, th us omitted. T o p ro ve T heorem 3, w e need some preliminary results. First we examine the limiting distribution of the infeasible FM estimator, e β C upF M . The endogeneit y correction is ac hiev ed by m o difying the v ariable y it in (3 ) w ith the transformation y + it = y it − ¯ Ω ubi ¯ Ω − 1 bi  ∆ ¯ x it ∆ F 0 t  and u + it = u it − ¯ Ω ubi ¯ Ω − 1 bi  ∆ ¯ x it ∆ F 0 t  . 29 By construction u + it has zero long-run co v ariance with  ∆ ¯ x ′ it ∆ F 0 ′ t  ′ and hence the end o- geneit y can b e r emo ve d . T h e serial correlation correction term has the form ¯ ∆ + bui =  ¯ ∆ + εui ¯ ∆ + η u  =  ¯ ∆ bui ¯ ∆ bi   I k − ¯ Ω − 1 bi ¯ Ω bui  = ¯ ∆ bui − ¯ ∆ bi ¯ Ω − 1 bi ¯ Ω bui , where ¯ ∆ bui denotes the one-sided long-run co v ariance b etw een u it and ( ε it , η t ) . Ther efore, the infeasible FM estimator is e β C upF M = n X i =1 x ′ i M F 0 x i ! − 1 n X i =1  x ′ i M F 0 y + i − T  ¯ ∆ + εui − ¯ δ ′ i ¯ ∆ + η u  with ¯ δ i =  F 0 ′ F 0  − 1 F 0 ′ ¯ x i . The f ollo wing Lemma giv es the limiting distribution of e β C upF M . Lemma A.3 Su pp ose Assu mptions in The or em 1 hold. Then as ( n, T ) seq → ∞ √ nT  e β C upF M − β 0  d − → M N 0 , D − 1 Z " lim n →∞ 1 n n X i =1 ¯ Ω u.bi E  Z R ni R ′ ni | C  # D − 1 Z ! . Pro of. Let w + it =  u + it ε ′ it η ′  ′ and w e h av e 1 √ T [ T r ] X t =1 w + it d − →   B + ui B εi B η   =  B + ui B bi  = B M  Ω + i  as T → ∞ , (28) where B bi =  B εi B η  , Ω u.bi = Ω ui − Ω ubi Ω − 1 bi Ω bui , Ω + i =  Ω u.bi 0 0 Ω bi  =   Ω u.bi 0 0 0 Ω εi Ω εηi 0 Ω η εi Ω η   = Σ + + Γ + + Γ + ′ ,  B + ui B bi  =  I − Ω ubi Ω − 1 bi 0 I   B ui B bi  . 30 Deﬁne ∆ + i = Σ + i + Γ + i . and let u + 1 it = u it − Ω ubi Ω − 1 bi  ∆ x it ∆ F t  . First we notice from (26 ) in Lemma A.2 that ζ + 1 iT = 1 T T X t =1 e x it u + 1 it =  I k − δ ′ i  1 T T X t =1  x it F 0 t  u + 1 it =  I k − δ ′ i  " 1 T T X t =1  x it F 0 t  u it − 1 T T X t =1  x it F 0 t  Ω ubi Ω − 1 bi  ∆ x it ∆ F 0 t  # d − → Ω 1 / 2 u.bi Z Q i dV i +  ∆ + εui − π ′ i ∆ + η u  (29) as T → ∞ . No w let ζ ∗ 1 iT = ζ + 1 iT −  ∆ + εui − δ ′ i ∆ + ηu  . Clearly , ζ ∗ 1 iT d − → Ω 1 / 2 u.bi Z Q i dV i . Th us, 1 √ nT n X i =1  x ′ i M F 0 u + 1 i − T  ∆ + εui − δ ′ i ∆ + ηu  = 1 √ nT n X i =1 T X t =1 e x it u + 1 it − T  ∆ + εui − δ ′ i ∆ + ηu  ! d − → M N 0 , lim n →∞ 1 n n X i =1 Ω u.bi E  Z Q i Q ′ i | C  ! as ( n, T ) seq → ∞ . Next, w e mo dify (29 ). 1 T T X t =1 Z it u + it = 1 T T X t =1  ¯ x it − ¯ δ ′ i F 0 t  u + it =  I k − ¯ δ ′ i  " 1 T T X t =1  ¯ x it F 0 t  u + it − 1 T T X t =1  ¯ x it F 0 t  Ω ubi Ω − 1 bi  ∆ ¯ x it ∆ F 0 t  # d − →  I k − ¯ π ′ i   Z  ¯ B εi B η  dB ui +  ¯ ∆ εui ¯ ∆ ηu  −  Z  ¯ B εi B η  dB ′ bi ¯ Ω − 1 bi ¯ Ω bui + ¯ ∆ bi  = Z R ni dB ui +  I k − ¯ π ′ i   ¯ ∆ εui ¯ ∆ η u  − Z  R ni dB ′ bi ¯ Ω − 1 bi ¯ Ω bui +  I k − ¯ π ′ i   ¯ ∆ εi ¯ ∆ η  ¯ Ω − 1 bi ¯ Ω bui  = ¯ Ω 1 / 2 u.bi Z R ni dV i +  ¯ ∆ + εui − ¯ π ′ i ¯ ∆ + η u  Therefore, 1 √ nT n X i =1  Z ′ i u + i − T  ¯ ∆ + εui − ¯ δ ′ i ¯ ∆ + ηu  d − → M N 0 , lim n →∞ 1 n n X i =1 ¯ Ω u.bi E  Z R ni R ′ ni | C  ! 31 as ( n, T ) seq → ∞ . Then √ nT  e β C upF M − β 0  d − → M N 0 , D − 1 Z lim n →∞ 1 n n X i =1 Ω u.bi E  Z R ni R ′ ni | C  D − 1 Z ! as ( n, T ) seq → ∞ . This prov es th e th eorem. T o sh o w √ nT  b β C upF M − e β C upF M  = o p (1), w e n eed the follo w ing lemma. Lemma A.4 U nder Assumptions of 1-6, we have (a) √ n  b ∆ + εun − ∆ + εun  = o p (1) , (b) 1 √ n P n i =1  δ ′ i b ∆ + ηu − δ ′ i ∆ + η u  = o p (1) , (c) 1 √ nT P n i =1  x ′ i M b F b u + i − x ′ i M F 0 u + i  = o p (1) wher e b u + it = u it − b Ω ubi b Ω − 1 bi  ∆ x it ∆ b F t  , b ∆ + εun = 1 n P n i =1 b ∆ + εui and ∆ + εun = 1 n P n i =1 ∆ + εui . Note that the lemma h olds w hen the long r un v ariances are replaced by the bar v ersions . Since the pro ofs are b asically the same (as d emonstrated in the pro of of Theorem 1), the pr o of is fo cused on the v ariances without the bar. Pro of. First, note that ∆ + bui =  ∆ + εui ∆ + η u  =  ∆ bui ∆ bi   1 − Ω − 1 bi Ω bui  = ∆ bui − ∆ bi Ω − 1 bi Ω bui . Then ∆ + εui = ∆ εui − ∆ εi Ω ∗− 1 εi Ω εui where Ω ∗− 1 εi is the ﬁrst k × k blo c k of Ω − 1 bi . F ollo wing the argument s as in the p ro ofs of Theorems 9 an d 10 of Hannan (1970) (also see similar resu lt of Mo on and Perron (2004)) , w e hav e E    √ n  b ∆ + εun − ∆ + εun     2 ≤ sup i E    b ∆ + εui − E b ∆ + εui    2 + n su p i    E b ∆ + εui − ∆ + εui    2 = O  K T  + O  n K 2 q  . It follo w s that √ n  b ∆ + εun − ∆ + εun  = O p max r K T , r n K 2 q ! . 32 F rom Assu mption 6. K ∽ n b . Then n K 2 q ∽ n n 2 q b = n (1 − 2 qb ) → 0 if 1 < 2 q b or 1 2 q < b. Next K T ∽ n b T = exp  log  n b T  = exp  b − log T log n  log n = n b − log T log n ≤ n b − lim inf log T log n → 0 if b < lim inf log T log n b y Assum ption 6. Then √ n  b ∆ + εun − ∆ + εun  = O p max r K T , r n K 2 q ! = o p (1) as required. Th is prov es (a). T o establish (b ), we note 1 √ n n X i =1  δ ′ i b ∆ + ηu − δ ′ i ∆ + η u  = 1 n n X i =1 δ ′ i ! √ n  b ∆ + η u − ∆ + ηu  = O p (1) O p max ( r K T , r n K 2 q )! = o p (1) as required for part (b). Let e u + it = u it − b Ω ubi b Ω − 1 bi  ∆ x it ∆ F t  . Next, 1 √ nT n X i =1  x ′ i M b F b u + i − x ′ i M F 0 u + i  = 1 √ nT n X i =1  x ′ i M b F b u + i − x ′ i M b F e u + i + x ′ i M b F e u + i − x ′ i M b F u + i + x ′ i M b F u + i − x ′ i M F 0 u + i  = 1 √ nT n X i =1  x ′ i M b F e u + i − x ′ i M b F u + i  + 1 √ nT n X i =1  x ′ i M b F u + i − x ′ i M F 0 u + i  + 1 √ nT n X i =1  x ′ i M b F b u + i − x ′ i M b F e u + i  = 1 √ nT n X i =1 x ′ i M b F  e u + i − u + i  + 1 √ nT n X i =1  x ′ i M b F − x ′ i M F 0  u + i + 1 √ nT n X i =1 x ′ i M b F  b u + i − e u + i  = 1 √ nT n X i =1 x ′ i M b F  e u + i − u + i  + 1 √ nT n X i =1 x ′ i  M b F − M F 0  u + i + 1 √ nT n X i =1 x ′ i M b F  b u + i − e u + i  = I + I I + I I I . F rom the pro of of Pr op osition 4 in the supplementary app end ix, I I = 1 √ nT n X i =1 x ′ i  M b F − M F 0  u + i = o p (1) 33 if we replace u i b y u + i . Let ∆ b i =  ∆ x i ∆ F  b e a T × ( k + r ) matrix. C onsider I . 1 √ nT n X i =1 x ′ i M b F  e u + i − u + i  = 1 √ nT n X i =1 x ′ i M b F  u i − ∆ b i b Ω − 1 bi b Ω bui − u i + ∆ b i Ω − 1 bi Ω bui  = 1 √ nT n X i =1 x ′ i M b F  ∆ b i  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi  = 1 √ nT n X i =1 x ′ i I T − b F b F ′ T 2 !  ∆ b i  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi  = 1 √ nT n X i =1 x ′ i ∆ b i  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi  − 1 √ nT n X i =1 x ′ i b F b F ′ T 2  ∆ b i  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi  = I c + I I c . Along the s ame lines as the pro ofs of Theorems 9 and 10 of Hannan (1970), we can show that sup i E    b Ω ubi b Ω − 1 bi − Ω ubi Ω − 1 bi    2 = O  K T  + O  1 K 2 q  . Then we hav e Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi = O p M ax ( r K T , r 1 K 2 q )! . and 1 √ n n X i =1    Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi    2 = √ n 1 n n X i =1    Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi    2 ≤ √ n sup i    Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi    2 = √ n " O p M ax ( r K T , r 1 K 2 q )!# 2 . F or I c. , by the Cauc h y Sc hw arz inequalit y , k I c k =      1 √ nT n X i =1 x ′ i ∆ b i  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi       ≤   √ n 1 n n X i =1      x ′ i ∆ b i T      2   1 / 2 1 √ n n X i =1    Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi    2 ! 1 / 2 ≤  O p  √ n  1 / 2  √ n  1 / 2 O p M ax ( r K T , r 1 K 2 q )! = O p  √ n  O p M ax ( r K T , r 1 K 2 q )! 34 Similarly , k I I c. k =      1 √ nT n X i =1 x ′ i b F b F ′ T 2  ∆ b i  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi       =      1 √ n n X i =1 x ′ i b F T 2 b F ′ ∆ b i T  Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi       ≤          √ n 1 n n X i =1      x ′ i b F T 2 b F ′ ∆ b i T      2   1 / 2 1 √ n n X i =1    Ω ubi Ω − 1 bi − b Ω ubi b Ω − 1 bi    2 ! 1 / 2        = O p  √ n  O p M ax ( r K T , r 1 K 2 q )! Com b in ing I c. and I I c , w e h a ve 1 √ nT n X i =1 x ′ i M b F  b u + i − e u + i  = O p  √ n  O p M ax ( r K T , r 1 K 2 q )! = O p M ax ( r nK T , r n K 2 q )! Recall K ∽ n b and lim inf log T log n > 1 from Assump tion 6. It follo ws that, as in Mo on and Pe r ron (2004 ) nK T ∽ n b +1 T = exp  log  n b +1 T  = exp  b + 1 − log T log n  log n = n b +1 − log T log n ≤ n b +1 − lim inf log T log n → 0 b y Assum ption 6 and b < lim inf log T log n − 1 . Also note n K 2 q ∽ n n 2 q b = n (1 − 2 qb ) → 0 b y Assum ption 6 and 1 2 q < b . Th erefore 1 √ nT n X i =1 x ′ i M b F  b u + i − e u + i  = O p M ax ( r nK T , r n K 2 q )! = o p (1) . Let ∆ b b i =  ∆ x i ∆ b F  . Note that ∆ b i − ∆ b b i =  ∆ x i ∆ F  −  ∆ x i ∆ b F  =  0 ∆ F − ∆ b F  . 35 Consider I I I. 1 √ nT n X i =1 x ′ i M b F  b u + i − e u + i  = 1 √ nT n X i =1 x ′ i M b F  u i − ∆ b b i b Ω − 1 bi b Ω bui − u i + ∆ b i b Ω − 1 bi b Ω bui  = 1 √ nT n X i =1 x ′ i M b F  ∆ b i − ∆ b b i  b Ω − 1 bi b Ω bui = 1 √ nT n X i =1 x ′ i M b F  ∆ F − ∆ b F  b Ω − 1 bi b Ω bui . W e u se Lemma 12.3 in Bai (2005) to get 1 nT n X i =1 x ′ i M b F  ∆ F − ∆ b F  = O p  b β − β 0  + O p  1 min ( n, T )  . It follo w s that 1 √ nT n X i =1 x ′ i M b F  ∆ F − ∆ b F  = √ n  O p  b β − β 0  + O p  1 min ( n, T )  = √ nO p  1 T  + O p  √ n min ( n, T )  = o p (1) since n T → 0 as ( n, T ) → ∞ . Collecting I − I I I w e prov e (c). Prop osition A.1 Under Assumptions 1-6, √ nT  b β C upF M − e β C upF M  = o p (1) . Pro of. T o sa ve the notations, we only show that results with x i in place of ¯ x i and δ i in p lace of of ¯ δ i since the steps are basically the same. In the s u pplementa r y app endix, it is sho w n that (see the pro of of Pr op osition 4 ) 1 nT 2 n X i =1 x ′ i M b F x i ! = 1 nT 2 n X i =1 x ′ i M F 0 x i ! + o p (1) . Then √ nT  b β C upF M − e β C upF M  = 1 nT 2 n X i =1 x ′ i M F 0 x i ! − 1 1 √ nT    P n i =1  x ′ i M b F b u + i − T  b ∆ + εui − δ ′ i b ∆ + η u  − P n i =1  x ′ i M F 0 u + i − T  ∆ + εui − δ ′ i ∆ + ηu     + o p (1) = 1 nT 2 n X i =1 x ′ i M F 0 x i ! − 1 1 √ nT    P n i =1  x ′ i M b F b u + i − x ′ i M F 0 u + i  − nT  b ∆ + εun − ∆ + εun  − T P n i =1  δ ′ i b ∆ + ηu − δ ′ i ∆ + η u     + o p (1) = 1 nT 2 n X i =1 x ′ i M F 0 x i ! − 1    1 √ nT P n i =1  x ′ i M b F b u + i − x ′ i M F 0 u + i  − √ n  b ∆ + εun − ∆ + εun  − 1 √ n P n i =1  δ ′ i b ∆ + η u − δ ′ i ∆ + η u     + o p (1) 36 where b ∆ + εun = 1 n P n i =1 b ∆ + εui and∆ + εun = 1 n P n i =1 ∆ + εui . Finally u sing Lemm a A.4, √ nT  b β C upF M − e β C upF M  = o p (1) . Pro of of Theorem 3: This follo w s directly from P r op osition A.1. Pro of of Prop osition 5: In the sup plemen tary app endix, it is shown that 1 T T X t =1 k ˆ F t − H F 0 t k 2 = T O p ( k ˆ β − β 0 k 2 ) + O p ( 1 n ) + O p ( 1 T 2 ) . F rom √ nT ( ˆ β − β 0 ) = O p (1), the ﬁ rst term on the r igh t hand side is O p (1 / ( nT )), whic h is dominated b y O (1 /n ). 37 References [1] Andrews , D. W. K. (2005) , “Cross-section Regression with Common Sh o c ks,” Ec onometric a , 73, 1551-1585 . [2] Bai, J . (2004) , “Estimating Cross-section C ommon Sto chasti c T rends in Nonstationary P anel Data,” Journal of Ec onometrics, 122, 137-183 . [3] Bai, J. (2005) , “P anel Data Mo dels with Int eractiv e Fixed Eﬀects”, working p ap er. [4] Bai, J., and Kao, C. (200 6), “On the Estimation an d Infer en ce of a P anel Coin tegration Mo del with Cr oss-Sectional Dep endence,” Contributions to Ec onomic Analysis, edited b y Badi Baltag i, E lsevier, 3-30. [5] Bai, J., Kao, C., and Ng, S . (200 6). “P anel Cointeg r ation w ith Global Sto chasti c T rends: Supp lemen tary App endix. ” [6] Bai, J., and Ng, S . (2002), “Determining the Number of F actors in Approximate F actor Mo d - els,” Ec onometric a, 70, 191-221 . [7] Bai, J ., and Ng, S. (2004), “A P anic A ttac k on Unit Ro ots and Cointe gration,” E c onometric a, 72, 1127-1177 . [8] Baltagi, B. (2005) , Ec onometric Analysis of Panel Data , New Y ork, Wiley . [9] Baltagi, B., and Kao, C. (2000), Nonstationary p anels, Coin tegration in Panels and Dynamic P anels: A S urve y , A dvanc es in Ec onometrics , 15, 7-51. [10] Bernank e, B. S., and Boivin, J. (2003), “Monetary P olicy in a Data-Ric h Environmen t,” Journal of Monetary Ec onomics , 50, 525-546 [11] Breitung, J., and Das, S . (2005), “T esting for Unit Ro ots in Panels with a F actor Structure,” forthcoming in E c onometric The ory. [12] Breitung, J., an d Pesa ran, M. H. (2005) , “Unit Ro ots and Coin tegration in Pa n els,” f orthcom- ing in L. Mat yas, and P . Seve stre, The Ec onometrics of Panel Data (Th ird Edition), Kluw er Academic Publishers . [13] Chang, Y. (2002 ), “Nonlinear IV Unit Ro ot T ests in Pa nels with C ross-Sectional Dep end ence,” Journal of Ec onometrics , 110, 261-29 2. 38 [14] Chang, Y. (2004) , “Bootstrap Unit Ro ot T ests in P anels with Cr oss-Sectional Dep endence,” Journal of Ec onometrics , 120, 263-29 3. [15] Choi, I. (2006), “Com bin ation Unit Ro ot T ests for Cr oss-Sectionally Correlated P anels,” Ec onometric The ory and Pr actic e: F r ontiers of Analysis and Applie d R ese ar ch: Essays in Honor of Peter C. B. Phil lips , Cam brid ge, Cam bridge Unive rsit y P r ess [16] Gengen bac h, C ., Pa lm , F. C., and Urb ain, J.-P . (2005a), “P anel Unit Ro ot T ests in the Pres- ence of Cross-Sectional Dep endencies: Comparison and Imp licatio ns of Mo delling,” w orking pap er. [17] Gengen bac h, C ., Palm, F. C ., and Urbain, J.-P . (2005b), “Pa n el Cointe gration T esting in the Presence of Common F actors,” working pap er. [18] Hall, P . and Heyde, C.C .(1080), Martingale Limit The ory and Its A pplic ations , Academic Press, I n c, New Y ork. [19] Hannan, E . (1970), M ultiple Time Series , New Y ork, Wiley . [20] Hansen, B.E. (1992) , Eﬃcient Estimation and T esting of Coin tegrating V ectors in the Presence of Deterministic T r en ds. Journal of Ec onometrics , 53, 87-121. [21] Holly , S., Pesaran, M.H., and Y amagata, T. (2006) , A Spatio-T emp oral Mo del of House Prices in th e US. IZA Discussion P ap er No. 2338, Un iv ersity of Cambridge. [22] Hsiao, C., (2003), Analysis of Panel Data , Cambridge, C ambridge Unive rsit y Press. [23] Kao, C. (1999 ), “Spurious Regression and Residual-Based T ests for Coin tegration in Panel Data,” Journal of Ec onometrics , 90, 1-44. [24] Kap etanios, G., Pesaran, H.H. and Y amagata , T. (2006), Pa nels with Nonstationary Multi- factor Er ror S tructures, Discussion p ap er No. 2243, Qu enn Mary , Univ er s it y of London. [25] Mo on, H. R., and P erron, B. (2004), “T esting for a Un it Ro ot in P anels with Dyn amic F actor,” Journal of Ec onometrics , 122, 81-126 . [26] Mo on, H. R., and Phillips, P . C. B. (2000) , “Estimation of Au toregressiv e Ro ots of Near Unit Using Panel Data,” Ec onometric The ory , 927-997. [27] Mo on, H. R., and Phillips, P . C. B. (2004 ), “GMM Es timation of Au toregressive Ro ots Near Unit with P anel Data,” Ec onometric The ory , 927-997. 39 [28] P ark, J. (1990), “T esting f or Unit Ro ots and Coin tegration By V ariable Ad d ition,” A dvanc es in Ec onometrics , ed. by F omb y and Rh o des, 107-133, JAI Press. [29] P ark, J. and Phillips, P . C. B. (198 8), “Regressions with In tegrated Regressors: P art I,” Ec onometric The ory , 468-498. [30] P esaran, M. H. (200 6), Es timation and Inference in Large Heterogeneous p anels with a Mul- tifactor Error S tr ucture, E c onometric a, 74, 9671012. [31] P esaran, M. H., and S mith, R. (1995) , “Estimation of Long-Run Relationships from Dynamic Heterogeneo us Panels,” Journal of Ec onometrics , 68, 79-114. [32] P esaran, M. H., and Y amagata, T. (2006), “T esting Slop e Homogeneit y in Large Panels,” W orking pap er. [33] Phillips, P . C. B. (1995), “F u lly-mo diﬁed least squares and autoregression,” Ec onometric a , 63, 1023- 1079. [34] Phillips, P . C. B., and Durlauf, S. (1986), “Multiple Time Series Regression with Integrate d Pro cesses,” R e v iew of Ec onomic Studies , 53, 473-49 5. [35] Phillips, P . C. B., and Hansen, B. E. (1990), “Stati stical inference in instru men tal v ariables regression with I(1) p ro cesses,” R eview of Ec onomic Studies , 57, 99-125. [36] Phillips, P . C. B., and Mo on, H. (1999), “Linear Regression Limit Th eory for Nonstationary P anel Data,” Ec onometric a, 67, 1057-11 11. [37] Phillips, P . C. B., and O uliaris, S. (1990), “Asymptotic P rop erties of Residual Based T ests for Coin tegration,” E c onometr ic a , 58, 165-193 . [38] Phillips, P . C. B., and Sul, D. (2003), “Dynamic P anel Estimation and Homogeneit y T esting Under Cr oss Section Dep end ence,” Ec onometrics Journal , 6, 217-259 . [39] Sto c k, J. H., and W atson, M. W. (2002), “F orecasting Using Principal Comp onents from a Large Nu m b er of Predictors,” Journal of the Americ an Statistic al Asso ciation , 97, 1167–117 9. [40] W esterlund , J. (2006), “On the Estimation of C ointeg rated Panel Regressions with C ommon F actors,” working pap er. [41] W esterlund , J., and Edgerton, D. (2005), T esting the Barro-Gordon Mo d el in Breaking and Dep endent Panels: Evidence from the OECD Countries, working pap er. 40 T able 1: Mean Bias a nd Sta nda rd Deviation of Estima t ors σ 31 = 0 σ 31 = 0 . 8 σ 31 = − 0 . 8 LSD V 2s FM C upBC Cup FM LSD V 2s FM Cup BC Cup FM LSDV 2sFM CupBC Cu p FM σ 21 = 0 n,T=20 1.352 0.349 0.030 0.030 - 0.712 0.257 0.00 0 0.000 2.216 -0 .0 86 0.030 0.030 (1.559) (0.387) (0.030) (0.029) (1.505) (0.372) (0.030) (0.029) (1.524) (0.394) (0.029) (0.029) n,T=40 3.371 - 0.719 -0.000 -0.000 2.761 -0.246 -0.000 -0.0 00 1.010 -0 .371 -0 .000 -0.00 0 (1.139) (0.225) (0.009) (0.009) (1.529) (0.227) (0.010) (0.009) (1.124) (0.217) (0.009) (0.009) n,T=60 -2.006 0.094 -0.000 -0.000 - 1.393 0.038 -0.00 0 -0.000 -1.073 0 .1 99 - 0.000 -0.000 (0.920) (0.138) (0.005) (0.005) (0.915) (0.139) (0.005) (0.005) (0.929) (0.138) (0.005) (0.005) n,T=120 0.204 - 0.064 -0.000 -0.000 0.548 -0.062 -0.020 0.015 - 0.163 -0.061 0.01 8 -0.000 (0.645) (0.056) (0.018) (0.002) (0.646) (0.056) (0.002) (0.002) (0.643) (0.056) (0.002) (0.002) σ 21 = 0 . 2 n,T=20 4.333 0.317 -0.11 9 0.332 2.258 0.129 -0.158 0.2 93 4.903 - 0.220 -0.117 0 .322 (1.584) (0.385) (0.030) (0.029) (1.529) (0.382) (0.031) (0.029) (1.614) (0.396) (0.030) (0.028) n,T=40 4.567 -0.768 -0.1 13 0.100 4.051 - 0.333 -0.1 17 0.10 1 1.964 -0.3 76 -0 .115 0.10 2 (1.133) (0.223) (0.010) (0.009) (1.153) (0.227) (0.010) (0.009) (1.120) (0.216) (0.010) (0.009) n,T=60 -1.100 0.109 -0.071 0.045 -0.337 0 .0 82 -0.067 0.0 49 0.032 0.150 - 0.065 0.051 (0.923) (0.138) (0.005) (0.005) (0.925) (0.139) (0.005) (0.005) (0.938) (0.140) (0.005) (0.005) n,T=120 0.696 - 0.059 0.000 0.178 1.161 -0.070 -0.017 0 .017 0.15 1 -0.026 0.01 7 - 0.017 (0.648) (0.055) 0.0 18 (0.00 2) (0.649) (0.055) (0.002) (0.002) (0.646) (0.055) (0.002) (0.002) σ 21 = − 0 . 2 n,T=20 -1.600 0.376 0.179 -0.27 4 -3 .763 0.33 1 0.151 -0.2 91 - 0.75 4 -0.049 0.16 9 -0.274 (1.588) (0.393) (0.031) (0.029) (1.593) (0.345) (0.031) (0.029) (1.603) (0.394) (0.031) (0.029) n,T=40 2.086 -0.653 0.105 -0.108 0.812 - 0.077 0.101 -0.1 13 - 0.353 -0.313 0.09 6 -0.112 (1.144) (0.225) (0.010) (0.009) (1.141) (0.223) (0.010) (0.009) (1.128) (0.218) (0.010) (0.009) n,T=60 -2.850 0.008 0.055 -0.06 2 -2.178 -0 .018 0.058 -0.05 8 -1 .872 0.23 6 0.056 -0.0 60 (0.917) (0.142) (0.005) (0.005) (0.905) (0.136) (0.005) (0.005) (0.921) (0.138) (0.005) (0.005) n,T=120 - 0.50 1 0.000 0.000 0.000 - 0.175 - 0.000 -0.0 18 0.017 -0 .839 0.02 9 0.000 -0.0 00 (0.650) (0.057) (0.002) (0.018) (0.646) (0.057) (0.002) (0.002) (0.654) (0.058) (0.002) (0.002) Note: (a) Th e Mean biases here ha v e b een multiplied by 100. (b) c = 5 , σ 32 = 0 . 4 . 41 T able 2: Mean Bias a nd Sta nda rd Deviation of Estima t ors for Diﬀeren t n and T c = 5 c = 10 (n,T) LS D V 2sFM Cup BC CupFM LS D V 2sFM Cu pBC CupFM (20 , 20) 2.2 58 0.129 -0.158 0.2 93 1.5 38 0.275 -0.158 0.294 (1.594 ) (0.382 ) (0.031 ) (0.028) (3.186 ) (0.771) (0.0 31) (0.029) (20 , 40) 4.832 -0.426 -0. 067 0.1 07 8.141 -0.0 06 -0. 067 0.106 (1.692 ) (0.288 ) (0.014 ) (0.014) (3.186 ) (0.566) (0.0 14) (0.014) (20 , 60) 0.460 0.282 -0.019 -0.058 -0.105 0 .0561 -0.186 0.058 (1.560 ) (0.206 ) (0.009 ) (0.009) (3.121 ) (0.412) (0.0 09) (0.009) (20 , 120) 3.018 0. 040 0.010 0.021 -6.550 0.0 67 0.010 0.0 21 (1.572 ) (0.123 ) (0.0 05 (0.005) (3.144 ) (0.245) (0.0 05) (0.004) (40 , 20) 4.012 -0.566 -0. 225 0.3 20 5.092 -1.0 87 -0. 226 0.320 (1.126 ) (0.280 ) (0.021 8) (0.01 9) (2.252 ) (0.593) (0.0 21) (0.019) (40 , 40) 4.051 -0.332 -0. 117 0.1 01 6.616 -0.6 22 -0. 117 0.101 (1.153 ) (0.227 ) (0.010 ) (0.009) (2.305 ) (0.454) (0.0 10) (0.009) (40 , 60) 1.818 0.114 -0.055 0.05 1 2.628 0. 248 -0.055 0.05 1 (1.098 ) (0.158 ) (0.007 ) (0.006) (2.196 ) (0.317) (0.0 07) (0.006) (40 , 120) 1.905 -0.0 90 -0.0 10 0.01 5 3.303 -0.1 78 -0. 010 0.015 (1.111 ) (0.087 ) (0.003 ) (0.003) (2.243 ) (0.187) (0.0 03) (0.003) (60 , 20) 3.934 -0.317 -0. 294 0.2 95 4.989 -0.5 44 -0. 294 0.295 (0.921 ) (0.249 ) (0.018 ) (0.017) (1.841 ) (0.497) (0.0 14) (0.016) (60 , 40) 2.023 0.110 -0.125 0.10 8 2.573 0. 267 -0.125 0.10 9 (0.923 ) (0.187 ) (0.009 ) (0.008) (1.296 ) (0.027) (0.0 09) (0.008) (60 , 60) -0.337 0.0 82 -0.067 0.049 -1.666 0.1 91 -0.067 0.049 (0.925 ) (0.139 ) (0.005 ) (0.005) (1.850 ) (0.279) (0.0 05) (0.005) (60 , 120) -1.168 0.109 -0.01 5 0.015 -2.839 -0.223 -0.014 0.015 (0.923 ) (0.075 ) (0.003 ) (0.003) (1.847 ) (0.151) (0.0 03) (0.003) (120 , 20) 2.548 -0.1 51 -0.3 04 0.29 4 2.236 -0.2 03 -0. 304 0.294 (0.651 ) (0.182 ) (0.014 ) (0.011) (1.303 ) (0.362) (0.0 14) (0.011) (120 , 40) 1.579 -0.0 26 -0.0 13 0.00 1 1.678 0.000 -0.133 0.112 (0.661 ) (0.137 ) (0.006 ) (0.005) (1.321 ) (0.279) (0.0 06) (0.005) (120 , 60) 0.764 0.004 -0.077 0.013 0.539 0.061 -0.077 0.048 (0.634 ) (0.100 ) (0.004 ) (0.004) (1.267 ) (0.199) (0.0 04) (0.004) (120 , 120) 1.161 -0.070 -0.0 17 0.01 7 1.823 -0.1 34 -0. 017 0.018 (0.649 ) (0.055 ) (0.002 ) (0.002) (1.298 ) (0.111) (0.0 02) (0.002) (a) The Mean biases here ha v e b een multiplied by 100. (b) σ 21 = 0 . 2 , σ 31 = 0 . 8 , and σ 32 = 0 . 4 . 42 T able 3: Mean Bias a nd Sta nda rd Deviation of t-st atistics σ 31 = 0 σ 31 = 0 . 8 σ 31 = − 0 . 8 LSD V 2s FM C upBC Cup FM LSD V 2s FM Cup BC Cup FM LSDV 2s FM CupBC Cup FM σ 21 = 0 n,T=20 0.036 0.006 0.016 0.016 0.006 0.02 2 4 0.001 0.001 0.0 41 -0.00 1 0.01 9 0.019 (2.414) (2.445) (1.531) (1.502) (2.527) (2.449) (1.529) (1.503) (2.534) (2.455) (1.515) (1.491) n,T=40 0.092 -0.036 -0.0 07 -0 .0 06 0.074 -0.052 -0.012 -0.0 11 0.019 0.0 08 - 0.006 -0.0 0 5 (3.576) (2.589) (1.276) (1.256) (3.592) (2.618) (1.273) (1.254) (3.588) (2.581) (1.278) (1.217) n,T=60 -0.098 0.016 - 0.019 -0.019 -0.036 -0 .016 -0.01 1 - 0.011 -0.06 0 0.045 -0.00 9 -0 .0 09 (4.346) (2.647) (1.182) (1.169) (4.325) (2.640) (1.189) (1.178) (4.315) (2.644) (1.182) (1.169) n,T=120 0.046 - 0.019 -0.003 -0.003 0.099 - 0.019 -0.0 75 0.10 2 - 0.088 - 0.04 0 0.06 8 -0.011 (6.093) (2.696) (1.101) (1.096) (6.089) (2.661) (1.118) (1.094) (6.095) (2.705) (1.120) (1.095) σ 21 = 0 . 2 n,T=20 0.104 0.040 0.001 0.185 0.0 70 0.037 -0.01 3 0.188 0.1 05 0.033 0.0 04 0.181 (2.508) (2.454) (1.558) (1.497) (2.529) (2.453) (1.561) (1.442) (2.539) (2.465) (1.543) (1.483) n,T=40 0.149 - 0.013 -0.081 0.14 0 0.134 -0.022 -0.085 0 .142 0.05 9 - 0.0 03 -0.08 1 0.143 (3.563) (2.597) (1.304) (1.252) (3.578) (2.639) (1.307) (1.252) (3.578) (2.612) (1.314) (1.258) n,T=60 -0.032 0.039 - 0.100 0.11 5 0.027 0.0 13 - 0.094 0.1 23 0.011 0 .038 -0.087 0 .127 (4.357) (2.651) (1.209) (1.167) (4.357) (2.647) (1.215) (1.174) (4.325) (2.646) (1.204) (1.162) n,T=120 0.049 - 0.016 0 .003 0.002 0.097 -0.019 -0.059 0 .114 0.01 2 - 0.0 29 0.0 6 2 - 0.109 (6.060) (2.640) (1.096) (1.092) (6.084) (2.645) (1.115) (1.093) (6.043) (2.635) (1.111) (1.089) σ 21 = − 0 . 2 n,T=20 -0.031 -0 .013 0 .029 -0.1 55 - 0.064 0.0 05 0.125 -0 .166 -0.031 -0.02 9 0 .027 -0.1 52 (2.519) (2.456) (1.559) (1.497) (2.528) (2.439) (1.556) (1.498) (2.538) (2.458) (1.556) (1.498) n,T=40 0.033 -0.068 0.06 7 -0.153 -0.005 -0 .071 0.0 61 -0.16 2 - 0.035 -0.021 0.05 8 -0.159 (3.586) (2.593) (1.312) (1.255) (3.597) (2.618) (1.305) (1.248) (3.588) (2.574) (1.305) (1.252) n,T=60 -0.162 0.002 0 .062 -0.1 54 -0.093 -0 .035 0.0 67 -0.14 6 - 0.114 0.0 2 8 0.067 -0 .1 47 (4.335) (2.657) (1.212) (1.169) (4.283) (2.633) (1.210) (1.168) (4.308) (2.643) (1.206) (1.166) n,T=120 -0.066 0.001 0.0 07 0.007 -0.010 0 .0 22 -0.062 0.1 17 -0.11 1 - 0.0 04 0.0 77 -0.10 4 (6.098) (2.679) (1.106) (1.106) (6.152) (2.577) (1.116) (1.092) (6.119) (2.691) (1.125) (1.101) Note: (a) c = 5, σ 32 = 0 . 4 . 43 T able 4: Mean Bias a nd Sta nda rd Deviation of t-st atistics for Diﬀeren t n and T c = 5 c = 10 (n,T) LS D V 2sFM CupBC CupFM LSDV 2sFM Cup BC CupFM (20 , 20) 0.070 0.037 -0.013 0.169 0.036 0.030 -0.013 0.169 (2.529 ) (2.453 ) (1.561) (1.497) (2.532 ) (2.562) (1.560) (1 .496) (20 , 40) 0.130 -0.007 -0.009 0.110 0.106 -0.0 11 -0. 009 0.110 (3.539 ) (1.863 ) (1.313) (1.286) (3.541 ) (1.896) (1.313) (1 .286) (20 , 60) 0.029 0.009 0.015 0.085 0.009 0.003 0.016 0.085 (4.303 ) (1.553 ) (1.253) (1.239) (4.305 ) (1.569) (1.253) (1 .239) (20 , 120) -0.090 0.015 0.057 0.064 -0.105 0.013 0.057 0.064 (6.131 ) (1.222 ) (1.156) (1.151) (6.132 ) (1.220) (1.156) (1 .151) (40 , 20) 0.119 -0.015 -0.086 0.242 0.073 -0.0 19 -0. 086 0.241 (2.518 ) (3.376 ) (1.549) (1.443) (2.520 ) (3.610) (1.549) (1 .443) (40 , 40) 0.134 -0.022 -0.085 0.142 0.100 -0.0 26 -0. 085 0.142 (3.578 ) (2.639 ) (1.307) (1.252) (3.580 ) (2.739) (1.307) (1 .252) (40 , 60) 0.113 0.012 -0.048 0.109 0.085 0.008 -0.047 0.109 (4.328 ) (2.164 ) (1.209) (1.177) (4.329 ) (2.222) (1.209) (1 .176) (40 , 120) 0.133 -0.0 14 -0. 007 0.05 9 0.113 -0.0 19 -0. 007 0.059 (6.097 ) (1.519 ) (1.131) (1.123) (6.098 ) (1.535) (1.131) (1 .123) (60 , 20) 0.123 0.005 -0.161 0.276 0.067 -0.0 02 -0. 160 0.276 (2.521 ) (4.042 ) (1.579) (1.424) (2.524 ) (4.409) (1.579) (1 .425) (60 , 40) 0.100 0.069 -0.109 0.192 0.059 0.065 -0.109 0.192 (3.532 ) (3.206 ) (1.352) (1.272) (3.534 ) (3.375) (1.352) (1 .272) (60 , 60) 0.027 0.013 -0.094 0.123 -0.006 0.010 -0.094 0.122 (4.426 ) (2.613 ) (1.215) (1.174) (4.359 ) (2.751) (1.215) (1 .174) (60 , 120) -0.020 0.031 -0.024 0.077 -0.044 0.030 -0.025 0.077 (6.131 ) (1.866 ) (1.118) (1.104) (6.132 ) (1.902) (1.118) (1 .104) (120 , 20) 0.139 0.04 4 -0.243 0.38 6 0.060 0.063 -0.243 0.386 (2.478 ) (5.269 ) (1.681) (1.404) (2.479 ) (5.969) (1.681) (1 .404) (120 , 40) 0.135 0.03 7 -0.186 0.26 8 0.078 0.040 -0.186 0.268 (3.588 ) (4.369 ) (1.366) (1.233) (3.589 ) (4.706) (1.366) (1 .233) (120 , 60) 0.099 0.01 1 -0.162 0.17 4 0.052 0.004 -0.162 0.174 (4.272 ) (3.683 ) (1.249) (1.166) (4.273 ) (3.902) (1.249) (1 .167) (120 , 120) 0.097 -0.189 -0 .589 0.11 4 0.063 -0.0 27 -0. 059 0.114 (6.084 ) (2.645 ) (1.115) (1.093) (6.086 ) (2.741) (1.115) (1 .093) (a) σ 21 = 0 . 2 , σ 31 = 0 . 8 , and σ 32 = 0 . 4 . 44

Panel Cointegration with Global Stochastic Trends

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment