Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model

Statistical P ape rs man uscript No. (will b e inserted b y the editor) Impro v ed maxim u m lik eliho o d estimators in a heterosk edastic errors-in-v ariables mo del Alexandre G. P atriota · Artur J. Lemonte · Heleno Bolfarine Receiv ed: dat e / Accepted: dat e Abstract This pap er develo ps a bias correction scheme for a multiv ariate h eterosk edas- tic errors-in-v a riables model. The app licabilit y of t his mo del is justiﬁed in areas suc h as astrophysics , epidemiology and analytical chemistry , where the v ariables are sub ject t o measuremen t errors and the v ariances va ry with the observ ations. W e cond uct Monte Carlo simulati ons to inv estiga te the p erformance of the corrected estimators. The nu- merical results sho w that the bias corre ction scheme yields nearl y unbiased estima tes. W e also giv e an application to a real data set. Keywords Bias correction · errors-in-va riables mo del · maximum-lik elihoo d estimation · heteroskedasti c mod el 1 Introduction Heteroskedas tic e rrors-in-v ariables (or measuremen t error) mo dels ha ve b een ex t en- sive ly studied in th e statistica l literature and widely applied in astro physics (to ex plain relationships b etw een black hole masses and some v ariates of luminosities), ep idemi- ology (to model the cardio v ascular event with its ris k factors), analytical c hemistry (to compare diﬀerent typ es of measurement in stru ments). The applicability of this mod el ab ound mainly in the astronomy literature where all qu an tities are su b ject to measuremen t errors (Akritas and Bershady , 1996). It is w ell-know n that, when the measuremen t errors are ignored in the estima- tion pro cess, t h e maximum-lik elihoo d estima tors (MLEs) become inconsis tent. More sp eciﬁcally , the estimation of the slo p e p arameter of a simple linear mo del is atten uated (F uller, 1987). When v aria bles are sub ject to measurement errors, a sp ecial inference treatment must b e carried out for the model parameters in order to av oid inconsis- tent estimators. Usually , a measurement equation is added to the mod el to capture the measurement error eﬀect and t h en the MLEs from t his approach are consistent, A. G. P atriota · A. J. Lemon te · H. Bolf arine Departamen to de Estat ´ ıstica, Universidade de S˜ ao P aulo, Rua do Mat˜ ao, 1010, S˜ ao Paulo/SP , 05508-090, B r azil F ax: +55 11 3 8144135 E-mail: patriota.alexandre@gmail.com 2 eﬃcien t and asymptotically normally distributed. A careful and deep exposition on the inferen tial pro cess in errors-in-va riables mo dels can b e seen in F uller (1987) and the references therein. Although consisten t, asymptotically eﬃcient and asymptotically normal ly distri- buted, the MLEs are oftentimes bia sed and p oint inference can be misleading. This is not a serious problem for relatively large sample sizes, since b ias is typically of order O ( n − 1 ), while the asymptotic standard errors are of order O ( n − 1 / 2 ). Ho w ever, for small or ev en moderate v alues of th e sample size n , bias can constitute a problem. Bias adjustment has b een extensively studied in the statistical li terature. F or ex ample, Cook et al. ( 1986), Cordeiro (1993), Cordeiro and V asconcellos (1997), V asconcellos and Cordeiro (1997) and, more recently , Cordeiro (2008). Add itionally , Patriota and Lemonte ( 2009) obtained genera l matrix form ulae for th e second-order biases of the maximum-lik elihoo d estimators in a very ge neral mo del whic h includes all previous w orks aforemen tioned. The mo del presented by the authors considers that the mean vector and the vari ance-co v ariance matrix of the observed v ariable h a ve parameters in common. T his approac h includes the heterosk edastic measurement error model that w e are going to study in this pap er. The main goal of this article is to deﬁne bias-corrected estimators using the general second-order bias expression d eriv ed in P atriota and Lemon te (2009) assuming th at the mod el deﬁned by (1) and (2) holds. A dditionally , we compare the p erformance of bias- corrected estimators with the MLEs in small samples via Monte Carlo sim ulations. The numerical results sho w that t he bias correction is eﬀective in small samples and leads to estimates that are nearly unbias ed and display sup erior ﬁnite-sample beh a vior. The rest of th e pap er is as foll o ws. Section 2 presents the multiv aria te heterosk edas- tic errors-in-v a riables model. Using general results from Patriota and Lemonte (2009), w e deriv e in Section 3 the second-order biases of the MLEs of the parameters . The result is u sed to deﬁ ne bias-corrected estimates. In Section 4 the O ( n − 1 ) biases of the estimates b µ i and b Σ i are given. Monte Carlo sim ulation results are p resen ted and discussed in Section 5. Section 6 gives an a pplication. Finally , conclud ing remarks are oﬀered in Section 7. 2 T he mo del The multiv ariate mod el assumed throughout this pap er is y i = β 0 + β 1 x i + q i , i = 1 , . . . , n, (1) where y i is a ( v × 1) laten t res p onse v ector, x i is a ( m × 1) latent vector o f c ov ariates, β 0 is a ( v × 1) vector of intercepts, β 1 is a ( v × m ) matrix, the elemen ts of whic h are inclinatio ns and q i is the equation error having a multiv aria te normal distribution with mean zero and co v ari ance-v ariance matrix Σ q . The va riables y i and x i are not di- rectly observed, instead surrogate v ariables Y i and X i are measured with the follo wi ng additive struct ure: Y i = y i + η y i and X i = x i + η x i . (2) The errors η y i and η x i are assumed to follo w a normal distribution giv en by  η y i η x i  ind ∼ N v + m  0 0  ,  τ y i 0 0 τ x i  , 3 where “ ind ∼ ” means “indep endently distributed as ” and the co v ariance-v ariance matri- ces τ y i and τ x i are assumed to b e known for all i = 1 , . . . , n . These matrices may b e attained, for example, through an analytical treatment of t he data collection mecha- nism, replications, machine precisio n, etc. Model (2) has equation errors for all li nes, i.e. , y i and x i are n ot p erfectly related. These equation errors are justiﬁed by the inﬂu ence of other factors than x i in the v aria tion of y i . It is very reasonable to consider equation erro rs i n (1) to capture extra v aria bility , since the v ariances τ y i are ﬁxed and whether some other fa ctor aﬀects the v aria tion of y i , the estimation of the line parameters will b e clearly aﬀected. S upp osing that x i iid ∼ N m ( µ x , Σ x ), where “ iid ∼ ” means “indep en dent and identical ly distributed as”, and considering that th e model errors ( q i , η y i and η x i ) and x i are indep endent, w e hav e t hat the join t distribution of the observ ed v ariables ca n b e expressed as  Y i X i  ind ∼ N v + m "  β 0 + β 1 µ x µ x  , β 1 Σ x β ⊤ 1 + Σ q + τ y i β 1 Σ x Σ x β ⊤ 1 Σ x + τ x i !# . (3) Note that in ( 3), the mean vector and th e cov ariance-v ariance matrix of observed v aria bles h a ve the matrix β 1 in common, i.e., they share mv parameters. Kulathinal et al. (2002) stud y the u niv aria te case (when v = 1 and m = 1) and prop ose an EM (Exp ectation and Maximization) algorithm t o obtain MLEs for mo d el p arameters. I n addition, th ey derived the asymptotic va riance of the MLE of the inclination parameter making it p ossible to b uild h yp otheses testing of it. Also, de Castro et al. (2008) derive the observ ed and expected Fisher informa tion and conduct some si mula tion studies to inv estigate the b ehavior of the likelihoo d ratio, score, W ald and C ( α ) statistics for testing hypoth esis of the parameters and Patriota et al . (2009) study the asymptotic prop erties of method-of-moments estima tors in the univ a riate model prop osed b y Ku- lathinal et al. (2002). Model (2) is a m ultiv aria te v ersion of the model proposed by Kulathinal et al. (2002). 3 Se cond-order bias of b θ In order to follo w the same sc heme adopted by Patriota an d Lemon te (2009), deﬁne the vector of parameters θ = ( β ⊤ 0 , vec( β 1 ) ⊤ , µ ⊤ x , vec h ( Σ x ) ⊤ , vec h ( Σ q ) ⊤ ) ⊤ , where vec( · ) is the vec op erator, which transforms a matrix into a vector by stac king the columns of the matrix and vec h( · ) is the vec h op erator, whic h t ransforms a symmetric matrix into a vector by stac king the on or ab ov e d iagonal elements. A lso, consider Z i = ( Y ⊤ i , X ⊤ i ) ⊤ and the mean and co v aria nce-v ariance function as µ i ( θ ) = β 0 + β 1 µ x µ x ! and Σ i ( θ ) = β 1 Σ x β ⊤ 1 + Σ q + τ y i β 1 Σ x Σ x β ⊤ 1 Σ x + τ x i ! , respectively . Moreo v er, to simplify notation, deﬁne the quan tities Z = vec( Z 1 , . . . , Z n ), µ = vec( µ 1 ( θ ) , . . . , µ n ( θ ) ), Σ = block–diag { Σ 1 ( θ ) , . . . , Σ n ( θ ) } and u = Z − µ . The log- lik elihoo d function for t h e vector parame ter θ from a random sample, except for con- stants, can b e expressed as ℓ ( θ ) = − 1 2 log | Σ | − 1 2 tr { Σ − 1 uu ⊤ } . (4) 4 Additionally , for the pu rp ose of computing the score fun ction, th e Fisher information and the second-order biases, also deﬁne a r = ∂ µ ∂ θ r , a sr = ∂ 2 µ ∂ θ s ∂ θ r , C r = ∂ Σ ∂ θ r , C sr = ∂ C r ∂ θ s , A r = − Σ − 1 C r Σ − 1 and F ( r ) β 0 = ∂ β 0 ∂ θ r , F ( s ) β 1 = ∂ β 1 ∂ θ s , F ( s ) µ x = ∂ µ x ∂ θ s , F ( s ) Σ x = ∂ Σ x ∂ θ s and F ( s ) Σ q = ∂ Σ q ∂ θ s , (5) with r, s = 1 , 2 , . . . , p , where p is the dimension of θ . The q uantities (5) are vec- tors or matrices of zeros with a unit in the p osition referring to the s th elemen t of θ . Let e D = ( a β 0 , a β 1 , a µ x , 0 , 0 ) and e V = ( 0 , C β 1 , 0 , C Σ x , C Σ q ), with a β 0 = ( a 1 , a 2 , . . . , a v ), a β 1 = ( a v +1 , . . . , a v ( m +1) ), a µ x = ( a v ( m +1)+ 1 , . . . , a v ( m +1)+ m ), C β 1 =  vec ( C v +1 ) , . . . , v ec( C v ( m +1) )  , C Σ x =  vec ( C ( v +1)( m +1) ) , . . . , v ec( C p ′ )  and C Σ q =  vec( C p ′ +1 ) , . . . , v ec( C p )  , where p ′ = v ( m + 1) + m + m ( m + 1) / 2. The ﬁrst deriv a tive of (4) with respect to the r th elemen t of θ is U r = 1 2 tr { A r ( Σ − uu ⊤ ) } + tr { Σ − 1 a r u ⊤ } ; (6) the ex p ectation of the d eriv ativ e of (6) with respect to the s th elemen t of θ is given by κ sr = 1 2 tr { A r C s } − a ⊤ s Σ − 1 a r . Under general regularity conditions (Co x and Hink ley , 1974, Ch. 9), − κ sr is the ( s, r ) th elemen t of the exp ected Fisher information. The score function and the exp ected Fisher information are given, respectively , b y U θ = e D ⊤ Σ − 1 u − 1 2 e V ⊤ e Σ − 1 vec ( Σ − uu ⊤ ) and K θ = e D ⊤ Σ − 1 e D + 1 2 e V ⊤ e Σ − 1 e V , with e Σ = Σ ⊗ Σ an d ⊗ is the Kronecke r pro duct. Deﬁning e u =  u − vec( Σ − uu ⊤ )  , e F = e D e V ! and f H =  Σ 0 0 2 e Σ  − 1 , w e can write the score function and the Fisher information in a short form as U θ = e F ⊤ f H e u and K θ = e F ⊤ f H e F . The Fisher scori ng metho d can be used to estima te θ iterative ly solving the equa- tion θ ( m +1) = ( e F ( m ) ⊤ f H ( m ) e F ( m ) ) − 1 e F ( m ) ⊤ f H ( m ) e u ∗ ( m ) , m = 0 , 1 , 2 , . . . , (7) where e u ∗ ( m ) = e F ( m ) θ ( m ) + e u ( m ) . Each loop, through the iterativ e scheme (7), consists of an iterative re-wei ghted least squ ares algorithm to optimize the log-lik elihoo d (4). Using equation (7) and any soft w are ( MAPLE , MATLAB , Ox , R , SAS ) with a w eigh ted linear regression rout in e one can comput e the MLE, b θ , iteratively . Initial appro ximation θ (0) for the iterative algorithm is used t o ev aluate e F (0) , f H (0) and e u ∗ (0) from which these equations ca n be used to obtain the next e stimate θ (1) . This new v alue can up date e F , f H and e u ∗ and so the iterations conti nue un til con v ergence is ac hieve d. 5 The general matrix formulae d eriv ed by Patriota a nd Lemo nte (2009) for n − 1 bias vector B ( b θ ) of b θ is given by B ( b θ ) = ( e F ⊤ f H e F ) − 1 e F ⊤ f H e ξ , (8) where e ξ = ( Φ 1 , . . . , Φ p )vec { ( e F ⊤ f H e F ) − 1 } and Φ r = − 1 2 ( G r + J r ), r = 1 , 2 , . . . , p , with G r =  a 1 r · · · a pr vec ( C 1 r ) · · · v ec( C pr )  and J r =  0 2( I nq ⊗ a r ) e D  , where I k denotes the k × k identity matrix. The b ias vector B ( b θ ) is simply t h e set coeﬃcients fro m the ordin ary w eigh ted lest-squares regression of the e ξ on the columns of e F , using weigh ts in f H . The b ias vector B ( b θ ) will b e small when e ξ is orthogonal to th e columns of f H e F and it can b e larg e when n is small. Note that equation (8) inv olv es simple op erations on matrices and v ectors and we can cal culate the bias B ( b θ ) numerical ly via softw are with numerical linear algebra facilities such as Ox (D oornik, 2006) and R (R Devel opment Core T e am, 2008) with minimal eﬀort. After some algebra, w e ha ve a r = 1 n ⊗ F ( r ) β 0 0 ! , a s = 1 n ⊗ F ( s ) β 1 µ x 0 ! , a t = 1 n ⊗ β 1 F ( t ) µ x F ( t ) µ x ! and a u = 0 , for r = 1 , . . . , v ; s = v + 1 , . . . , v ( m + 1); t = v ( m + 1) + 1 , . . . , v ( m + 1) + m ; and u = ( v + 1)( m + 1) , . . . , p ; where p = v ( m + 1) + m + m ( m + 1) / 2 + v ( v + 1) / 2. (Here, 1 n denotes an n × 1 vector of ones.) Moreo v er, a r s = 1 n ⊗ F ( s ) β 1 F ( r ) µ x 0 ! , for all r an d s , C s = I n ⊗ F ( s ) β 1 Σ x β ⊤ 1 + β 1 Σ x F ( s ) ⊤ β 1 F ( s ) β 1 Σ x F ( s ) β 1 Σ x 0 ! , C t = I n ⊗ β 1 F ( t ) Σ x β ⊤ 1 β 1 F ( t ) Σ x F ( t ) Σ x β ⊤ 1 0 ! and C u = I n ⊗ F ( u ) Σ q 0 0 0 ! , for s = v + 1 , . . . , v ( m + 1); t = v ( m + 1) + 1 , . . . , v ( m + 1) + m ; and u = ( v + 1)( m + 1) , . . . , p . Additionally , C r s = I n ⊗ F ( s ) β 1 Σ x F ( r ) ⊤ β 1 + F ( r ) β 1 Σ x F ( s ) ⊤ β 1 0 0 0 ! and C tu = I n ⊗ F ( u ) β 1 F ( t ) Σ x β ⊤ 1 + β 1 F ( t ) Σ x F ( s ) ⊤ β 1 F ( u ) β 1 F ( t ) Σ x F ( t ) Σ x F ( u ) ⊤ β 1 0 ! , for r, s, u = v + 1 , . . . , v ( m + 1); t = v ( m + 1) + 1 , . . . , v ( m + 1) + m ; and C r s = 0 otherwise. Therefore, in the measurement error mod el deﬁned by th e equ ations (1) and (2), all quantities necessary to compute the O ( n − 1 ) bias of b θ u sing exp ression ( 8) are giv en. 6 On the right-hand side of ex pression (8), consisten t estimates of the parameter θ can b e inserted to deﬁne the corrected MLE e θ = b θ − b B ( b θ ), w here b B ( · ) den otes the MLE of B ( · ), that is, the unkno wn parameters are repla ced by their M LEs. The bias-corrected estimate (BCE) e θ is exp ected to ha v e b etter sampling properties than the un corrected estimator, b θ . In fact, w e presen t some sim ulatio ns in S ection 5 to sho w that e θ has smaller bias than its correspondin g MLE, thus suggesting that the bias corrections hav e the eﬀect of shifting the modiﬁed estimates to w ard to th e true parameter va lues. The BCEs can alw a ys b e deﬁ ned if the jo int cumulan ts of the deriv ativ es of the log-lik elihoo d function and the MLEs exist. Although, in some situations (for example, homoske dastic simple errors-in-var iables mo del), the ﬁrst moment of the MLEs is not deﬁned, it is still p ossible to deﬁne such “corrected” estimators from B ( b θ ). In this cas e, the interpretation of B ( b θ ) may not b e the second-order bias of b θ , but it is still b eing an “adjustemen t” factor of the lo cation of the MLEs. Patriota an d Lemonte (2009) present some sim ulation stud ies considering a simple linear errors-in-va riables mo del in which is show ed that t he BCEs hav e b etter p erformance than the MLEs for ﬁnite sample sizes. In general, it is very hard to verify if the MLEs of the parameters of the mod el considered in this p aper hav e deﬁ ned exp ectations, b u t the simulation studies presented in Section 5 in dicate a b etter p erformance of th e corrected estimators than the u n corrected ones and , th erefore, we advise to use the corrected estimators. 4 B iases of the MLEs b µ i and b Σ i In t his section, w e give matrix form ulae for the O ( n − 1 ) biases of t he MLEs of t h e i th mean µ i = µ i ( θ ) and i th v ariance-cov ariance ve ctor Σ ∗ i = vec h ( Σ i ( θ ) ). Let q 1 = v + m and q 2 = q 1 ( q 1 + 1) / 2. Ad ditionally , let A = [ A 1 , . . . , A n ] ⊤ b e a np × p matrix, where A i is a p × p matrix, th en we deﬁn e tr ∗ ( A ) = [tr( A 1 ) , . . . , t r( A n )] ⊤ . F rom a T aylor series ex pansion of b µ i = µ i ( b θ ), we obtain u p to an error of order O ( n − 2 ): B ( b µ i ) = L i B ( b θ ) + 1 2 tr ∗ [ M i Co v( b θ )] , where L i is a q 1 × p matrix of ﬁrst partial deriv atives ∂ µ i /∂ θ r (for r = 1 , 2 , . . . , p ), M i = [ M i 1 , . . . , M iq 1 ] ⊤ is a q 1 p × p matrix of second partial deriv ativ es, where M il is a p × p matrix with elements ∂ 2 µ il /∂ θ r ∂ θ s (for r, s = 1 , . . . , p and l = 1 , 2 , . . . , q 1 ), Co v( b θ ) = K − 1 θ is the asymptotic cov ariance matrix of b θ and the ve ctor B ( b θ ) was deﬁned b efore. A ll quantities in the above equation should b e ev aluated at b θ . The asymptotic v ariance of b µ i can also b e expressed exp licitly in terms of the co v aria nce of b θ by V ar( b µ i ) = L i Co v( b θ ) L ⊤ i . The second-order bias of b Σ ∗ i is obtained by expanding b Σ ∗ i = Σ ∗ i ( b θ ) in T a ylor series. Then, th e O ( n − 1 ) bias of b Σ ∗ i is written as: B ( b Σ ∗ i ) = L ∗ i B ( b θ ) + 1 2 tr ∗ [ M ∗ i Co v( b θ )] , where L ∗ i is a q 2 × p matrix of ﬁ rst p artial d eriv a tives ∂ Σ ∗ i /∂ θ r (for r = 1 , 2 , . . . , p ), M ∗ i = [ M ∗ i 1 , . . . , M ∗ iq 2 ] ⊤ is a q 2 p × p matrix of second partial d eriv ativ es, where M ∗ il is a p × p matrix with elements ∂ 2 Σ ∗ il /∂ θ r ∂ θ s (for r, s = 1 , . . . , p an d l = 1 , 2 , . . . , q 2 ). 7 Therefore, we are now able t o deﬁ ne the follo wing second-order b ias-corrected es- timators for b µ i and b Σ ∗ i : e µ i = b µ i − b B ( b µ i ) and e Σ ∗ i = b Σ ∗ i − b B ( b Σ ∗ i ) . It is clear that the O ( n − 1 ) bias of any other function of θ , sa y Ψ ( θ ) ( h × 1), can b e ob t ained easily by T aylor series exp an sion: B ( b Ψ ) = ∇ (1) Ψ B ( b θ ) + 1 2 tr ∗ [ ∇ (2) Ψ Co v( b θ )] , where ∇ (1) Ψ is a h × p matrix of ﬁrst partial d eriv ativ es ∂ Ψ /∂ θ r (for r = 1 , 2 , . . . , p ) and ∇ (2) Ψ = [ ∇ (2) Ψ 1 , . . . , ∇ (2) Ψ h ] ⊤ is a hp × p matrix of second partial deriv atives , where ∇ (2) Ψ l is a p × p matrix with elements ∂ 2 Ψ l /∂ θ r ∂ θ s (for r, s = 1 , . . . , p an d l = 1 , 2 , . . . , h ). 5 Num erical resul ts W e shall use Mon te Carlo simulation to ev aluate the ﬁnite sample p erformance of the MLEs attained using the iterativ e form ula (7) and of t heir corresponding bias- corrected versions for a heteroskedastic errors-in-v ariables mod el p resented in ( 2) with m = v = 1. The sample sizes considered were n = 40 , 60 , 100 and 200, the number of Monte Ca rlo replications was 10,000. All simulations w ere p erformed using th e R programming language (R Developmen t Core T e am, 2008). W e consider th e simple errors-in-vari ables mo del Y i = y i + η y i and X i = x i + η x i , with y i | x i ind ∼ N ( β 0 + β 1 x i , σ 2 ). This mo del was studied by Kulathinal et al. (2002). The errors η y i and η x i are indep endent of the unobserv able co v ariate x i and are distributed as  η y i η x i  ind ∼ N 2  0 0  ,  τ y i 0 0 τ x i  , where th e v ariances τ y i and τ x i are k n o wn for all i = 1 , . . . , n . S upp osing in addition that x i iid ∼ N ( µ x , σ 2 x ), w e hav e that t he joint distribution of the observed v ariables can b e ex pressed as  Y i X i  ind ∼ N 2  β 0 + β 1 µ x µ x  ,  β 2 1 σ 2 x + τ y i + σ 2 β 1 σ 2 x β 1 σ 2 x σ 2 x + τ x i  . Deﬁne θ = ( β 0 , β 1 , µ x , σ 2 x , σ 2 ) ⊤ , µ i ( θ ) = β 0 + β 1 µ x µ x ! and Σ i ( θ ) =  β 2 1 σ 2 x + σ 2 + τ y i β 1 σ 2 x β 1 σ 2 x σ 2 x + τ x i  . F rom the p revious ex pressions, we h ave immediately that a 1 = 1 n ⊗  1 0  , a 2 = 1 n ⊗  µ x 0  , a 3 = 1 n ⊗  β 1 1  , a 4 = a 5 = 0 8 and a r s = 0 for all r, s except for a 23 = a 32 = 1 n ⊗  1 0  . Also, C 1 = C 3 = 0 and C 2 = I n ⊗  2 β 1 σ 2 x σ 2 x σ 2 x 0  , C 4 = I n ⊗  β 2 1 β 1 β 1 1  and C 5 = I n ⊗  1 0 0 0  . Additionally , C r s = 0 for all r, s except for C 22 = I n ⊗  2 σ 2 x 0 0 0  and C 24 = C 42 = I n ⊗  2 β 1 1 1 0  . Thus, e D = ( a 1 , a 2 , a 3 , 0 , 0 ) and e V = ( 0 , vec( C 2 ) , 0 , v ec( C 4 ) , vec ( C 5 )). Therefore, all the q uantitie s necessary t o calculate B ( b θ ) using expression (8) are given. In order to analyze the p oint estimation results, w e computed , for each sample size and for each estimator: relativ e bias (the relative bias of an estimator b θ is deﬁn ed as { E( b θ ) − θ } /θ , its estimate b eing obtained b y estimating E( b θ ) by Mon te Carlo) and root mean squ are error, i.e., √ MSE, where MSE is t h e m ean squared error estimated from t h e 10,000 Monte Carlo replications. F o r practical reasons and without loss of generalit y , we adopt th e same setting of parameters chosen by d e Castro et al. (2008). (The parameters are the MLEs for the mod el parameters using a real data set presented in the next section.) W e take β 0 = − 2, β 1 = 0 . 5, µ x = − 2, σ 2 x = 4 and σ 2 = 10. W e also consider tw o typ es of heteroske dasticit y as stu died by Pa triota et al . (2009), n amely: (a) √ τ x i ∼ U (0 . 5 , 1 . 5) and √ τ y i ∼ U (0 . 5 , 4), where U ( a, b ) means uniform distribution on [ a, b ]; (b) √ τ x i = 0 . 1 | x i | and √ τ y i = 0 . 1 | − 2 + 0 . 51 x i | , i.e., t he v aria nces d ep end on the unkn o wn cov ariate. W e remark t hat the v ariances are considered to b e k now n and kept ﬁ xed in all Monte Carlo simulatio ns. T able 1 shows simulati on results for an errors-in-v ariables model with a un iform heteroskedas ticity . The ﬁgures in this t able reveal that the maximum-lik elihoo d esti- mators of the parameters can be substantially b iased when the sample size is small, and that the bias correction w e d erive d in the previous secti on is v ery eﬀective. F or instance, when n = 40 th e biase s of the estimators of β 0 , β 1 , µ x , σ 2 x and σ 2 a ver- age − 0 . 0224 4 whereas the biases of the corresponding b ias-adjusted estimators a verag e − 0 . 00276; t hat is, th e a verag e bias (in v alue absolute) of the MLEs is almost ten t imes greater than th at of the corrected estimators. In p articular, th e maximum-lik elihoo d estimators of σ 2 x and σ 2 displa y substantial b ias, and the b ias correction prov es to b e quite eﬀective when app lied to th ese estimators. T able 2 d ispla ys simulation results for an errors-in-va riables mo d el with a nonuni- form heteroskedasticit y . W e note th at the bias-adjusted estimator again displays smaller bias th an the standard maximum-lik elihoo d estimator. This suggests that th e second- order bias of MLEs should not b e ignored in samples of small to mo derate sizes since they can b e n onnegligible. Note also that ro ot mean squ are error decrease with n , as exp ected. Additionally , we note that all estimators hav e simi lar ro ot mean squared errors. It is interesting to note that the ﬁnite-sample p erformance of the estimator of σ 2 x deteriorate when w e p ass from t he mo del with a uniform heteroskedasticit y to t he mod el with a nonuniform heteroskedasticit y (see T ables 1 and 2). F o r instance, when n = 100, the relative biases of b σ 2 x (MLE) were − 0 . 0135 (u niform heteroskedas ticit y) 9 T able 1 Relative bias and √ MSE of uncorrected and corrected estimates wi th a uniform heterosk edasticit y: √ τ x i ∼ U (0 . 5 , 1 . 5) and √ τ y i ∼ U (0 . 5 , 4). MLE BCE n θ Rel. bi as √ MSE Rel. bi as √ MSE 40 β 0 − 0 . 0173 0.99 − 0 . 0043 0.97 β 1 0 . 0315 0.38 0 . 0054 0.37 µ x − 0 . 0018 0.35 − 0 . 0018 0.35 σ 2 x − 0 . 0351 1.11 − 0 . 0045 1.13 σ 2 − 0 . 0895 3.31 − 0 . 0086 3.38 60 β 0 − 0 . 0139 0.77 − 0 . 0061 0.76 β 1 0 . 0213 0.29 0 . 0058 0.29 µ x 0 . 0009 0.28 0 . 0009 0.28 σ 2 x − 0 . 0239 0.89 − 0 . 0036 0.90 σ 2 − 0 . 0548 2.60 − 0 . 0018 2.64 100 β 0 − 0 . 0100 0.68 − 0 . 0037 0.67 β 1 0 . 0168 0.26 0 . 0042 0.25 µ x 0 . 0001 0.25 0 . 0001 0.25 σ 2 x − 0 . 0135 0.80 0 . 0022 0.81 σ 2 − 0 . 0424 2.40 0 . 0003 2.43 200 β 0 − 0 . 0049 0.59 − 0 . 0006 0.59 β 1 0 . 0127 0.22 0 . 0041 0.22 µ x 0 . 0013 0.23 0 . 0013 0.23 σ 2 x − 0 . 0116 0.70 0 . 0008 0.70 σ 2 − 0 . 0350 2.09 − 0 . 0014 2.11 BCE: bi as- corrected estimator. and − 0 . 0484 (nonuniform heterosk edasticit y), whic h amounts t o an increase in relativ e biases of n early 3 . 5 t imes. 6 A pplication W e shall no w present an application of th e mo d el describ ed in Section 2 where v = m = 1. W e analyze a epidemiological data set from the WHO MON ICA (W orld Health Organization Multinational MONitoring of trends and determinants in CArdio v ascula r disease) Pro ject. This d ata set wa s previously studied by Ku lathinal et al. (2002) an d de Castro et al. (2008) where the ML approach w as adopted to estimate the model parameters. The main goal of this p ro ject is to monitor trends in cardiov ascular diseases and relate it with kn o wn risk factors. Here, y is the trend s in cardio v ascula r mortalit y and coronary h eart disease and x is the changes in known risk factors. The risk score was deﬁned as a linear com bination of smoking status, systolic blo o d pressure, b o dy mass index and total choles terol level. Note th at, th ese va riables are non- ob serva ble index es therefore they need to b e estimated in some wa y . F o llo w up studies where condu ct ed using prop ortional hazards mo dels which can provide the observed ( Y and X ) indexes and th e measurement error va riances. The latent v ariables y and x are linearly related as y i = β 0 + β 1 x i + q i , i = 1 , . . . , n. 10 T able 2 Relative bias and √ MSE of uncorrected and corrected estimates with a nonuniform heterosk edasticit y: √ τ x i = 0 . 1 | x i | and √ τ y i = 0 . 1 | β 0 + β 1 x i | . MLE BCE n θ Rel. bi as √ MSE Rel. bi as √ MSE 40 β 0 − 0 . 0026 0.73 − 0 . 0018 0.73 β 1 0 . 0292 0.27 0 . 0276 0.27 µ x − 0 . 0228 0.32 − 0 . 0228 0.32 σ 2 x − 0 . 0594 0.91 − 0 . 0354 0.92 σ 2 − 0 . 0540 2.26 − 0 . 0056 2.30 60 β 0 0 . 0008 0.59 0 . 0013 0.59 β 1 0 . 0203 0.22 0 . 0192 0.22 µ x − 0 . 0208 0.26 − 0 . 0208 0.26 σ 2 x − 0 . 0502 0.76 − 0 . 0340 0.75 σ 2 − 0 . 0332 1.85 − 0 . 0002 1.88 100 β 0 0 . 0013 0.51 0 . 0016 0.51 β 1 0 . 0184 0.19 0 . 0176 0.19 µ x − 0 . 0198 0.23 − 0 . 0198 0.23 σ 2 x − 0 . 0484 0.65 − 0 . 0363 0.65 σ 2 − 0 . 0223 1.61 0 . 0027 1.64 200 β 0 0 . 0036 0.45 0 . 0039 0.45 β 1 0 . 0165 0.17 0 . 0159 0.17 µ x − 0 . 0186 0.20 − 0 . 0186 0.20 σ 2 x − 0 . 0474 0.59 − 0 . 0377 0.58 σ 2 − 0 . 0204 1.41 − 0 . 0004 1.43 BCE: bi as- corrected estimator. As the va riables y i and x i are not directly observ able, surrogate v ariables Y i and X i are observed in th eir p lace, resp ectively . Su ch surrogate vari ables are attained from an analytical treatment of the data collection pro cess. The data set are d iv ided into tw o groups, n amely: men ( n = 38) and women ( n = 36). In what follo ws, we compare th e MLEs with the bias-corrected estimators. T able 3 presents the MLEs, its standard dev iation, its second-order biases and the corrected estimates. It can b e seen that, the greater is the standard dev iation of the MLE, t he more distant from zero is its resp ectively second- order bias. As conclud ed in t h e sim u- lation stud ies, the biases of t h e v ariances estimates are larger than of those pro duced by the line estimators. The second-order biases of the MLEs can b e expressed as a p ercenta ge of th e MLEs. That is, for the men data set, th e second-order biases are − 0 . 21%, 0.85%, 0.00%, − 2 . 92% and − 9 . 21% of th e t otal amount of th e MLEs of β 0 , β 1 , µ x , σ 2 x and σ 2 , resp ectivel y . F or the women data set, the second-order biases are 52.96%, 1.21%, 0.00%, − 3 . 16% and − 10 . 19% of the MLEs of β 0 , β 1 , µ x , σ 2 x and σ 2 , respectively . It shows that t h e second-order biases of the MLEs are more pronou n ced in the women d ata set, mainly for th e intercept estimator. 7 C onclusions W e derive a b ias-adjustmen t scheme to eliminate the second- order b iases of the max- im um-likelihood estimates in a heteroskedastic multiv ariate errors-in-v ariables regres- sion model using th e general matrix form ulae for the second-order bias derived by P atriota and Lemonte (2009). The sim ulation results presented show t hat t he MLEs 11 T able 3 MLEs and bias-corr ected estimates. Pa rameter MLEs S.E. Bias BCEs β 0 − 2 . 0799 0.5285 0 . 0044 − 2 . 0843 β 1 0 . 4690 0.2339 0 . 0040 0 . 4650 Men µ x − 1 . 0924 0.3550 0 . 0000 − 1 . 0924 σ 2 x 4 . 3163 1.0969 − 0 . 1261 4 . 4423 σ 2 4 . 8883 1.7790 − 0 . 4501 5 . 3384 Pa rameter MLEs S.E. Bias BCEs β 0 0 . 0321 1.1121 0 . 0170 0 . 0151 β 1 0 . 6790 0.4072 0 . 0082 0 . 6708 W omen µ x − 2 . 0677 0.3386 0 . 0000 − 2 . 0677 σ 2 x 3 . 6243 0.9695 − 0 . 1146 3 . 7389 σ 2 11 . 0809 4.2425 − 1 . 1289 12 . 2098 BCE: bi as- corrected estimates. can b e considerably biased. The b ias correction derived in this pap er is very eﬀective, even when the sample size is la rge. In d eed, the bias corre ction mechanism adopted yields modiﬁed maximum-like lihoo d estimates which are n early un biased. Addition- ally , many errors-in-vari ables mo dels are sp ecial cases of th e prop osed mo del and the results obt ained here can b e easily p articularized to these submo dels. W e also present an application t o a real d ata set. Ackno wledgments This work was partially supp orted by F APESP (F u nda¸ c˜ a o de Amparo ` a Pe squisa do Estado de S˜ ao P aulo, Brazil) and CNPq (Conselho N acional de Desenv olvimen to Cien t ´ ıﬁco e T ecnol´ ogico, Brazil). The authors thank Dr. K ari Kuulasmaa (National Public Health I nstitute, Finland) for k in dly supplying the d ata of our app lication. The authors are also grateful t o a referee for helpful comments and suggestions. References Akritas, M.G. , Bers hady , M.A. (1996). Linear regres sion for astronomical data wi th measuremen t errors and intrinsic scatter. The Astr ophys ic al Journal . 470 :706–714. Cook, R.D., Tsai, C., W ei, B. (1986). Bias in nonlinear regressi on. Biometrika . 73 :615– 623. Cordeiro, G.M. (1993). Bartlett corrections and b ias correction for tw o heterosk edastic regression mo dels. Comm unic at ions i n Statistics, The ory and Metho ds . 22 :169–188. Cordeiro, G.M. (2008). Corrected maximum likeli ho od es timators in linear het- erosk edastic regression mo d els. Br az ilian R eview of Ec ono metrics . 28 :53–67. Cordeiro, G.M., V as concellos, K .L.P . (1997). Bias correction for a class of multiv ariate nonlinear regression mo dels. Statistics and Pr ob abil ity L etters . 35 :155–164. Co x, D .R., H inkley , D .V. (1974). The or etic al Statistics . London: Chapman and Hall. de Castro, M., Galea, M., Bolfarine, H. (2008). Hy p othesis testing in an errors-in- v aria bles mo del with h eterosk edastic measurement errors. Statistics in Me dicine . 27 :5217– 5234. 12 Doornik , J.A. (2006). An Obje ct -Oriente d Matrix L anguage – Ox 4 . Timberlake Con- sultants Press, London. 5th ed . F uller, W . (1987). Me asur ement Err or Mo dels . Wiley: Chichester. Kulathinal, S.B., K uulasmaa, K., Gasbarra, D . (2002). Estimation of an errors-in- v aria bles regression mo del when th e var iances of th e measurement error v ary b etw een the observ ations. Statistics i n Me dicine . 21 :1089–1101. P atriota, A.G., Bolfa rine, H ., de Castro, M. (2009). A heteroscedastic errors-in-v ariables model with equation error. Statistic a l Met ho dolo gy . Doi:10.101 6/j.stamet.200 9.02.003. P atriota, A.G., Lemonte, A .J. (2009). Bias correction in a multiv ariate normal re- gression mo del with general parameterization. Statistics and Pr ob ability L etter s . Doi:10.101 6/j.spl.2 009.04.018. R Devel opment Core T eam (2008). R: A Language and Environmen t for Statistical Computing. V ienna, A ustria. V asconcellos, K.L.P ., Cordeiro, G.M. (1997). Approximate bias for multiv ariate non- linear heteroskedastic regressions. Br azilian Journal of Pr ob ability and Statistics . 11 :141–1 59.

Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment