Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data

Statistic al Scienc e 2007, V ol. 22, No. 4, 569– 573 DOI: 10.1214 /07-STS227B Main article DO I: 10.1214/07-STS227 c  Institute of Mathematical Statisti cs , 2007 Comment: Demystifying Double Robustness: A Compa rison of Alternative Strategies fo r Estimating a P opulation Mean from Incomplete Data Anastasios A. Tsiatis and Ma rie Davidian INTRODUCTION W e congratulat e Drs . Kang and S c hafer (K S hence- forth) for a careful and though t-prov oking contribu- tion to the literature regarding the so-called “dou- ble robu stness” prop ert y , a t opic that still en genders some confusion and disagreemen t. The authors’ ap- proac h of fo cusing on the simp lest situation of es- timation of t he p opu lation mea n µ of a resp onse y when y is not observed on all sub jects according to a missing at rand om (MAR) mec hanism (equiv alent ly , estimation o f t he mean of a p oten tial outcome in a causal mo del under the assumption of no un mea- sured confound ers) is commendable, as the fun da- men tal iss u es can b e explored without the distrac- tions of the messier n otation and consid erations re- quired in more complicated settings. Ind eed, as the article demons tr ates, th is simple setting is su ﬃcien t to highligh t a num b er of k ey p oint s. As noted elo quently by Molen b erghs ( 2005 ), in regard to how suc h missing data/causal in f erence problems are b est addressed, tw o “sc ho ols” ma y b e iden tiﬁed: the “lik eliho o d -orien ted” school and the “w eigh ting-based” sc ho ol. As w e ha v e emphasized previously (Da vidian, Tsiatis and Leon, 2005 ), we prefer to view infer en ce from the v anta ge p oin t of Anastasio s A. Tsiatis is Dr exel Pr ofessor of Statistics at North Car olina St ate University, R aleigh, North Car olina 27695-8203 , USA e-mail: tsiatis@stat.ncsu.e du . Marie Davidian is Wil liam Ne al R eynolds Pr ofessor of Statistics at North Car olina Stat e University, R aleigh, North Car olina 27695-820 3, USA e-mail: davidi an@stat.ncsu .e du . This is an electronic reprint of the or iginal article published by the Institute of Mathematical Statistics in Statistic al Scienc e , 2 007, V ol. 2 2, No. 4, 569– 573 . This reprint diﬀers from the original in pagination and t yp ogr aphic detail. semiparametric theory , fo cus ing on the assump tions em b edd ed in the statistical mo d els leading to diﬀer- en t “t yp es” of estimato rs (i.e ., “lik eliho o d-oriente d” or “w eigh ting-based”) rather than on the form s of the estimators themselv es. In this discussion, w e hop e to complement the p resen tation of the authors by elab orating on this p oint of view. Throughout, we use th e same n otation as in th e pap er. SEMIP ARAMET RIC THEORY PERSP ECTIVE As demonstrated by Robins, Rotnitzky and Zhao ( 1994 ) and Tsiatis ( 2006 ), exploiting the relation- ship b etw een so-calle d inﬂuenc e functions and esti- mators is a fr u itful approac h to stud ying and con- trasting the (large-sample) prop erties of estimators for p arameters of int erest in a statistical mo del. W e remind the r eader that a statistic al mo del is a class of dens ities that could hav e generated the ob s erv ed data. Our p resen tation h ere is for s calar parameters suc h as µ , but generalizes readily to v ector-v alued parameters. If one restricts atten tion to estimators that are r e gular (i.e., not “pathological”; see Da vid- ian, Tsiatis and Leon, 2005 , page 263 and Tsiatis 2006 , pages 26–27 ), then , for a parameter µ in a parametric or s emip arametric statistical mo del, an estimator b µ for µ based on indep endent an d iden- tically d istr ibuted ob s erv ed data z i , i = 1 , . . . , n , is said to b e asympto tic al ly line ar if it satisﬁes n 1 / 2 ( b µ − µ 0 ) = n − 1 / 2 n X i =1 ϕ ( z i ) + o p (1) (1) for ϕ ( z ) with E { ϕ ( z ) } = 0 and E { ϕ 2 ( z ) } < ∞ , where µ 0 is the tru e v alue of µ generating the data, and exp ectation is with resp ect to the true d istribution of z . The function ϕ ( z ) is the inﬂuenc e function of the estimator b µ . A regular, asymptotically linear es- timator w ith inﬂu ence function ϕ ( z ) is consisten t 1 2 A. A. TSIA TIS AN D M. DA VIDIAN and asymptotically normal with asymptotic v ari- ance E { ϕ 2 ( z ) } . Thus, there is an inextricable con- nection b et ween estimators and in ﬂuence fu nctions in that the asymp totic b eha vior of an estimator is fully determined b y its inﬂ uence fun ction, so that it suﬃces to fo cus on the in ﬂ uence function when discussing an estimator’s prop erties. Many of the estimators discu ssed by KS are regular and asymp - totical ly linear; in the sequel, w e refer to regular and asymptotically linear estimators as simply “estima- tors.” W e capitalize on this connection b y considering the problem of estimating µ in the setting in KS in terms of statistica l mo dels that may b e assumed for the observe d data, from which inﬂuence func- tions corresp onding to estimators v alid und er the assumed mo dels ma y b e d er ived. In the situation studied by KS , th e “full” data that would ideally b e observ ed are ( t, x, y ); how eve r, as y is unobs er ved for some sub jects, the observed d ata av ailable for anal- ysis are z = ( t, x, ty ). As n oted b y KS , the MAR as- sumption states that y and t are conditionally in de- p end ent give n x ; for example, P ( t = 1 | y , x ) = P ( t = 1 | x ). Un d er this assumption, all join t d ensities for the observ ed data hav e the form p ( z ) = p ( y | x ) I ( t =1) p ( t | x ) p ( x ) , (2) where p ( y | x ) is the densit y of y giv en x , p ( t | x ) is the d ensit y of t giv en x , and p ( x ) is the marginal densit y of x . L et p 0 ( z ) b e the densit y in th e class of densities of form ( 2 ) generating the observed data (the true joint densit y). One ma y p osit d iﬀerent statistic al mo d els b y mak- ing diﬀeren t assumptions on the components of ( 2 ). W e fo cus on thr ee suc h mo d els: I. Mak e n o assumptions on the forms of p ( x ) or p ( t | x ), lea ving th ese en tirely unsp eciﬁed. Make a sp eciﬁc assumption on p ( y | x ), namely , th at E ( y | x ) = m ( x, β ) for some giv en fu nction m ( x, β ) dep end ing on p arameters β ( p × 1). De- note th e class of densities satisfying th ese as- sumptions as M I . I I. Mak e no assu m ptions on the f orms of p ( x ) or p ( y | x ). Mak e a sp eciﬁc assumption on p ( t | x ) that P ( t = 1 | x ) = E ( t | x ) = π ( x, α ) for some giv en function π ( x, α ) dep endin g on p arameters α ( s × 1). Here, we also requir e the assumption that P ( t = 1 | x ) ≥ ε > 0 for all x and some ε . Denote the class of d ensities satisfying these assump- tions as M II . I I I . Make n o assump tions on th e form of p ( x ), but mak e sp eciﬁc assumptions on p ( y | x ) and p ( t | x ), namely , that E ( y | x ) = m ( x, β ) and P ( t = 1 | x ) = E ( t | x ) = π ( x, α ) ≥ ε > 0 for all x and some ε for giv en fun ctions m ( x, β ) and π ( x, α ) dep end ing on parameters β and α . Th e class of densities satisfying these assumptions is M I ∩ M II . All of I–II I are semiparametric s tatistica l mo dels in that some asp ects of p ( z ) are left un sp eciﬁed. De- note b y m 0 ( x ) the true function E ( y | x ) and b y π 0 ( x ) the tru e fun ction P ( t = 1 | x ) = E ( t | x ) corresp ondin g to the tr u e densit y p 0 ( z ). Semiparametric theory yields the form of all in- ﬂuence fu nctions corresp ond in g to estimators for µ under eac h of the statistical mo d els I–I I I. As d is- cussed in Tsiatis ( 2006 , page 52), lo osely sp eaking, a consisten t and asymptotically n ormal estimator for µ in a s tatistical mo del h as the prop ert y that, for all p ( z ) in the class of d ensities deﬁned by the mo del, n 1 / 2 ( b µ − µ ) D ( p ) → N { 0 , σ 2 ( p ) } , where D ( p ) → means con- v ergence in distrib u tion und er the densit y p ( z ) , and σ 2 ( p ) is the asymptotic v ariance of b µ u nder p ( z ). If mo del I is correct, then m 0 ( x ) = m ( x, β ) for some β , and it ma y b e shown (e.g., Tsiatis, 2006 , Section 4.5) that all estimators for µ ha ve inﬂuence functions of the form m 0 ( x ) − µ + ta ( x ) { y − m 0 ( x ) } (3) for arb itrary functions a ( x ) of x . If mo del I I is cor- rect, then π 0 ( x ) = π ( x, α ) for some α , and all esti- mators for µ h a v e inﬂ uence functions of th e form ty π 0 ( x ) + t − π 0 ( x ) π 0 ( x ) h ( x ) − µ (4) for arb itrary h ( x ), which is w ell kno wn from Robins, Rotnitzky and Z hao ( 1994 ). If mo del I I I is correct, then m 0 ( x ) = m ( x, β ) and π 0 ( x ) = π ( x, α ) for some β and α , and in ﬂuence fun ctions for estimators b µ ha v e the form m 0 ( x ) − µ + ta ( x ) { y − m 0 ( x ) } (5) + t − π 0 ( x ) π 0 ( x ) h ( x ) for arbitrary a ( x ) and h ( x ). Dep ending on forms of m ( x, β ) as a function of β and π ( x, α ) as a f unction of α , there will b e restrictions on th e f orms of a ( x ) and h ( x ); see b elo w . W e n ow consider estimators discuss ed b y K S from the p ers p ectiv e of in ﬂuence functions. The regres- sion estimator b µ OLS in ( 7 ) of KS comes ab out nat- urally if one assumes mod el I is correct. In terms COMMENT 3 of inﬂ uence functions, b µ OLS ma y b e motiv ated by considering the inﬂu ence function ( 3 ) with a ( x ) = 0, as this leads to the estimator n − 1 P n i =1 m ( x i , β ). In fact, although KS do not discuss it, the “impu tation estimator” b µ IMP = n − 1 P n i =1 { t i y i + (1 − t i ) m ( x i , β ) } ma y b e motiv ated by ta king a ( x ) = 1 in ( 3 ). Of course, in p ractice, β m ust b e estimated. In general, ( 3 ) implies that all estimators for µ that are consis- ten t and asymptoticall y normal if mo del I is correct m ust b e asymptotically equiv alen t to an estimator of the form n − 1 n X i =1 [ m ( x i , b β ) + t i e a ( x i ) { y i − m ( x i , b β ) } ] , (6) where β is estimated by solving an estimating equa- tion P n i =1 t i A ( x i , β ) { y i − m ( x i , β ) } = 0 for A ( x, β ) ( p × 1). Because β is estimated, the inﬂuence fun c- tion of the estimator ( 6 ) with a particular e a ( x ) w ill not b e exactly equal to ( 3 ) w ith a ( x ) = e a ( x ); instead, it ma y b e sho wn th at the inﬂ uence function of ( 6 ) is of form ( 3 ) with a ( x ) in ( 3 ) equal to e a ( x ) − E [ { π 0 ( x ) e a ( x ) − 1 } m T β ( x, β 0 )] · [ E { π 0 ( x ) A ( x, β 0 ) m T β ( x, β 0 ) } ] − 1 (7) · A ( x, β 0 ) , where m β ( x, β ) is the v ector of partial deriv ativ es of elements of m ( x, β ) with resp ect to β , and β 0 is suc h that m 0 ( x ) = m ( x, β 0 ). The IPW estimator b µ IPW - POP in (3) of KS and its v ariants arise if one assu mes mo del I I. In par- ticular, b µ IPW - POP can b e motiv ated via the inﬂu - ence function ( 4 ) with h ( x ) = − µ . The estimator b µ IPW - NR in ( 4 ) of KS follo ws from ( 4 ) with h ( x ) = − E [ y { 1 − π ( x ) } ] /E [ { 1 − π ( x ) } ]. In fact, if one re- stricts h ( x ) in ( 4 ) to b e a constan t, then, using th e fact that the exp ectati on of the squ are of ( 4 ) is the asymptotic v ariance of the estimator, one ma y ﬁn d the “b est” such constant minimizing the v ariance as h ( x ) = − E [ y { 1 − π ( x ) } /π ( x )] /E [ { 1 − π ( x ) } /π ( x )]. An estimator b ased on th is id ea was giv en in (10) of Lunceford and Da vid ian ( 2004 , page 2943). In gen- eral, as for mo del I, ( 4 ) implies that all estimators for µ that are consisten t and asymp totical ly normal if mo del I I is correct m ust b e asymptotically equiv- alen t to an estimator of th e form n − 1 n X i =1  t i y i π ( x i , b α ) + t i − π ( x i , b α ) π ( x i , b α ) e h ( x i )  , (8) where b α is estimated b y solving an equation of the form P n i =1 { t i − π ( x i , α ) } B ( x i , α ) = 0 for some ( s × 1) B ( x i , α ), almost alw a ys maximum lik eliho o d for binary r egression. As ab o ve , b ecause α is estimated, the inﬂ uence fu nction of ( 8 ) is equal to ( 4 ) w ith h ( x ) equal to e h ( x ) − E [ π T α ( x, α 0 ) { m 0 ( x ) + e h ( x ) } /π 0 ( x )] · [ E { B ( x, α 0 ) π T α ( x, α 0 ) } ] − 1 (9) · B ( x, α 0 ) π 0 ( x ) , where π α ( x, α ) is the vecto r of partial d eriv ativ es of elemen ts of π ( x, α ) with r esp ect to α , and α 0 satis- ﬁes π 0 ( x ) = π ( x, α 0 ). Doubly r ob u st (DR) estimators are estimators that are consisten t and asymptotically n ormal for mo d- els in M I ∪ M II , that is, u nder the assumptions of mo del I or mo del I I . When the true d ensit y p 0 ( z ) ∈ M I ∩ M II , then the inﬂuence fun ction of any su ch DR estimator m ust b e equal to ( 3 ) with a ( x ) = 1 /π 0 ( x ) or, equiv alen tly , equal to ( 4 ) w ith h ( x ) = − m 0 ( x ). Accordingly , when p 0 ( z ) ∈ M I ∩ M II , that is, b oth mod els hav e b een sp eciﬁed correctly , all suc h DR estimators will h a v e the same asymptotic v ariance. This also imp lies th at, if b oth mo dels are correctly sp eciﬁed, the asymptotic p rop erties of the estimator do not dep end on th e metho ds used to estimate β and α . KS discuss strateg ies for constructing DR esti- mators, and they presen t seve ral sp eciﬁc examples: b µ BC - OLS in th eir equation ( 8 ); the estimators b e- lo w ( 8 ) using P OP or NR wei ght s, whic h w e denote as b µ BC - POP and b µ BC - NR , resp ectiv ely; th e estimator b µ WLS in their equ ation ( 10 ); b µ π - cov in their equ ation (12); and a v ersion of b µ π - cov equal to the estima- tor prop osed b y Scharfstein, Rotnitzky and Rob in s ( 1999 ) and Bang and Robins ( 2005 ), which we de- note as b µ SRR . The results for these estimators un der the “Correct-Correct” scenarios ( M I ∩ M II ) in T a- bles 5–8 of KS are consisten t with the asymptotic prop erties ab o v e. W e note that b µ π - cov is n ot DR un- der M I ∪ M II b ecause of the ad d itional assump tion that the mean of y giv en π must b e equal to a lin- ear combinatio n of basis fu nctions in π . Making this additional assumption ma y n ot b e un r easonable in practice; how ev er, strictly sp eaking, it tak es b µ π - cov outside the class of DR estimators discussed here, and hence we do not consider it in the remainder of this section. How ev er, b µ SRR is still in th is class. KS suggest that a c haracteristic distinguishin g th e p erforman ce of DR estimators is whether or not the estimator is within or outsid e the augmen ted in v erse-probabilit y w eigh ted (AIPW) class. W e ﬁn d 4 A. A. TSIA TIS AN D M. DA VIDIAN this distinction artiﬁcial , as all of the ab o ve estima- tors b µ BC - OLS , b µ BC - POP , b µ BC - NR , b µ WLS and b µ SRR can b e exp ressed in an AIPW form. Namely , all of these estimato rs are algebraically exactly of the form ( 8 ) with e h ( x i ) replaced by a term − b γ − m ( x i , b β ), where b γ BC - OLS = b γ WLS = b γ SRR = 0, b γ BC - P OP = n − 1 P n i =1 ( t i / b π i )( y i − b m i ) n − 1 P n i =1 t i / b π i and (10) b γ BC - NR = n − 1 P n i =1 ( t i (1 − b π i ) / b π i )( y i − b m i ) n − 1 P n i =1 t i (1 − b π i ) / b π i , where w e w rite b π i = π ( x i , b α ) and b m i = m ( x i , b β ) for brevit y . F or b µ WLS and b µ SRR , this ident it y follo ws from the fact that P n i =1 t i b π i ( y i − b m i ) = 0 , whic h for b µ WLS holds b ecause KS r estrict to m ( x, β ) = x T β , with x includ ing a constan t term. Thus, we con- tend th at iss ues of p erformance u nder M I ∪ M II are not linke d to whether or not a DR estimator is AIPW, but, rather, are a consequence of forms of the inﬂu ence functions of estimators under M I or M II . In particular, under mo del I I, it follo ws that the ab o v e estimators ha v e inﬂ u ence f unctions of th e form ( 4 ) with h ( x ) equal to ( 9 ) with e h ( x ) = −{ γ ∗ + m ( x, β ∗ ) } , where γ ∗ and β ∗ are the limits in probabilit y of b γ and b β , resp ectiv ely . Thus, features determining p erf orm ance of these estimators w hen mo del I I is correct are h o w close γ ∗ + m ( x, β ∗ ) is to m 0 ( x ) and ho w α is estimated, where maxim um lik eliho o d is the optimal c hoice. In fact, this p er- sp ectiv e rev eals that, for ﬁxed m ( x, β ), u sing ideas similar to those in T an ( 2006 ), the op timal c hoice of b γ is as in b γ BC - NR with t i (1 − b π i ) / b π i replaced by t i (1 − b π i ) / b π 2 i . Similarly , un der m o del I , the inﬂuence fun ctions of these estimators are of the form ( 3 ) with a ( x ) equal to ( 7 ) with e a ( x ) = ψ 1 /π ( x, α ∗ ) + ψ 2 , where α ∗ is the limit in probab ility of b α and ψ 1 = 1 and ψ 2 = 0 for b µ BC - OLS , b µ WLS and b µ SRR ; ψ 1 = 1 /E { π 0 ( x ) /π ( x, α ∗ ) } and ψ 2 = 0 for b µ BC - POP ; and ψ 1 and ψ 2 for b µ BC - NR are more complicated exp ectations inv olv- ing π 0 ( x ) and π ( x, α ∗ ). Th us, und er mo del I, f ea- tures determinin g p erformance of these estimators are the form of e a ( x ) and h o w β is estimated through the c hoice of A ( x, β ). W e ma y in terpret some of the results in T ables 5, 6 and 8 of KS in ligh t of these observ ations. Un- der the “ π -mo del Correct– y -mo del In correct” sce- nario ( M II ∩ M c I ), b µ BC - OLS , b µ WLS and b µ SRR sho w some nontrivial d iﬀerences in p erformance, wh ic h , from ab ov e, are lik ely attributable to diﬀerences in m ( x, β ∗ ). Under th e “ π -mo del Incorrect– y -mod el Correct” ( M I ∩ M c II ), all three estimators share the same e a ( x ) b ut u se diﬀerent metho ds to estimate β , so that an y diﬀerences are d ictated en tirely by the c h oice of A ( x, β ). The p o or p erformance of b µ SRR can b e understo o d from this p ersp ectiv e: “ β ” for this es- timator is actuall y β in th e mo del m ( x, β ) used by the other t w o estimators concatenated by an add i- tional element, the co eﬃcien t of b π − 1 i . Th e A ( x, β ) for b µ SRR th us in v olv es a d esign matrix that is un- stable for small b π i , consisten t with the comment of KS at the end of their Section 3. In sum m ary , we b eliev e that stud ying the p erfor- mance of estimators via their inﬂu ence f unctions can pro vide useful insigh ts. Our preceding remarks re- fer to large-sample p erformance, wh ic h d ep ends di- rectly on the inﬂuence fun ction. Estimators with the same inﬂu en ce function can exhibit diﬀerent ﬁnite- sample p rop erties. It ma y b e p ossible via higher- order expansions to gain an understanding of some of this b eha vior; to the b est of our kno wledge, this is an op en question. BOTH MODELS INCORRECT The dev elopmen ts in the pr evious section are rel- ev an t in M I ∪ M II . Key themes of KS are p erfor- mance of DR and other estimators outsid e this class; that is, when b oth the mo dels π ( x, α ) and m ( x, β ) are incorrectly sp eciﬁed, and c hoice of estimator un - der these circumstances. One wa y to study p erformance in this situation is through simulation. KS ha ve devised a ve ry inte r- esting and instru ctiv e sp eciﬁc sim ulation scenario that highligh ts some imp ortan t features of v arious estimators. In p articular, the K S scenario emph a- sizes the diﬃculties encoun tered with some of the DR estimators wh en π ( x i , b α ) is small for some x i . Indeed, in our exp erience, p o or p erf ormance of DR and IPW estimators in practice can result from few small π ( x i , b α ). When there are small π ( x i , b α ), as noted KS, resp onses are n ot observ ed for some p or- tion of the x space. Consequently , estimato rs lik e b µ OLS rely on extrap olation into that part of the x space. KS ha v e constructed a scenario where fail- ure to observ e y in a p ortion of the x space can wreak hav o c on some estimators that mak e u se of the π ( x i , b α ) but h as minimal impact on the qu al- it y of extrap olations for these x based on m ( x, b β ). COMMENT 5 One could equally well build a scenario where the x for whic h y is unobs erv ed are highly inﬂu ential for the regression m ( x, β ) and h ence could resu lt in deleterious p erf ormance of b µ OLS . W e th us reiterate the r emark of K S that, although simulations can b e illuminating, they cann ot yield broadly applicable conclusions. Giv en this, we oﬀer some though ts on other strate- gies for deriving estimat ors that ma y ha ve some ro- bustness p r op erties und er the foregoing conditions, that is, oﬀer go o d p erformance outside M I ∪ M II . One approac h may b e to searc h outside the class of DR estimators v alid under M I ∪ M II . F or ex- ample, as su ggested b y the simulations of KS , esti- mators in the sp irit of b µ π - cov , whic h imp ose add i- tional assumptions rendering them DR in the strict sense only in a subset of M I ∪ M II , ma y comp ensate for this restriction by yielding more robust p erfor- mance outside M I ∪ M II ; fu rther study along these lines w ould b e in teresting. An alternativ e tactic for searc h ing outside M I ∪ M II ma y b e to consider the form of inﬂ uence functions ( 5 ) for estimators v alid under M I ∩ M II . F or in s tance, a “h ybrid ” estimator of the form n − 1 n X i =1  m ( x i , b β ) I { π ( x i , b α ) < δ } +  t i y i π ( x i , b α ) + t i − π ( x i , b α ) π ( x i , b α ) e h ( x i )  · I { π ( x i , b α ) ≥ δ }  , for δ sm all, ma y tak e adv an tage of the desirable prop erties of b oth b µ OLS and DR estimators. A second p ossible strategy for identi fying robust estimators arises from the follo wing observ ation. Con- sider the estimator n − 1 n X i =1  t i y i π ( x i ) − t i − π ( x i ) π ( x i ) m ( x i , b β )  . (11) If π ( x i ) = π ( x i , b α ), th en ( 11 ) yields one f orm of a DR estimator. If π ( x i ) ≡ 1, then ( 11 ) results in the impu- tation estimato r. If π ( x i ) = ∞ , ( 11 ) reduces to b µ OLS . This suggests that it ma y b e p ossible to dev elop es- timators based on alternativ e c hoices of π ( x i ) that ma y ha v e go o d robus tness prop erties. F or exam- ple, a m etho d for obtaining estimators π ( x i , b α ) that shrinks these to ward a common v alue ma y p ro v e fruitful. T he suggestion of KS to mo v e aw a y from logistic regression mo dels for π ( x i , α ) is in a similar spirit. Finally , w e note that y et another approac h to dev eloping estimators would b e to start with the premise that one mak e n o parametric assump tion on the f orms of E ( y | x ) and E ( t | x ) b ey ond some mild smo othness conditions. Here, it is likel y that ﬁ rst- order asymptotic theory , as in the previous section, ma y no longer b e applicable. It may b e necessary to use h igher-order asymptotic theory to make progress in this d irection; see, for example, Robins and v an der V aart ( 2006 ). CONCLUDING REMA RKS W e aga in compliment the authors for their though t- ful and insigh tful article, and w e app r eciate the op- p ortun it y to oﬀer our p ersp ective s on this imp ortan t problem. W e lo ok forw ard to new metho dological dev elopmen ts that ma y ov ercome s ome of th e chal- lenges brought in to fo cus by KS in their article. A CKNO WLEDGMENT This r esearch was supp orted in part b y Gr ants R01-CA051 962 , R01-CA085848 and R37-AI031789 from the National Institutes of Health. REFERENCES Bang, H . and Rob ins, J. M. (2005). D oubly robu st estima- tion in missing data and causal inference mo dels. Bi omet- rics 61 962–972. MR2216189 Da vidian, M., Tsia tis, A. A. and Leon, S. (2005). Semi- parametric estimation of treatment eﬀect in a pretest- p osttest study without missing data. Statist. Sci. 20 261– 301. MR2189002 Lunceford, J. K. and D a vidian, M. (2004). Stratiﬁcation and weigh t in g via t he prop ensity score in estimation of causal treatment eﬀects: A comparativ e study . Statistics in Me dicine 23 2937–2960. Molenberghs, G. (2005). D iscussion of “Semiparametric es- timation of treatment eﬀect in a pretest–p osttest study with missing data,” by M. Davidian, A. A. Tsiatis and S. Leon. Statist. Sci. 20 289–292. MR2189002 Ro bins, J. M., Rotnitzky, A. and Zhao, L. P. (1994). Estimation of regression co eﬃcients when some regressors are not alw ays observed. J. Amer. Stat ist. Asso c. 89 846– 866. MR1294730 Ro bins, J. and v an der V aar t, A. ( 2006). Adaptiv e nonparametric conﬁdence sets. An n. Statist . 34 229–253. MR2275241 Scharfstein, D. O., Rotnitzky, A. and Robins, J. M. (1999). Rejoinder to “Ad justing for nonignorable drop- out using semiparametric nonresp onse mod els.” J. Amer. Statist. Asso c. 94 1135–114 6. MR1731478 T an, Z. (2006). A distributional approac h for causal inference using propensity scores. J. Amer . Statist. Asso c. 101 1619– 1637. MR2279484 6 A. A. TSIA TIS AN D M. DA VIDIAN Tsia tis, A. A. (2006). Semi p ar ametric The ory and Missing Data . S pringer, New Y ork. MR2233926

Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment