On the definition of a confounder

The Annals of Statistics 2013, V ol. 41, No. 1, 196–220 DOI: 10.1214 /12-AOS1058 c  Institute of Mathematical Statistics , 2 013 ON THE DEFINITION OF A CONF OUNDER 1 By Tyler J. V anderWe ele and Il y a Shpitser Harvar d University The causal inference literature has provided a clear formal deﬁ- nition of confound in g expressed in terms of counterf actual ind ep en- dence. The literature has not, ho w ever, come to an y consensus on a formal deﬁ nition of a confounder, as it h as given priorit y to the concept of confounding o ver th at of a confounder. W e consider a num b er of cand idate d eﬁnitions arising from v arious more informal statements ma de in the literature. W e consider the prop erties satis- ﬁed by each candidate deﬁn ition, p rincipally focusing on (i) wheth er under the candidate deﬁnition control for all “confounders” suﬃces to con trol for “confounding” and (ii) whether ea ch confounder in some context helps eliminate or reduce confounding bias. Several of the candidate deﬁnitions do not hav e th ese tw o prop erties. Only one candidate deﬁnition of those considered satisﬁes b oth prop erties. W e prop ose that a “confound er” b e deﬁn ed as a p re- exp osure co v ariate C for which there ex ists a set of other cov ariates X suc h that eﬀect of the ex p osure on the outcome is unconfounded conditional on ( X , C ) but such t h at for n o proper subset of ( X, C ) is the eﬀect of th e exp o- sure on the outcome unconfounded giv en the subset. W e also p ro vide a conditional analogue of the ab ov e deﬁn ition; and we p rop ose a v ari- able that helps redu ce bias but not eliminate bias b e referred to as a “surrogate confounder.” These deﬁnitions are closely related to those giv en b y R obins and Morg enstern [ Comput. Math. Appl. 14 (1987) 869–916 ]. The implications that hold among the v arious candidate deﬁnitions are discussed. 1. In tro duction. Statisticians and epid emiologi sts had tr ad itionally con- ceiv ed of a confounder as a pre-exp osure v ariable that w as asso ciated with exp osure and asso ciated also with the outcome conditional on the exp osur e, p ossibly cond itional also on other co v ariates [Miettinen ( 1974 )]. The dev el- opmen ts in causal inference ov er the past tw o decades hav e made clear that Received December 2011; revised S eptember 2012. 1 F unded by th e National I nstitutes of Health, USA. AMS 2000 subje ct cl assiﬁc ations. Primary 62A01; secondary 68T30, 62J 99. Key wor ds and phr ases. Causal inference, causal diagrams, counterfa ctual, confounder, minimal suﬃciency . This is an electronic reprint of the origina l ar ticle published b y the Institute of Mathematical Statistics in The Annals of S t atistics , 2013, V ol. 4 1, No. 1, 196–22 0 . This reprint diﬀers fro m the o riginal in pagination and t yp ogra phic detail. 1 2 T. J. V ANDER WEELE AND I. SHPITSER this deﬁnition of a “confoun der” is inadequate: there can b e pre-exposur e v ariables asso ciated with the exp osur e and th e outcome, the con trol of whic h in tro du ces rather than eliminates bias [Greenland, Pe arl and Robins ( 1999 ), Glymour and G reenland ( 2008 ), P earl ( 20 09 )]. The literature has mo ved a w a y from formal language ab out “confound ers” and instead places the con- ceptual emph asis on “confoundin g.” See Morabia ( 20 11 ) for historical dis- cussion of this p oin t. T he causal inference lite rature has pro vided a formal deﬁnition of “confounding” in terms of dep endence of co unterfactual out- comes and exp osure, p ossibly conditional on co v ariates. The absence of con- founding ( indep endence of t he counterfactual outcomes and the exp osure) has b een tak en as the foun d ational assumption f or dra wing causal inferences. Suc h absen ce of confound ing is alternativ ely referred to as “ignorabilit y” or “ignorable treatmen t assignment ” [Rubin ( 1978 )], “exc hangeabilit y” [Green- land and Robins ( 1986 )], “no unmeasured confounding” [Robins ( 1992 )], “selectio n on observ ables” [Barno w, Cain and Goldb erger ( 1980 ), Imb ens ( 2004 )] or “exogeneit y ” [Imben s ( 2004 )]. T o da y , at lea st within the formal metho dological literature on causalit y , language concerning “confound ers” is generally used only inform ally , if at all. The priority that has b een giv en to “confounding” o v er “co nfoun ders” has arguably b rough t clarit y and preci- sion to th e ﬁeld. Nev er th eless, among practicing statisticians and epidemiol- ogists, language concerning b oth “confounders” and “confound ing” is com- mon. This raises t he question as to whether a f ormal deﬁnition of a “c on- founder” can also b e giv en within the count erfactual framewo rk that coheres with ho w the wo rd seems to b e used in p ractice. In this pap er w e will consider v arious deﬁnitions of a confoun der pro- p osed either formally or informally b y a n um b er of prominent statisticians and epidemiologists. F or eac h p oten tial deﬁnition w e will consider the prop- erties satisﬁed b y the ca ndid ate deﬁnition. S p eciﬁcally , w e state and p ro v e a num b er of pr op ositions sh o wing whether under ea c h candidate deﬁnition (i) con trol for all “confounders” suﬃces to cont rol for “confounding” and (ii) whether eac h confounder in some con text helps eliminate or r educe con- founding bias. As we w ill see b elo w, only one candid ate deﬁnition of those considered s atisﬁes b oth pr op erties. W e consider also the implications that hold b et we en th e v arious d eﬁnitions themselv es. 2. Notation and framew ork. W e let A denote an exp osure, Y the out- come, and w e will use C , S and X to denote particular pre-exp osu re co - v ariates or sets of co v ariates (that ma y or ma y not b e measured). As noted in th e p enultimate section of the p ap er, the restriction to pre-exp osure co- v ariates could, in th e con text of causal diagrams [P earl ( 1995 , 2009 )], b e replaced to that of nondescendents of exp osure A . Within the counterfac - tual or p oten tial o utcomes framewo rk [Neyman ( 1923 ), Rubin ( 1978 )], w e let Y a denote the p otentia l outcome for Y if exp osure A were set, p ossibly CONFOUNDER D EFINITION 3 con trary to fact, to the v alue a . If the exp osure is b in ary , the a verag e causal eﬀect is giv en b y E ( Y 1 ) − E ( Y 0 ). Note that the p oten tial outcomes n otatio n Y a presupp oses that an individ u al’s p oten tial outco me do es not dep end on the exp osures of other individu als. This assumption is sometime s referred to as SUT V A, the stable un it treatmen t v alue assumption [Rubin ( 1990 )] or as a no-in terference assump tion [Cox ( 1958 )]. W e u se the notation E ⊥ ⊥ F | G to denote that E is in dep endent of F conditional on G . F or exp osure A and outcome Y , we say there is no con- founding conditional on S (or that the eﬀect of A on Y is unconfounded giv en S ) if Y a ⊥ ⊥ A | S . W e will refer to an y suc h S as a suﬃcient set or a suﬃcien t adjustment s et. If the eﬀect of A on Y is unconfoun ded given S , then the causal eﬀect can b e consisten tly estimated b y E ( Y 1 ) − E ( Y 0 ) = P s { E ( Y | A = 1 , s ) − E ( Y | A = 0 , s ) } pr( s ) [Rosen baum and Rubin ( 1983 )]. W e will sa y th at S = ( S 1 , . . . , S n ) constitutes a min im ally suﬃcien t adjust- men t set if Y a ⊥ ⊥ A | S bu t there is no prop er subset T of S suc h that Y a ⊥ ⊥ A | T , wh er e “prop er su bset” here is un d ersto o d as T b eing a strict subset of the co ordinates of S = ( S 1 , . . . , S n ). Some of the candidate deﬁnitions of a confound er b elo w deﬁne “con- founder” in terms of “confounding” via reference to “su ﬃcien t adjustment sets” or “minimally su ﬃcien t adjustment sets.” Suc h deﬁnitions giv e con- ceptual priority to “c onfoun d ing,” as has generally b een done in th e ca usal inference literature [Greenland and Robins ( 1986 ), Greenland and Morgen- stern ( 2001 ), Hern´ an ( 2008 )]. Often after formal deﬁn itions of “confoundin g” are give n, a “confound er” is deﬁned as a deriv ative and sometimes in formal concept. F or example, in pap ers b y Greenland, Pearl and Robin s ( 1999 ) and Greenland and Morgenstern ( 2001 ), formal deﬁnitions are giv en for “con- founding” and th en a “confounder” is simply describ ed as a v ariable that is in some sense “resp onsible” [Greenland , Robins and Pe arl ( 1999 ), page 33] for confoun d ing. Although priorit y arguably has and should b e giv en to the concept of “confounding” o ver “co nfoun der,” applied researc hers will ofte n use the w ord “confounder” to refer to a single v ariable that is p erhaps a mem b er of a suﬃcien t adj ustmen t set bu t do es n ot by itself constitute a suﬃcien t adjustmen t set and this raises the question of whether this use of “confounder” c an b e giv en a coherent deﬁ nition within the count erfactual framew ork. Most of the deﬁnitions a nd prop erties we discuss m ake reference only to coun terfactual outcomes. Ho w ev er, one of the deﬁnitions and sev eral prop o- sitions mak e r eference to causal diagrams. W e will th us restrict atten tion in this pap er to causal d iagrams. W e review concepts and deﬁnitions for causal diagrams in the App end ix ; the reader can also consult P earl ( 1995 , 2009 ). F or exp ository pu rp oses we follo w Pearl ( 1995 ), but the results in the pap er are equally applicable to all of the alternativ e graphical causal mo dels consid ered, for example, b y Robins and Ric h ardson ( 2010 ). I n sh ort, follo wing P earl ( 1995 ), a causal diagram is a v ery general d ata generat- 4 T. J. V AN DER WEELE AND I. SHPITSER ing pro cess corresp onding to a set of nonparametric structur al equations where eac h v ariable X i is give n by its nonparametric stru ctural equation X i = f i ( pa i , ε i ), where pa i are the paren ts of X i on the graph and the ε i are m utually indep end ent suc h that the structural equations encod e one-step ahead coun terfactual relationships amo ng the v ariables with other coun ter- factuals giv en by recur siv e sub stitution [P earl ( 1995 , 2009 )]. The assump tion of “faithfulness” is said to b e satisﬁed if all of th e cond itional indep en dence relationships among the v ariables are implied b y the structure of the graph; see the App end ix for further details. A bac kdo or p ath from A to Y is a path to Y wh ic h b egins with an edge in to A . P earl ( 1995 ) show ed that if a set of pre-exp osure co v ariates S blo c ks all bac kdo or paths fr om A to Y , th en the eﬀect of A on Y is u nconfounded given S . The d eﬁnitions g ive n b elo w will be stated formally in terms of p oten tial outcomes and causal diagrams. It is assumed that th ere is an und erlying causal diagram w hic h may con tain b oth measured and un measured v ari- ables; all v ariables considered in the deﬁnitions are v ariables on the dia- gram. Whether a v ariable satisﬁes the criteria of a particular d eﬁnition will b e relativ e to the causal diagram. In Section 6 we will consider settings with m ultiple causal d iagrams where one d iagram may hav e v ariables absen t on another. 3. C andidate deﬁn itions for a confounder. Here we giv e a num b er of candidate deﬁn itions of a confound er motiv ated b y statemen ts made in the metho dological literature. W e will cite sp eciﬁc statemen ts from the m etho d- ologic literature; we do not n ecessarily b eliev e these statemen ts were in- tended as formal d eﬁnitions of a “confounder” b y th e authors cited. W e simply use these statemen ts to motiv ate the candidate deﬁnitions. As noted ab o v e, w e b eliev e statemen ts ab out “confounders,” as opp osed to “confound - ing,” ha ve generally b een u sed only in formally and in tuitive ly . As already noted, the traditional conception of a confounder in statistics and epidemiology has b een a v ariable associated with b oth the treatmen t and the outcome. Miettinen ( 1974 ) n otes that whether s uc h asso ciations hold will dep end on what other v ariables a re con trolled fo r in a n analysis. This motiv ates our ﬁ rst candidate deﬁnition for a confounder. Definition 1. A p re-exp osure co v ariate C is a confoun der for the eﬀect of A on Y if there exists a set of pre-exp osure co v ariates X suc h th at C 6⊥ ⊥ A | X and C 6⊥ ⊥ Y | ( A, X ) . Deﬁnition 1 is essen tially a generalizatio n of the traditi onal conceptual- ization of a confounder. P earl ( 1995 ) sh ow ed that if a set of pr e-exp osure co v ariates X blo cks all bac kdo or paths fr om A to Y , then the eﬀect of A on Y is unconfound ed CONFOUNDER D EFINITION 5 giv en X . Hern´ an ( 2008 ) accordingly sp eaks of a confounder as a v ariable that “can b e used to blo c k a b ac kdo or path b et wee n exp osure and outcome” (page 355). A similar deﬁnition of a confounder is giv en in Greenland and P earl [( 2007 ), page 152] and in Glymour and Greenland [( 2008 ), page 193]. This motiv ates a second candidate deﬁn ition. Definition 2. A p re-exp osure co v ariate C is a confoun der for the eﬀect of A on Y if it blo c ks a b ac kdo or path from A to Y . The second deﬁnition is p erhaps one that wo uld arise most naturally within the con text of causal diagrams; the deﬁnition itself of course presup - p oses a framework of causal d iagrams or v arian ts thereof [Spirtes, Glymour and Sc heines ( 1993 ), Da wid ( 2002 )]. P earl ( 2 009 ) sp eaks of a confound er as “a v ariable that is a mem b er of ev ery suﬃcient [adjustment ] set” (page 195), that is, control for it must b e necessary . Lik ewise, Robins and Greenland ( 1986 ) write, “W e will call a co v ariate a confoun d er if estimators whic h are n ot adjusted f or the co v ariate are biased” (page 393) and H ern´ an ( 2008 ) sp eaks of a confound er as “an y v ariable that is necessary to eliminate the bias in the analysis” (p age 35 7). Note that a v ariable is a m em b er of every suﬃcien t adjustment set if and only if it is a memb er of ev ery m inimal su ﬃcien t adju stmen t set. Th is motiv ates our third candidate deﬁnition. Definition 3. A p re-exp osure co v ariate C is a confoun der for the eﬀect of A on Y if it is a member of ev ery min imally suﬃcien t adjus tment set. Deﬁnition 3 captures th e notio n that con trolling for a confounder migh t b e necessary to eliminate b ias. The deﬁnition mak es reference to “ev ery minimally s u ﬃcien t adjustment set;” this w ill b e relativ e to a particular causal diagram, a p oint to which we will return b elo w. Klein baum, Kupp er and Morgenstern ( 1982 ), in a textb o ok on ep idemi- ologic r esearc h, ga ve as a deﬁnition of a “confound er” a v ariable that is “a mem b er of a suﬃcient confoun der group” where a suﬃcien t confounder group is deﬁ ned as “a minimal set of one or more risk factors w hose si- m ultaneous con trol in the analysis will correct for join t confoundin g in the estimation of the eﬀect of in terest” (page 276). Klein baum, K upp er and Morgenstern ( 1982 ), ho wev er, deﬁ ne “confound ing” in terms of association rather th an coun terfactual indep endence. As a v ariant of the Klein baum, Kupp er and Mo rgenstern prop osal, we could r etain the deﬁnition “a mem- b er of a minimally su ﬃcien t adjustment set” bu t u se the counterfactual deﬁnition of “confounding.” Th is motiv ates the four th candidate deﬁn ition. Definition 4. A p re-exp osure co v ariate C is a confoun der for the eﬀect of A on Y if it is a member of some minimally suﬃcient adjus tmen t set. 6 T. J. V AN DER WEELE AND I. SHPITSER Deﬁnition 4 can b e restat ed as follo ws: a p re-exp osure co v ariate C is a confounder for the eﬀect of A on Y if th er e exist s a set of p re-exp osure co- v ariates X (possib ly emp ty) suc h that Y a ⊥ ⊥ A | ( X, C ) but there is no p rop er subset T of ( X , C ) suc h that Y a ⊥ ⊥ A | T . Robins and Morgenstern ( 1987 ) and Da wid ( 2002 ) like wise conceiv e of a confoun der in te rms of the presence or absence of confound in g in suc h a wa y that coincides with Deﬁn ition 4 when there is a single confoun d er; wh en there are m ultiple sets that are suﬃcient or sets th at are suﬃ cien t but n ot m inimally suﬃcien t, it is not clear ho w the d eﬁnition of Da wid ( 2002 ) generalizes; the deﬁnitions of Robins and Morgenstern ( 1987 ) can b e adapted t o coincide with Deﬁnition 4 . Robins and Morgenstern [( 1987 ), S ection 2H] say that C is a confounder condi- tional on F if causal eﬀects are computable giv en d ata on C and F , but not on F alone. In the framew ork of Robins and Morgenstern, if one we re to tak e as the (un conditional) deﬁ nition of a confoun der that “there exists some set F such that C is a confounder cond itional on F [in th e sense of Robins and Morgenstern ( 1987 ), Section 2H],” then this wo uld coincide with Deﬁnition 4 . Miettinen and Co ok ( 1981 ) conceiv e of a confound er as an y v ariable that is helpful in reducing bias. Hern´ an ( 2008 ) lik ewise sp eaks of a co nfou n der as “any v ariable that can b e u sed to r educe [confound ing] bias” (page 355). Geng, Guo and F ung ( 2002 ) u se a similar deﬁn ition for confoundin g. As noted by other authors [Greenland and Morgenstern ( 2001 ), Hern´ an ( 2008 )], whether a v ariable is helpful in reducing bias will d ep end on wh at other v ariables are b eing conditioned on in the analysis; a confoun der should b e helpful for reducing bias in some con text. Th is motiv ates our ﬁfth d eﬁnition. Definition 5. A p re-exp osure cov ariate C is a confounder for the ef- fect of A on Y if there exists a set of pre-exp osure cov ariates X suc h that | P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) − { E ( Y 1 ) − E ( Y 0 ) }| < | P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) − { E ( Y 1 ) − E ( Y 0 ) }| . Deﬁnition 5 captures the notion that con trolling for C along with X results in lo w er bias in the estimate of th e causal eﬀect than con trolling for X alone. A num b er of v ariants of Deﬁnition 5 could also b e consid ered. Geng, Guo and F un g ( 2002 ), for example, considered the analogous deﬁnition f or the eﬀe ct of the exp osure on the exp osed rather than the ov erall eﬀect of the exp osu re on the p opulation; one could lik ewise consid er the analogue of Deﬁnition 5 for eﬀects conditional on X rather than standardized o v er X or, alternativ ely , for diﬀeren t measures of eﬀect, for example, risk ratios or od ds ratios rather than causal eﬀects on the diﬀerence scal e. Deﬁn ition 5 , u nlik e other deﬁnitions, is inherently scale-dep endent. T hus, under Deﬁnition 5 , a v ariable C migh t b e a confoun der for Y bu t not for log ( Y ) or vice v ersa. T h is is an imp ortan t limitation of Deﬁnition 5 . Note, how ever, that some authors CONFOUNDER D EFINITION 7 also consid er “c onfoun ding” to be scale-dep endent [Greenland and Robins ( 1986 , 2009 ), Greenland and Morgenstern ( 2001 )] and use “ignorabilit y” to refer to the notion of unconfoundedn ess in the distribution of coun terfactuals as giv en ab ov e. Confounders ha ve also sometimes b een deﬁn ed in terms of empirical col- lapsibilit y [Miettinen ( 1976 ), Breslo w and Day ( 1980 )], that is, if one obtains the same estimate with or w ithout adjustmen t for a v ariable, then it is n ot a confound er. I n the applied literature the app roac h is somet imes encapsu- lated in th e “10 p ercent rule,” that is, discard a cov ariate if adju stmen t for it do es n ot c hange an estimate by more than 10 p ercen t. It is we ll do cument ed in the lite rature that collapsibilit y-based deﬁ n itions do not w ork for all ef- fect measures, suc h as the o dds ratio or haza rd ratio s, for which marginal and conditional ma y diﬀer ev en in the absence of confoun ding [Greenland, Robins and P earl ( 1999 )]. Suc h eﬀect measures are sometimes referred to as noncollapsible. Ho wev er, for at least the risk diﬀerence scale (or the risk r atio scale) a colla psib ility-based d eﬁnition of a confounder could b e en tertained and for completeness w e consider it also here. S uc h a collapsibilit y-based deﬁnition could b e formalized as follo ws. Definition 6. A p re-exp osure cov ariate C is a confounder for the ef- fect of A on Y if there exists a set of pre-exp osure cov ariates X suc h that P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } pr( x, c ) 6 = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ). Deﬁnition 6 , lik e Deﬁnition 5 , is scale-dep endent. Although n ot the fo cus of the present pap er, in the App end ix w e giv e some fur ther remarks on the p ossibilit y of empirical testing for eac h of Deﬁni- tions 1 – 6 and for confounding an d nonconfoun ding more generally . Ho wev er, for the most part, n otions of confound ing and confounders , u nder these six deﬁnitions, are not empirically testable without further exp erimen tal d ata or strong assumptions. 4. Prop erties of a confound er. Language ab out “confounders” o ccurs of course not simply in metho dologic w ork b u t in substan tiv e statistic al and epidemiologic researc h. I n the d esign an d analysis of observ ational studies in the applied literature the task of cont rolling for “confounding” is often con- strued as that of collecting d ata on and cont rolling f or all “confounders.” In this section w e prop ose that when language ab out “confoun ders” is generally used in statistics and epidemiology , t wo things are implicitly presup p osed: ﬁrst, that if one w ere to con trol for all “confounders ,” th en th is w ould suﬃce to con trol for “co nfoun ding” and, second, that co ntrol for a “confound er” will in some sense help to reduce or eliminat e confoun ding b ias. W e w ould prop ose that if a formal d eﬁnition is to b e giv en for a “confounder ,” it should 8 T. J. V AN DER WEELE AND I. SHPITSER in some sense satisfy these t wo prop erties. If it d o es not, it arguably do es not cohere with what is typica lly presupp osed when language ab out “con- founders” is used in pr actice. W e giv e a formalization of these t w o pr op erties and in the f ollo wing s ection w e will discuss whic h of these t w o prop erties are satisﬁed by eac h of the candidate d eﬁnitions of the previous section. W e could formalize the ﬁrs t p rop erty as follo ws . Pr ope r ty 1. If S co nsists of the set of all confounders f or the eﬀect of A on Y , then th ere is no confounding of the eﬀect of A on Y conditional on S , that is, Y a ⊥ ⊥ A | S . The deﬁn ition make s reference to “all confounders;” to mak e r eference to all suc h v ariables, the domain of the v ariables considered needs to b e sp eciﬁed. The d omain here w ill b e all pre-exp osure v ariables on a p articular causal diagram that qualify as confoun ders according to whatev er deﬁn ition is in view. See Section 6 for some extensions. The second pr op ert y is that control for a confounder sh ould help either reduce or eliminate bias. The reduction and the elimination of bias are n ot equiv alen t and, thus, w e will form ally giv e t wo alternativ e p rop erties, 2A and 2B . Pr ope r ty 2A. If C is a confound er for the eﬀect of A on Y , then there exists a set of pr e-exp osure co v ariates X (p ossibly empt y) such that Y a ⊥ ⊥ A | ( X, C ) but Y a 6⊥ ⊥ A | X . Pr ope r ty 2B. If C is a confounder for the eﬀect of A on Y , then there exists a set of pr e-exp osure co v ariates X (p ossibly empt y) such that | P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) − { E ( Y 1 ) − E ( Y 0 ) }| < | P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) − { E ( Y 1 ) − E ( Y 0 ) }| . Prop erty 2A captures that not ion that in some con text, that is, condi- tional on X , the co v ariate C helps eliminate bias. Prop ert y 2B captures the notion that in some con text, that is, cond itional on X , the co v ariate C h elps reduce bias. Note th at Prop ert y 2B , lik e Deﬁn ition 5 , is inh eren tly scale- dep end ent and in this sens e p erhaps less fundamenta l than Pr op ert y 2A . F or no w w e simply prop ose that for a candidate deﬁnition of a confounder to adequately captur e th e intuitiv e sense in whic h the w ord is used, it shou ld satisfy Prop erty 1 and sh ou ld also satisfy either Prop erty 2A or 2B . It wo uld b e p eculiar if a confounder w ere deﬁned in a w a y that it did not satisfy these t w o prop erties. In the next section we consider whether eac h of the candidate deﬁnitions, Deﬁnitions 1 – 6 , satisfy P r op erties 1 , 2A and 2B . Of cours e, one p ossible outcome of this exercise is that n one of t he candidate deﬁnitions satisfy Prop erty 1 and either Prop erties 2A or 2B (or ev en that n o candidate deﬁnition could). Ho we ve r, as we will see in the n ext section, this tur ns out not to b e the case. CONFOUNDER D EFINITION 9 Fig. 1. Deﬁnition 1 do es not satisfy Pr op erty 2A or 2B . 5. Prop erties of the candidate deﬁnitions. Deﬁnition 1 w as a generaliza- tion of the traditional epid emiologic conception of a confoun der as a v ariable asso ciated with exp osure and outcome. F or this deﬁnition we ha ve the fol- lo wing resu lt. Pr opo s ition 1. Under faithfulness, for every c ausal diagr am, Deﬁni- tion 1 satisﬁes Pr op erty 1 . Deﬁnition 1 do es not satisfy Pr op erties 2A or 2B . Pr oof . W e ﬁrst sho w that Deﬁnition 1 satisﬁes Prop ert y 1 in faithful mo dels. Let G ∗ = G Nd( A ) ∪ An( Y ) b e the subgraph of G that has only the no d es in N d( A ) or An( Y ); see the App end ix . Let P a ∗ b e the subset of P a( A ) in G ∗ suc h th at ev ery elemen t P ∈ P a ∗ con tains some path in G ∗ to Y not through A . Since we consider faithfu l mo dels, we can u se d-connectedness to repr esent dep end ence. First we note that every elemen t in Pa ∗ satisﬁes Deﬁnition 1 . Indeed, an y elemen t of P a( A ) is d ep endent on A conditioned on an y set. F or any member of P a ∗ , we ﬁx some path π to Y (not thr ou gh A ). W e are now free to p ic k an y set X to m ake this path d-connected (e.g ., w e can pic k the smallest X that op ens all colliders in π ). This set X satisﬁes Deﬁnition 1 for Pa ∗ with resp ect to A and Y . Th us, the set of all n o des in N d( A ) satisfying Deﬁnition 1 will include P a ∗ . Next, w e sho w that any sup ers et of P a ∗ in Nd( A ) will b e a v alid adj u stmen t set f or ( A, Y ). Assum e this is not the case for a p articular S , and ﬁx a b ac kdo or p ath from A to Y whic h is op en giv en S . Th en the ﬁ rst no de on this p ath after A m ust b e in P a ∗ . But this means the path is blo c k ed by S . Our conclusion follo ws. W e no w sho w Deﬁnition 1 d o es not satisfy Prop erties 2A or 2B . Consid er the causal diagram in Figure 1 . The v ariable C 3 is unconditionally asso ciated with A and Y ; th e v ariables C 1 and C 2 are eac h asso ciated with A and Y conditional on C 3 . Th us, under D eﬁnition 1 , all three w ould qualify as “confounders.” Ho wev er, there is no set of p re-exp osure co v ariates X on the graph suc h that con trol f or C 3 helps eliminate or reduce bias. T o see this, note that if X includes C 1 or C 2 , then the eﬀect estimate is unbiase d irresp ectiv e of whether adju s tmen t is made for C 3 . If X includes neither C 1 nor C 2 , then the estimand without adjustment f or C 3 is un biased whereas the estimand adjusted for C 3 is not. Therefore, Deﬁnition 1 do es not satisfy Prop erties 2A or 2B . This completes the pro of.  10 T. J. V AN DER WEELE AND I. SHPITSER Fig. 2. Deﬁnition 2 do es not satisfy Pr op erty 2A or 2B . In tuitiv ely , Deﬁnition 1 d o es not sati sfy Prop erties 2A or 2B b ecause in the causal diagram in Figure 1 , the v ariable C 3 is unconditionally asso ciated with A and Y and thus w ould b e a confounder under Deﬁnition 1 , but con trol for it will only either not aﬀect bias (if con trol is n ot made f or C 1 and C 2 ) or increase bias (if con trol is not made for C 1 and C 2 ). The causal structure in Figure 1 and the b ias resulting from cont rolling for C 3 is sometimes referred to in the literature as “M-bias” or “collider-stratiﬁcatio n” [Greenland ( 2003 ), Hern´ an et al. ( 2002 ), Hern´ an ( 2008 )]. W e note that if faithfulness is violated, Deﬁnition 1 do es not satisfy Prop ert y 1 either [Pearl ( 2009 )]. Under Deﬁnition 2 , a confounder was deﬁned as a p re-exp osure co v ariate that blo c ks a backdoor path fr om A to Y . Pr opo s ition 2. F or every c ausal diagr am, Deﬁnition 2 satisﬁes Pr op- erty 1 . Deﬁnition 2 do es not satisfy Pr op erties 2A or 2B . Pr oof . If S consists of the set of all confounders under Deﬁnition 2 , then this set S will include all pre-exp osu r e co v ariates that b lo c k a bac kdo or path from A to Y . F rom this it follo ws that S b lo cks all bac kdo or paths from A to Y and b y Pearl’s bac kdo or path theorem, the eﬀect of A on Y is unconfound ed giv en S . Thus, Deﬁnition 2 satisﬁes P rop ert y 1 . W e n ow sho w that it do es n ot satisfy Pr op erties 2A and 2B . Consider the causal diagram in Figure 2 . Under Deﬁnition 2 b oth C 1 and C 2 blo c k a b ac kdo or path from A to Y and th us would qualify as confoun ders. How- ev er, for C 2 there is n o set of pre-exp osure co v ariates X on the graph suc h that c ontrol for C 2 helps eliminate since if X = C 1 , there is n o bias with- out cont rolling f or C 2 ; if X = ∅ , there is bias ev en with con trolling f or C 2 . Th us, Deﬁnition 2 do es not satisfy Prop erty 2A . W e no w show that it does not satisfy Prop ert y 2B . Supp ose Figure 2 is a causal diagram for ( C 1 , C 2 , A, Y ) where all v ariables are b inary and supp ose that P ( C 1 = 1) = 1 / 2, P ( C 2 = 1 | c 1 ) = 1 / 5 + 3 c 1 / 5, P ( A = 1 | c 1 , c 2 ) = 1 / 10 + 3 c 1 / 5 + c 2 / 10, P ( Y = 1 | a, c 1 , c 2 ) = 1 / 2 + (1 / 2)( a − 1 / 2) c 1 . One can then verify that E ( Y 1 ) − E ( Y 0 ) = P c 1 ,c 2 { E ( Y | A = 1 , c 1 , c 2 ) − E ( Y | A = 0 , c 1 , c 2 ) } pr( c 1 , c 2 ) = 0 . 25 = P c 1 { E ( Y | A = 1 , c 1 ) − E ( Y | A = 0 , c 1 ) } pr( c 1 ), that E ( Y | A = 1) − E ( Y | A = 0) = 0 . 266 and that P c 2 { E ( Y | A = 1 , c 2 ) − E ( Y | A = 0 , c 2 ) } pr( c 2 ) = 0 . 269. Under De ﬁn ition 2 , C 2 w ould b e c onsidered a confounder sin ce C 2 blo c ks the b ac kdo or p ath A ← C 2 ← C 1 → Y . Ho we ve r, there is no se t X of pr e- exp osure co v ariates su c h that | P x,c 2 { E ( Y | A = 1 , x, c 2 ) − E ( Y | A = 0 , x, CONFOUNDER D EFINITION 11 Fig. 3. Deﬁnition 3 do es not satisfy Pr op erty 1 . c 2 ) } pr( x, c 2 ) − { E ( Y 1 ) − E ( Y 0 ) }| < | P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr ( x ) − { E ( Y 1 ) − E ( Y 0 ) }| . This is b ecause if X is ta ke n as C 1 , then the expressions on b oth sides of the inequalit y are equal to 0 (co ntrolling for C 2 in addition to C 1 do es not reduce bias); if X is tak en as the emp ty set, w e ha ve | P c 2 { E ( Y | A = 1 , c 2 ) − E ( Y | A = 0 , c 2 ) } pr( c 2 ) − { E ( Y 1 ) − E ( Y 0 ) }| = | 0 . 269 − 0 . 2 50 | = 0 . 019 > 0 . 016 = | 0 . 26 6 − 0 . 250 | = |{ E ( Y | A = 1) − E ( Y | A = 0) } − { E ( Y 1 ) − E ( Y 0 ) }| and again controll ing for C 2 do es not reduce (but rather in creases) bias. Deﬁn ition 2 thus do es not satisfy Prop erty 2B . This completes the pro of.  If w e consider th e causal diagram in Figure 2 , then under Deﬁnition 2 b oth C 1 and C 2 blo c k a backdoor p ath f rom A to Y and th us w ould qualify as confoun d ers. Ho we ve r, for C 2 there is no set o f pre-exp osure co v ariates X o n the grap h su c h that control for C 2 helps eliminate bias (Prop ert y 2A ) since if X = C 1 , th ere is no b ias w ithout controlli ng for C 2 ; if X = ∅ , there is bias ev en with con trolling f or C 2 . Lik ewise, examples can b e constru cted as in the pro of ab ov e in wh ic h con trol for C 2 will only increase bias, that is, con trol for C 2 do es not h elp reduce bias (Prop erty 2B ). Under Deﬁnition 3 , a confoun d er was d eﬁned as a member of ev ery min- imally suﬃcient adjustment set. Pr opo s ition 3. Deﬁnition 3 do es not sa tisfy Pr op erty 1 . Deﬁnition 3 satisﬁes Pr op erty 2A . Pr oof . Consider the causal d iagram in Figure 3 . Here, either C 1 or C 2 w ould constitute minimally s uﬃcien t adjustmen t sets and thus neither are a mem b er of eve ry m in imally suﬃcien t adjustment set and un der Deﬁnition 3 , neither w ould b e confound ers. If w e con trol for nothing, there is still con- founding for the eﬀect of A on Y and, thus, for Figur e 3 , con trolling for all confounders under Deﬁnition 3 w ould not suﬃce to control for confoundin g. Th us, De ﬁn ition 3 do es not satisfy Prop er ty 1 . If C is a mem b er of ev ery minimally suﬃ cien t adjustment s et, then it is a memb er of a minimally suf- ﬁcien t adj ustmen t set and from this it triviall y follo ws that it satisﬁes the requirement s in Pr op ert y 2A . This completes the pro of.  A v ariable C that is a confounder under Deﬁnition 3 will in general sat - isfy Prop erty 2B as well but ma y not alwa ys b ecause there are cases in whic h there is confounding in th e distrib ution of co unterfact ual outcomes 12 T. J. V AN DER WEELE AND I. SHPITSER conditional on C a nd so that C i s a confound er under Deﬁnition 3 but with the a v erage ca usal eﬀect on th e a dd itiv e scale not confounded [Greenland, Robins and P earl ( 19 99 )]. Intuitiv ely , to s ee th at Deﬁnition 3 do es not sat- isfy Prop erty 1 , consider the causal d iagram in Figure 3 . Here, either C 1 or C 2 w ould constitute minimally suﬃ cien t adju stmen t s ets and th u s neither are a mem b er of ev er y minimally suﬃcient adj u stmen t set. Under Deﬁni- tion 3 , there would thus b e no confounders for the eﬀect of A on Y ; clearly , ho w ev er, if we con trol for nothing, there is s till confounding for th e eﬀect of A on Y . Under Deﬁnition 4 , a confounder was deﬁned as a memb er of some mini- mally suﬃcien t adju stmen t set. Pr opo s ition 4. F or every c ausal diagr am, Deﬁnition 4 satisﬁes Pr op- erty 1 . Deﬁnition 4 satisﬁes Pr op erty 2A . Pr oof . W e will show that Deﬁnition 4 satisﬁes Prop ert y 1 . W e ﬁr st claim that an y min imally s u ﬃcien t adj u stmen t set for ( A, Y ) m ust lie in G An( A ) ∪ An( Y ) , the subgraph of G that has only the no des in Nd( A ) or An( Y ); see the App endix . Assume this is not true, and pick some min - imally suﬃcien t set S with elemen ts outside An( A ) ∪ An( Y ). This means S ∩ (An( A ) ∪ An( Y )) is not suﬃcien t. Note th at any ancestor of a no de in the set An ( A ) ∪ An ( Y ) will also b e in An( A ) ∪ An( Y ). F rom this it follo ws that an y bac kdo or path f rom A to Y whic h has a n o de outside An( A ) ∪ An ( Y ) will require a collider to get back in to An( A ) ∪ An ( Y ). Ho wev er, those colliders m ust b e op en by elemen ts in S . W e ha ve a cont radiction. W e ha v e sho wn that an y minimally suﬃcient adju stmen t set m ust b e a subset of An( A ) ∪ An ( Y ) and, th us, an y v ariable that is a confounder under Deﬁn ition 4 must b e in An( A ) ∪ An( Y ) . Next we n ote that P a( A ) is a su ﬃcien t adju stmen t set for ( A, Y ) . Pic k a minimal sub set P a + of P a ( A ) that is suﬃcient. Our cla im is that eve ry elemen t P in P a( A ) \ Pa + is suc h that P is not connected to Y in the graph ( G An( A ) ∪ An( Y ) ) a except b y paths that are blo ck ed cond itional on Pa + . Assume this is not true, and ﬁx a path ω from P to Y that is not blo c k ed b y P a + in ( G An( A ) ∪ An( Y ) ) a . If this path h as n o colliders, then app endin g ω with the edge P → A prod uces a bac kdo or path from A to Y not block ed b y Pa + , con tradicting the earlier claim that P a + is a v alid adjus tment set. If ω only con tains colliders ancestral of P a + , then either ω has a non- collider triple block ed b y P a + (in which ca se w e are done with that p ath) or ω app end ed with P → A pr o duces a b ac kdo or path op en conditional on Pa + , whic h is a co ntradicti on. If ω con tains c ollider triples ancestral of P a( A ) \ P a + (but not ancestral of Pa + ), let W b e the central no de of the last s uc h collider triple on the path fr om P to Y . Let P ′ b e a mem b er of P a( A ) \ P a + of whic h W is an an cestor. Consider instead of ω a n ew path: A ← P ′ ← · · · ← W app ended with the sub path of ω that begins with the CONFOUNDER D EFINITION 13 no de on ω after W and end s with Y . This p ath either has a noncollider triple blo c ke d b y P a + (in whic h case so do es ω and we are done with ω ) or it is op en conditional on P a + , in wh ic h case w e h av e a con tradiction, or it con tains collider triples ancestral of Y not through P a ( A ). In the last ca se, let Z b e the c entral nod e of the ﬁrst s u c h collider triple on the curr en tly considered p ath fr om A to Y . C onsider instea d a n ew path which app ends a su bpath of the cur ren tly considered path extending f r om A to Z , and th e segmen t Z → · · · → Y . This path has n o blo c ke d c olliders b y construction, and th us must either ha v e a noncollider triple blo c k ed by Pa + (in wh ic h case so do es ω and w e are don e with ω ) or it is op en conditional on P a + , in whic h case we hav e a contradictio n. Our ﬁn al claim is that any sup erset S of P a + in Nd( A ) ∩ (An( A ) ∪ An( Y )) is a v alid adjustment set for ( A, Y ). Assum e this w ere not so and ﬁ x an op en bac kdo or path ρ from A to Y giv en S . The ﬁrst n o de on ρ after A m u st lie either in P a + or in Pa( A ) \ P a + . In the ﬁrst case, the path is blo c ke d. In the second case , we hav e shown ab o ve that ev ery path fr om P a ( A ) \ P a + to Y in ( G An( A ) ∪ An( Y ) ) a is block ed by Pa + and, th us, th e path m ust b e block ed in the second case as w ell. Th er e th us cannot b e an op en bac kdo or p ath from A to Y giv en S and w e ha ve a cont radiction. W e ha ve that P a + is a suﬃcien t adjustmen t set; any v ariable that is a confounder under Deﬁnition 4 will b e a mem b er of Nd( A ) ∩ (An( A ) ∪ An( Y )) and, th us, we hav e that the set of v ariables that are confounders under Deﬁnition 4 w ill b e a suﬃcien t adjustment set. Deﬁnition 4 thus satisﬁes Prop erty 1 . Deﬁnition 4 satisﬁes Prop erty 2A trivially . This completes the pr o of.  A v ariable that is a confoun d er und er Deﬁnition 4 will in general sat isfy Prop erty 2B as well but ma y not alw ays b ecause, as b efore, there may b e confounding in distribution without the a v erage causal eﬀect on the additiv e scale b eing confoun ded. Deﬁnition 4 thus satisﬁes Prop ert y 2A , generally Prop erty 2B , and, as sho wn in the pro of ab o v e, also satisﬁes Prop erty 1 for all causal diagrams. Th at Deﬁnition 4 satisﬁes Prop erty 1 can b e restated as the prop osition that the union of all minimally suﬃcien t adju stmen t sets is itself a suﬃcien t adju stmen t set. Deﬁn ition 4 thus satisﬁes the prop er- ties whic h arguably ough t to b e required for a reasonable d eﬁnition of a “confounder.” Under Deﬁnition 5 , a confounder w as essenti ally d eﬁned as a pre-exp osur e co v ariate, the con trol for which help ed reduce bias. Pr opo s ition 5. Deﬁnition 5 do es not sa tisfy Pr op erty 1 . Deﬁnition 5 satisﬁes Pr op erty 2B b u t not 2A . Pr oof . Supp ose that Y a ⊥ ⊥ A | C , that ( C , A, Y ) are all binary and that P ( C = 1) = 1 / 2, P ( A = 1 | c ) = 1 / 4 + c/ 2, P ( Y = 1 | a, c ) = 4 / 10 − 4 c/ 10 − 3 a/ 10 + 8 ac/ 10. O ne can then verify that E ( Y 1 ) = P c E ( Y | A = 1 , c ) pr( c ) = 14 T. J. V AN DER WEELE AND I. SHPITSER Fig. 4. Deﬁnition 5 do es not satisfy Pr op erty 2A . 3 / 10, E ( Y | A = 1) = 4 / 10, E ( Y 0 ) = P c E ( Y | A = 0 , c ) pr( c ) = 2 / 10, E ( Y | A = 0) = 3 / 10. Th us, | P c { E ( Y | A = 1 , c ) − E ( Y | A = 0 , c ) } pr( c ) − { E ( Y 1 ) − E ( Y 0 ) }| = 0 = |{ E ( Y | A = 1) − E ( Y | A = 0) − { E ( Y 1 ) − E ( Y 0 ) }| and so under Deﬁnition 5 , C w ould not b e a confoun der. The set o f v ariables deﬁned a s confounders u nder Deﬁn ition 5 w ould th us b e empt y . Ho we ve r, it is not the case that adjustment for the e mpty set suﬃces to con trol for confounding since, for example, E ( Y 1 ) = 3 / 10 6 = 4 / 10 = E ( Y | A = 1). T h us, Deﬁnition 5 do es not satisfy Pr op ert y 1 . W e n o w s ho w that Deﬁnition 5 d o es not satisfy Prop ert y 2A . Consider the causal diagram in Figure 4 . Although con trol for C 2 migh t redu ce bias compared to an unadju sted estimate and th us satisfy Deﬁnition 5 with X = ∅ , there is n o X suc h that the eﬀect of A on Y is unconfoun ded cond itional on ( X, C 2 ) bu t n ot on X alone. Thus, Deﬁnition 5 d o es not satisfy Prop erty 2A . Deﬁnition 5 satisﬁes Prop erty 2B trivially . This completes the pr o of.  Deﬁnition 5 do es not satisfy Prop ert y 1 b ecause an unadjusted estimate of the causal risk diﬀerence ma y b e correct, ev en in the pr esence of con- founding, b ecause th e bias due to co nfoun ding fo r E ( Y 1 ) ma y cancel that for E ( Y 0 ); said another wa y , t here may b e co nfoun ding in the distribution of coun terfactual outco mes without their b eing confoun d ing in a particular measure. That Deﬁnition 5 satisﬁes Prop ert y 2B is essential ly em b ed ded in Deﬁnition 5 itself. I n tuitiv ely , to see that Deﬁnition 5 d o es not satisfy Prop erty 2A , consider the causal diagram in Figure 4 . Although con trol for C 2 migh t r educe b ias compared to an un adjusted estimate and th us satisfy Deﬁnition 5 with X = ∅ , there w ould b e no X suc h that the eﬀect of A on Y is unconfoun d ed conditional on ( X , C 2 ) but not on X alone. Under Deﬁnition 6 , a confounder was deﬁned as a pre-exp osur e co v ariate, the con trol for wh ic h in some con text c hanged the eﬀect estimate. Pr opo s ition 6. Deﬁnition 6 do es not sa tisfy Pr op erty 1 . Deﬁnition 6 do es not satisfy Pr op erties 2A or 2B . Pr oof . In the ﬁ rst example in the pro of of P rop osition 5 , the set of confounders under Deﬁnition 6 would b e emp ty b ecause with X empt y w e ha v e P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } pr ( x, c ) = 0 = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } p r( x ). Ho wev er, the eﬀect of A on Y is not uncon- CONFOUNDER D EFINITION 15 founded conditional on the empt y set. Thus, Deﬁnition 6 does not sat isfy Prop erty 1 . W e no w sho w Deﬁnition 6 do es not satisfy Prop erties 2A or 2B . Consider the ca usal diagram in Figure 1 . If we let X d en ote the empt y set, then C 3 will satisfy Deﬁnition 6 and so w ould b e a confoun der under Deﬁnition 6 . Ho w ev er, if we consider Prop erties 2A and 2B , there is no set of pre-exp osure co v ariates X on the graph su c h that control for C 3 helps eliminate or r educe bias. T o see this, note that if X includes C 1 or C 2 , then the eﬀect estimate is un biased irresp ectiv e of whether adjustment is made for C 3 . If X includes neither C 1 nor C 2 , then th e estimand without ad j ustmen t for C 3 is unbiased whereas the estimand adjusted for C 3 is not. Therefore, Deﬁnition 1 does not satisfy Prop er ties 2A and 2B . This completes the pro of.  As with Deﬁnition 5 , Deﬁnition 6 do es not satisfy Pr op ert y 1 b ecause of the p ossibilit y of cancellations: th ere may b e confoun ding in the distribution of coun terfactual outco mes without their b eing confoun d ing in a particular measure. Deﬁnition 6 also f ails to s atisfy P rop erties 2A or 2B . I t fails b ecause of the p ossib ility of “M-bias” or “collider-stratiﬁcati on” structur es as in Figure 1 [Greenland ( 2003 ), Hern´ an et al. ( 2002 )]. Con trolling for a v ariable suc h as C 3 ma y c hange the estimate , bu t it may be that it is the estimate without cont rol for that v ariable (e.g., C 3 in Figure 1 ) that is unbiased. Also, as noted ab o v e, the collapsibilit y-based d eﬁnitions fail for od ds ratio and hazard ratio measures for others reasons, namely , b ecause marginal and conditional measures are not comparable ev en in the absence of confound ing. See Greenla nd , Robins and Pe arl ( 1999 ), Ge ng et al. ( 2001 ) and Geng and Li ( 2002 ) for further discussion of the relationship b et w een, and general nonequiv alence of, confounding and collapsibilit y . Candidate d eﬁnitions for a confound er might th us include Deﬁn ition 4 and, i f the issu e of scale dep end ence is set aside, De ﬁn ition 5 . N ote, ho w- ev er, that a v ariable that satisﬁes Deﬁnition 5 b ut not Deﬁnition 4 w ill nev er help t o eliminate confoun ding bias, only to reduce su c h bias. Suc h a v ari- able reduces bias essen tially b y serving as a pro xy for a v ariable that d o es satisfy Deﬁnition 4 . W e therefore pr op ose that a confou n der b e deﬁn ed as in Deﬁnition 4 , “a pr e-exp osure co v ariate that is a mem b er of some minimally suﬃcien t adjustmen t set ” and that an y v ariable that satisﬁes Deﬁnition 5 but n ot Deﬁnition 4 b e r eferred to as a “sur rogate confounder.” The termi- nology of a “surrogate confounder” or “pro xy confound er” app ears elsewhere [Greenland and Morgenstern ( 2001 ), Hern´ an ( 2008 )]; here w e ha v e provided a formal criterion f or such a “sur rogate confounder.” See Greenland and P earl ( 2011 ) and Ogburn and V and er W eele ( 2012 ) for pr op erties of suc h surrogate confounders. In terestingly , Deﬁnition 4 is closely related to deﬁnitions concerning con- founders prop osed b y Robins and Morgernstern ( 1987 ), though their deﬁni- tions were not un iv ersally adopted by th e epidemiologic communit y o ver the 16 T. J. V AN DER WEELE AND I. SHPITSER ensuing 25 y ears. Robins and Morgenstern ( 198 7 ) we re not principally con- cerned with h o w the word “confounder” is employ ed in practice when used in an u nqualiﬁed sense, b ut rather with whether a particular v ariable w ould still, in some sense, b e a confounder if data w ere also a v ailable on other v ariables. As n oted ab ov e, Robins and Morgenstern [( 1987 ), S ection 2H] sa y that C is a confound er conditional on F if causal eﬀects are computable giv en data on C and F , but not on F alone. In th e framew ork of Robins and Mo rgenstern, if one were to tak e as the (unconditional) deﬁnition of a confounder that “there exists some set F such that C is a confounder con- ditional on F [in the s ense of Robins and Morgenstern ( 1987 ), S ection 2H],” then this would co incide with Deﬁnition 4 . Note that Robins and Mo rgen- stern, in th eir deﬁn itions, in some sense go further than Deﬁnition 4 in ha ving the inv estigator explicitly sp ecify the other v ariables F for wh ic h con trol migh t b e made. This wo uld indeed b e useful in practice, th ough cur- ren t use of la nguage has n ot generally adopted this con v en tion. It migh t in the futu re b e helpfu l to distinguish b et we en the unqu aliﬁed use of th e word “ c onfounder ” as deﬁned in Deﬁnition 4 , and “ c onfounder in the c ontext of ha ving data also on F ” as in Robins and Morgenstern ( 1987 ). The f ormer is arguably ho w the word “confounder” is often used in practice ; the latter w ould b e usefu l in making decisions about data colle ction and confounder con trol. 6. S ome extensions, implications and f urther results. In th e discussion ab o v e w e hav e co nsid ered whether a co v ariate is a “confounder” in an un- conditional sense. Ho wev er, w e m igh t also sp eak ab out whether a v ariable C is a confounder for the eﬀect of A on Y conditional on some s et of co v ari- ates L whic h an in ve stigator is going to condition on irresp ectiv e of whether con trol is made for C . Deﬁnition 4 abov e, th e deﬁn ition for an “uncondi- tional confounder” could b e restated as follo ws: a pr e-exp osure cov ariate C is a confound er for the eﬀect of A on Y if there exists a set of pre-exp osure co v ariates X su c h t hat Y a ⊥ ⊥ A | ( X, C ) bu t there is no prop er sub set T of ( X, C ) suc h that Y a ⊥ ⊥ A | T . The conditional analogue w ould then b e as fol- lo ws: w e sa y that a pre-exp osure co v ariate C is a confounder for the eﬀect of A on Y conditional on L if there exists a set of pre-exp osu r e co v ariates X such that Y a ⊥ ⊥ A | ( X , L, C ) but there is n o prop er subs et T of ( X , C ) suc h th at Y a ⊥ ⊥ A | ( T , L ). Consider again the causal diagram in Figure 3 . Here, C 2 w ould b e a confoun der un der Deﬁnition 4 . Ho we ve r, C 2 is n ot a confounder f or the e ﬀect of A on Y conditional o n L = C 1 . Cons ider once more the c ausal diag ram i n Fi gure 1 . Here, neither C 1 nor C 2 w ould b e a confounder under Deﬁnition 4 . How ever, cond itional on L = C 3 , b oth C 1 and C 2 w ould b e confoun ders. An analogue of Deﬁnition 4 could also b e giv en for a p articular causal parameter of in terest rather th an for the condition of n on confou n ding in distribution Y a ⊥ ⊥ A | S . F or example, C could b e deﬁned to b e a confounder CONFOUNDER D EFINITION 17 for a partic ular c ausal paramete r (e.g ., the causal risk d iﬀerence or causal risk ratio) if there exists a set of pre-exp osure co v ariates X such t he pa- rameter is iden tiﬁed b y adjusting f or ( X, C ) and if for no pr op er sub set, T of ( X , C ) is th e parameter identi ﬁed b y adjusting for T [cf. R ob in s and Morgenstern ( 1987 )]. Ho we ve r, when w e restrict atten tion to particular p a- rameters we reintrodu ce some of the complications w ith cancellations that w ere noted ab ov e. F or example, du e to cancellations, a v ariable C ma y b e a confounder f or the causal risk diﬀerence b ut not for the causal r isk ratio [cf. V anderW eele ( 2012 )]. W e ha v e restricted our atten tion in this pap er th us far to pre-exp osure co- v ariates as p oten tial confounders. W e ha v e done so in order to corresp ond as closely as p ossible to the discuss ion in th e epidemiologic and p oten tial out- comes literatures. Ho wev er, within the conte xt of causal diagrams, a some- what broader range of v ariables could b e considered as “confounders” in that all of the discussion ab o ve is applicable if w e consider all n ondescendent s of A as p oten tial confounders rather than simp ly considering p re-exp osure co- v ariates. Throughout the pap er w e h a v e give n all d eﬁnitions with resp ect to a particular underlying causal diagram. Ho we ver, for a giv en exp osure A and a giv en outcome Y , there will b e multi ple causal diagrams that correctly represent the causal structur e relating these v ariables to o ne an other and to cov ariates. One d iagram ma y b e an ela b oration of another and co nta in v ariables that the other do es not. It is straigh tforw ard to v erify that if a v ariable C is classiﬁed as a confound er und er Deﬁnitions 1 , 2 , 4 , 5 or 6 , then C will also b e a confoun der under eac h of those d eﬁnitions resp ectiv ely on an y expanded causal diagram with additional v ariables. In the case of Deﬁnition 1 , th is is b ecause asso ciations that hold conditional on co v ari- ates X for one diagram will clea rly also h old for the other. In the case of Deﬁnition 2 , if C blo cks a bac kdo or path on on e causal diagram, it will blo c k a b ackdoor pat h o n any larger diagram that also correctly d escrib es the ca usal structure. In the case of Deﬁnition 4 , if there is some minimally suﬃcien t adjustmen t set S of which C is a mem b er, then that set will also b e minimally suﬃcien t on an y larger diagram that also co rrectly describ es the ca usal structure. In the c ase of Deﬁnitions 5 a nd 6 , if the i nequalities in these deﬁ nitions hold for some co v ariate set X for one diagram, they will clearly also hold for the other. Only Deﬁn ition 3 d o es not share this prop- ert y . T o see this, consid er Fig ur e 3 ; if in Figure 3 , w e colla psed o ver C 2 so that the causal diagram in v olv ed only C 1 , A and Y , then C 1 w ould b e a mem b er of ev ery minimally suﬃcient adjustmen t set for th is diagram and th us a confounder under Deﬁnition 3 . Ho we ve r, as we saw ab o v e, C 1 is not a confounder under Deﬁn ition 3 for Figure 3 itself w h ic h in clud es the extra v ariable C 2 . This f ailure is a serious problem with Deﬁnition 3 , but, as we also sa w ab o ve , Deﬁnition 3 suﬀers from other limitations as we ll. 18 T. J. V AN DER WEELE AND I. SHPITSER Sev eral f airly trivial imp lications follo w fr om Deﬁnition 4 an d ma y b e w orth noting for the sak e of completeness. First, if a causal d iagram had a v ariable C with an arro w to log( C ) (or vice versa) and if C w ere a mem b er of a minimally suﬃcient adjustmen t set, then, und er Deﬁnition 4 , both C and log ( C ) would b e considered “confounders,” though log( C ) w ould not b e a confounder conditional on C , and like wise C would not be a confounder conditional on log ( C ). W e b eliev e that this is in accord w ith epidemiologic usage, though it w ould b e p eculiar to co nsider b oth C and log( C ) simul- taneously , just as it wo uld b e p eculiar to include b oth C and log ( C ) on a causal diagram. Second, if a v ariable C is measured with error, taking v alue C ∗ , and if the measuremen t error term ε = C ∗ − C were also represen ted on the causal diagram, then, if C w ere a confound er under Deﬁnition 4 , C ∗ and ε w ould also b oth b e confounders un der Deﬁnition 4 . W e b eliev e this is also in acc ord with standard epidemiolog ic usage of “co nfoun der,” though w e w ould in practice r arely r efer to ε as a “confounder” since w e rarely ha ve access to ε . Once aga in, h o w ev er, neither C ∗ nor ε w ould b e confounders conditional on C . Finally , sup p ose C 1 w ere heigh t in m eters and C 2 w ere w eigh t in kilograms and that C 1 and C 2 together suﬃced to con trol for con- founding but neither alone did; let C 3 = C 1 /C 2 1 b e b o dy mass index (BMI) and su pp ose that cont rolling for C 3 alone s u ﬃced to cont rol for confounding. Then under Deﬁnition 4 , C 1 , C 2 and C 3 w ould eac h b e confounders, thou gh C 3 w ould not be a confound er cond itional on ( C 1 , C 2 ) and likewise neither C 1 nor C 2 w ould b e a confounder conditional on C 3 . Once agai n, w e b eliev e this is in accord with traditional epidemiologic usage of “confounder.” Sev eral implications hold b et w een the d iﬀeren t deﬁnitions of a confounder as stated in the follo wing resu lt. Pr opo s ition 7. On a c ausal diagr am, if a variable is a c onfounder un- der D eﬁnition 3 , then it i s a c onfounder under Deﬁnitions 4 , 2 and 1 ; if under Deﬁnition 4 , then under Deﬁnitions 2 and 1 ; if under D eﬁnition 5 , then under Deﬁnitions 6 and 1 ; if under Deﬁnition 6 , then u nder Deﬁni- tion 1 . N o other implic ations hold witho ut further assumpt ions. Pr oof . On a causal diagram, if a v ariable is a mem b er of ev ery min- imally suﬃcien t a dju stmen t set , it m u st b e a mem b er of a minimally suf - ﬁcien t adjustment set (the existence of a minimally suﬃcien t adjustment set is guaran teed b y th e v ariables lying on a causal diagram). Th us, if a v ariable is a confounder u nder D eﬁnition 3 , then it is a confound er u nder Deﬁnition 4 . Supp ose a v ariable C satisﬁes Deﬁnition 4 , that is, is a mem- b er of some minimally suﬃcient adjus tmen t set ( X, C ), b ut that it do es not satisfy Deﬁnition 2 , th at is, it is not on a b ac kdo or path from A to Y . By Theorem 5 o f Shpitser, V anderW eele and Robins ( 2010 ), ( X, C ) blocks all bac kdo or paths fr om A to Y . If C d o es not lie on a bac kdo or path from A to Y , then X alone w ould b lo c k all bac kdo or paths from A to Y , whic h CONFOUNDER D EFINITION 19 w ould con tradict that ( X, C ) is a minimally suﬃcien t adjustment set. T h us, if C is a confounder under Deﬁnition 4 , it is a confoun der un der Deﬁni- tion 2 . That C b eing a confounder under Deﬁnition 4 implies C is a con- founder under Deﬁnition 1 f ollo w s f rom the con trap ositiv e of Corollary 4.1 of Robins ( 1997 ). I f C is a confound er und er Deﬁnition 5 , it m ust b e a con- founder under Deﬁn ition 6 b ecause the only w a y C can b e a confounder under Deﬁnition 5 is if P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } pr( x, c ) and P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) are not equal. If C is not a confounder under Deﬁnition 1 , then for ev ery X , C is indep endent of Y conditional on ( A, X ) o r of A conditional on X and from this it easil y fol- lo ws that P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr ( x ) and thus that C is not a confoun der un der Deﬁnition 6 . Thus, if C is a confoun der und er Deﬁn ition 6 , it m ust b e a confounder und er Deﬁnition 1 . W e now argue that without further assumptions no other implications b et we en the deﬁnitions hold. Th e v ariable C 2 in Figure 4 could satisfy Def- inition 1 but do es not satisfy Deﬁnition 2 , so Deﬁnition 1 do es not imply Deﬁnition 2 . The v ariable C 3 in Figure 1 could satisfy Deﬁnition 1 , b ut do es not sat isfy Deﬁnitions 3 , 4 or 5 ; thus, Deﬁnition 1 do es not imply Deﬁni- tions 3 , 4 or 5 . If C is a confound er und er Deﬁnition 1 , in general it will b e under Deﬁnition 6 as w ell, but it ma y not b ecause of c ancellations due to scale-dep endence. If C satisﬁes the conditions for Deﬁn ition 2 (i.e., lies on a bac kdo or p ath from A to Y ), it w ill generally do so for Deﬁnitions 1 and 6 bu t ma y fail to do so b ecause of failure or faithfulness or cancellations due to scale-dep end en ce. In the examp le giv en concerning Pr op ert y 2B in P r op osition 2 , the v ariable C 2 in Figure 2 satisﬁed Deﬁn ition 2 bu t do es not satisfy Deﬁnitions 3 , 4 or 5 ; th us, Deﬁnition 2 do es n ot imp ly Deﬁn itions 3 , 4 or 5 . It w as sho wn ab o ve that if C s atisﬁes th e cond itions for Deﬁnition 3 , it will satisfy the conditions for Deﬁnitions 4 , 2 and 1 . If C satisﬁes the conditions for Deﬁn ition 3 , it will generally satisfy the conditions for Deﬁn itions 5 and 6 , but it ma y not do so due to scale-dep endence. It w as shown ab o v e that if C satisﬁes the conditions for Deﬁnition 4 , it will satisfy the conditions for Deﬁnitions 2 and 1 . In Figure 3 , C 2 satisﬁes the conditions for Deﬁn ition 4 but not Deﬁnition 3 , therefore, Deﬁnition 4 do es not imply Deﬁnition 3 . If C satisﬁes the conditions for Deﬁnition 4 , it will generally satisfy the conditions for Deﬁnitions 5 an d 6 , but it ma y not do so due to scale-dep en dence. It wa s sho wn ab o v e that if C satisﬁes the conditions for Deﬁnition 5 , it will satisfy the cond itions for Deﬁnitions 6 and 1 . In th e example giv en concerning Prop erty 2B in P r op osition 5 , the v ariable C 2 in Figure 4 satisﬁed Deﬁnition 5 but do es n ot satisfy Deﬁnitions 2 , 3 or 4 ; th us, Deﬁn ition 5 do es not imply Deﬁnitions 2 , 3 or 4 . 20 T. J. V AN DER WEELE AND I. SHPITSER Fig. 5. L o gic al r elationships that hold among deﬁnitions. Dashe d arr ows indic ate impli- c ations that wil l gener al l y hold but may fail due to sc ale dep endenc e of deﬁnitions. It w as shown ab o v e that if C satisﬁes the conditions for Deﬁnition 6 , it will satisfy the cond itions for Deﬁnition 1 . The v ariable C 2 in Figure 4 could satisfy Deﬁnition 6 bu t do es not satisfy Deﬁnition 2 , so Deﬁnition 6 do es not imply Deﬁnition 2 . The v ariable C 3 in Figure 1 co uld satisfy Deﬁnition 6 , but do es not satisfy Deﬁnitions 3 , 4 or 5 ; th u s, Deﬁnition 6 d o es not imply Deﬁnitions 3 , 4 or 5 .  The implications betw een the deﬁnitions are p lotted in Figure 5 . Those implications that will generally hold but ma y n ot hold b ecause of cancella- tions due to scale-dep endence are indicated with dash ed arrows. The pr op erties th ems elves that w e ha v e been c onsidering also b ear cer- tain relations to one another insofar as it is not d iﬃcult to sh o w that if Prop erty 2A is itself tak en as the deﬁnition of a confound er, then, on causal diagrams, this deﬁnition of a confounder also s atisﬁes Prop ert y 1 . This is b ecause if S denotes the set of all n o des C whic h ob ey Pr op ert y 2A and if S is not a su ﬃcien t adjustment set (so there is op en bac kdo or path π from A to Y ), then if w e let W b e all nondescendants of A other than A and noncolliders n o des on π , if w e c ho ose a no de K on π that do es not contai n descendan ts of A, then it is the case that K satisﬁes Prop ert y 2A , and is not a part of S , which wo uld b e a cont radiction. Although it is the c ase t hat if Prop erty 2A is itsel f tak en as the deﬁni- tion o f a confound er then this deﬁnition also satisﬁes Prop ert y 1 on causal diagrams, this does n ot hold g enerally within a counte rfactual framew ork. Note also that, even on causal diagrams, it is not the case th at Prop ert y 2A implies Prop ert y 1 ; a count erexample to this w as give n in Prop osition 3 for Deﬁnition 3 wh ic h satisﬁes Pr op ert y 2A but not Prop erty 1 . Rather, if Prop erty 2A is itself tak en as the deﬁnition of a confoun der, then, on causal diagrams, this deﬁnition w ould satisfy Prop ert y 1 as w ell. Th is raises the question as to whether Prop ert y 2A itself could be tak en as the deﬁnition of a confoun der, as suc h a deﬁnition w ould satisfy Prop er ty 2A (b y deﬁn i- tion) and Pr op ert y 1 on causal d iagrams. Although such a deﬁnition w ould satisfy Prop er ties 1 and 2A on causal diagrams, it wo uld also follo w from this deﬁnition that C 1 is a confounder for the eﬀect of A on Y in Figure 1 , CONFOUNDER D EFINITION 21 ev en though the eﬀect A on Y is un confounded withou t cont rolling for any co v ariates. This is b ecause if Prop erty 2A is taken as the deﬁnition of a confounder, then C 1 satisﬁes Prop ert y 2A with X take n as C 3 . In general, ho w ev er, if the e ﬀect A on Y is unconfounded without controlli ng for an y co v ariates, we w ould p robably simp ly sa y that th ere are n o confounder s for the unconditional eﬀect of A on Y . 7. C oncluding remarks. The c ausal inference literature has pro vided a formal deﬁnition of confounding with reference to distributions of coun ter- factual outcomes. Th e literature no w righ tly emphasizes the concept of con- founding con trol o v er that of a “confounder.” Nonetheless, the wo rd “con- founder” is o ften still used among applied researc hers and in this pap er we ha v e shown that at least one f orm al counterfact ual-based d eﬁnition coheres with th e wa y in which the word is generally u sed. W e h a v e considered a n umb er of candidate prop osals often arising from more informal statemen ts made in th e literature. W e ha ve considered whether eac h of these deﬁnitions satisﬁes tw o prop erties, namely , (i) th at on any causal diagram, con trol for all confounders so d eﬁ ned w ill con trol for confound in g and (ii) an y v ariable qualifying as a confound er u n der th is criterion will in some context remo v e confounding. Only one of the deﬁn itions co nsider ed here satisﬁed b oth of these t w o prop erties. W e thus prop osed that a pre-exp osure co v ariate C be considered a confoun der for the eﬀect of A on Y if there exists a set of cov ari- ates X su c h th at the eﬀect of the exp osure on the outcome is unconfoun ded conditional on ( X, C ) b u t for no prop er su bset of ( X , C ) is th e eﬀect of the exp osure on the outcome u nconfounded giv en the subset. Equ iv alen tly , a confounder is a “mem b er of a m inimally suﬃ cien t adjus tment set.” T his is closely related to the deﬁnitions concernin g confound ers giv en in Robins and Morgenstern ( 1987 ), th ough Robins and Morgenstern suggest sp ecifying the other v ariables for w hic h con trol migh t b e made as w ell. W e hav e further pro vided a conditional analogue of the prop osed deﬁnition of a confounder; and w e ha v e prop osed that a v ariable that helps reduce bias bu t not elim- inate bias b e referred to as a “ sur rogate confounder.” The deﬁnition of a “confounder” ab o v e is giv en rigorously in terms of coun terfactuals and, w e b eliev e, is also in accord with the intuitiv e prop erties of a “confounder” im- plicitly presupp osed b y practicing statisticians and epidemiologists. F rom a more theoretic al p ersp ectiv e, Deﬁnition 4 , u nlik e the other deﬁnitions, giv es rise to elegan t and u seful results whic h itself lends further su pp ort for its b eing tak en as the deﬁn ition of a confounder. APPENDIX Review of causal diagrams. A directed graph co nsists of a set o f no d es and directed edges among no d es. A p ath is a sequence of distinct no des connected by edges regardless of a rrowhead direction; a directed path is a path whic h follo ws the edges in the d irection in dicated by th e graph’s arr ows. 22 T. J. V AN DER WEELE AND I. SHPITSER A directed graph is acycli c if there is no no de with a sequen ce of directe d edges bac k to itself. The no des with directe d edges int o a no de A are said to b e the paren ts of A ; the no d es in to whic h there are directed edges from A are said to b e th e c hildren of A . W e say that nod e A is an ancestor of no de B if there is a directed path from A to B ; if A is an ancesto r of B , then B is said to be a descend ant of A . If X denotes a set of nodes, then An( X ) w ill d enote th e ancestors of X and Nd( X ) will denote the set of nondescendants of X . F or a give n graph G , and a set of no d es S , the graph G S denotes a subgraph of G con taining only v ertices of G in S and only edges of G b etw een vertice s in S . On the other hand, the graph G S denotes the graph obtained from G b y remo ving all edges with arro wheads p oin ting to S . A nod e is said to b e a collider for a particular path if it is suc h that b oth the pr eceding and subsequent no des on the p ath ha v e d irected edges going in to that no de. A path b et wee n t wo no des, A and B , is said to b e blo c k ed given s ome set of no d es C if either there is a v ariable in C on the path that is not a co llider for the path or if there is a collider on the path suc h that neither the collider itself nor an y of its descendan ts are in C . F or disjoin t sets of no des A , B and C , we say t hat A an d B are d-separated giv en C if eve ry p ath from an y no de in A to any no de in B is blo ck ed giv en C . Directed a cyclic graphs are sometimes used as statistical mo dels to enco d e ind ep endence relationships among v ariables represen ted b y the no des on the graph [Lauritzen ( 1996 )]. The v ariables corr esp onding to the no des on a graph are said to satisfy the global Mark o v prop erty for the directed acycl ic graph (or to ha v e a distribution compatible with the graph) if for any disj oin t s ets of no d es A, B , C w e hav e that A ⊥ ⊥ B | C whenev er A and B are d -separated give n C . The d istribution of some set of v ariables V on the graph is said to b e faithful to the graph if for all disjoint sets A, B , C of V we hav e that A ⊥ ⊥ B | C only when A and B are d-separated give n C . Directed acyclic graphs can b e in terpreted as represen ting causal r ela- tionships. Pe arl ( 1995 ) deﬁn ed a causal d irected acyclic graph as a di- rected acyclic graph with no d es ( X 1 , . . . , X n ) corresp onding to v ariables suc h that eac h v ariable X i is giv en by its non p arametric stru ctural equation X i = f i ( pa i , ε i ), where pa i are the paren ts of X i on the graph and the ε i are m utually indep endent . F or a causal diagram, the nonparametric str uctural equations enco de counterfactual r elationships among the v ariables repre- sen ted on the graph. T he equations themselv es represent one-step ahead coun terfactuals with other counte rfactuals giv en b y recur s iv e substitution [see Pe arl ( 2009 ) for fu r ther discu s sion]. A causal directed acyclic graph deﬁned b y nonparametric structural equ ations satisﬁes the global Mark o v prop erty as stated ab ov e [P earl ( 2009 )]. The requirement that the ε i b e m utually indep endent is essen tially a r equ iremen t that there is no v ariable absen t from the graph whic h, if included on the graph , w ould b e a parent of t w o or more v ariables [P earl ( 1995 , 2009 )]. Throughout w e assume the CONFOUNDER D EFINITION 23 exp osure A consists of a sin gle no de. A backdoor p ath from A to Y is a path t o Y whic h begins with an edge in to A . A set of v ariables X is said to satisfy the b ac kdo or path criterion with resp ect to ( A, Y ) if no v ariable in X is a descendan t of A and if X blo cks all bac kdo or p aths fr om A to Y . Pea rl ( 1995 ) sho wed that if X satisﬁes the bac kdo or path criterion with resp ect to ( A, Y ), then the eﬀect of A on Y is unconfounded giv en X , that is, Y a ⊥ ⊥ A | X . Empirical testing for confound ers and confounding. The abs en ce of con- founding conditional on a set of co v ariates S , that is, Y a ⊥ ⊥ A | S , is not a prop erty that can b e tested empirically with data. One m ust rely on sub j ect matter kno wledge, whic h ma y sometimes tak e the form of a causal diagram. Nonetheless, a few things can b e said ab out empirical testing concerning confounding and confounders. F or the sak e of completeness, we will con- sider eac h of Deﬁnitions 1 – 6 . It is possib le to v erify empir ically whether a v ariable is a confound er un der Deﬁnition 1 sin ce the deﬁnition r efers to ob- serv ed associations; how ever, it is not p ossible, without further kno wledge, to empirically ve rify that a v ariable do es not satisfy Deﬁnition 1 b ecause a v ariable may satisfy Deﬁnition 1 for some X that in v olv es an u nmea- sured v ariable U . O ne w ould ha ve to kno w that data w ere a v ailable for all v ariables on a causal diagram to emp ir ically verify that a v ariable w as a nonconfounder under Deﬁnition 1 . Because of this, ev en though Deﬁn ition 1 satisﬁes Prop ert y 1 under fait hfu lness, this cannot b e us ed as an empirical test for confoun ding since (i) we cannot empirically v erify that a v ariable is a nonconfound er un der Deﬁn ition 1 and (ii) w e cannot empirically v erify whether faithfulness holds. Without fu rther assum ptions, we cannot empirically ve rify that a v ariable is a confound er or a n on confou n der un d er Deﬁnition 2 b ecause Deﬁnition 2 mak es r eference to bac kd o or paths. Whether a v ariable l ies on a b ackdoor path cannot b e tested empirically without further assumptions; one w ould ha v e to k n o w th e structur e of the un d erlying causal diagram. Lik ewise, for Deﬁnitions 3 and 4 , one wo uld need to kno w all m inimally suﬃcient adju st- men t s ets, which itself w ould require chec king the “no confounding” condi- tion Y a ⊥ ⊥ A | S , whic h is, as noted a b o v e, not empirically te stable; though see b elo w for some qualiﬁcations. F or Deﬁnition 5 , w e could empirically re- ject the inequalit y in Deﬁnition 5 f or observed X if P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) . Ho w- ev er, w e cannot emp irically reject the inequalit y in Deﬁnition 5 for un ob- serv ed X and w e, moreo v er, cannot empirically v erify the inequ ality in Def- inition 5 b ecause E ( Y 1 ) − E ( Y 0 ) will not in general b e empirically iden tiﬁed if there are u nobserve d v ariables. W e can ve rify empiricall y wh ether a v ari- able is a confounder und er Deﬁnition 6 since the deﬁnition refers to only 24 T. J. V AN DER WEELE AND I. SHPITSER observ ed v ariables; h o w ev er, it is not p ossible, w ithout furth er kno wledge, to empirically ve rify that a v ariable do es not satisfy Deﬁnition 6 b ecause a v ariable may satisfy Deﬁnition 6 for some X that in v olv es an u nmea- sured v ariable U . O ne w ould ha ve to kno w that data w ere a v ailable for all v ariables on a causal diagram to emp ir ically verify that a v ariable w as a nonconfounder und er Deﬁnition 6 . Because of this w e cannot empirically v erify that a v ariable is a n onconfounder under Deﬁnition 6 . Determining whether a v ariable is a confound er requires m aking un testable assumptions. The only real progress that can b e made with emp irical test- ing for confounders is by making other untesta ble assum p tions th at logically imply a te st for assumptions w e care ab out. F or example, supp ose we as- sume we ha v e some set S that we are su re constitutes a suﬃ cient adjustment set. In this case, we can sometimes remo ve v ariables as unnecessary for con- founding con trol. In p articular, Robins ( 1997 ) sho wed that if w e knew that for co v ariate sets S 1 and S 2 w e had that Y a ⊥ ⊥ A | ( S 1 , S 2 ), then we would also ha v e that Y a ⊥ ⊥ A | S 1 if S 2 can be decomp osed into tw o disjoint sub sets T 1 and T 2 suc h that A ⊥ ⊥ T 1 | S 1 and Y ⊥ ⊥ T 2 | A, S 1 , T 1 . Both of th ese latter con- ditions are empirically testable. Ge ng et al. ( 2001 ) pr o vide some analo gous results for the eﬀect of exp osur e on the exp osed. V and er W eele and Shpitser ( 2011 ) note that if for co v ariate set S we ha v e that Y a ⊥ ⊥ A | S , then if a bac k- w ard selection p ro cedure is app lied to S such that v ariables are iterativ ely discarded that are indep endent of Y conditional on b oth exposu r e A and the mem b ers of S that ha v e not y et b een discarded, th en the resulting set of co v ariates w ill suﬃce for confounding con tr ol. Th ey also show that u nder an additional assumption of faithfulness, if, for co v ariate set S , w e hav e that Y a ⊥ ⊥ A | S , then if a forw ard s electio n pro cedure is ap p lied to S suc h that, starting with th e empt y set, v ariables are iterativ ely added whic h a re asso- ciated with Y conditional on b oth exp osure A and the v ariables that ha v e already b een added, then the resulting s et of co v ariates will suﬃce for con- founding con trol. Note, how eve r, all of th ese results require knowledge that for s ome set S , Y a ⊥ ⊥ A | S , wh ic h is not itself empirically testable without exp erimenta l in terven tions. Ac kno wledgment s. Th e authors th ank Sander Gr eenland, James Robins and Miguel Hern´ an for helpful comments on this pap er. REFERENCES Barnow, B . S . , Cain, G. G. and Goldberger, A. S. (1980). Issues in the analysis of selectivity b ias. In Evaluation Studies ( E. Str omsdorfer and G. F arkas , eds.) 5 . Sage, San F rancisco. Breslow , N. E. and Da y, N. E. ( 1980). Statistic al Metho ds in Canc er R ese ar ch, V ol. 1: The Analysis of Case–Contr ol St udies . I nternational Agency for R esearc h on Cancer, Lyon, F rance. CONFOUNDER D EFINITION 25 Co x, D. R. (1958). Planning of Exp eriments . Wiley , New Y ork. MR0095561 Da wid, A. P. (2002). Inﬂuence diagrams for causal modeling and inference. Int. Statist. R ev. 70 161–189. Geng, Z. , Guo, J. and Fung, W.-K. (2002). Criteria for confound ers in ep idemiologic al studies. J. R. Stat. So c. Ser. B Stat. Metho dol. 64 3–15. MR1881841 Geng, Z. and Li, G . (2002 ). Conditions for non-confounding and co llapsibilit y with- out know ledge of completely constructed causal diagrams. Sc and. J. Stat. 29 169–181. MR1894389 Geng, Z. , Guo, J. , La u, T. S. and Fung, W .-K. (2001 ). Confounding, h omogeneit y and collapsibility for causal eﬀects in epidemiolog ic studies. Statist. Sinic a 11 63– 75. MR1820001 Gl ymour, M. M. and Greenland, S. (2008). Causal diagrams. In Mo dern Epidemiolo gy , 3rd ed. ( K. J. Rothman , S. Greenland and T. L. Lash , eds.) 12 . Lippincott Williams and Wilkins, Philadelphia, P A. Greenland, S. (2003). Quantifying biases in causal models: Classical confounding vers us collider-stratiﬁcation bias. Epidemiolo gy 14 300– 306. Greenland, S. and Morgenstern, H. (2001). Confounding in h ealth researc h . Annual R ev. Public He alth 22 189–212. Greenland, S. , Pearl, J. and Ro bins, J. M. (1999). Causal diagrams for epidemiologic researc h. Epidemiolo gy 10 37–48. Greenland, S. and Pearl, J. (2007). Causal diagrams. In Encyclop e dia of Epidemiolo gy ( S. Boslaugh , ed.) 149–156. Sage, Thousand Oaks, CA. Greenland, S. and Pearl, J. (2011). Ad justments and their consequences—colla psibility analysis using graphical mo dels. Internat ional Statistic al R eview 79 401–426. Greenland, S. and Robins, J. M. (1986). Identiﬁability , exc hangeability , and ep idemi- ological confound ing. Int. J. Epidemiol. 15 413–4 19. Greenland, S. , Robins, J. M. and Pearl, J. (1999). Confounding and colla psibility in causal inference. Statist. Sci. 14 29–46. Greenland, S. and Rob ins, J. M. (2009). Identiﬁabilit y , exchangeabilit y and confound - ing revisited. Epidemiol. Persp e ct. Innov. 6 4. Hern ´ an, M. A. (2008). Confounding. In Encyclop e dia of Quantitative Risk Assessment and A nalysis ( B. Everitt and E. Melnick , eds.) 353–362. Wiley , Chichester, UK. Hern ´ an, M. A. , Hern ´ anez-D ´ ıaz, S . , W erler, M. M . and Mitchell, A. A. (2002). Causal k n o wledge as a prerequisite for confounding ev aluation: An application to birth defects ep id emiology. A meric an Journal of Epidemiolo gy 155 176–184. Imbens, G. W. (2004). Nonparametric estimation of av erage treatment eﬀects under ex- ogeneit y: A review. R ev. Ec onom. Statist. 86 4–29. Kleinbaum, D. G. , Ku pper, L. L. and Morgenstern, H. (1982). Epidemio- lo gic R ese ar ch: Principles and Quantitative Met ho ds . Lifetime Learning Publications [W adsw orth], Belmon t, CA. MR0684361 Lauritzen, S. L. (1996). Gr aphic al Mo dels . Ox ford Univ. Press, N ew Y ork. Miettinen, O. S. (1974). Confounding and eﬀect mod iﬁcation. Am. J. Epidemiol. 100 350–353 . Miettinen, O. S. (1976). Stratiﬁcation by a multiv ariate confounder score. Am. J. Epi- demiol. 104 609–62 0. Miettinen, O. S. and Cook, E. F. (1981). Confounding: Essence and detection. Am. J. Epidemiol. 114 593–60 3. Morabia, A. (2011). History of th e mo dern epidemiological concept of confounding. J. Epidemiol. Community He alth 65 297–300. 26 T. J. V AN DER WEELE AND I. SHPITSER Neyman, J. (1923). Sur les applications de la thar des probabilities aux exp eriences Agar- icales: Essa y des principle. Excerpts rep rinted (1990) in Engli sh (D. Dabrow sk a and T. Sp eed , trans.). Statist. Sci. 5 463–47 2. Ogburn, E. L. and V an derWeele, T. J. (2012). On the nondiﬀerential misclassiﬁcation of a binary confounder. Epidemi olo gy 23 433–439. Pearl, J. ( 1995). Causal diagrams for empirical research. Biometrika 82 669–71 0. MR1380809 Pearl, J. (2009). Causality: Mo dels, R e asoning, and Infer enc e , 2nd ed. Cambridge Univ. Press, Cambridge. MR2548166 Ro bins, J . (1992). Estimation of th e time-dep en dent accelerated failure t ime model in the p resence of confounding factors. Biometrika 79 321–334 . MR1185134 Ro bins, J. M. (1997 ). Causal inference from complex longitudinal data. In L atent V ariable Mo deling and Appli c ations to Causality (Los Angeles, CA, 1994) ( M. Berkane , ed.). L e ctur e Notes in Statistics 120 69–117 . Springer, New Y ork. MR1601279 Ro bins, J. M. and Greenland , S. ( 1986). The role of mo del selection in causal inference from nonex p erimen tal data. Am. J. Epidemiol. 123 392–40 2. Ro bins, J . M . and Morgenstern, H. (198 7). The foundations of confounding in epi- demiology . Comput. Math. Appl. 14 869–91 6. MR0922790 Ro bins, J. M . and Richard son, T. S. (2010). Alternative graphical causal mod els and the identiﬁcation of direct eﬀ ects. In Causality and Psychop atholo gy: Finding the Deter- minants of Disor ders and Their Cur es ( P. E. Sh rout , K. M. Keyes and K. Ornstein , eds.) 103–158. Oxford Univ. Press, New Y ork. Ro senbaum, P. R. and Rubin, D. B. (1983). The central role of the p ropen sity score in observ ational studies for causal eﬀects. Biometrika 70 41–55. MR0742974 Rubin, D. B. (1978). Bay esian inference for causa l eﬀects: The role of randomization. Ann . Statist. 6 34–58. MR0472152 Rubin, D. B. (1990). F ormal mo des of statistical in ference for causal eﬀects. J. Statist. Plann. Infer enc e 25 279–292. Shpitser, I. , V anderWee le, T. J. and Robins, J. M. (2010). On the v alidit y of cov ari- ate adjustment for estimating causal eﬀects. In Pr o c e e dings of the 26th Confer enc e on Unc ertainty and A rtiﬁcial Intel l igenc e 527–53 6. AUAI Press, Corv allis, OR. Spir tes, P. , Gl y mour, C. and Scheines, R. (1993). Causation, Pr e diction, and Se ar ch . L e ctur e Notes in Statistics 81 . Springer, New Y ork. MR1227558 V anderWee le, T. J. (2012). Confounding and eﬀect modiﬁcation: Distribution and mea- sure. Epidemiolog ic Metho ds 1 55–82 . V anderWee le, T. J. and Shpitse r, I. (2011). A new criterion for confounder selection. Biometrics 67 1406– 1413. MR2872391 Dep a r tm ents of Epidemiology and Biost a tistics Har v ard School of Public Hea lt h 677 Hunting ton A venue Boston, Massachusetts 02115 USA E-mail: tv anderw@hsph.harv ard.edu URL: htt p://www.hsph.harv ard.edu/facult y/ty ler- v anderweele/ Dep a r tm ents of Epidemiology Har v ard School of Pub lic Heal th 677 Huntington A venue Boston, Massachusetts 02115 USA E-mail: shpitse@hsph.harv ard. edu

On the definition of a confounder

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment