On the definition of a confounder

The causal inference literature has provided a clear formal definition of confounding expressed in terms of counterfactual independence. The literature has not, however, come to any consensus on a formal definition of a confounder, as it has given pr…

Authors: Tyler J. V, erWeele, Ilya Shpitser

The Annals of Statistics 2013, V ol. 41, No. 1, 196–220 DOI: 10.1214 /12-AOS1058 c  Institute of Mathematical Statistics , 2 013 ON THE DEFINITION OF A CONF OUNDER 1 By Tyler J. V anderWe ele and Il y a Shpitser Harvar d University The causal inference literature has provided a clear formal defi- nition of confound in g expressed in terms of counterf actual ind ep en- dence. The literature has not, ho w ever, come to an y consensus on a formal defi nition of a confounder, as it h as given priorit y to the concept of confounding o ver th at of a confounder. W e consider a num b er of cand idate d efinitions arising from v arious more informal statements ma de in the literature. W e consider the prop erties satis- fied by each candidate defin ition, p rincipally focusing on (i) wheth er under the candidate definition control for all “confounders” suffices to con trol for “confounding” and (ii) whether ea ch confounder in some context helps eliminate or reduce confounding bias. Several of the candidate definitions do not hav e th ese tw o prop erties. Only one candidate definition of those considered satisfies b oth prop erties. W e prop ose that a “confound er” b e defin ed as a p re- exp osure co v ariate C for which there ex ists a set of other cov ariates X suc h that effect of the ex p osure on the outcome is unconfounded conditional on ( X , C ) but such t h at for n o proper subset of ( X, C ) is the effect of th e exp o- sure on the outcome unconfounded giv en the subset. W e also p ro vide a conditional analogue of the ab ov e defin ition; and we p rop ose a v ari- able that helps redu ce bias but not eliminate bias b e referred to as a “surrogate confounder.” These definitions are closely related to those giv en b y R obins and Morg enstern [ Comput. Math. Appl. 14 (1987) 869–916 ]. The implications that hold among the v arious candidate definitions are discussed. 1. In tro duction. Statisticians and epid emiologi sts had tr ad itionally con- ceiv ed of a confounder as a pre-exp osure v ariable that w as asso ciated with exp osure and asso ciated also with the outcome conditional on the exp osur e, p ossibly cond itional also on other co v ariates [Miettinen ( 1974 )]. The dev el- opmen ts in causal inference ov er the past tw o decades hav e made clear that Received December 2011; revised S eptember 2012. 1 F unded by th e National I nstitutes of Health, USA. AMS 2000 subje ct cl assific ations. Primary 62A01; secondary 68T30, 62J 99. Key wor ds and phr ases. Causal inference, causal diagrams, counterfa ctual, confounder, minimal sufficiency . This is an electronic reprint of the origina l ar ticle published b y the Institute of Mathematical Statistics in The Annals of S t atistics , 2013, V ol. 4 1, No. 1, 196–22 0 . This reprint differs fro m the o riginal in pagination and t yp ogra phic detail. 1 2 T. J. V ANDER WEELE AND I. SHPITSER this definition of a “confoun der” is inadequate: there can b e pre-exposur e v ariables asso ciated with the exp osur e and th e outcome, the con trol of whic h in tro du ces rather than eliminates bias [Greenland, Pe arl and Robins ( 1999 ), Glymour and G reenland ( 2008 ), P earl ( 20 09 )]. The literature has mo ved a w a y from formal language ab out “confound ers” and instead places the con- ceptual emph asis on “confoundin g.” See Morabia ( 20 11 ) for historical dis- cussion of this p oin t. T he causal inference lite rature has pro vided a formal definition of “confounding” in terms of dep endence of co unterfactual out- comes and exp osure, p ossibly conditional on co v ariates. The absence of con- founding ( indep endence of t he counterfactual outcomes and the exp osure) has b een tak en as the foun d ational assumption f or dra wing causal inferences. Suc h absen ce of confound ing is alternativ ely referred to as “ignorabilit y” or “ignorable treatmen t assignment ” [Rubin ( 1978 )], “exc hangeabilit y” [Green- land and Robins ( 1986 )], “no unmeasured confounding” [Robins ( 1992 )], “selectio n on observ ables” [Barno w, Cain and Goldb erger ( 1980 ), Imb ens ( 2004 )] or “exogeneit y ” [Imben s ( 2004 )]. T o da y , at lea st within the formal metho dological literature on causalit y , language concerning “confound ers” is generally used only inform ally , if at all. The priority that has b een giv en to “confounding” o v er “co nfoun ders” has arguably b rough t clarit y and preci- sion to th e field. Nev er th eless, among practicing statisticians and epidemiol- ogists, language concerning b oth “confounders” and “confound ing” is com- mon. This raises t he question as to whether a f ormal definition of a “c on- founder” can also b e giv en within the count erfactual framewo rk that coheres with ho w the wo rd seems to b e used in p ractice. In this pap er w e will consider v arious definitions of a confoun der pro- p osed either formally or informally b y a n um b er of prominent statisticians and epidemiologists. F or eac h p oten tial definition w e will consider the prop- erties satisfied b y the ca ndid ate definition. S p ecifically , w e state and p ro v e a num b er of pr op ositions sh o wing whether under ea c h candidate definition (i) con trol for all “confounders” suffices to cont rol for “confounding” and (ii) whether eac h confounder in some con text helps eliminate or r educe con- founding bias. As we w ill see b elo w, only one candid ate definition of those considered s atisfies b oth pr op erties. W e consider also the implications that hold b et we en th e v arious d efinitions themselv es. 2. Notation and framew ork. W e let A denote an exp osure, Y the out- come, and w e will use C , S and X to denote particular pre-exp osu re co - v ariates or sets of co v ariates (that ma y or ma y not b e measured). As noted in th e p enultimate section of the p ap er, the restriction to pre-exp osure co- v ariates could, in th e con text of causal diagrams [P earl ( 1995 , 2009 )], b e replaced to that of nondescendents of exp osure A . Within the counterfac - tual or p oten tial o utcomes framewo rk [Neyman ( 1923 ), Rubin ( 1978 )], w e let Y a denote the p otentia l outcome for Y if exp osure A were set, p ossibly CONFOUNDER D EFINITION 3 con trary to fact, to the v alue a . If the exp osure is b in ary , the a verag e causal effect is giv en b y E ( Y 1 ) − E ( Y 0 ). Note that the p oten tial outcomes n otatio n Y a presupp oses that an individ u al’s p oten tial outco me do es not dep end on the exp osures of other individu als. This assumption is sometime s referred to as SUT V A, the stable un it treatmen t v alue assumption [Rubin ( 1990 )] or as a no-in terference assump tion [Cox ( 1958 )]. W e u se the notation E ⊥ ⊥ F | G to denote that E is in dep endent of F conditional on G . F or exp osure A and outcome Y , we say there is no con- founding conditional on S (or that the effect of A on Y is unconfounded giv en S ) if Y a ⊥ ⊥ A | S . W e will refer to an y suc h S as a sufficient set or a sufficien t adjustment s et. If the effect of A on Y is unconfoun ded given S , then the causal effect can b e consisten tly estimated b y E ( Y 1 ) − E ( Y 0 ) = P s { E ( Y | A = 1 , s ) − E ( Y | A = 0 , s ) } pr( s ) [Rosen baum and Rubin ( 1983 )]. W e will sa y th at S = ( S 1 , . . . , S n ) constitutes a min im ally sufficien t adjust- men t set if Y a ⊥ ⊥ A | S bu t there is no prop er subset T of S suc h that Y a ⊥ ⊥ A | T , wh er e “prop er su bset” here is un d ersto o d as T b eing a strict subset of the co ordinates of S = ( S 1 , . . . , S n ). Some of the candidate definitions of a confound er b elo w define “con- founder” in terms of “confounding” via reference to “su fficien t adjustment sets” or “minimally su fficien t adjustment sets.” Suc h definitions giv e con- ceptual priority to “c onfoun d ing,” as has generally b een done in th e ca usal inference literature [Greenland and Robins ( 1986 ), Greenland and Morgen- stern ( 2001 ), Hern´ an ( 2008 )]. Often after formal defin itions of “confoundin g” are give n, a “confound er” is defined as a deriv ative and sometimes in formal concept. F or example, in pap ers b y Greenland, Pearl and Robin s ( 1999 ) and Greenland and Morgenstern ( 2001 ), formal definitions are giv en for “con- founding” and th en a “confounder” is simply describ ed as a v ariable that is in some sense “resp onsible” [Greenland , Robins and Pe arl ( 1999 ), page 33] for confoun d ing. Although priorit y arguably has and should b e giv en to the concept of “confounding” o ver “co nfoun der,” applied researc hers will ofte n use the w ord “confounder” to refer to a single v ariable that is p erhaps a mem b er of a sufficien t adj ustmen t set bu t do es n ot by itself constitute a sufficien t adjustmen t set and this raises the question of whether this use of “confounder” c an b e giv en a coherent defi nition within the count erfactual framew ork. Most of the definitions a nd prop erties we discuss m ake reference only to coun terfactual outcomes. Ho w ev er, one of the definitions and sev eral prop o- sitions mak e r eference to causal diagrams. W e will th us restrict atten tion in this pap er to causal d iagrams. W e review concepts and definitions for causal diagrams in the App end ix ; the reader can also consult P earl ( 1995 , 2009 ). F or exp ository pu rp oses we follo w Pearl ( 1995 ), but the results in the pap er are equally applicable to all of the alternativ e graphical causal mo dels consid ered, for example, b y Robins and Ric h ardson ( 2010 ). I n sh ort, follo wing P earl ( 1995 ), a causal diagram is a v ery general d ata generat- 4 T. J. V AN DER WEELE AND I. SHPITSER ing pro cess corresp onding to a set of nonparametric structur al equations where eac h v ariable X i is give n by its nonparametric stru ctural equation X i = f i ( pa i , ε i ), where pa i are the paren ts of X i on the graph and the ε i are m utually indep end ent suc h that the structural equations encod e one-step ahead coun terfactual relationships amo ng the v ariables with other coun ter- factuals giv en by recur siv e sub stitution [P earl ( 1995 , 2009 )]. The assump tion of “faithfulness” is said to b e satisfied if all of th e cond itional indep en dence relationships among the v ariables are implied b y the structure of the graph; see the App end ix for further details. A bac kdo or p ath from A to Y is a path to Y wh ic h b egins with an edge in to A . P earl ( 1995 ) show ed that if a set of pre-exp osure co v ariates S blo c ks all bac kdo or paths fr om A to Y , th en the effect of A on Y is u nconfounded given S . The d efinitions g ive n b elo w will be stated formally in terms of p oten tial outcomes and causal diagrams. It is assumed that th ere is an und erlying causal diagram w hic h may con tain b oth measured and un measured v ari- ables; all v ariables considered in the definitions are v ariables on the dia- gram. Whether a v ariable satisfies the criteria of a particular d efinition will b e relativ e to the causal diagram. In Section 6 we will consider settings with m ultiple causal d iagrams where one d iagram may hav e v ariables absen t on another. 3. C andidate defin itions for a confounder. Here we giv e a num b er of candidate defin itions of a confound er motiv ated b y statemen ts made in the metho dological literature. W e will cite sp ecific statemen ts from the m etho d- ologic literature; we do not n ecessarily b eliev e these statemen ts were in- tended as formal d efinitions of a “confounder” b y th e authors cited. W e simply use these statemen ts to motiv ate the candidate definitions. As noted ab o v e, w e b eliev e statemen ts ab out “confounders,” as opp osed to “confound - ing,” ha ve generally b een u sed only in formally and in tuitive ly . As already noted, the traditional conception of a confounder in statistics and epidemiology has b een a v ariable associated with b oth the treatmen t and the outcome. Miettinen ( 1974 ) n otes that whether s uc h asso ciations hold will dep end on what other v ariables a re con trolled fo r in a n analysis. This motiv ates our fi rst candidate definition for a confounder. Definition 1. A p re-exp osure co v ariate C is a confoun der for the effect of A on Y if there exists a set of pre-exp osure co v ariates X suc h th at C 6⊥ ⊥ A | X and C 6⊥ ⊥ Y | ( A, X ) . Definition 1 is essen tially a generalizatio n of the traditi onal conceptual- ization of a confounder. P earl ( 1995 ) sh ow ed that if a set of pr e-exp osure co v ariates X blo cks all bac kdo or paths fr om A to Y , then the effect of A on Y is unconfound ed CONFOUNDER D EFINITION 5 giv en X . Hern´ an ( 2008 ) accordingly sp eaks of a confounder as a v ariable that “can b e used to blo c k a b ac kdo or path b et wee n exp osure and outcome” (page 355). A similar definition of a confounder is giv en in Greenland and P earl [( 2007 ), page 152] and in Glymour and Greenland [( 2008 ), page 193]. This motiv ates a second candidate defin ition. Definition 2. A p re-exp osure co v ariate C is a confoun der for the effect of A on Y if it blo c ks a b ac kdo or path from A to Y . The second definition is p erhaps one that wo uld arise most naturally within the con text of causal diagrams; the definition itself of course presup - p oses a framework of causal d iagrams or v arian ts thereof [Spirtes, Glymour and Sc heines ( 1993 ), Da wid ( 2002 )]. P earl ( 2 009 ) sp eaks of a confound er as “a v ariable that is a mem b er of ev ery sufficient [adjustment ] set” (page 195), that is, control for it must b e necessary . Lik ewise, Robins and Greenland ( 1986 ) write, “W e will call a co v ariate a confoun d er if estimators whic h are n ot adjusted f or the co v ariate are biased” (page 393) and H ern´ an ( 2008 ) sp eaks of a confound er as “an y v ariable that is necessary to eliminate the bias in the analysis” (p age 35 7). Note that a v ariable is a m em b er of every sufficien t adjustment set if and only if it is a memb er of ev ery m inimal su fficien t adju stmen t set. Th is motiv ates our third candidate definition. Definition 3. A p re-exp osure co v ariate C is a confoun der for the effect of A on Y if it is a member of ev ery min imally sufficien t adjus tment set. Definition 3 captures th e notio n that con trolling for a confounder migh t b e necessary to eliminate b ias. The definition mak es reference to “ev ery minimally s u fficien t adjustment set;” this w ill b e relativ e to a particular causal diagram, a p oint to which we will return b elo w. Klein baum, Kupp er and Morgenstern ( 1982 ), in a textb o ok on ep idemi- ologic r esearc h, ga ve as a definition of a “confound er” a v ariable that is “a mem b er of a sufficient confoun der group” where a sufficien t confounder group is defi ned as “a minimal set of one or more risk factors w hose si- m ultaneous con trol in the analysis will correct for join t confoundin g in the estimation of the effect of in terest” (page 276). Klein baum, K upp er and Morgenstern ( 1982 ), ho wev er, defi ne “confound ing” in terms of association rather th an coun terfactual indep endence. As a v ariant of the Klein baum, Kupp er and Mo rgenstern prop osal, we could r etain the definition “a mem- b er of a minimally su fficien t adjustment set” bu t u se the counterfactual definition of “confounding.” Th is motiv ates the four th candidate defin ition. Definition 4. A p re-exp osure co v ariate C is a confoun der for the effect of A on Y if it is a member of some minimally sufficient adjus tmen t set. 6 T. J. V AN DER WEELE AND I. SHPITSER Definition 4 can b e restat ed as follo ws: a p re-exp osure co v ariate C is a confounder for the effect of A on Y if th er e exist s a set of p re-exp osure co- v ariates X (possib ly emp ty) suc h that Y a ⊥ ⊥ A | ( X, C ) but there is no p rop er subset T of ( X , C ) suc h that Y a ⊥ ⊥ A | T . Robins and Morgenstern ( 1987 ) and Da wid ( 2002 ) like wise conceiv e of a confoun der in te rms of the presence or absence of confound in g in suc h a wa y that coincides with Defin ition 4 when there is a single confoun d er; wh en there are m ultiple sets that are sufficient or sets th at are suffi cien t but n ot m inimally sufficien t, it is not clear ho w the d efinition of Da wid ( 2002 ) generalizes; the definitions of Robins and Morgenstern ( 1987 ) can b e adapted t o coincide with Definition 4 . Robins and Morgenstern [( 1987 ), S ection 2H] say that C is a confounder condi- tional on F if causal effects are computable giv en d ata on C and F , but not on F alone. In the framew ork of Robins and Morgenstern, if one we re to tak e as the (un conditional) defi nition of a confoun der that “there exists some set F such that C is a confounder cond itional on F [in th e sense of Robins and Morgenstern ( 1987 ), Section 2H],” then this wo uld coincide with Definition 4 . Miettinen and Co ok ( 1981 ) conceiv e of a confound er as an y v ariable that is helpful in reducing bias. Hern´ an ( 2008 ) lik ewise sp eaks of a co nfou n der as “any v ariable that can b e u sed to r educe [confound ing] bias” (page 355). Geng, Guo and F ung ( 2002 ) u se a similar defin ition for confoundin g. As noted by other authors [Greenland and Morgenstern ( 2001 ), Hern´ an ( 2008 )], whether a v ariable is helpful in reducing bias will d ep end on wh at other v ariables are b eing conditioned on in the analysis; a confoun der should b e helpful for reducing bias in some con text. Th is motiv ates our fifth d efinition. Definition 5. A p re-exp osure cov ariate C is a confounder for the ef- fect of A on Y if there exists a set of pre-exp osure cov ariates X suc h that | P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) − { E ( Y 1 ) − E ( Y 0 ) }| < | P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) − { E ( Y 1 ) − E ( Y 0 ) }| . Definition 5 captures the notion that con trolling for C along with X results in lo w er bias in the estimate of th e causal effect than con trolling for X alone. A num b er of v ariants of Definition 5 could also b e consid ered. Geng, Guo and F un g ( 2002 ), for example, considered the analogous definition f or the effe ct of the exp osure on the exp osed rather than the ov erall effect of the exp osu re on the p opulation; one could lik ewise consid er the analogue of Definition 5 for effects conditional on X rather than standardized o v er X or, alternativ ely , for differen t measures of effect, for example, risk ratios or od ds ratios rather than causal effects on the difference scal e. Defin ition 5 , u nlik e other definitions, is inherently scale-dep endent. T hus, under Definition 5 , a v ariable C migh t b e a confoun der for Y bu t not for log ( Y ) or vice v ersa. T h is is an imp ortan t limitation of Definition 5 . Note, how ever, that some authors CONFOUNDER D EFINITION 7 also consid er “c onfoun ding” to be scale-dep endent [Greenland and Robins ( 1986 , 2009 ), Greenland and Morgenstern ( 2001 )] and use “ignorabilit y” to refer to the notion of unconfoundedn ess in the distribution of coun terfactuals as giv en ab ov e. Confounders ha ve also sometimes b een defin ed in terms of empirical col- lapsibilit y [Miettinen ( 1976 ), Breslo w and Day ( 1980 )], that is, if one obtains the same estimate with or w ithout adjustmen t for a v ariable, then it is n ot a confound er. I n the applied literature the app roac h is somet imes encapsu- lated in th e “10 p ercent rule,” that is, discard a cov ariate if adju stmen t for it do es n ot c hange an estimate by more than 10 p ercen t. It is we ll do cument ed in the lite rature that collapsibilit y-based defi n itions do not w ork for all ef- fect measures, suc h as the o dds ratio or haza rd ratio s, for which marginal and conditional ma y differ ev en in the absence of confoun ding [Greenland, Robins and P earl ( 1999 )]. Suc h effect measures are sometimes referred to as noncollapsible. Ho wev er, for at least the risk difference scale (or the risk r atio scale) a colla psib ility-based d efinition of a confounder could b e en tertained and for completeness w e consider it also here. S uc h a collapsibilit y-based definition could b e formalized as follo ws. Definition 6. A p re-exp osure cov ariate C is a confounder for the ef- fect of A on Y if there exists a set of pre-exp osure cov ariates X suc h that P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } pr( x, c ) 6 = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ). Definition 6 , lik e Definition 5 , is scale-dep endent. Although n ot the fo cus of the present pap er, in the App end ix w e giv e some fur ther remarks on the p ossibilit y of empirical testing for eac h of Defini- tions 1 – 6 and for confounding an d nonconfoun ding more generally . Ho wev er, for the most part, n otions of confound ing and confounders , u nder these six definitions, are not empirically testable without further exp erimen tal d ata or strong assumptions. 4. Prop erties of a confound er. Language ab out “confounders” o ccurs of course not simply in metho dologic w ork b u t in substan tiv e statistic al and epidemiologic researc h. I n the d esign an d analysis of observ ational studies in the applied literature the task of cont rolling for “confounding” is often con- strued as that of collecting d ata on and cont rolling f or all “confounders.” In this section w e prop ose that when language ab out “confoun ders” is generally used in statistics and epidemiology , t wo things are implicitly presup p osed: first, that if one w ere to con trol for all “confounders ,” th en th is w ould suffice to con trol for “co nfoun ding” and, second, that co ntrol for a “confound er” will in some sense help to reduce or eliminat e confoun ding b ias. W e w ould prop ose that if a formal d efinition is to b e giv en for a “confounder ,” it should 8 T. J. V AN DER WEELE AND I. SHPITSER in some sense satisfy these t wo prop erties. If it d o es not, it arguably do es not cohere with what is typica lly presupp osed when language ab out “con- founders” is used in pr actice. W e giv e a formalization of these t w o pr op erties and in the f ollo wing s ection w e will discuss whic h of these t w o prop erties are satisfied by eac h of the candidate d efinitions of the previous section. W e could formalize the firs t p rop erty as follo ws . Pr ope r ty 1. If S co nsists of the set of all confounders f or the effect of A on Y , then th ere is no confounding of the effect of A on Y conditional on S , that is, Y a ⊥ ⊥ A | S . The defin ition make s reference to “all confounders;” to mak e r eference to all suc h v ariables, the domain of the v ariables considered needs to b e sp ecified. The d omain here w ill b e all pre-exp osure v ariables on a p articular causal diagram that qualify as confoun ders according to whatev er defin ition is in view. See Section 6 for some extensions. The second pr op ert y is that control for a confounder sh ould help either reduce or eliminate bias. The reduction and the elimination of bias are n ot equiv alen t and, thus, w e will form ally giv e t wo alternativ e p rop erties, 2A and 2B . Pr ope r ty 2A. If C is a confound er for the effect of A on Y , then there exists a set of pr e-exp osure co v ariates X (p ossibly empt y) such that Y a ⊥ ⊥ A | ( X, C ) but Y a 6⊥ ⊥ A | X . Pr ope r ty 2B. If C is a confounder for the effect of A on Y , then there exists a set of pr e-exp osure co v ariates X (p ossibly empt y) such that | P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) − { E ( Y 1 ) − E ( Y 0 ) }| < | P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) − { E ( Y 1 ) − E ( Y 0 ) }| . Prop erty 2A captures that not ion that in some con text, that is, condi- tional on X , the co v ariate C helps eliminate bias. Prop ert y 2B captures the notion that in some con text, that is, cond itional on X , the co v ariate C h elps reduce bias. Note th at Prop ert y 2B , lik e Defin ition 5 , is inh eren tly scale- dep end ent and in this sens e p erhaps less fundamenta l than Pr op ert y 2A . F or no w w e simply prop ose that for a candidate definition of a confounder to adequately captur e th e intuitiv e sense in whic h the w ord is used, it shou ld satisfy Prop erty 1 and sh ou ld also satisfy either Prop erty 2A or 2B . It wo uld b e p eculiar if a confounder w ere defined in a w a y that it did not satisfy these t w o prop erties. In the next section we consider whether eac h of the candidate definitions, Definitions 1 – 6 , satisfy P r op erties 1 , 2A and 2B . Of cours e, one p ossible outcome of this exercise is that n one of t he candidate definitions satisfy Prop erty 1 and either Prop erties 2A or 2B (or ev en that n o candidate definition could). Ho we ve r, as we will see in the n ext section, this tur ns out not to b e the case. CONFOUNDER D EFINITION 9 Fig. 1. Definition 1 do es not satisfy Pr op erty 2A or 2B . 5. Prop erties of the candidate definitions. Definition 1 w as a generaliza- tion of the traditional epid emiologic conception of a confoun der as a v ariable asso ciated with exp osure and outcome. F or this definition we ha ve the fol- lo wing resu lt. Pr opo s ition 1. Under faithfulness, for every c ausal diagr am, Defini- tion 1 satisfies Pr op erty 1 . Definition 1 do es not satisfy Pr op erties 2A or 2B . Pr oof . W e first sho w that Definition 1 satisfies Prop ert y 1 in faithful mo dels. Let G ∗ = G Nd( A ) ∪ An( Y ) b e the subgraph of G that has only the no d es in N d( A ) or An( Y ); see the App end ix . Let P a ∗ b e the subset of P a( A ) in G ∗ suc h th at ev ery elemen t P ∈ P a ∗ con tains some path in G ∗ to Y not through A . Since we consider faithfu l mo dels, we can u se d-connectedness to repr esent dep end ence. First we note that every elemen t in Pa ∗ satisfies Definition 1 . Indeed, an y elemen t of P a( A ) is d ep endent on A conditioned on an y set. F or any member of P a ∗ , we fix some path π to Y (not thr ou gh A ). W e are now free to p ic k an y set X to m ake this path d-connected (e.g ., w e can pic k the smallest X that op ens all colliders in π ). This set X satisfies Definition 1 for Pa ∗ with resp ect to A and Y . Th us, the set of all n o des in N d( A ) satisfying Definition 1 will include P a ∗ . Next, w e sho w that any sup ers et of P a ∗ in Nd( A ) will b e a v alid adj u stmen t set f or ( A, Y ). Assum e this is not the case for a p articular S , and fix a b ac kdo or p ath from A to Y whic h is op en giv en S . Th en the fi rst no de on this p ath after A m ust b e in P a ∗ . But this means the path is blo c k ed by S . Our conclusion follo ws. W e no w sho w Definition 1 d o es not satisfy Prop erties 2A or 2B . Consid er the causal diagram in Figure 1 . The v ariable C 3 is unconditionally asso ciated with A and Y ; th e v ariables C 1 and C 2 are eac h asso ciated with A and Y conditional on C 3 . Th us, under D efinition 1 , all three w ould qualify as “confounders.” Ho wev er, there is no set of p re-exp osure co v ariates X on the graph suc h that con trol f or C 3 helps eliminate or reduce bias. T o see this, note that if X includes C 1 or C 2 , then the effect estimate is unbiase d irresp ectiv e of whether adju s tmen t is made for C 3 . If X includes neither C 1 nor C 2 , then the estimand without adjustment f or C 3 is un biased whereas the estimand adjusted for C 3 is not. Therefore, Definition 1 do es not satisfy Prop erties 2A or 2B . This completes the pro of.  10 T. J. V AN DER WEELE AND I. SHPITSER Fig. 2. Definition 2 do es not satisfy Pr op erty 2A or 2B . In tuitiv ely , Definition 1 d o es not sati sfy Prop erties 2A or 2B b ecause in the causal diagram in Figure 1 , the v ariable C 3 is unconditionally asso ciated with A and Y and thus w ould b e a confounder under Definition 1 , but con trol for it will only either not affect bias (if con trol is n ot made f or C 1 and C 2 ) or increase bias (if con trol is not made for C 1 and C 2 ). The causal structure in Figure 1 and the b ias resulting from cont rolling for C 3 is sometimes referred to in the literature as “M-bias” or “collider-stratificatio n” [Greenland ( 2003 ), Hern´ an et al. ( 2002 ), Hern´ an ( 2008 )]. W e note that if faithfulness is violated, Definition 1 do es not satisfy Prop ert y 1 either [Pearl ( 2009 )]. Under Definition 2 , a confounder was defined as a p re-exp osure co v ariate that blo c ks a backdoor path fr om A to Y . Pr opo s ition 2. F or every c ausal diagr am, Definition 2 satisfies Pr op- erty 1 . Definition 2 do es not satisfy Pr op erties 2A or 2B . Pr oof . If S consists of the set of all confounders under Definition 2 , then this set S will include all pre-exp osu r e co v ariates that b lo c k a bac kdo or path from A to Y . F rom this it follo ws that S b lo cks all bac kdo or paths from A to Y and b y Pearl’s bac kdo or path theorem, the effect of A on Y is unconfound ed giv en S . Thus, Definition 2 satisfies P rop ert y 1 . W e n ow sho w that it do es n ot satisfy Pr op erties 2A and 2B . Consider the causal diagram in Figure 2 . Under Definition 2 b oth C 1 and C 2 blo c k a b ac kdo or path from A to Y and th us would qualify as confoun ders. How- ev er, for C 2 there is n o set of pre-exp osure co v ariates X on the graph suc h that c ontrol for C 2 helps eliminate since if X = C 1 , there is n o bias with- out cont rolling f or C 2 ; if X = ∅ , there is bias ev en with con trolling f or C 2 . Th us, Definition 2 do es not satisfy Prop erty 2A . W e no w show that it does not satisfy Prop ert y 2B . Supp ose Figure 2 is a causal diagram for ( C 1 , C 2 , A, Y ) where all v ariables are b inary and supp ose that P ( C 1 = 1) = 1 / 2, P ( C 2 = 1 | c 1 ) = 1 / 5 + 3 c 1 / 5, P ( A = 1 | c 1 , c 2 ) = 1 / 10 + 3 c 1 / 5 + c 2 / 10, P ( Y = 1 | a, c 1 , c 2 ) = 1 / 2 + (1 / 2)( a − 1 / 2) c 1 . One can then verify that E ( Y 1 ) − E ( Y 0 ) = P c 1 ,c 2 { E ( Y | A = 1 , c 1 , c 2 ) − E ( Y | A = 0 , c 1 , c 2 ) } pr( c 1 , c 2 ) = 0 . 25 = P c 1 { E ( Y | A = 1 , c 1 ) − E ( Y | A = 0 , c 1 ) } pr( c 1 ), that E ( Y | A = 1) − E ( Y | A = 0) = 0 . 266 and that P c 2 { E ( Y | A = 1 , c 2 ) − E ( Y | A = 0 , c 2 ) } pr( c 2 ) = 0 . 269. Under De fin ition 2 , C 2 w ould b e c onsidered a confounder sin ce C 2 blo c ks the b ac kdo or p ath A ← C 2 ← C 1 → Y . Ho we ve r, there is no se t X of pr e- exp osure co v ariates su c h that | P x,c 2 { E ( Y | A = 1 , x, c 2 ) − E ( Y | A = 0 , x, CONFOUNDER D EFINITION 11 Fig. 3. Definition 3 do es not satisfy Pr op erty 1 . c 2 ) } pr( x, c 2 ) − { E ( Y 1 ) − E ( Y 0 ) }| < | P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr ( x ) − { E ( Y 1 ) − E ( Y 0 ) }| . This is b ecause if X is ta ke n as C 1 , then the expressions on b oth sides of the inequalit y are equal to 0 (co ntrolling for C 2 in addition to C 1 do es not reduce bias); if X is tak en as the emp ty set, w e ha ve | P c 2 { E ( Y | A = 1 , c 2 ) − E ( Y | A = 0 , c 2 ) } pr( c 2 ) − { E ( Y 1 ) − E ( Y 0 ) }| = | 0 . 269 − 0 . 2 50 | = 0 . 019 > 0 . 016 = | 0 . 26 6 − 0 . 250 | = |{ E ( Y | A = 1) − E ( Y | A = 0) } − { E ( Y 1 ) − E ( Y 0 ) }| and again controll ing for C 2 do es not reduce (but rather in creases) bias. Defin ition 2 thus do es not satisfy Prop erty 2B . This completes the pro of.  If w e consider th e causal diagram in Figure 2 , then under Definition 2 b oth C 1 and C 2 blo c k a backdoor p ath f rom A to Y and th us w ould qualify as confoun d ers. Ho we ve r, for C 2 there is no set o f pre-exp osure co v ariates X o n the grap h su c h that control for C 2 helps eliminate bias (Prop ert y 2A ) since if X = C 1 , th ere is no b ias w ithout controlli ng for C 2 ; if X = ∅ , there is bias ev en with con trolling f or C 2 . Lik ewise, examples can b e constru cted as in the pro of ab ov e in wh ic h con trol for C 2 will only increase bias, that is, con trol for C 2 do es not h elp reduce bias (Prop erty 2B ). Under Definition 3 , a confoun d er was d efined as a member of ev ery min- imally sufficient adjustment set. Pr opo s ition 3. Definition 3 do es not sa tisfy Pr op erty 1 . Definition 3 satisfies Pr op erty 2A . Pr oof . Consider the causal d iagram in Figure 3 . Here, either C 1 or C 2 w ould constitute minimally s ufficien t adjustmen t sets and thus neither are a mem b er of eve ry m in imally sufficien t adjustment set and un der Definition 3 , neither w ould b e confound ers. If w e con trol for nothing, there is still con- founding for the effect of A on Y and, thus, for Figur e 3 , con trolling for all confounders under Definition 3 w ould not suffice to control for confoundin g. Th us, De fin ition 3 do es not satisfy Prop er ty 1 . If C is a mem b er of ev ery minimally suffi cien t adjustment s et, then it is a memb er of a minimally suf- ficien t adj ustmen t set and from this it triviall y follo ws that it satisfies the requirement s in Pr op ert y 2A . This completes the pro of.  A v ariable C that is a confounder under Definition 3 will in general sat - isfy Prop erty 2B as well but ma y not alwa ys b ecause there are cases in whic h there is confounding in th e distrib ution of co unterfact ual outcomes 12 T. J. V AN DER WEELE AND I. SHPITSER conditional on C a nd so that C i s a confound er under Definition 3 but with the a v erage ca usal effect on th e a dd itiv e scale not confounded [Greenland, Robins and P earl ( 19 99 )]. Intuitiv ely , to s ee th at Definition 3 do es not sat- isfy Prop erty 1 , consider the causal d iagram in Figure 3 . Here, either C 1 or C 2 w ould constitute minimally suffi cien t adju stmen t s ets and th u s neither are a mem b er of ev er y minimally sufficient adj u stmen t set. Under Defini- tion 3 , there would thus b e no confounders for the effect of A on Y ; clearly , ho w ev er, if we con trol for nothing, there is s till confounding for th e effect of A on Y . Under Definition 4 , a confounder was defined as a memb er of some mini- mally sufficien t adju stmen t set. Pr opo s ition 4. F or every c ausal diagr am, Definition 4 satisfies Pr op- erty 1 . Definition 4 satisfies Pr op erty 2A . Pr oof . W e will show that Definition 4 satisfies Prop ert y 1 . W e fir st claim that an y min imally s u fficien t adj u stmen t set for ( A, Y ) m ust lie in G An( A ) ∪ An( Y ) , the subgraph of G that has only the no des in Nd( A ) or An( Y ); see the App endix . Assume this is not true, and pick some min - imally sufficien t set S with elemen ts outside An( A ) ∪ An( Y ). This means S ∩ (An( A ) ∪ An( Y )) is not sufficien t. Note th at any ancestor of a no de in the set An ( A ) ∪ An ( Y ) will also b e in An( A ) ∪ An( Y ). F rom this it follo ws that an y bac kdo or path f rom A to Y whic h has a n o de outside An( A ) ∪ An ( Y ) will require a collider to get back in to An( A ) ∪ An ( Y ). Ho wev er, those colliders m ust b e op en by elemen ts in S . W e ha ve a cont radiction. W e ha v e sho wn that an y minimally sufficient adju stmen t set m ust b e a subset of An( A ) ∪ An ( Y ) and, th us, an y v ariable that is a confounder under Defin ition 4 must b e in An( A ) ∪ An( Y ) . Next we n ote that P a( A ) is a su fficien t adju stmen t set for ( A, Y ) . Pic k a minimal sub set P a + of P a ( A ) that is sufficient. Our cla im is that eve ry elemen t P in P a( A ) \ Pa + is suc h that P is not connected to Y in the graph ( G An( A ) ∪ An( Y ) ) a except b y paths that are blo ck ed cond itional on Pa + . Assume this is not true, and fix a path ω from P to Y that is not blo c k ed b y P a + in ( G An( A ) ∪ An( Y ) ) a . If this path h as n o colliders, then app endin g ω with the edge P → A prod uces a bac kdo or path from A to Y not block ed b y Pa + , con tradicting the earlier claim that P a + is a v alid adjus tment set. If ω only con tains colliders ancestral of P a + , then either ω has a non- collider triple block ed b y P a + (in which ca se w e are done with that p ath) or ω app end ed with P → A pr o duces a b ac kdo or path op en conditional on Pa + , whic h is a co ntradicti on. If ω con tains c ollider triples ancestral of P a( A ) \ P a + (but not ancestral of Pa + ), let W b e the central no de of the last s uc h collider triple on the path fr om P to Y . Let P ′ b e a mem b er of P a( A ) \ P a + of whic h W is an an cestor. Consider instead of ω a n ew path: A ← P ′ ← · · · ← W app ended with the sub path of ω that begins with the CONFOUNDER D EFINITION 13 no de on ω after W and end s with Y . This p ath either has a noncollider triple blo c ke d b y P a + (in whic h case so do es ω and we are done with ω ) or it is op en conditional on P a + , in wh ic h case w e h av e a con tradiction, or it con tains collider triples ancestral of Y not through P a ( A ). In the last ca se, let Z b e the c entral nod e of the first s u c h collider triple on the curr en tly considered p ath fr om A to Y . C onsider instea d a n ew path which app ends a su bpath of the cur ren tly considered path extending f r om A to Z , and th e segmen t Z → · · · → Y . This path has n o blo c ke d c olliders b y construction, and th us must either ha v e a noncollider triple blo c k ed by Pa + (in wh ic h case so do es ω and w e are don e with ω ) or it is op en conditional on P a + , in whic h case we hav e a contradictio n. Our fin al claim is that any sup erset S of P a + in Nd( A ) ∩ (An( A ) ∪ An( Y )) is a v alid adjustment set for ( A, Y ). Assum e this w ere not so and fi x an op en bac kdo or path ρ from A to Y giv en S . The first n o de on ρ after A m u st lie either in P a + or in Pa( A ) \ P a + . In the first case, the path is blo c ke d. In the second case , we hav e shown ab o ve that ev ery path fr om P a ( A ) \ P a + to Y in ( G An( A ) ∪ An( Y ) ) a is block ed by Pa + and, th us, th e path m ust b e block ed in the second case as w ell. Th er e th us cannot b e an op en bac kdo or p ath from A to Y giv en S and w e ha ve a cont radiction. W e ha ve that P a + is a sufficien t adjustmen t set; any v ariable that is a confounder under Definition 4 will b e a mem b er of Nd( A ) ∩ (An( A ) ∪ An( Y )) and, th us, we hav e that the set of v ariables that are confounders under Definition 4 w ill b e a sufficien t adjustment set. Definition 4 thus satisfies Prop erty 1 . Definition 4 satisfies Prop erty 2A trivially . This completes the pr o of.  A v ariable that is a confoun d er und er Definition 4 will in general sat isfy Prop erty 2B as well but ma y not alw ays b ecause, as b efore, there may b e confounding in distribution without the a v erage causal effect on the additiv e scale b eing confoun ded. Definition 4 thus satisfies Prop ert y 2A , generally Prop erty 2B , and, as sho wn in the pro of ab o v e, also satisfies Prop erty 1 for all causal diagrams. Th at Definition 4 satisfies Prop erty 1 can b e restated as the prop osition that the union of all minimally sufficien t adju stmen t sets is itself a sufficien t adju stmen t set. Defin ition 4 thus satisfies the prop er- ties whic h arguably ough t to b e required for a reasonable d efinition of a “confounder.” Under Definition 5 , a confounder w as essenti ally d efined as a pre-exp osur e co v ariate, the con trol for which help ed reduce bias. Pr opo s ition 5. Definition 5 do es not sa tisfy Pr op erty 1 . Definition 5 satisfies Pr op erty 2B b u t not 2A . Pr oof . Supp ose that Y a ⊥ ⊥ A | C , that ( C , A, Y ) are all binary and that P ( C = 1) = 1 / 2, P ( A = 1 | c ) = 1 / 4 + c/ 2, P ( Y = 1 | a, c ) = 4 / 10 − 4 c/ 10 − 3 a/ 10 + 8 ac/ 10. O ne can then verify that E ( Y 1 ) = P c E ( Y | A = 1 , c ) pr( c ) = 14 T. J. V AN DER WEELE AND I. SHPITSER Fig. 4. Definition 5 do es not satisfy Pr op erty 2A . 3 / 10, E ( Y | A = 1) = 4 / 10, E ( Y 0 ) = P c E ( Y | A = 0 , c ) pr( c ) = 2 / 10, E ( Y | A = 0) = 3 / 10. Th us, | P c { E ( Y | A = 1 , c ) − E ( Y | A = 0 , c ) } pr( c ) − { E ( Y 1 ) − E ( Y 0 ) }| = 0 = |{ E ( Y | A = 1) − E ( Y | A = 0) − { E ( Y 1 ) − E ( Y 0 ) }| and so under Definition 5 , C w ould not b e a confoun der. The set o f v ariables defined a s confounders u nder Defin ition 5 w ould th us b e empt y . Ho we ve r, it is not the case that adjustment for the e mpty set suffices to con trol for confounding since, for example, E ( Y 1 ) = 3 / 10 6 = 4 / 10 = E ( Y | A = 1). T h us, Definition 5 do es not satisfy Pr op ert y 1 . W e n o w s ho w that Definition 5 d o es not satisfy Prop ert y 2A . Consider the causal diagram in Figure 4 . Although con trol for C 2 migh t redu ce bias compared to an unadju sted estimate and th us satisfy Definition 5 with X = ∅ , there is n o X suc h that the effect of A on Y is unconfoun ded cond itional on ( X, C 2 ) bu t n ot on X alone. Thus, Definition 5 d o es not satisfy Prop erty 2A . Definition 5 satisfies Prop erty 2B trivially . This completes the pr o of.  Definition 5 do es not satisfy Prop ert y 1 b ecause an unadjusted estimate of the causal risk difference ma y b e correct, ev en in the pr esence of con- founding, b ecause th e bias due to co nfoun ding fo r E ( Y 1 ) ma y cancel that for E ( Y 0 ); said another wa y , t here may b e co nfoun ding in the distribution of coun terfactual outco mes without their b eing confoun d ing in a particular measure. That Definition 5 satisfies Prop ert y 2B is essential ly em b ed ded in Definition 5 itself. I n tuitiv ely , to see that Definition 5 d o es not satisfy Prop erty 2A , consider the causal diagram in Figure 4 . Although con trol for C 2 migh t r educe b ias compared to an un adjusted estimate and th us satisfy Definition 5 with X = ∅ , there w ould b e no X suc h that the effect of A on Y is unconfoun d ed conditional on ( X , C 2 ) but not on X alone. Under Definition 6 , a confounder was defined as a pre-exp osur e co v ariate, the con trol for wh ic h in some con text c hanged the effect estimate. Pr opo s ition 6. Definition 6 do es not sa tisfy Pr op erty 1 . Definition 6 do es not satisfy Pr op erties 2A or 2B . Pr oof . In the fi rst example in the pro of of P rop osition 5 , the set of confounders under Definition 6 would b e emp ty b ecause with X empt y w e ha v e P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } pr ( x, c ) = 0 = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } p r( x ). Ho wev er, the effect of A on Y is not uncon- CONFOUNDER D EFINITION 15 founded conditional on the empt y set. Thus, Definition 6 does not sat isfy Prop erty 1 . W e no w sho w Definition 6 do es not satisfy Prop erties 2A or 2B . Consider the ca usal diagram in Figure 1 . If we let X d en ote the empt y set, then C 3 will satisfy Definition 6 and so w ould b e a confoun der under Definition 6 . Ho w ev er, if we consider Prop erties 2A and 2B , there is no set of pre-exp osure co v ariates X on the graph su c h that control for C 3 helps eliminate or r educe bias. T o see this, note that if X includes C 1 or C 2 , then the effect estimate is un biased irresp ectiv e of whether adjustment is made for C 3 . If X includes neither C 1 nor C 2 , then th e estimand without ad j ustmen t for C 3 is unbiased whereas the estimand adjusted for C 3 is not. Therefore, Definition 1 does not satisfy Prop er ties 2A and 2B . This completes the pro of.  As with Definition 5 , Definition 6 do es not satisfy Pr op ert y 1 b ecause of the p ossibilit y of cancellations: th ere may b e confoun ding in the distribution of coun terfactual outco mes without their b eing confoun d ing in a particular measure. Definition 6 also f ails to s atisfy P rop erties 2A or 2B . I t fails b ecause of the p ossib ility of “M-bias” or “collider-stratificati on” structur es as in Figure 1 [Greenland ( 2003 ), Hern´ an et al. ( 2002 )]. Con trolling for a v ariable suc h as C 3 ma y c hange the estimate , bu t it may be that it is the estimate without cont rol for that v ariable (e.g., C 3 in Figure 1 ) that is unbiased. Also, as noted ab o v e, the collapsibilit y-based d efinitions fail for od ds ratio and hazard ratio measures for others reasons, namely , b ecause marginal and conditional measures are not comparable ev en in the absence of confound ing. See Greenla nd , Robins and Pe arl ( 1999 ), Ge ng et al. ( 2001 ) and Geng and Li ( 2002 ) for further discussion of the relationship b et w een, and general nonequiv alence of, confounding and collapsibilit y . Candidate d efinitions for a confound er might th us include Defin ition 4 and, i f the issu e of scale dep end ence is set aside, De fin ition 5 . N ote, ho w- ev er, that a v ariable that satisfies Definition 5 b ut not Definition 4 w ill nev er help t o eliminate confoun ding bias, only to reduce su c h bias. Suc h a v ari- able reduces bias essen tially b y serving as a pro xy for a v ariable that d o es satisfy Definition 4 . W e therefore pr op ose that a confou n der b e defin ed as in Definition 4 , “a pr e-exp osure co v ariate that is a mem b er of some minimally sufficien t adjustmen t set ” and that an y v ariable that satisfies Definition 5 but n ot Definition 4 b e r eferred to as a “sur rogate confounder.” The termi- nology of a “surrogate confounder” or “pro xy confound er” app ears elsewhere [Greenland and Morgenstern ( 2001 ), Hern´ an ( 2008 )]; here w e ha v e provided a formal criterion f or such a “sur rogate confounder.” See Greenland and P earl ( 2011 ) and Ogburn and V and er W eele ( 2012 ) for pr op erties of suc h surrogate confounders. In terestingly , Definition 4 is closely related to definitions concerning con- founders prop osed b y Robins and Morgernstern ( 1987 ), though their defini- tions were not un iv ersally adopted by th e epidemiologic communit y o ver the 16 T. J. V AN DER WEELE AND I. SHPITSER ensuing 25 y ears. Robins and Morgenstern ( 198 7 ) we re not principally con- cerned with h o w the word “confounder” is employ ed in practice when used in an u nqualified sense, b ut rather with whether a particular v ariable w ould still, in some sense, b e a confounder if data w ere also a v ailable on other v ariables. As n oted ab ov e, Robins and Morgenstern [( 1987 ), S ection 2H] sa y that C is a confound er conditional on F if causal effects are computable giv en data on C and F , but not on F alone. In th e framew ork of Robins and Mo rgenstern, if one were to tak e as the (unconditional) definition of a confounder that “there exists some set F such that C is a confounder con- ditional on F [in the s ense of Robins and Morgenstern ( 1987 ), S ection 2H],” then this would co incide with Definition 4 . Note that Robins and Mo rgen- stern, in th eir defin itions, in some sense go further than Definition 4 in ha ving the inv estigator explicitly sp ecify the other v ariables F for wh ic h con trol migh t b e made. This wo uld indeed b e useful in practice, th ough cur- ren t use of la nguage has n ot generally adopted this con v en tion. It migh t in the futu re b e helpfu l to distinguish b et we en the unqu alified use of th e word “ c onfounder ” as defined in Definition 4 , and “ c onfounder in the c ontext of ha ving data also on F ” as in Robins and Morgenstern ( 1987 ). The f ormer is arguably ho w the word “confounder” is often used in practice ; the latter w ould b e usefu l in making decisions about data colle ction and confounder con trol. 6. S ome extensions, implications and f urther results. In th e discussion ab o v e w e hav e co nsid ered whether a co v ariate is a “confounder” in an un- conditional sense. Ho wev er, w e m igh t also sp eak ab out whether a v ariable C is a confounder for the effect of A on Y conditional on some s et of co v ari- ates L whic h an in ve stigator is going to condition on irresp ectiv e of whether con trol is made for C . Definition 4 abov e, th e defin ition for an “uncondi- tional confounder” could b e restated as follo ws: a pr e-exp osure cov ariate C is a confound er for the effect of A on Y if there exists a set of pre-exp osure co v ariates X su c h t hat Y a ⊥ ⊥ A | ( X, C ) bu t there is no prop er sub set T of ( X, C ) suc h that Y a ⊥ ⊥ A | T . The conditional analogue w ould then b e as fol- lo ws: w e sa y that a pre-exp osure co v ariate C is a confounder for the effect of A on Y conditional on L if there exists a set of pre-exp osu r e co v ariates X such that Y a ⊥ ⊥ A | ( X , L, C ) but there is n o prop er subs et T of ( X , C ) suc h th at Y a ⊥ ⊥ A | ( T , L ). Consider again the causal diagram in Figure 3 . Here, C 2 w ould b e a confoun der un der Definition 4 . Ho we ve r, C 2 is n ot a confounder f or the e ffect of A on Y conditional o n L = C 1 . Cons ider once more the c ausal diag ram i n Fi gure 1 . Here, neither C 1 nor C 2 w ould b e a confounder under Definition 4 . How ever, cond itional on L = C 3 , b oth C 1 and C 2 w ould b e confoun ders. An analogue of Definition 4 could also b e giv en for a p articular causal parameter of in terest rather th an for the condition of n on confou n ding in distribution Y a ⊥ ⊥ A | S . F or example, C could b e defined to b e a confounder CONFOUNDER D EFINITION 17 for a partic ular c ausal paramete r (e.g ., the causal risk d ifference or causal risk ratio) if there exists a set of pre-exp osure co v ariates X such t he pa- rameter is iden tified b y adjusting f or ( X, C ) and if for no pr op er sub set, T of ( X , C ) is th e parameter identi fied b y adjusting for T [cf. R ob in s and Morgenstern ( 1987 )]. Ho we ve r, when w e restrict atten tion to particular p a- rameters we reintrodu ce some of the complications w ith cancellations that w ere noted ab ov e. F or example, du e to cancellations, a v ariable C ma y b e a confounder f or the causal risk difference b ut not for the causal r isk ratio [cf. V anderW eele ( 2012 )]. W e ha v e restricted our atten tion in this pap er th us far to pre-exp osure co- v ariates as p oten tial confounders. W e ha v e done so in order to corresp ond as closely as p ossible to the discuss ion in th e epidemiologic and p oten tial out- comes literatures. Ho wev er, within the conte xt of causal diagrams, a some- what broader range of v ariables could b e considered as “confounders” in that all of the discussion ab o ve is applicable if w e consider all n ondescendent s of A as p oten tial confounders rather than simp ly considering p re-exp osure co- v ariates. Throughout the pap er w e h a v e give n all d efinitions with resp ect to a particular underlying causal diagram. Ho we ver, for a giv en exp osure A and a giv en outcome Y , there will b e multi ple causal diagrams that correctly represent the causal structur e relating these v ariables to o ne an other and to cov ariates. One d iagram ma y b e an ela b oration of another and co nta in v ariables that the other do es not. It is straigh tforw ard to v erify that if a v ariable C is classified as a confound er und er Definitions 1 , 2 , 4 , 5 or 6 , then C will also b e a confoun der under eac h of those d efinitions resp ectiv ely on an y expanded causal diagram with additional v ariables. In the case of Definition 1 , th is is b ecause asso ciations that hold conditional on co v ari- ates X for one diagram will clea rly also h old for the other. In the case of Definition 2 , if C blo cks a bac kdo or path on on e causal diagram, it will blo c k a b ackdoor pat h o n any larger diagram that also correctly d escrib es the ca usal structure. In the case of Definition 4 , if there is some minimally sufficien t adjustmen t set S of which C is a mem b er, then that set will also b e minimally sufficien t on an y larger diagram that also co rrectly describ es the ca usal structure. In the c ase of Definitions 5 a nd 6 , if the i nequalities in these defi nitions hold for some co v ariate set X for one diagram, they will clearly also hold for the other. Only Defin ition 3 d o es not share this prop- ert y . T o see this, consid er Fig ur e 3 ; if in Figure 3 , w e colla psed o ver C 2 so that the causal diagram in v olv ed only C 1 , A and Y , then C 1 w ould b e a mem b er of ev ery minimally sufficient adjustmen t set for th is diagram and th us a confounder under Definition 3 . Ho we ve r, as we saw ab o v e, C 1 is not a confounder under Defin ition 3 for Figure 3 itself w h ic h in clud es the extra v ariable C 2 . This f ailure is a serious problem with Definition 3 , but, as we also sa w ab o ve , Definition 3 suffers from other limitations as we ll. 18 T. J. V AN DER WEELE AND I. SHPITSER Sev eral f airly trivial imp lications follo w fr om Definition 4 an d ma y b e w orth noting for the sak e of completeness. First, if a causal d iagram had a v ariable C with an arro w to log( C ) (or vice versa) and if C w ere a mem b er of a minimally sufficient adjustmen t set, then, und er Definition 4 , both C and log ( C ) would b e considered “confounders,” though log( C ) w ould not b e a confounder conditional on C , and like wise C would not be a confounder conditional on log ( C ). W e b eliev e that this is in accord w ith epidemiologic usage, though it w ould b e p eculiar to co nsider b oth C and log( C ) simul- taneously , just as it wo uld b e p eculiar to include b oth C and log ( C ) on a causal diagram. Second, if a v ariable C is measured with error, taking v alue C ∗ , and if the measuremen t error term ε = C ∗ − C were also represen ted on the causal diagram, then, if C w ere a confound er under Definition 4 , C ∗ and ε w ould also b oth b e confounders un der Definition 4 . W e b eliev e this is also in acc ord with standard epidemiolog ic usage of “co nfoun der,” though w e w ould in practice r arely r efer to ε as a “confounder” since w e rarely ha ve access to ε . Once aga in, h o w ev er, neither C ∗ nor ε w ould b e confounders conditional on C . Finally , sup p ose C 1 w ere heigh t in m eters and C 2 w ere w eigh t in kilograms and that C 1 and C 2 together sufficed to con trol for con- founding but neither alone did; let C 3 = C 1 /C 2 1 b e b o dy mass index (BMI) and su pp ose that cont rolling for C 3 alone s u fficed to cont rol for confounding. Then under Definition 4 , C 1 , C 2 and C 3 w ould eac h b e confounders, thou gh C 3 w ould not be a confound er cond itional on ( C 1 , C 2 ) and likewise neither C 1 nor C 2 w ould b e a confounder conditional on C 3 . Once agai n, w e b eliev e this is in accord with traditional epidemiologic usage of “confounder.” Sev eral implications hold b et w een the d ifferen t definitions of a confounder as stated in the follo wing resu lt. Pr opo s ition 7. On a c ausal diagr am, if a variable is a c onfounder un- der D efinition 3 , then it i s a c onfounder under Definitions 4 , 2 and 1 ; if under Definition 4 , then under Definitions 2 and 1 ; if under D efinition 5 , then under Definitions 6 and 1 ; if under Definition 6 , then u nder Defini- tion 1 . N o other implic ations hold witho ut further assumpt ions. Pr oof . On a causal diagram, if a v ariable is a mem b er of ev ery min- imally sufficien t a dju stmen t set , it m u st b e a mem b er of a minimally suf - ficien t adjustment set (the existence of a minimally sufficien t adjustment set is guaran teed b y th e v ariables lying on a causal diagram). Th us, if a v ariable is a confounder u nder D efinition 3 , then it is a confound er u nder Definition 4 . Supp ose a v ariable C satisfies Definition 4 , that is, is a mem- b er of some minimally sufficient adjus tmen t set ( X, C ), b ut that it do es not satisfy Definition 2 , th at is, it is not on a b ac kdo or path from A to Y . By Theorem 5 o f Shpitser, V anderW eele and Robins ( 2010 ), ( X, C ) blocks all bac kdo or paths fr om A to Y . If C d o es not lie on a bac kdo or path from A to Y , then X alone w ould b lo c k all bac kdo or paths from A to Y , whic h CONFOUNDER D EFINITION 19 w ould con tradict that ( X, C ) is a minimally sufficien t adjustment set. T h us, if C is a confounder under Definition 4 , it is a confoun der un der Defini- tion 2 . That C b eing a confounder under Definition 4 implies C is a con- founder under Definition 1 f ollo w s f rom the con trap ositiv e of Corollary 4.1 of Robins ( 1997 ). I f C is a confound er und er Definition 5 , it m ust b e a con- founder under Defin ition 6 b ecause the only w a y C can b e a confounder under Definition 5 is if P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } pr( x, c ) and P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) are not equal. If C is not a confounder under Definition 1 , then for ev ery X , C is indep endent of Y conditional on ( A, X ) o r of A conditional on X and from this it easil y fol- lo ws that P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr ( x ) and thus that C is not a confoun der un der Definition 6 . Thus, if C is a confoun der und er Defin ition 6 , it m ust b e a confounder und er Definition 1 . W e now argue that without further assumptions no other implications b et we en the definitions hold. Th e v ariable C 2 in Figure 4 could satisfy Def- inition 1 but do es not satisfy Definition 2 , so Definition 1 do es not imply Definition 2 . The v ariable C 3 in Figure 1 could satisfy Definition 1 , b ut do es not sat isfy Definitions 3 , 4 or 5 ; thus, Definition 1 do es not imply Defini- tions 3 , 4 or 5 . If C is a confound er und er Definition 1 , in general it will b e under Definition 6 as w ell, but it ma y not b ecause of c ancellations due to scale-dep endence. If C satisfies the conditions for Defin ition 2 (i.e., lies on a bac kdo or p ath from A to Y ), it w ill generally do so for Definitions 1 and 6 bu t ma y fail to do so b ecause of failure or faithfulness or cancellations due to scale-dep end en ce. In the examp le giv en concerning Pr op ert y 2B in P r op osition 2 , the v ariable C 2 in Figure 2 satisfied Defin ition 2 bu t do es not satisfy Definitions 3 , 4 or 5 ; th us, Definition 2 do es n ot imp ly Defin itions 3 , 4 or 5 . It w as sho wn ab o ve that if C s atisfies th e cond itions for Definition 3 , it will satisfy the conditions for Definitions 4 , 2 and 1 . If C satisfies the conditions for Defin ition 3 , it will generally satisfy the conditions for Defin itions 5 and 6 , but it ma y not do so due to scale-dep endence. It w as shown ab o v e that if C satisfies the conditions for Definition 4 , it will satisfy the conditions for Definitions 2 and 1 . In Figure 3 , C 2 satisfies the conditions for Defin ition 4 but not Definition 3 , therefore, Definition 4 do es not imply Definition 3 . If C satisfies the conditions for Definition 4 , it will generally satisfy the conditions for Definitions 5 an d 6 , but it ma y not do so due to scale-dep en dence. It wa s sho wn ab o v e that if C satisfies the conditions for Definition 5 , it will satisfy the cond itions for Definitions 6 and 1 . In th e example giv en concerning Prop erty 2B in P r op osition 5 , the v ariable C 2 in Figure 4 satisfied Definition 5 but do es n ot satisfy Definitions 2 , 3 or 4 ; th us, Defin ition 5 do es not imply Definitions 2 , 3 or 4 . 20 T. J. V AN DER WEELE AND I. SHPITSER Fig. 5. L o gic al r elationships that hold among definitions. Dashe d arr ows indic ate impli- c ations that wil l gener al l y hold but may fail due to sc ale dep endenc e of definitions. It w as shown ab o v e that if C satisfies the conditions for Definition 6 , it will satisfy the cond itions for Definition 1 . The v ariable C 2 in Figure 4 could satisfy Definition 6 bu t do es not satisfy Definition 2 , so Definition 6 do es not imply Definition 2 . The v ariable C 3 in Figure 1 co uld satisfy Definition 6 , but do es not satisfy Definitions 3 , 4 or 5 ; th u s, Definition 6 d o es not imply Definitions 3 , 4 or 5 .  The implications betw een the definitions are p lotted in Figure 5 . Those implications that will generally hold but ma y n ot hold b ecause of cancella- tions due to scale-dep endence are indicated with dash ed arrows. The pr op erties th ems elves that w e ha v e been c onsidering also b ear cer- tain relations to one another insofar as it is not d ifficult to sh o w that if Prop erty 2A is itself tak en as the definition of a confound er, then, on causal diagrams, this definition of a confounder also s atisfies Prop ert y 1 . This is b ecause if S denotes the set of all n o des C whic h ob ey Pr op ert y 2A and if S is not a su fficien t adjustment set (so there is op en bac kdo or path π from A to Y ), then if w e let W b e all nondescendants of A other than A and noncolliders n o des on π , if w e c ho ose a no de K on π that do es not contai n descendan ts of A, then it is the case that K satisfies Prop ert y 2A , and is not a part of S , which wo uld b e a cont radiction. Although it is the c ase t hat if Prop erty 2A is itsel f tak en as the defini- tion o f a confound er then this definition also satisfies Prop ert y 1 on causal diagrams, this does n ot hold g enerally within a counte rfactual framew ork. Note also that, even on causal diagrams, it is not the case th at Prop ert y 2A implies Prop ert y 1 ; a count erexample to this w as give n in Prop osition 3 for Definition 3 wh ic h satisfies Pr op ert y 2A but not Prop erty 1 . Rather, if Prop erty 2A is itself tak en as the definition of a confoun der, then, on causal diagrams, this definition w ould satisfy Prop ert y 1 as w ell. Th is raises the question as to whether Prop ert y 2A itself could be tak en as the definition of a confoun der, as suc h a definition w ould satisfy Prop er ty 2A (b y defin i- tion) and Pr op ert y 1 on causal d iagrams. Although such a definition w ould satisfy Prop er ties 1 and 2A on causal diagrams, it wo uld also follo w from this definition that C 1 is a confounder for the effect of A on Y in Figure 1 , CONFOUNDER D EFINITION 21 ev en though the effect A on Y is un confounded withou t cont rolling for any co v ariates. This is b ecause if Prop erty 2A is taken as the definition of a confounder, then C 1 satisfies Prop ert y 2A with X take n as C 3 . In general, ho w ev er, if the e ffect A on Y is unconfounded without controlli ng for an y co v ariates, we w ould p robably simp ly sa y that th ere are n o confounder s for the unconditional effect of A on Y . 7. C oncluding remarks. The c ausal inference literature has pro vided a formal definition of confounding with reference to distributions of coun ter- factual outcomes. Th e literature no w righ tly emphasizes the concept of con- founding con trol o v er that of a “confounder.” Nonetheless, the wo rd “con- founder” is o ften still used among applied researc hers and in this pap er we ha v e shown that at least one f orm al counterfact ual-based d efinition coheres with th e wa y in which the word is generally u sed. W e h a v e considered a n umb er of candidate prop osals often arising from more informal statemen ts made in th e literature. W e ha ve considered whether eac h of these definitions satisfies tw o prop erties, namely , (i) th at on any causal diagram, con trol for all confounders so d efi ned w ill con trol for confound in g and (ii) an y v ariable qualifying as a confound er u n der th is criterion will in some context remo v e confounding. Only one of the defin itions co nsider ed here satisfied b oth of these t w o prop erties. W e thus prop osed that a pre-exp osure co v ariate C be considered a confoun der for the effect of A on Y if there exists a set of cov ari- ates X su c h th at the effect of the exp osure on the outcome is unconfoun ded conditional on ( X, C ) b u t for no prop er su bset of ( X , C ) is th e effect of the exp osure on the outcome u nconfounded giv en the subset. Equ iv alen tly , a confounder is a “mem b er of a m inimally suffi cien t adjus tment set.” T his is closely related to the definitions concernin g confound ers giv en in Robins and Morgenstern ( 1987 ), th ough Robins and Morgenstern suggest sp ecifying the other v ariables for w hic h con trol migh t b e made as w ell. W e hav e further pro vided a conditional analogue of the prop osed definition of a confounder; and w e ha v e prop osed that a v ariable that helps reduce bias bu t not elim- inate bias b e referred to as a “ sur rogate confounder.” The definition of a “confounder” ab o v e is giv en rigorously in terms of coun terfactuals and, w e b eliev e, is also in accord with the intuitiv e prop erties of a “confounder” im- plicitly presupp osed b y practicing statisticians and epidemiologists. F rom a more theoretic al p ersp ectiv e, Definition 4 , u nlik e the other definitions, giv es rise to elegan t and u seful results whic h itself lends further su pp ort for its b eing tak en as the defin ition of a confounder. APPENDIX Review of causal diagrams. A directed graph co nsists of a set o f no d es and directed edges among no d es. A p ath is a sequence of distinct no des connected by edges regardless of a rrowhead direction; a directed path is a path whic h follo ws the edges in the d irection in dicated by th e graph’s arr ows. 22 T. J. V AN DER WEELE AND I. SHPITSER A directed graph is acycli c if there is no no de with a sequen ce of directe d edges bac k to itself. The no des with directe d edges int o a no de A are said to b e the paren ts of A ; the no d es in to whic h there are directed edges from A are said to b e th e c hildren of A . W e say that nod e A is an ancestor of no de B if there is a directed path from A to B ; if A is an ancesto r of B , then B is said to be a descend ant of A . If X denotes a set of nodes, then An( X ) w ill d enote th e ancestors of X and Nd( X ) will denote the set of nondescendants of X . F or a give n graph G , and a set of no d es S , the graph G S denotes a subgraph of G con taining only v ertices of G in S and only edges of G b etw een vertice s in S . On the other hand, the graph G S denotes the graph obtained from G b y remo ving all edges with arro wheads p oin ting to S . A nod e is said to b e a collider for a particular path if it is suc h that b oth the pr eceding and subsequent no des on the p ath ha v e d irected edges going in to that no de. A path b et wee n t wo no des, A and B , is said to b e blo c k ed given s ome set of no d es C if either there is a v ariable in C on the path that is not a co llider for the path or if there is a collider on the path suc h that neither the collider itself nor an y of its descendan ts are in C . F or disjoin t sets of no des A , B and C , we say t hat A an d B are d-separated giv en C if eve ry p ath from an y no de in A to any no de in B is blo ck ed giv en C . Directed a cyclic graphs are sometimes used as statistical mo dels to enco d e ind ep endence relationships among v ariables represen ted b y the no des on the graph [Lauritzen ( 1996 )]. The v ariables corr esp onding to the no des on a graph are said to satisfy the global Mark o v prop erty for the directed acycl ic graph (or to ha v e a distribution compatible with the graph) if for any disj oin t s ets of no d es A, B , C w e hav e that A ⊥ ⊥ B | C whenev er A and B are d -separated give n C . The d istribution of some set of v ariables V on the graph is said to b e faithful to the graph if for all disjoint sets A, B , C of V we hav e that A ⊥ ⊥ B | C only when A and B are d-separated give n C . Directed acyclic graphs can b e in terpreted as represen ting causal r ela- tionships. Pe arl ( 1995 ) defin ed a causal d irected acyclic graph as a di- rected acyclic graph with no d es ( X 1 , . . . , X n ) corresp onding to v ariables suc h that eac h v ariable X i is giv en by its non p arametric stru ctural equation X i = f i ( pa i , ε i ), where pa i are the paren ts of X i on the graph and the ε i are m utually indep endent . F or a causal diagram, the nonparametric str uctural equations enco de counterfactual r elationships among the v ariables repre- sen ted on the graph. T he equations themselv es represent one-step ahead coun terfactuals with other counte rfactuals giv en b y recur s iv e substitution [see Pe arl ( 2009 ) for fu r ther discu s sion]. A causal directed acyclic graph defined b y nonparametric structural equ ations satisfies the global Mark o v prop erty as stated ab ov e [P earl ( 2009 )]. The requirement that the ε i b e m utually indep endent is essen tially a r equ iremen t that there is no v ariable absen t from the graph whic h, if included on the graph , w ould b e a parent of t w o or more v ariables [P earl ( 1995 , 2009 )]. Throughout w e assume the CONFOUNDER D EFINITION 23 exp osure A consists of a sin gle no de. A backdoor p ath from A to Y is a path t o Y whic h begins with an edge in to A . A set of v ariables X is said to satisfy the b ac kdo or path criterion with resp ect to ( A, Y ) if no v ariable in X is a descendan t of A and if X blo cks all bac kdo or p aths fr om A to Y . Pea rl ( 1995 ) sho wed that if X satisfies the bac kdo or path criterion with resp ect to ( A, Y ), then the effect of A on Y is unconfounded giv en X , that is, Y a ⊥ ⊥ A | X . Empirical testing for confound ers and confounding. The abs en ce of con- founding conditional on a set of co v ariates S , that is, Y a ⊥ ⊥ A | S , is not a prop erty that can b e tested empirically with data. One m ust rely on sub j ect matter kno wledge, whic h ma y sometimes tak e the form of a causal diagram. Nonetheless, a few things can b e said ab out empirical testing concerning confounding and confounders. F or the sak e of completeness, we will con- sider eac h of Definitions 1 – 6 . It is possib le to v erify empir ically whether a v ariable is a confound er un der Definition 1 sin ce the definition r efers to ob- serv ed associations; how ever, it is not p ossible, without further kno wledge, to empirically ve rify that a v ariable do es not satisfy Definition 1 b ecause a v ariable may satisfy Definition 1 for some X that in v olv es an u nmea- sured v ariable U . O ne w ould ha ve to kno w that data w ere a v ailable for all v ariables on a causal diagram to emp ir ically verify that a v ariable w as a nonconfounder under Definition 1 . Because of this, ev en though Defin ition 1 satisfies Prop ert y 1 under fait hfu lness, this cannot b e us ed as an empirical test for confoun ding since (i) we cannot empirically v erify that a v ariable is a nonconfound er un der Defin ition 1 and (ii) w e cannot empirically v erify whether faithfulness holds. Without fu rther assum ptions, we cannot empirically ve rify that a v ariable is a confound er or a n on confou n der un d er Definition 2 b ecause Definition 2 mak es r eference to bac kd o or paths. Whether a v ariable l ies on a b ackdoor path cannot b e tested empirically without further assumptions; one w ould ha v e to k n o w th e structur e of the un d erlying causal diagram. Lik ewise, for Definitions 3 and 4 , one wo uld need to kno w all m inimally sufficient adju st- men t s ets, which itself w ould require chec king the “no confounding” condi- tion Y a ⊥ ⊥ A | S , whic h is, as noted a b o v e, not empirically te stable; though see b elo w for some qualifications. F or Definition 5 , w e could empirically re- ject the inequalit y in Definition 5 f or observed X if P x,c { E ( Y | A = 1 , x, c ) − E ( Y | A = 0 , x, c ) } p r( x, c ) = P x { E ( Y | A = 1 , x ) − E ( Y | A = 0 , x ) } pr( x ) . Ho w- ev er, w e cannot emp irically reject the inequalit y in Definition 5 for un ob- serv ed X and w e, moreo v er, cannot empirically v erify the inequ ality in Def- inition 5 b ecause E ( Y 1 ) − E ( Y 0 ) will not in general b e empirically iden tified if there are u nobserve d v ariables. W e can ve rify empiricall y wh ether a v ari- able is a confounder und er Definition 6 since the definition refers to only 24 T. J. V AN DER WEELE AND I. SHPITSER observ ed v ariables; h o w ev er, it is not p ossible, w ithout furth er kno wledge, to empirically ve rify that a v ariable do es not satisfy Definition 6 b ecause a v ariable may satisfy Definition 6 for some X that in v olv es an u nmea- sured v ariable U . O ne w ould ha ve to kno w that data w ere a v ailable for all v ariables on a causal diagram to emp ir ically verify that a v ariable w as a nonconfounder und er Definition 6 . Because of this w e cannot empirically v erify that a v ariable is a n onconfounder under Definition 6 . Determining whether a v ariable is a confound er requires m aking un testable assumptions. The only real progress that can b e made with emp irical test- ing for confounders is by making other untesta ble assum p tions th at logically imply a te st for assumptions w e care ab out. F or example, supp ose we as- sume we ha v e some set S that we are su re constitutes a suffi cient adjustment set. In this case, we can sometimes remo ve v ariables as unnecessary for con- founding con trol. In p articular, Robins ( 1997 ) sho wed that if w e knew that for co v ariate sets S 1 and S 2 w e had that Y a ⊥ ⊥ A | ( S 1 , S 2 ), then we would also ha v e that Y a ⊥ ⊥ A | S 1 if S 2 can be decomp osed into tw o disjoint sub sets T 1 and T 2 suc h that A ⊥ ⊥ T 1 | S 1 and Y ⊥ ⊥ T 2 | A, S 1 , T 1 . Both of th ese latter con- ditions are empirically testable. Ge ng et al. ( 2001 ) pr o vide some analo gous results for the effect of exp osur e on the exp osed. V and er W eele and Shpitser ( 2011 ) note that if for co v ariate set S we ha v e that Y a ⊥ ⊥ A | S , then if a bac k- w ard selection p ro cedure is app lied to S such that v ariables are iterativ ely discarded that are indep endent of Y conditional on b oth exposu r e A and the mem b ers of S that ha v e not y et b een discarded, th en the resulting set of co v ariates w ill suffice for confounding con tr ol. Th ey also show that u nder an additional assumption of faithfulness, if, for co v ariate set S , w e hav e that Y a ⊥ ⊥ A | S , then if a forw ard s electio n pro cedure is ap p lied to S suc h that, starting with th e empt y set, v ariables are iterativ ely added whic h a re asso- ciated with Y conditional on b oth exp osure A and the v ariables that ha v e already b een added, then the resulting s et of co v ariates will suffice for con- founding con trol. Note, how eve r, all of th ese results require knowledge that for s ome set S , Y a ⊥ ⊥ A | S , wh ic h is not itself empirically testable without exp erimenta l in terven tions. Ac kno wledgment s. Th e authors th ank Sander Gr eenland, James Robins and Miguel Hern´ an for helpful comments on this pap er. REFERENCES Barnow, B . S . , Cain, G. G. and Goldberger, A. S. (1980). Issues in the analysis of selectivity b ias. In Evaluation Studies ( E. Str omsdorfer and G. F arkas , eds.) 5 . Sage, San F rancisco. Breslow , N. E. and Da y, N. E. ( 1980). Statistic al Metho ds in Canc er R ese ar ch, V ol. 1: The Analysis of Case–Contr ol St udies . I nternational Agency for R esearc h on Cancer, Lyon, F rance. CONFOUNDER D EFINITION 25 Co x, D. R. (1958). Planning of Exp eriments . Wiley , New Y ork. MR0095561 Da wid, A. P. (2002). Influence diagrams for causal modeling and inference. Int. Statist. R ev. 70 161–189. Geng, Z. , Guo, J. and Fung, W.-K. (2002). Criteria for confound ers in ep idemiologic al studies. J. R. Stat. So c. Ser. B Stat. Metho dol. 64 3–15. MR1881841 Geng, Z. and Li, G . (2002 ). Conditions for non-confounding and co llapsibilit y with- out know ledge of completely constructed causal diagrams. Sc and. J. Stat. 29 169–181. MR1894389 Geng, Z. , Guo, J. , La u, T. S. and Fung, W .-K. (2001 ). Confounding, h omogeneit y and collapsibility for causal effects in epidemiolog ic studies. Statist. Sinic a 11 63– 75. MR1820001 Gl ymour, M. M. and Greenland, S. (2008). Causal diagrams. In Mo dern Epidemiolo gy , 3rd ed. ( K. J. Rothman , S. Greenland and T. L. Lash , eds.) 12 . Lippincott Williams and Wilkins, Philadelphia, P A. Greenland, S. (2003). Quantifying biases in causal models: Classical confounding vers us collider-stratification bias. Epidemiolo gy 14 300– 306. Greenland, S. and Morgenstern, H. (2001). Confounding in h ealth researc h . Annual R ev. Public He alth 22 189–212. Greenland, S. , Pearl, J. and Ro bins, J. M. (1999). Causal diagrams for epidemiologic researc h. Epidemiolo gy 10 37–48. Greenland, S. and Pearl, J. (2007). Causal diagrams. In Encyclop e dia of Epidemiolo gy ( S. Boslaugh , ed.) 149–156. Sage, Thousand Oaks, CA. Greenland, S. and Pearl, J. (2011). Ad justments and their consequences—colla psibility analysis using graphical mo dels. Internat ional Statistic al R eview 79 401–426. Greenland, S. and Robins, J. M. (1986). Identifiability , exc hangeability , and ep idemi- ological confound ing. Int. J. Epidemiol. 15 413–4 19. Greenland, S. , Robins, J. M. and Pearl, J. (1999). Confounding and colla psibility in causal inference. Statist. Sci. 14 29–46. Greenland, S. and Rob ins, J. M. (2009). Identifiabilit y , exchangeabilit y and confound - ing revisited. Epidemiol. Persp e ct. Innov. 6 4. Hern ´ an, M. A. (2008). Confounding. In Encyclop e dia of Quantitative Risk Assessment and A nalysis ( B. Everitt and E. Melnick , eds.) 353–362. Wiley , Chichester, UK. Hern ´ an, M. A. , Hern ´ anez-D ´ ıaz, S . , W erler, M. M . and Mitchell, A. A. (2002). Causal k n o wledge as a prerequisite for confounding ev aluation: An application to birth defects ep id emiology. A meric an Journal of Epidemiolo gy 155 176–184. Imbens, G. W. (2004). Nonparametric estimation of av erage treatment effects under ex- ogeneit y: A review. R ev. Ec onom. Statist. 86 4–29. Kleinbaum, D. G. , Ku pper, L. L. and Morgenstern, H. (1982). Epidemio- lo gic R ese ar ch: Principles and Quantitative Met ho ds . Lifetime Learning Publications [W adsw orth], Belmon t, CA. MR0684361 Lauritzen, S. L. (1996). Gr aphic al Mo dels . Ox ford Univ. Press, N ew Y ork. Miettinen, O. S. (1974). Confounding and effect mod ification. Am. J. Epidemiol. 100 350–353 . Miettinen, O. S. (1976). Stratification by a multiv ariate confounder score. Am. J. Epi- demiol. 104 609–62 0. Miettinen, O. S. and Cook, E. F. (1981). Confounding: Essence and detection. Am. J. Epidemiol. 114 593–60 3. Morabia, A. (2011). History of th e mo dern epidemiological concept of confounding. J. Epidemiol. Community He alth 65 297–300. 26 T. J. V AN DER WEELE AND I. SHPITSER Neyman, J. (1923). Sur les applications de la thar des probabilities aux exp eriences Agar- icales: Essa y des principle. Excerpts rep rinted (1990) in Engli sh (D. Dabrow sk a and T. Sp eed , trans.). Statist. Sci. 5 463–47 2. Ogburn, E. L. and V an derWeele, T. J. (2012). On the nondifferential misclassification of a binary confounder. Epidemi olo gy 23 433–439. Pearl, J. ( 1995). Causal diagrams for empirical research. Biometrika 82 669–71 0. MR1380809 Pearl, J. (2009). Causality: Mo dels, R e asoning, and Infer enc e , 2nd ed. Cambridge Univ. Press, Cambridge. MR2548166 Ro bins, J . (1992). Estimation of th e time-dep en dent accelerated failure t ime model in the p resence of confounding factors. Biometrika 79 321–334 . MR1185134 Ro bins, J. M. (1997 ). Causal inference from complex longitudinal data. In L atent V ariable Mo deling and Appli c ations to Causality (Los Angeles, CA, 1994) ( M. Berkane , ed.). L e ctur e Notes in Statistics 120 69–117 . Springer, New Y ork. MR1601279 Ro bins, J. M. and Greenland , S. ( 1986). The role of mo del selection in causal inference from nonex p erimen tal data. Am. J. Epidemiol. 123 392–40 2. Ro bins, J . M . and Morgenstern, H. (198 7). The foundations of confounding in epi- demiology . Comput. Math. Appl. 14 869–91 6. MR0922790 Ro bins, J. M . and Richard son, T. S. (2010). Alternative graphical causal mod els and the identification of direct eff ects. In Causality and Psychop atholo gy: Finding the Deter- minants of Disor ders and Their Cur es ( P. E. Sh rout , K. M. Keyes and K. Ornstein , eds.) 103–158. Oxford Univ. Press, New Y ork. Ro senbaum, P. R. and Rubin, D. B. (1983). The central role of the p ropen sity score in observ ational studies for causal effects. Biometrika 70 41–55. MR0742974 Rubin, D. B. (1978). Bay esian inference for causa l effects: The role of randomization. Ann . Statist. 6 34–58. MR0472152 Rubin, D. B. (1990). F ormal mo des of statistical in ference for causal effects. J. Statist. Plann. Infer enc e 25 279–292. Shpitser, I. , V anderWee le, T. J. and Robins, J. M. (2010). On the v alidit y of cov ari- ate adjustment for estimating causal effects. In Pr o c e e dings of the 26th Confer enc e on Unc ertainty and A rtificial Intel l igenc e 527–53 6. AUAI Press, Corv allis, OR. Spir tes, P. , Gl y mour, C. and Scheines, R. (1993). Causation, Pr e diction, and Se ar ch . L e ctur e Notes in Statistics 81 . Springer, New Y ork. MR1227558 V anderWee le, T. J. (2012). Confounding and effect modification: Distribution and mea- sure. Epidemiolog ic Metho ds 1 55–82 . V anderWee le, T. J. and Shpitse r, I. (2011). A new criterion for confounder selection. Biometrics 67 1406– 1413. MR2872391 Dep a r tm ents of Epidemiology and Biost a tistics Har v ard School of Public Hea lt h 677 Hunting ton A venue Boston, Massachusetts 02115 USA E-mail: tv anderw@hsph.harv ard.edu URL: htt p://www.hsph.harv ard.edu/facult y/ty ler- v anderweele/ Dep a r tm ents of Epidemiology Har v ard School of Pub lic Heal th 677 Huntington A venue Boston, Massachusetts 02115 USA E-mail: shpitse@hsph.harv ard. edu

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment