Design-based inference for generalized causal effects in randomized experiments

Design-based inference for generalized causal eﬀects in randomized exp erimen ts Xin yuan Chen 1 , ∗ and F an Li 2 , † 1 Departmen t of Mathematics and Statistics, Mississippi State Univ ersit y , MS, USA 2 Departmen t of Biostatistics, Y ale Sc ho ol of Public Health, CT, USA ∗ xc hen@math.msstate.edu † fan.f.li@y ale.edu F ebruary 27, 2026 Abstract Generalized causal eﬀect estimands, including the Mann-Whitney parameter and causal net b eneﬁt, provide ﬂexible summaries of treatment eﬀects in randomized ex- p erimen ts with non-Gaussian or multiv ariate outcomes. W e develop a uniﬁed design- based inference framework for regression adjustment and v ariance estimation of a broad class of generalized causal eﬀect estimands deﬁned through pairwise con trast functions. Lev eraging the theory of U-statistics and ﬁnite-p opulation asymptotics, w e establish the consistency and asymptotic normality of regression estimators con- structed from individual pairs and p er-unit pair a verages, even when the w orking mo dels are missp eciﬁed. Consequen tly , these estimators are mo del-assisted rather than mo del-based. In contrast to classical av erage treatmen t eﬀect estimands, we sho w that for nonlinear con trast functions, co v ariate adjustmen t preserv es consistency but does not admit a univ ersal eﬃciency guarantee. F or inference, w e demonstrate that standard heterosk edasticit y-robust and cluster-robust v ariance estimators are generally inconsistent in this setting. As a remedy , we pro ve that a complete t w o- w ay cluster-robust v ariance estimator, which fully accounts for pairwise dep endence and reverse comparisons, is consistent. Keywor ds: design-based inference, ﬁnite-p opulation central limit theorem, pairwise com- parisons, randomized exp erimen ts, U-statistics 1 In tro duction Randomized exp erimen ts increasingly ev aluate treatmen ts using complex outcomes, includ- ing ordinal resp onses ( W ang and P o co c k , 2016 ), prioritized comp osite endpoints ( P o co c k 1 et al. , 2011 ), m ultiv ariate clinical measures ( P o co c k , 1997 ), and outcomes for which dis- tributional shifts are more meaningful than mean diﬀerences ( De Sc hryver and De Neve , 2019 ). In such settings, treatmen t eﬀects can b e summarized through pairwise comparison estimands that con trast outcomes across coun terfactual conditions. Examples include the probabilistic index ( Thas et al. , 2012 ) and the causal net b eneﬁt estimands ( Bebu and Lac hin , 2016 ). Bey ond the classical av erage treatmen t eﬀect (A TE), these quan tities pro- vide ﬂexible, interpretable summaries of treatment sup eriorit y that extend b ey ond linear con trasts and are particularly suited to non-Gaussian or m ulti-comp onen t outcomes. In this article, we develop a uniﬁed design-based framew ork for inference on a broad class of generalized causal eﬀect (GCE) estimands in randomized exp erimen ts, sp eciﬁcally accommo dating non-Gaussian, m ultiv ariate, and comp osite outcomes. Mathematically , let Y = ( Y 1 , . . . , Y Q ) ⊤ denote the Q -dimensional ( Q ≥ 1) outcomes, where each comp onen t outcome Y q ( q = 1 , . . . , Q ) is unconstrained and can b e of any type. Under the ﬁnite- p opulation p oten tial outcomes framew ork, the GCE estimand is deﬁned as λ ( a, 1 − a ) = 1 N ( N − 1) X 1 ≤ i  = j ≤ N w { Y i ( a ) , Y j (1 − a ) } , (1) where N is the total num b er of units in the exp erimen t, Y i ( a ) = ( Y 1 ,i ( a ) , . . . , Y Q,i ( a )) ⊤ is the p oten tial outcome vector under treatment condition a = 0 , 1, and w is a con trast function enco ding the comparison rule b et ween tw o potential outcome vectors. Here, the summation a verages contrasts o ver all distinct ordered pairs of units, representing a b et ween-unit comparison in the ﬁnite p opulation. This form ulation accommo dates arbitrary outcome structures, including m ultiv ariate, prioritized, or comp osite outcomes, and encompasses a wide range of nonlinear treatmen t contra sts (Section 2.1 ). F or example, when Q = 1 and w ( y 1 , y 2 ) = 1 ( y 1 > y 2 ) + 1 2 1 ( y 1 = y 2 ), ( 1 ) reduces to the probabilistic index underlying Mann-Whitney type eﬀects ( Thas et al. , 2012 ; De Sc hryver and De Nev e , 2019 ); when Q = 1 and w ( y 1 , y 2 ) = y 1 − y 2 is a linear con trast, ( 1 ) reduces to the classical A TE 2 giv en b y λ ( a, 1 − a ) = N − 1 P N i =1 { Y i ( a ) − Y i (1 − a ) } . The GCE, therefore, pro vides a uniﬁcation for b oth classical linear estimands and mo dern nonlinear contrast functionals. Com bining λ ( a, 1 − a ) and λ (1 − a, a ) leads to the causal net-b eneﬁt summaries such as τ ( a ) = λ ( a, 1 − a ) − λ (1 − a, a ), which quantify the adv antage of allo cation a relativ e to its complemen t 1 − a ; see, for example, Bebu and Lachin ( 2016 ) and Scheidegger et al. ( 2025 ). F or randomized experiments with a univ ariate outcome ( Q = 1) and w ( y 1 , y 2 ) = y 1 − y 2 , Lin ( 2013 ) demonstrated that cov ariate adjustment via linear regression can im- pro ve asymptotic eﬃciency under complete randomization when treatment-co v ariate in- teractions are included, and the heterosk edasticity-robust v ariance estimation can b e used for asymptotically conserv ativ e inference. Imp ortan tly , this result does not rely on the correct sp eciﬁcation of the linear outcome mo del; rather, regression adjustment serves as a mo del-assisted estimator whose eﬃciency gains do not compromise the A TE estimand. In comparison, the design-based theory for estimating the GCE has been elusiv e, despite a few prior eﬀorts. F or example, for a nonlinear w with univ ariate outcomes ( Q = 1), V ermeulen et al. ( 2015 ) and Sc heidegger et al. ( 2025 ) ha v e suggested that similar eﬃciency gain for estimating λ ( a, 1 − a ) in ( 1 ) through the use of the probabilistic index mo del (PIM, Thas et al. , 2012 )—a mo del that resem bles linear regression but sp eciﬁcally exploits the rela- tionship b etw een the probabilistic index and co v ariates. Ho wev er, neither V ermeulen et al. ( 2015 ) nor Scheidegger et al. ( 2025 ) provided detailed analytical discussions of the impact of co v ariate adjustmen t on estimation eﬃciency and of the consequences of diﬀeren t regression mo del sp eciﬁcations. In addition and p erhaps more imp ortantly , design-based theory for v ariance estimation with a nonlinear contrast function remains largely unexplored in ran- domized exp eriments. W e therefore pro vide a comprehensiv e treatmen t for mo del-assisted analysis of the GCE under the design-based framew ork, along with asymptotically v alid v ariance estimators. Sp eciﬁcally , for estimating λ ( a, 1 − a ) in randomized exp erimen ts, w e consider tw o 3 classes of estimators using diﬀerent lev els of data: individual pairs and p er-unit pair a v- erages. F or each class of estimators, we study three regression mo del sp eciﬁcations: the Neyman-t yp e ( Spla wa-Neyman , 1923 ), regressing the outcome on the treatmen t indicators, the Fisher-type ( Fisher , 1935 ), on the treatmen t indicators and cov ariates, and the Lin-t yp e ( Lin , 2013 ), on the treatment indicators and treatmen t-cov ariate interactions. W e pro ve that these estimators are consistent and asymptotically normal for the GCE ev en when the w orking mo dels are arbitrarily missp eciﬁed. Leveraging the theory of U-statistics, w e compare the leading terms in the Neyman ﬁnite-sample randomization v ariances (hereafter referred to as Neyman v ariances) of these estimators. Our results demonstrate a fundamen- tal departure from the classical A TE setting. That is, under nonlinear contrast functions, co v ariate adjustmen t admits no univ ersal guaran tee of asymptotic eﬃciency gains, regard- less of the c hoice of mo del-assisted estimators. This result p oints to the intrinsic limits of regression adjustment for the GCE, including the probabilistic index estimands. How- ev er, cov ariate-adjusted estimators remain more eﬃcient empirically when the co v ariates are prognostic for the outcome, as sho wn in the sim ulation studies. Of note, the theoretical dev elopment underlying the tec hnical results is non-trivial. This is because all estimators are functions of ﬁnite-p opulation U-statistics arising from pairwise con trasts, which induce complex dependence structures through shared units and reverse comparisons. Establish- ing consistency and asymptotic normalit y , therefore, requires extending ﬁnite-p opulation cen tral limit theorems ( Li and Ding , 2017 ) to this nonlinear comparison setting. Finally , for inference, we demonstrate that, surprisingly , neither the heteroskedasticit y- robust (HR) ( Hub er , 1967 ) nor the cluster-robust (CR) v ariance estimator ( Liang and Zeger , 1986 ) is consistent, as neither correctly accounts for the correlation structure. In- stead, the tw o-w ay (TW) cluster-robust v ariance estimator ( Cameron et al. , 2011 ) is con- sisten t for the v ariance of b λ ( a, 1 − a ), but not for the cov ariance b etw een b λ ( a, 1 − a ) and b λ (1 − a, a ) and may , therefore, b e an ti-conserv ativ e for the ﬁnal v ariance of the summary es- 4 timand b τ ( a ). As a solution, we prop ose a complete t wo-w ay (CTW) cluster-robust v ariance estimator that also accoun ts for the within-pair co v ariance, i.e., b etw een w { Y i ( a ) , Y j (1 − a ) } and w { Y j (1 − a ) , Y i ( a ) } , whic h consisten tly estimates the v ariance of b τ ( a ) in randomized exp erimen ts. 2 Regression estimators for generalized causal eﬀect 2.1 Notation and framew ork W e consider a completely randomized exp eriment with N units, and let A i = a ∈ { 0 , 1 } denote the treatment indicator for individual i = 1 , . . . , N . W e assume N a units are randomly assigned to treatmen t a , with N 0 + N 1 = N . Let π a = N a / N for a = 0 , 1 be the assignmen t prop ortion. Deﬁne X i as the d x -dimensional pre-treatmen t cov ariate v ector, and Y i = ( Y 1 ,i , . . . , Y Q,i ) ⊤ as the Q outcomes of in terest, where Q ≥ 1. Under the stable unit treatmen t v alue assumption (SUTV A), the observ ed outcome is Y i = A i Y i (1)+(1 − A i ) Y i (0). The observ ed data for unit i is ( Y i , A i , X i ). W e pursue the design-based causal inference framew ork such that ( N a , π a ) are known in the exp eriment planning phase, and the potential outcomes Y i ( a ) and co v ariates X i are considered ﬁxed. The sole source of randomness comes from the treatment assignmen t pro cess; that is, only the treatment assignmen t A i is treated as random. This notation allo ws us to deﬁne the GCE estimand in equation ( 1 ). Belo w, w e oﬀer tw o key remarks on this target estimand. Remark 1 (F urther examples of target estimand) With multiple outc omes ( Q ≥ 2 ), we write Y q ( a ) ∈ Y q and Y Y Y = × Q q =1 Y q . The c ontr ast w : Y Y Y × Y Y Y 7→ W ⊆ R may b e c onstructe d in at le ast the fol lowing ways. In the non-prioritize d setting, wher e no c onsensus exists on r elative clinic al imp ortanc e, one may use a dimension-wise aggr e gation w ( y 1 , y 2 ) = A{ w 1 ( y 1 , 1 , y 2 , 1 ) , . . . , w Q ( y Q, 1 , y Q, 1 ) } , wher e w q ( y q , 1 , y q , 2 ) : Y q × Y q 7→ W q ⊆ R is a c ontr ast function for the q -th outc ome, e.g., the He aviside step function w q ( y q , 1 , y q , 2 ) = 5 1 ( y q , 1 > y q , 2 ) + 1 2 1 ( y q , 1 = y q , 2 ) , and A : × Q q =1 W q 7→ W is an aggr e gation op er ator. A notable example is the non-prioritize d aver age, wher e A = P Q q =1 ω q w q with ω q ≥ 0 and P Q q =1 ω q = 1 . In the prioritize d setting, wher e outc omes c an b e r anke d by clinic al imp ortanc e, a joint lexic o gr aphic rule may b e deﬁne d as w ( y 1 , y 2 ) = 1 ( y 1 ⪰ L y 2 ) , wher e y 1 ≻ y 2 if the ﬁrst c omp onent on which y 1 and y 2 diﬀer favors y 1 ( Bebu and L achin , 2016 ). Remark 2 (An alternativ e estimand) The estimand ( 1 ) aver ages c ontr asts over dis- tinct or der e d p airs and ther efor e c orr esp onds to a U-typ e GCE estimand. However, an alternative, e qual ly valid c ausal estimand includes diagonal terms and c an b e written as λ † ( a, 1 − a ) = 1 N 2 X 1 ≤ i,j ≤ N w { Y i ( a ) , Y j (1 − a ) } . (2) We r efer to ( 2 ) as the V-typ e GCE estimand. The two estimands diﬀer only by λ † ( a, 1 − a ) − λ ( a, 1 − a ) = 1 N 2 N X i =1 w { Y i ( a ) , Y i (1 − a ) } − P 1 ≤ i  = j ≤ N w { Y i ( a ) , Y j (1 − a ) } N 2 ( N − 1) , which is of or der O ( N − 1 ) under mild b ounde dness c onditions on w . Henc e, under standar d ﬁnite-p opulation asymptotics with N → ∞ , the U- and V-typ e estimands shar e the same limiting value and c oincide with the c orr esp onding sup er-p opulation c ontr ast when p otential outc omes ar e viewe d as indep endent dr aws. F r om a r andomization-b ase d p ersp e ctive, the estimators develop e d in this article ar e asymptotic al ly unbiase d and c onsistent for b oth the U- and V-typ e estimands. F or a deﬁnitive structur al similarity, we ther efor e fo cus on the U- typ e formulation thr oughout, noting that al l asymptotic r esults extend dir e ctly to the V-typ e c ounterp art. T o pro ceed, w e deﬁne the indicator function 1 i ( a ) = 1 ( A i = a ). F or a pair units i and j , deﬁne the p otential con trast as W i,j ( a, a ′ ) = w { Y i ( a ) , Y j ( a ′ ) } for a, a ′ = 0 , 1. Th us, the observ ed con trast is W i,j = w ( Y i , Y j ) = P 1 a =0 P 1 a ′ =0 1 i ( a ) 1 j ( a ′ ) W i,j ( a, a ′ ). Since λ ( a, a ) is not practically meaningful, we only fo cus on cases where a ′ = 1 − a . Note that although λ ( a, a ) → 0 as N → ∞ , λ ( a, a ) is not necessarily zero in a ﬁnite population of size N . In 6 general, the contrast function w is asymmetric, and therefore W i,j  = W j,i , implying that there is a pair of observ ed contrasts ( W i,j , W j,i ) for eac h pair of ( Y i , Y j ). Thus, ( Y i , Y j ) con tributes to the estimation of b oth λ ( a, 1 − a ) and λ (1 − a, a ) if A i  = A j . F or notational simplicit y , w e further deﬁne X i,j = X i − X j . Th us, X j,i = − X i,j and P 1 ≤ i  = j ≤ N X i,j = 0 , i.e., X i,j is a centered co v ariate vector. 2.2 Regression adjustmen t using individual pairs T o estimate the GCE in randomized exp erimen ts, we ﬁrst in tro duce regression estimators that enable baseline co v ariate adjustmen t using per-unit pair a verages. Deﬁne S ( a, 1 − a ) = { ( i, j ) | A i = a, A j = 1 − a, 1 ≤ i  = j ≤ N } , and a natural nonparametric estimator for λ ( a, 1 − a ) in ( 1 ) is given by b λ I ( a, 1 − a ) = 1 |S ( a, 1 − a ) | X ( i,j ) ∈S ( a, 1 − a ) W i,j . (3) Hence, we consider the regression mo del W i,j ∼ Z ⊤ I ,i,j ≡ ( A i (1 − A j ) , (1 − A i ) A j ) . (4) Then, b λ I (1 , 0) and b λ I (0 , 1) in ( 3 ) are co eﬃcients of A i (1 − A j ) and (1 − A i ) A j from the or- dinary least squares (OLS) ﬁt of the saturated regression mo del ( 4 ). This is the pairwise or dy adic extension of the Neyman diﬀerence-in-means estimator ( Splaw a-Neyman , 1923 ) for the classical A TE in randomized exp erimen ts. Since this estimator is built up on a v eraging individual pairs, we include the subscript ‘I’ to signify this feature. T o p otentially impro ve eﬃciency through cov ariate adjustment, Fisher ( 1935 ) prop osed a regression estimator for the A TE in randomized exp eriments. Here, we consider its analogy in the pairwise setting as W i,j ∼ Z acv ⊤ I ,i,j ≡ ( A i (1 − A j ) , (1 − A i ) A j , X ⊤ i,j ) , (5) 7 whic h can b e considered as an analysis of cov ariance (ANCO V A) mo del, but now for pair- wise comparisons. The ANCO V A estimator using individual pairs, b λ acv I (1 , 0) and b λ acv I (0 , 1), are co eﬃcien ts of A i (1 − A j ) and (1 − A i ) A j from the OLS ﬁt of ( 5 ), resp ectively . Finally , based on Fisher ( 1935 ) and F reedman ( 2008 ), Lin ( 2013 ) developed a regression estimator that includes treatment indicators and treatment indicator-cov ariate interactions. In the pairwise comparison setting, its analogy is giv en by W i,j ∼ Z adj ⊤ I ,i,j ≡ ( A i (1 − A j ) , (1 − A i ) A j , A i (1 − A j ) X ⊤ i,j , (1 − A i ) A j X ⊤ i,j ) . (6) Therefore, the cov ariate-adjusted estimator using individual pairs, b λ adj I (1 , 0) and b λ adj I (0 , 1), are co eﬃcients of A i (1 − A j ) and (1 − A i ) A j from the OLS ﬁt of ( 6 ), resp ectiv ely . 2.3 Regression adjustmen t using p er-unit pair a v erages Based on the pairwise comparison observ ation W i,j , regression adjustment using individual pairs is not the only p ossible approac h for estimating the GCE. Here, we present an alterna- tiv e approach that op erates upon p er-unit pair a verages. First, one can rewrite λ ( a, 1 − a ) in ( 1 ) as λ ( a, 1 − a ) = 1 2 N N X i =1 1 N − 1 X j : j  = i W i,j ( a, 1 − a ) + 1 2 N N X j =1 1 N − 1 X i : i  = j W i,j ( a, 1 − a ) = 1 2 N N X i =1  W i, · ( a, 1 − a ) + W · ,i ( a, 1 − a )  , (7) where the second equalit y deﬁnes the per-unit pair a v erages as W i, · ( a, 1 − a ) = 1 N − 1 X j : j  = i W i,j ( a, 1 − a ) , W · ,i ( a, 1 − a ) = 1 N − 1 X i : i  = j W i,j ( a, 1 − a ) . Of note, the 1 / 2 on eac h term in the ﬁrst equalit y of ( 7 ) is for symmetry , whic h can b e replaced by any non-negative normalized weigh ts. Equation ( 7 ) further implies that one can estimate λ ( a, 1 − a ) using p er-unit pair av erages or, equiv alently , p er-unit pair totals. That is, w e can deﬁne W A i, · = N − 1 1 − A i P j : A j =1 − A i W i,j and W A · ,i = N − 1 1 − A i P j : A j =1 − A i W j,i as 8 the observed p er-unit pair av erages, and consider the t w o separate regression mo dels W A i, · ∼ Z ⊤ A ,i, · ≡ ( A i , 1 − A i ) , W A · ,i ∼ Z ⊤ A , · ,i ≡ (1 − A i , A i ) . (8) Then the co eﬃcien t of A i from the OLS ﬁt of W A i, · ∼ Z ⊤ A ,i, · and the co eﬃcien t of 1 − A i from the OLS ﬁt of W A · ,i ∼ Z ⊤ A , · ,i giv e the exactly equiv alent estimator b λ A (1 , 0) for λ (1 , 0), b ecause they b oth use the same pairs in S (1 , 0) under diﬀerent index lab els. Similarly , w e ha ve b λ A (0 , 1) for λ (0 , 1). The subscript ‘A’ is added to signify that the estimator is obtained using p er-unit pair av erages. It is straigh tforw ard to show that the estimator b λ A ( a, 1 − a ) is algebraically equiv alent to b λ I ( a, 1 − a ). Similar to ( 5 ), the paired ANCO V A mo dels based on using p er-unit pair av erages are W A i, · ∼ Z acv ⊤ A ,i, · ≡ ( A i , 1 − A i , X A ⊤ i, · ) , W A · ,i ∼ Z acv ⊤ A , · ,i ≡ (1 − A i , A i , X A ⊤ · ,i ) , (9) where X A i, · = N − 1 1 − A i P j : A j =1 − A i X i,j and X A · ,i = N − 1 1 − A i P j : A j =1 − A i X j,i represen t the p er- unit pair a verages of baseline co v ariates. Imp ortantly , the ﬁnal ANCOV A estimator using p er-unit pair a v erages for λ (1 , 0) is the w eighted a verage of the coeﬃcient of A i from the OLS ﬁt of W A i, · ∼ Z acv ⊤ A ,i, · and the coeﬃcient of 1 − A i from the OLS ﬁt of W A · ,i ∼ Z acv ⊤ A , · ,i , since these tw o co eﬃcien ts are not algebraically equiv alent due to baseline cov ariate adjustmen t. Similar observ ations apply to the estimator for λ (0 , 1). The optimal w eigh ts can be deter- mined using the Neyman v ariances of the co eﬃcients and their co v ariance, which requires in volv ed estimation as w e detail in Section 4 . Therefore, in practice, w e recommend select- ing the co eﬃcient with the smaller estimated v ariance as the ﬁnal estimator. F or simplicity of exp osition, in the remainder of the article, we lab el the estimator only using co eﬃcien ts from the OLS ﬁt of W A i, · ∼ Z acv ⊤ A ,i, · as b λ acv A , 1 ( a, 1 − a ) and the corresp onding estimator from the OLS ﬁt of W A · ,i ∼ Z acv ⊤ A , · ,i as b λ acv A , 2 ( a, 1 − a ), and fo cus our theoretical discussions on these estimators. Finally , to allo w for treatmen t-by-co v ariate in teractions using p er-unit pair a verages, 9 w e consider the follo wing regression analogues of the Lin ( 2013 )-t yp e adjustment W A i, · ∼ Z adj ⊤ A ,i, · ≡ ( A i , 1 − A i , A i X A ⊤ i, · , (1 − A i ) X A ⊤ i, · ) , W A · ,i ∼ Z adj ⊤ A , · ,i ≡ (1 − A i , A i , (1 − A i ) X A ⊤ · ,i , A i X A ⊤ · ,i ) . (10) The ﬁnal cov ariate-adjusted estimator using p er-unit pair av erages b λ adj A , 1 (1 , 0) is the w eigh ted a verage of the co eﬃcient of A i from the OLS ﬁt of W A i, · ∼ Z adj ⊤ A ,i, · , and b λ adj A , 2 (1 , 0) the co eﬃcien t of 1 − A i from the OLS ﬁt of W A · ,i ∼ Z adj ⊤ A , · ,i , since these t w o coeﬃcients are again not algebraically equiv alent due to baseline cov ariate adjustmen t. Similarly , we can label the resulting estimators as b λ adj A , 1 (0 , 1) and b λ adj A , 2 (0 , 1). W e provide a summary of all six regression estimators in T able 1 . T able 1: A summary of p ossible regression estimators for the generalized causal eﬀect λ (1 , 0). The results for λ (0 , 1) are symmetric and omitted. All mo dels are ﬁtted using ordinary least squares. Regression unit F orm ula Equation Estimator for λ (1 , 0) Individual pair W i,j ∼ Z ⊤ I ,i,j ( 4 ) co ef { A i (1 − A j ) } W i,j ∼ Z acv ⊤ I ,i,j ( 5 ) co ef { A i (1 − A j ) } W i,j ∼ Z adj ⊤ I ,i,j ( 6 ) co ef { A i (1 − A j ) } P er-unit pair a verage ( W A i, · ∼ Z ⊤ A ,i, · W A · ,i ∼ Z ⊤ A , · ,i ( 8 ) co ef( A i ) in the ﬁrst submo del or equiv alently , co ef(1 − A i ) in the second submo del ( W A i, · ∼ Z acv ⊤ A ,i, · W A · ,i ∼ Z acv ⊤ A , · ,i ( 9 ) w eighted a verage of coef( A i ) in the ﬁrst submodel, and co ef(1 − A i ) in the second submo del ( W A i, · ∼ Z adj ⊤ A ,i, · W A · ,i ∼ Z adj ⊤ A , · ,i ( 10 ) w eighted a verage of coef( A i ) in the ﬁrst submodel, and co ef(1 − A i ) in the second submo del Remark 3 (Connection with rank regression) Under sp e ciﬁc c ontr ast functions, e.g., w ( Y 1 , Y 2 ) = 1 ( Y 1 ⪰ L Y 2 ) , mo dels in ( 8 ) - ( 10 ) ar e unit-level r ank-aver age or aggr e gate d p air- wise r ank r e gr essions ( De Neve and Thas , 2015 ) applie d to r andomize d exp eriments. The p er-unit aver age W i, · ( a, 1 − a ) is the pr op ortion of wins for unit i in tr e atment gr oup a 10 against al l units in tr e atment gr oup 1 − a , or the aver age r ank of Y i ( a ) among al l Y j (1 − a ) , with W A i, · b eing the observe d version of W i, · ( a, 1 − a ) . Under w ( Y 1 , Y 2 ) = 1 ( Y 1 ⪰ L Y 2 ) , the estimand λ ( a, 1 − a ) = N − 1 P N i =1 W i, · ( a, 1 − a ) → P { Y ( a ) ⪰ L Y (1 − a ) } as N → ∞ , which is the c ausal Mann-Whitney estimand under a sup er-p opulation fr amework. 3 Asymptotic prop erties of regression estimators 3.1 Preliminaries T o study the asymptotic properties of the regression estimators under arbitrary model missp eciﬁcation, w e adopt the ﬁnite-p opulation asymptotic regime of Li and Ding ( 2017 ), considering a sequence of ﬁnite p opulations with N → ∞ . All quan tities dep end on N , and w e suppress this dep endence in notation for simplicit y . The follo wing regularity conditions are required to establish these results. Condition 1 π a has a limit in (0 , 1) for a = 0 , 1 , and d x is ﬁnite and indep endent of N . Condition 2 { N ( N − 1) } − 1 P i  = j W i,j ( a, 1 − a ) 4 = O (1) for a = 0 , 1 . { N ( N − 1) } − 1 × P i  = j ∥ X i,j ∥ 4 ∞ = O (1) and max i,j : i  = j ∥ X i,j ∥ 2 ∞ = o ( N ) . Condition 3 The limit of { N ( N − 1) } − 1 P i  = j X i,j X ⊤ i,j is p ositive deﬁnite, and the limit of { N ( N − 1) } − 1 P i  = j X i,j W i,j is ﬁnite. Conditions 1 - 3 are t ypical regularit y conditions in v oked in design-based causal infer- ence; see, for example, Lin ( 2013 ) and Li and Ding ( 2017 ). Condition 1 requires that the n umber of units under the tw o treatmen ts is not to o im balanced, and restricts the co v ari- ates to a ﬁxed n umber. Condition 2 restricts the momen ts of the p otential outcomes and co v ariates, remo ving highly inﬂuen tial/outlier observ ations and prev en ting the magnitude of the cov ariates from gro wing to o rapidly . Condition 3 requires the comp onen ts in the design matrices to be well-behav ed. 11 W e in tro duce additional notation in preparation for the main results. Let W ( a, 1 − a ) = { N ( N − 1) } − 1 P i  = j W i,j ( a, 1 − a ). Deﬁne ϵ i,j ( a, 1 − a ) = W i,j ( a, 1 − a ) − W ( a, 1 − a ), with ϵ i, · ( a, 1 − a ) = ( N − 1) − 1 P j : j  = i ϵ i,j ( a, 1 − a ) = W i, · ( a, 1 − a ) − W ( a, 1 − a ) and ϵ · ,i ( a, 1 − a ) = ( N − 1) − 1 P j : j  = i ϵ j,i ( a, 1 − a ) = W · ,i ( a, 1 − a ) − W ( a, 1 − a ). Note that P N i =1 ϵ i, · ( a, 1 − a ) = 0 and P N i =1 ϵ · ,i ( a, 1 − a ) = 0 by deﬁnition. Let E i ( a, 1 − a ) ≡ {E i, · ( a, 1 − a ) , E · ,i ( a, 1 − a ) } for i = 1 , . . . , N be pairs of quantities with P N i =1 E i, · ( a, 1 − a ) = 0 and P N i =1 E · ,i ( a, 1 − a ) = 0. Motiv ated b y the form of the Neyman v ariance, w e further deﬁne V c {E i ( a, 1 − a ) } = 1 π a N N X i =1 E i, · ( a, 1 − a ) 2 + 1 π 1 − a N N X i =1 E · ,i ( a, 1 − a ) 2 , V {E i ( a, 1 − a ) } = V c {E i ( a, 1 − a ) } − 1 N N X i =1 {E i, · ( a, 1 − a ) + E · ,i ( a, 1 − a ) } 2 . Also, for the co v ariances, w e deﬁne C V ( E i ) = 1 π a N N X i =1 E i, · ( a, 1 − a ) E · ,i (1 − a, a ) + 1 π 1 − a N N X i =1 E i, · (1 − a, a ) E · ,i ( a, 1 − a ) − 1 N N X i =1 {E i, · ( a, 1 − a ) + E · ,i ( a, 1 − a ) } {E i, · (1 − a, a ) + E · ,i (1 − a, a ) } . Finally , for the Neyman v ariance of the causal net beneﬁt estimators, w e also deﬁne V c ( E i ) = 1 π a N N X i =1 {E i, · ( a, 1 − a ) − E · ,i (1 − a, a ) } 2 + 1 π 1 − a N N X i =1 {E i, · (1 − a, a ) − E · ,i ( a, 1 − a ) } 2 , V ( E i ) = V c ( E i ) − 1 N N X i =1 [ {E i, · ( a, 1 − a ) − E · ,i (1 − a, a ) } + {E i, · (1 − a, a ) − E · ,i ( a, 1 − a ) } ] 2 . 3.2 Regression estimators using individual pairs Let ϵ i ( a, 1 − a ) = { ϵ i, · ( a, 1 − a ) , ϵ · ,i ( a, 1 − a ) } . W e hav e the follo wing result for b λ I ( a, 1 − a ). Theorem 1 Under c omplete r andomization and Conditions 1 and 2 , b λ I ( a, 1 − a ) = λ ( a, 1 − a ) + o P (1) ; if further V { ϵ i ( a, 1 − a ) } ↛ 0 , then N 1 / 2 { b λ I ( a, 1 − a ) − λ ( a, 1 − a ) } /σ I ( a, 1 − a ) d → N (0 , 1) with σ I ( a, 1 − a ) = V { ϵ i ( a, 1 − a ) } + O ( N − 1 ) . F or the c ovarianc e, if C V ( ϵ i ) ↛ 0 , then ν I = N C o v { b λ I ( a, 1 − a ) , b λ I (1 − a, a ) } = C V ( ϵ i ) + O ( N − 1 ) . 12 Theorem 1 presents the consistency and asymptotic normality of the nonparametric estimator b λ I ( a, 1 − a ). The condition { N ( N − 1) } − 1 P i  = j W i,j ( a, 1 − a ) 4 = O (1) can b e relaxed to { N ( N − 1) } − 1 P i  = j W i,j ( a, 1 − a ) 4 = o ( N ) for a = 0 , 1. The k ey to obtaining results in Theorem 1 is to apply the Hoeﬀding decomp osition ( v an der V aart , 1998 , § 11.4) to b λ I ( a, 1 − a ), which is an order-t w o U-statistic. This yields a Bahadur representation of the U-statistic that provides the form of a linear summation of H` ajek pro jections and an uncorrelated degenerate remainder that is of order o P ( N − 1 / 2 ). Thus, the asymptotic prop erties of b λ I ( a, 1 − a ) can b e obtained b y inv estigating the dominant linear summation term, whic h is also linear in A i . Then, under Conditions 1 and 2 , the consistency is obtained b y Cheb yshev’s inequality , and the asymptotic normalit y is obtained via applying the ﬁnite- p opulation central limit theorem for simple random sampling ( Li and Ding , 2017 , Theorem 1) to the linear terms. W e obtain leading terms of the Neyman v ariances of b λ I ( a, 1 − a ) and b λ I (1 − a, a ), as w ell as their co v ariance, rather than their full exact expressions, where the O ( N − 1 ) terms in σ I ( a, 1 − a ) and ν I are from the uncorrelated degenerate remainder. Next, let γ adj I ( a, 1 − a ) denote the co eﬃcient from the theoretical OLS ﬁt of ϵ i,j ( a, 1 − a ) on X i,j , whic h is the ﬁnite-p opulation probability limit of b γ adj I ( a, 1 − a ), the coeﬃcient of X i,j from the OLS ﬁt of ( 6 ). Deﬁne r adj I ,i,j ( a, 1 − a ) = ϵ i,j ( a, 1 − a ) − X ⊤ i,j γ adj I ( a, 1 − a ) and r adj I ,j,i ( a, 1 − a ) = ϵ j,i ( a, 1 − a ) − X ⊤ j,i γ adj I ( a, 1 − a ), with r adj I ,i, · ( a, 1 − a ) = ( N − 1) − 1 P j : j  = i r adj I ,i,j ( a, 1 − a ) = ϵ i, · ( a, 1 − a ) − X ⊤ i, · γ adj I ( a, 1 − a ) and r adj I , · ,i ( a, 1 − a ) = ( N − 1) − 1 P j : j  = i r adj I ,j,i ( a, 1 − a ) = ϵ · ,i ( a, 1 − a ) − X ⊤ · ,i γ adj I ( a, 1 − a ), where X i, · = ( N − 1) − 1 P j : j  = i X i,j and X · ,i = ( N − 1) − 1 P j : j  = i X j,i . Let r adj I ,i ( a, 1 − a ) = { r adj I ,i, · ( a, 1 − a ) , r adj I , · ,i ( a, 1 − a ) } . W e ha ve the follo wing result for b λ adj I ( a, 1 − a ). Theorem 2 Under c omplete r andomization and Conditions 1 - 3 , b λ adj I ( a, 1 − a ) = λ ( a, 1 − a ) + o P (1) ; if further V { r adj I ,i ( a, 1 − a ) } ↛ 0 , then N 1 / 2 { b λ adj I ( a, 1 − a ) − λ ( a, 1 − a ) } /σ adj I ( a, 1 − a ) d → N (0 , 1) with σ adj I ( a, 1 − a ) = V { r adj I ,i ( a, 1 − a ) } + O ( N − 1 ) . F or the c ovarianc e, if C V ( r adj I ,i ) ↛ 0 , then ν adj I = N C o v { b λ adj I ( a, 1 − a ) , b λ adj I (1 − a, a ) } = C V ( r adj I ,i ) + O ( N − 1 ) . 13 Theorem 2 presen ts the consistency and asymptotic normalit y of b λ adj I ( a, 1 − a ), under arbitrary mo del missp eciﬁcation. Interestingly , the form of V { r adj I ,i ( a, 1 − a ) } suggests that b λ adj I ( a, 1 − a ) ma y be less eﬃcien t than b λ I ( a, 1 − a ), as γ adj I ( a, 1 − a ) is not the co eﬃcient of X i, · from the OLS ﬁt of ϵ i, · ( a, 1 − a ) on X i, · . This phenomenon w as ﬁrst iden tiﬁed by Lin ( 2013 ) for estimating the A TE in individual randomized exp erimen ts. A related result is also provided b y Su and Ding ( 2021 ), who expanded the results of Lin ( 2013 ) to cluster randomization, where the Neyman v ariance of estimators based on individual-lev el data do es not incorp orate the “correct” regression co eﬃcient, whereas estimators constructed from scaled cluster totals do. Consequen tly , the latter is guaran teed to be at least as statistically eﬃcient as the corresponding unadjusted estimators. Finally , let γ acv I denote the co eﬃcient from the theoretical OLS ﬁt of P 1 a =0 P 1 a ′ =0 π a π a ′ × W i,j ( a, a ′ ) on X i,j . Deﬁne r acv I ,i,j ( a, 1 − a ) = ϵ i,j ( a, 1 − a ) − X ⊤ i,j γ acv I and r acv I ,j,i ( a, 1 − a ) = ϵ j,i ( a, 1 − a ) − X ⊤ j,i γ acv I , with r acv I ,i, · ( a, 1 − a ) = ( N − 1) − 1 P j : j  = i r acv I ,i,j ( a, 1 − a ) and r acv I , · ,i ( a, 1 − a ) = ( N − 1) − 1 P j : j  = i r acv I ,j,i ( a, 1 − a ). Let r acv I ,i ( a, 1 − a ) = ( r acv I ,i, · ( a, 1 − a ) , r acv I , · ,i ( a, 1 − a )). W e ha ve the following result for b λ acv I ( a, 1 − a ). Theorem 3 Under c omplete r andomization and Conditions 1 - 3 , b λ acv I ( a, 1 − a ) = λ ( a, 1 − a ) + o P (1) ; if further V { r acv I ,i ( a, 1 − a ) } ↛ 0 , then N 1 / 2 { b λ acv I ( a, 1 − a ) − λ ( a, 1 − a ) } /σ acv I ( a, 1 − a ) d → N (0 , 1) with σ acv I ( a, 1 − a ) = V { r acv I ,i ( a, 1 − a ) } + O ( N − 1 ) . F or the c ovarianc e, if C V ( r acv I ,i ) ↛ 0 , then ν acv I = N C o v { b λ acv I ( a, 1 − a ) , b λ acv I (1 − a, a ) } = C V ( r acv I ,i ) + O ( N − 1 ) . Theorem 3 presen ts the consistency and asymptotic normality of b λ acv I ( a, 1 − a ), under arbitrary mo del missp eciﬁcation. Similar to b λ adj I ( a, 1 − a ), a noteworth y ﬁnding is that b λ acv I ( a, 1 − a ) ma y b e also less eﬃcien t than b λ I ( a, 1 − a ). In the classical setting of estimating the A TE with linear con trasts, Lin ( 2013 ) established that co v ariate adjustmen t under a fully in teracted sp eciﬁcation do es not result in asymptotic eﬃciency loss under complete randomization. Our result clariﬁes that, although consistency holds under arbitrary mo del missp eciﬁcation, this eﬃciency prop erty do es not extend to GCEs based on nonlinear 14 con trast functions. The departure arises from the non-separable structure of the GCE, under which regression adjustment alters the leading term of the Neyman v ariance in w ays that need not reduce v ariability . This diﬀerence p oints to a fundamen tal limitation of regression adjustment b eyond linear estimands and clariﬁes that eﬃciency guaran tees for co v ariate adjustment can b e estimand-dep endent when moving to nonlinear contrast functionals. W e oﬀer more details on this discussion in the ensuing subsection. It is w orth noting that in some cases, the contrast functions can be anti-symmetric suc h that W i,j = C − W j,i for some constan t C . Some examples include w ( Y i , Y j ) = Y i − Y j , where C = 0, and w ( Y i , Y j ) = 1 ( Y i ⪰ L Y j ), where C = 1. Some intrinsic connections among estimators using individual pair data emerge under suc h con trast function sp eciﬁcations. Prop osition 1 If the c ontr ast function w satisﬁes anti-symmetry, i.e., W i,j = C − W j,i for some c onstant C , then, for a = 0 , 1 , (i) γ adj I ( a, 1 − a ) = γ adj I (1 − a, a ) ; (ii) V { ϵ i ( a, 1 − a ) } = V { ϵ i (1 − a, a ) } , V { r adj I ,i ( a, 1 − a ) } = V { r adj I ,i (1 − a, a ) } , and V { r acv I ,i ( a, 1 − a ) } = V { r acv I ,i (1 − a, a ) } ; and (iii) C V ( ϵ i ) = − V { ϵ i ( a, 1 − a ) } = − V { ϵ i (1 − a, a ) } , C V ( r adj I ,i ) = − V { r adj I ,i ( a, 1 − a ) } = − V { r adj I ,i (1 − a, a ) } , and C V ( r acv I ,i ) = − V { r acv I ,i ( a, 1 − a ) } = − V { r acv I ,i (1 − a, a ) } . Prop osition 1 states that if the contrast function satisﬁes an ti-symmetry , then γ adj I ( a, 1 − a ) = γ adj I (1 − a, a ). F or v ariances, b λ I ( a, 1 − a ) and b λ I (1 − a, a ) hav e the same leading term of the Neyman v ariances, and the same prop erty holds for b λ adj I ( a, 1 − a ) and b λ adj I (1 − a, a ), as w ell as b λ acv I ( a, 1 − a ) and b λ acv I (1 − a, a ). The leading term of the cov ariance b etw een b λ I ( a, 1 − a ) and b λ I (1 − a, a ) is the negativ e of the leading term of the v ariance, and the same prop ert y holds for the cov ariance b etw een b λ adj I ( a, 1 − a ) and b λ adj I (1 − a, a ), and the one b etw een b λ acv I ( a, 1 − a ) and b λ acv I (1 − a, a ). 3.3 Regression estimators using per-unit pair av erages Next, for regression estimators using p er-unit pair a v erages, w e fo cus the discussions on b λ adj A ( a, 1 − a ) and b λ acv A ( a, 1 − a ) since b λ A ( a, 1 − a ) and b λ I ( a, 1 − a ) are algebraically equiv alent. 15 W e ﬁrst state the follo wing regularit y conditions tailored for p er-unit pair av erages. Condition 4 N − 1 P N i =1 W i, · ( a, 1 − a ) 4 = O (1) for a = 0 , 1 . N − 1 P N i =1 ∥ X i, · ∥ 4 ∞ = O (1) and max i ∥ X i, · ∥ 2 ∞ = O (1) . Condition 5 The limit of N − 1 P N i =1 X i, · X ⊤ i, · is p ositive deﬁnite, and the limit of N − 1 × P N i =1 X i, · W i, · is ﬁnite. Let { b γ adj A , 1 (1 , 0) , b γ adj A , 1 (0 , 1) } and { b γ adj A , 2 (1 , 0) , b γ adj A , 2 (0 , 1) } denote the co eﬃcien ts of A i X A i, · and (1 − A i ) X A i, · , and (1 − A i ) X A · ,i and A i X A · ,i from the OLS ﬁt of mo dels in ( 10 ), respectively . Let γ adj A , 1 ( a, 1 − a ) denote the co eﬃcien t from the theoretical OLS ﬁt of ϵ i, · ( a, 1 − a ) on X i, · , and γ adj A , 2 ( a, 1 − a ) from the theoretical OLS ﬁt of ϵ · ,i ( a, 1 − a ) on X · ,i , whic h are ﬁnite-p opulation probability limits of b γ adj A , 1 ( a, 1 − a ) and b γ adj A , 2 ( a, 1 − a ), resp ectiv ely . F or o = 1 , 2, deﬁne r adj A , o ,i, · ( a, 1 − a ) = ϵ i, · ( a, 1 − a ) − X ⊤ i, · γ adj A , o ( a, 1 − a ) and r adj A , o , · ,i ( a, 1 − a ) = ϵ · ,i ( a, 1 − a ) − X ⊤ · ,i γ adj A , o ( a, 1 − a ). Let r adj A , o ,i ( a, 1 − a ) = { r adj A , o ,i, · ( a, 1 − a ) , r adj A , o , · ,i ( a, 1 − a ) } . W e ha ve the follo wing result for b λ adj A , o ( a, 1 − a ) for o = 1 , 2. Theorem 4 Under c omplete r andomization and Conditions 1 , 4 and 5 , for o = 1 , 2 , b λ adj A , o ( a, 1 − a ) = λ ( a, 1 − a ) + o P (1) ; if further V { r adj A , o ,i ( a, 1 − a ) } ↛ 0 , then N 1 / 2 { b λ adj A , o ( a, 1 − a ) − λ ( a, 1 − a ) } /σ adj A , o ( a, 1 − a ) d → N (0 , 1) with σ adj A , o ( a, 1 − a ) = V { r adj A , o ,i ( a, 1 − a ) } + O ( N − 1 ) . F or the c ovarianc e, if C V ( r adj A , o ,i ) ↛ 0 , then ν adj A , o = N C o v { b λ adj A , o ( a, 1 − a ) , b λ adj A , o (1 − a, a ) } = C V ( r adj A , o ,i ) + O ( N − 1 ) . Theorem 4 presents the consistency and asymptotic normality of b λ adj A , o ( a, 1 − a ) for o = 1 , 2, under arbitrary mo del missp eciﬁcation. Similar to the regression estimators using individual pairs, the form of V { r adj A , o ,i ( a, 1 − a ) } suggests that b λ adj A , o ( a, 1 − a ) ma y also b e less eﬃcien t than b λ I ( a, 1 − a ). Next, for V { r adj A , 1 ,i ( a, 1 − a ) } , γ adj A , 1 ( a, 1 − a ) is the coeﬃcient of X i, · from the OLS ﬁt of ϵ i, · ( a, 1 − a ) on X i, · , but not the co eﬃcient of X · ,i from the OLS ﬁt of ϵ · ,i ( a, 1 − a ) on X · ,i , whereas the righ t co eﬃcient is γ adj A , 2 ( a, 1 − a ). Similarly , vice v ersa for V { r adj A , 2 ,i ( a, 1 − a ) } . 16 An intuitiv e interpretation of V { r adj A , 1 ,i ( a, 1 − a ) } is that it is the Neyman v ariance of the ANCOV A estimator in a hypothetical setting of estimating the A TE with p otential outcomes W i, · ( a, 1 − a ) under treatment a and W · ,i ( a, 1 − a ) under 1 − a . Therefore, by Lin ( 2013 ), the relative eﬃciency ordering b etw een b λ adj A , 1 ( a, 1 − a ) and b λ A ( a, 1 − a ) is not deﬁnitiv e and can dep end on the data generating pro cess. Sp eciﬁcally , let W S ( a, 1 − a ) = |S ( a, 1 − a ) | − 1 P i  = j 1 i ( a ) 1 j (1 − a ) W i,j and X S ( a, 1 − a ) = |S ( a, 1 − a ) | − 1 P i  = j 1 i ( a ) 1 j (1 − a ) X i,j . By the prop ert y of the OLS, w e ha v e b λ adj A , o ( a, 1 − a ) = W S ( a, 1 − a ) − X ⊤ S ( a, 1 − a ) b γ adj A , o ( a, 1 − a ) for o = 1 , 2. Under regularit y conditions, it is shown in the Supplementary Materials that b λ adj A , o ( a, 1 − a ) − λ ( a, 1 − a ) = W S ( a, 1 − a ) − W ( a, 1 − a ) − X ⊤ S ( a, 1 − a ) γ adj A , o ( a, 1 − a ) + o P ( N − 1 ). Applying the Ho eﬀding decomp osition to W S ( a, 1 − a ) − W ( a, 1 − a ) − X ⊤ S ( a, 1 − a ) γ adj A , o ( a, 1 − a ) yields the individual summand in the leading term of the Bahadur representation as 1 i ( a ) r adj A , o ,i, · ( a, 1 − a ) + 1 i (1 − a ) r adj A , o , · ,i ( a, 1 − a ), where r adj A , o ,i, · ( a, 1 − a ) and r adj A , o , · ,i ( a, 1 − a ) share the the same regression co eﬃcient γ adj A , o ( a, 1 − a ) analogous to an ANCOV A mo del. The theory of randomized sampling yields the leading term in the Neyman v ariance, which alw ays con tains a term with the “incorrect” regression co eﬃcient, as we mentioned previously . This result indirectly suggests that one should adopt the approac h in Lin ( 2013 ) for estimands under linear contrast functions. Finally , let γ acv A , o = π a γ adj A , o ( a, 1 − a ) + π 1 − a γ adj A , o (1 − a, a ) for o = 1 , 2. Deﬁne r acv A , o ,i, · ( a, 1 − a ) = ϵ i, · ( a, 1 − a ) − X ⊤ i, · γ acv A , o and r acv A , o , · ,i ( a, 1 − a ) = ϵ · ,i ( a, 1 − a ) − X ⊤ · ,i γ acv A , o . Let r acv A , o ,i ( a, 1 − a ) = { r acv A , o ,i, · ( a, 1 − a ) , r acv A , o , · ,i ( a, 1 − a ) } . W e hav e the following theorem for b λ acv A , o ( a, 1 − a ) for o = 1 , 2. Theorem 5 Under c omplete r andomization and Conditions 1 , 4 and 5 , for o = 1 , 2 , b λ acv A , o ( a, 1 − a ) = λ ( a, 1 − a ) + o P (1) ; if further V { r acv A , o ,i ( a, 1 − a ) } ↛ 0 , then N 1 / 2 { b λ acv A , o ( a, 1 − a ) − λ ( a, 1 − a ) } /σ acv A , o ( a, 1 − a ) d → N (0 , 1) with σ acv A , o ( a, 1 − a ) = V { r acv A , o ,i ( a, 1 − a ) } + O ( N − 1 ) . F or the c ovarianc e, if C V ( r acv A , o ,i ) ↛ 0 , then ν acv A , o = N C ov { b λ acv A , o ( a, 1 − a ) , b λ acv A , o (1 − a, a ) } = C V ( r acv A , o ,i ) + O ( N − 1 ) . 17 Theorem 5 presen ts the consistency and asymptotic normality of b λ acv A , o ( a, 1 − a ), under arbitrary model misspeciﬁcation. Similar to b λ adj A , o ( a, 1 − a ), b λ acv A , o ( a, 1 − a ) may b e less eﬃcient than b λ I ( a, 1 − a ). F ollo wing Prop osition 1 , we also ha v e the following proposition regarding estimators using p er-unit pair av erages under contrast functions satisfying anti-symmetry . Prop osition 2 If the c ontr ast function w satisﬁes anti-symmetry, then, for a = 0 , 1 and o = 1 , 2 , in gener al, (i) γ adj A , o ( a, 1 − a )  = γ adj A , o (1 − a, a ) , but γ adj A , 1 ( a, 1 − a ) = γ adj A , 2 (1 − a, a ) , which implies that V { r adj A , 1 ,i ( a, 1 − a ) } = V { r adj A , 2 ,i (1 − a, a ) } and C V ( r adj A , 1 ,i ) = C V ( r adj A , 2 ,i ) ; (ii) γ acv A , 1 = π a γ adj A , 1 ( a, 1 − a ) + π 1 − a γ adj A , 1 (1 − a, a ) = π a γ adj A , 2 (1 − a, a ) + π 1 − a γ adj A , 2 ( a, 1 − a )  = γ acv A , 2 , unless π 0 = π 1 = 1 / 2 , and thus, V { r acv A , 1 ,i ( a, 1 − a ) } = V { r acv A , 2 ,i (1 − a, a ) } and C V ( r acv A , 1 ,i ) = C V ( r acv A , 2 ,i ) if π 0 = π 1 = 1 / 2 . Prop osition 2 states that the asymptotic equiv alence b λ adj I ( a, 1 − a ) and b λ acv I ( a, 1 − a ) under contrast functions satisfying ant i-symmetry do es not extend to b λ adj A , o ( a, 1 − a ) and b λ acv A , o ( a, 1 − a ). Ho wev er, the leading terms of the Neyman v ariances of b λ adj A , 1 ( a, 1 − a ) and b λ adj A , 2 (1 − a, a ) are equal. The results for co v ariate-adjusted estimators, in general, do not extend to ANCOV A estimators unless π 0 = π 1 = 1 / 2. 3.4 Estimating the causal net b eneﬁt 3.4.1 Estimators from previous regression mo dels The causal net b eneﬁt is deﬁned as τ ( a ) = λ ( a, 1 − a ) − λ (1 − a, a ) with τ (1 − a ) = − τ ( a ) for a = 0 , 1. The natural estimator for τ ( a ) is b τ ( a ) = b λ ( a, 1 − a ) − b λ (1 − a, a ). Therefore, the previously prop osed estimators for λ ( a, 1 − a ) can b e straigh tforwardly applied to the estimation of τ ( a ), and their consistency and asymptotic normality under arbitrary mo del missp eciﬁcation also extend. W e ha v e the following corollary . Corollary 1 Under Conditions 1 - 3 , (i) b τ I ( a ) = b λ I ( a, 1 − a ) − b λ I (1 − a, a ) = τ ( a ) + o P (1) , and if further V ( ϵ I ,i ) ↛ 0 , then N 1 / 2 { b τ I ( a ) − τ ( a ) } /σ τ , I d → N (0 , 1) with σ τ , I = V ( ϵ I ,i ) + O ( N − 1 ) ; 18 (ii) b τ acv I ( a ) = b λ acv I ( a, 1 − a ) − b λ acv I (1 − a, a ) = τ ( a ) + o P (1) , and if further V ( r acv I ,i ) ↛ 0 , then N 1 / 2 { b τ acv I ( a ) − τ ( a ) } /σ acv τ , I d → N (0 , 1) with σ acv τ , I = V ( r acv I ,i ) + O ( N − 1 ) ; (iii) b τ adj I ( a ) = b λ adj I ( a, 1 − a ) − b λ adj I (1 − a, a ) = τ ( a ) + o P (1) , and if further V ( r adj I ,i ) ↛ 0 , then N 1 / 2 { b τ adj I ( a ) − τ ( a ) } /σ adj τ , I d → N (0 , 1) with σ adj τ , I = V ( r adj I ,i ) + O ( N − 1 ) . Under Conditions 1 , 4 and 5 , for o = 1 , 2 , (i) b τ acv A , o ( a ) = b λ acv A , o ( a, 1 − a ) − b λ acv A , o (1 − a, a ) = τ ( a ) + o P (1) , and if further V ( r acv A , o ,i ) ↛ 0 , then N 1 / 2 { b τ acv A , o ( a ) − τ ( a ) } /σ acv τ , A , o d → N (0 , 1) with σ acv τ , A , o = V ( r acv A , o ,i ) + O ( N − 1 ) ; (ii) b τ adj A , o ( a ) = b λ adj A , o ( a, 1 − a ) − b λ adj A , o (1 − a, a ) = τ ( a ) + o P (1) , and if further V ( r adj A , o ,i ) ↛ 0 , then N 1 / 2 { b τ adj A , o ( a ) − τ ( a ) } /σ adj τ , A , o d → N (0 , 1) with σ adj τ , A , o = V ( r adj A , o ,i ) + O ( N − 1 ) . F ollo wing Prop ositions 1 and 2 , we hav e the following prop osition regarding estimating τ ( a ) under con trast functions satisfying an ti-symmetry . Prop osition 3 If the c ontr ast function w satisﬁes anti-symmetry, then, for a = 0 , 1 , (i) V ( ϵ I ,i ) = 4 V { ϵ I ,i ( a, 1 − a ) } = 4 V { ϵ I ,i (1 − a, a ) } , V ( r adj I ,i ) = 4 V { r adj I ,i ( a, 1 − a ) } = 4 V { r adj I ,i (1 − a, a ) } , and V ( r acv I ,i ) = 4 V { r acv I ,i ( a, 1 − a ) } = 4 V { r acv I ,i (1 − a, a ) } ; and (ii) V ( r adj A , 1 ,i ) = V ( r adj A , 2 ,i ) . Prop osition 3 states that, under contrast functions satisfying anti-symmetry , w e hav e V ( ϵ I ,i ) = 4 V { ϵ I ,i ( a, 1 − a ) } b ecause, b y Prop osition 1 , C V ( ϵ I ,i ) = − V { ϵ I ,i ( a, 1 − a ) } = − V { ϵ I ,i (1 − a, a ) } , same for V ( r adj I ,i ) and V ( r acv I ,i ). F or b τ adj A , 1 ( a ) and b τ adj A , 2 ( a ), V ( r adj A , 1 ,i ) = V ( r adj A , 2 ,i ), because V { r adj A , 1 ,i ( a, 1 − a ) } = V { r adj A , 2 ,i (1 − a, a ) } and C V ( r adj A , 1 ,i ) = C V ( r adj A , 2 ,i ) b y Prop osition 2 . 3.4.2 Estimators from the probabilistic index mo dels An alternativ e approac h for directly estimating the causal net b eneﬁt is based on PIMs ( Thas et al. , 2012 ; Scheidegger et al. , 2025 ). Although PIMs ha ve b een previously pro- p osed, their design-based prop erties under complete randomization hav e not b een formally c haracterized since their in tro duction. Under randomization, w e prov e that the coeﬃcient from a linear PIM with individual pair data is alwa ys consistent for the causal net b eneﬁt estimand, ev en when the working PIM is arbitrarily misspeciﬁed. This result aﬃrms that 19 the linear PIM estimator is model-assisted in the same sense as the regression estimators studied ab o ve. T o formalize this connection, deﬁne D i,j = A i − A j , and consider the regression mo dels W i,j ∼ Z ⊤ P ,i,j ≡ D i,j , W i,j ∼ Z acv ⊤ P ,i,j ≡ ( D i,j , X ⊤ i,j ) , W i,j ∼ Z int ⊤ P ,i,j ≡ ( D i,j , D i,j X ⊤ i,j ) , W i,j ∼ Z adj ⊤ P ,i,j ≡ ( D i,j , X ⊤ i,j , D i,j X ⊤ i,j ) . Let b β P , b β acv P , b β int P , and b β adj P b e the resp ectiv e co eﬃcien ts of D i,j from OLS ﬁts of the ab o ve mo dels. Then, PIM estimators for τ (1) are b τ P (1) = 2 b β P , (1) , b τ acv P (1) = 2 b β acv P , b τ int P (1) = 2 b β int P , and b τ adj P (1) = 2 b β adj P . W e then obtain the follo wing results. Theorem 6 Under c omplete r andomization and Conditions 1 - 3 , (i) b τ P ( a ) = τ ( a ) + o P (1) ; if further V ( ϵ I ,i ) ↛ 0 , then N 1 / 2 { b τ P ( a ) − τ ( a ) } /σ τ , I d → N (0 , 1) with σ τ , I = V ( ϵ I ,i ) + O ( N − 1 ) . (ii) b τ acv P ( a ) = τ ( a ) + o P (1) ; if further V ( r acv I ,i ) ↛ 0 , then N 1 / 2 { b τ acv P ( a ) − τ ( a ) } /σ acv τ , I d → N (0 , 1) with σ acv τ , I = V ( r acv I ,i ) + O ( N − 1 ) . (iii) b τ int P ( a ) = τ ( a ) + o P (1) ; if further V ( ϵ I ,i ) ↛ 0 , then N 1 / 2 { b τ int P ( a ) − τ ( a ) } /σ τ , I d → N (0 , 1) with σ τ , I = V ( ϵ I ,i )+ O ( N − 1 ) . (iv) b τ adj P ( a ) = τ ( a ) + o P (1) ; if further V ( r acv I ,i ) ↛ 0 , then N 1 / 2 { b τ adj P ( a ) − τ ( a ) } /σ acv τ , I d → N (0 , 1) with σ acv τ , I = V ( r acv I ,i ) + O ( N − 1 ) . Theorem 6 summarizes the consistency and asymptotic normalit y of linear PIM estima- tors for τ ( a ). The results ﬁrst show that b τ P ( a ) and b τ I ( a ) are equiv alent. Second, b τ P ( a ) and b τ int P ( a ) are asymptotically equiv alen t, and b τ acv P ( a ) and b τ adj P ( a ) are also asymptotically equiv- alen t. That is, the adjustmen t of D i,j X i,j is not meaningful asymptotically . The intuition is that D i,j X i,j forces the co eﬃcien t of this term to b e in the opp osite sign for groups with D i,j = 1 and − 1, which, coupled with X i, · = − X · ,i , cancels out the related terms in the leading terms of the Neyman v ariances. Lastly , b oth b τ acv P ( a ) and b τ adj P ( a ) are asymptotically equiv alent to b τ acv I ( a ). These ﬁndings provide a theoretical ground for justifying the linear PIM as a model-assisted estimator for the causal net b eneﬁt. 20 4 V ariance estimation for regression estimators 4.1 Estimators using individual pairs W e prop ose to quantify the uncertaint y of the estimators using individual pairs can via the complete t w o-wa y (CTW) clustering v ariance estimator, whic h accoun ts for the tw o-w a y correlation ( i and j ) among individual pairs and the rev erse eﬀect. F or the CTW estimator, stac k Z ⊤ I ,i,j , Z acv ⊤ I ,i,j , and Z adj ⊤ I ,i,j deﬁned in ( 4 )-( 6 ) to create resp ective design matrices Z I , Z acv I , and Z adj I . Denote the residuals from the OLS ﬁt of mo dels in ( 4 )-( 6 ) b y b r I ,i,j , b r acv I ,i,j , and b r adj I ,i,j , resp ectively . Then, using b λ adj I (1 , 0) as an example, its CTW v ariance estimator is “ se 2 CTW ¶ b λ adj I (1 , 0) © = h Ä Z adj ⊤ I Z adj I ä − 1 M adj I , CTW Ä Z adj ⊤ I Z adj I ä − 1 i (1 , 1) , (11) where [ · ] (1 , 1) denote the (1 , 1)th elemen t of the matrix inside [ · ]. Similarly , for “ se 2 CTW { b λ adj I (0 , 1) } , w e hav e [ · ] (2 , 2) , and for “ se 2 CTW { b τ adj I ( a ) } , w e ha ve [ · ] (1 , 1)+(2 , 2) − 2(1 , 2) , whic h is the sum of the (1 , 1)th and (2 , 2)th, min us twice the (1 , 2)th elements. The middle matrix is M adj I , CTW = N X i =1 X j : j  = i Z adj I ,i,j b r adj I ,i,j ! X j : j  = i Z adj I ,i,j b r adj I ,i,j ! ⊤ + N X j =1 X i : i  = j Z adj I ,i,j b r adj I ,i,j ! X i : i  = j Z adj I ,i,j b r adj I ,i,j ! ⊤ + N X i =1 X j : j  = i Z adj I ,i,j b r adj I ,i,j ! X j : j  = i Z adj I ,j,i b r adj I ,j,i ! ⊤ + N X j =1 X i : i  = j Z adj I ,i,j b r adj I ,i,j ! X i : i  = j Z adj I ,j,i b r adj I ,j,i ! ⊤ (12) − X i  = j ¶ Z adj I ,i,j Z adj ⊤ I ,j,i b r adj I ,i,j b r adj I ,j,i + Z adj I ,i,j Z adj ⊤ I ,i,j ( b r adj I ,i,j ) 2 © , whic h consists of the main eﬀects of i and j , the reverse eﬀects of i and j , and the correc- tion from ov ercoun ting. F or λ I ( a, 1 − a ) and λ acv I ( a, 1 − a ), we obtain their CTW v ariance estimators by substituting Z adj I and b r adj I ,i,j in ( 11 ) and ( 12 ) with corresp onding design matrix and residuals. Theorem 7 states that the CTW v ariance estimators are consisten t for the leading terms of the Neyman v ariances of estimators using individual pairs. Since the re- maining terms of the Neyman v ariance are of a higher order, the leading term dominates as N → ∞ , and th us, the CTW v ariance estimators are consistent for the Neyman v ariances. 21 Theorem 7 Under c omplete r andomization and Conditions 1 - 3 , N “ se 2 CTW { b λ I ( a, 1 − a ) } = V c { ϵ I ,i ( a, 1 − a ) } + o P (1) , N “ se 2 CTW { b τ I ( a ) } = V c ( ϵ I ,i ) + o P (1) ; N “ se 2 CTW { b λ acv I ( a, 1 − a ) } = V c { r acv I ,i ( a, 1 − a ) } + o P (1) , N “ se 2 CTW { b τ acv I ( a ) } = V c ( r acv I ,i ) + o P (1) ; N “ se 2 CTW { b λ adj I ( a, 1 − a ) } = V c { r adj I ,i ( a, 1 − a ) } + o P (1) , N “ se 2 CTW { b τ adj I ( a ) } = V c ( r adj I ,i ) + o P (1) . F or PIM estimators, stac k Z ⊤ P ,i,j , Z acv ⊤ P ,i,j , Z int ⊤ P ,i,j , and Z adj ⊤ P ,i,j deﬁned in ( 11 ) to create resp ectiv e design matrices Z P , Z acv P , Z int P , and Z adj P . Denote the residuals from the OLS ﬁt of mo dels in ( 11 ) b y b r P ,i,j , b r acv P ,i,j , b r int P ,i,j ,and b r adj P ,i,j , respectively . Then, the CTW v ariance estimator for the v ariance of b τ adj P ( a ) is “ se 2 CTW ¶ b τ adj P ( a ) © = h Ä Z adj ⊤ P Z adj P ä − 1 M adj P , CTW Ä Z adj ⊤ P Z adj P ä − 1 i (1 , 1) , (13) where the middle matrix M adj P , CTW = N X i =1 X j : j  = i Z adj P ,i,j b r adj P ,i,j ! X j : j  = i Z adj P ,i,j b r adj P ,i,j ! ⊤ + N X j =1 X i : i  = j Z adj P ,i,j b r adj P ,i,j ! X i : i  = j Z adj P ,i,j b r adj P ,i,j ! ⊤ + N X i =1 X j : j  = i Z adj I ,i,j b r adj P ,i,j ! X j : j  = i Z adj P ,j,i b r adj P ,j,i ! ⊤ + N X j =1 X i : i  = j Z adj P ,i,j b r adj P ,i,j ! X i : i  = j Z adj P ,j,i b r adj P ,j,i ! ⊤ − X i  = j ¶ Z adj P ,i,j Z adj ⊤ P ,j,i b r adj P ,i,j b r adj P ,j,i + Z adj P ,i,j Z adj ⊤ P ,i,j ( b r adj P ,i,j ) 2 © . (14) F or b τ P ( a ), b τ acv P ( a ), and b τ int P ( a ), w e obtain their CTW v ariance estimators by substituting Z adj P and b r adj P ,i,j in ( 13 ) and ( 14 ) with corresp onding design matrix and residuals. W e ha ve the following theorem. Theorem 8 Under c omplete r andomization and Conditions 1 - 3 , N “ se 2 CTW { b τ P ( a ) } = V c ( ϵ I ,i )+ o P (1) ; N “ se 2 CTW { b τ acv P ( a ) } = V c ( r acv P ,i )+ o P (1) ; N “ se 2 CTW { b τ int P ( a ) } = V c ( ϵ I ,i )+ o P (1) ; N “ se 2 CTW { b τ adj P ( a ) } = V c ( r acv P ,i ) + o P (1) . Similar to Theorem 7 , Theorem 8 states that the CTW v ariance estimators are also consisten t for the Neyman v ariances of PIM estimators. 22 Remark 4 The CTW varianc e estimator pr ovides an explicit, design-b ase d r epr esentation of the generic sandwich varianc e estimator original ly discusse d in Thas et al. ( 2012 ) for PIMs. However, Thas et al. ( 2012 ) develop e d their varianc e expr essions under a sup erp op- ulation, sampling-b ase d fr amework and did not pursue a ﬁnite-p opulation r andomization p ersp e ctive. In c ontr ast, our CTW formulation explicitly char acterizes the r andomization varianc e induc e d by known tr e atment assignment, ac c ounting for the two-way dep endenc e structur e arising fr om shar e d units and r everse c omp arisons. The standar d two-way (TW) clustering varianc e estimator ( Camer on et al. , 2011 ) is c onsistent for the mar ginal vari- anc es of b λ ( a, 1 − a ) and b λ (1 − a, a ) , but omits their c ovarianc e. Ther efor e, it is not c onsistent for the varianc es of the PIM estimators. The heter oske dasticity-r obust (HR) varianc e esti- mator ( Hub er , 1967 ) is neither c onsistent for either the mar ginal varianc es nor the c ovari- anc es b e c ause it omits the c orr elation among p airs and the r everse eﬀe ct. The (one-way) cluster-r obust (CR) varianc e estimator ( Liang and Ze ger , 1986 ) is also not c onsistent for either the mar ginal varianc es or the c ovarianc es b e c ause it omits half of the c orr elations and the r everse eﬀe ct. We pr ove these r esults in Se ction S5 of the Supplementary Materials. 4.2 Estimators using p er-unit pair av erages The v ariances and co v ariances of estimators using p er-unit pair a v erages can also b e esti- mated via the CTW v ariance estimator. Stack Z acv ⊤ A ,i, · , Z acv ⊤ A , · ,i , Z adj ⊤ A ,i, · , and Z adj ⊤ A , · ,i deﬁned in ( 9 ) and ( 10 ) to create corresp onding design matrices Z acv A , 1 , Z acv A , 2 , Z adj A , 1 , and Z adj A , 2 . Denote the residuals from the OLS ﬁt of mo dels in ( 9 ) and ( 10 ) by b r acv A , 1 ,i, · and b r acv A , 2 , · ,i and b r adj A , 1 ,i, · and b r adj A , 2 , · ,i , resp ectiv ely . W e need to manually compute the missing residuals, i.e., b r acv A , 1 , · ,i , b r acv A , 2 ,i, · , b r adj A , 1 , · ,i , and b r adj A , 2 ,i, · , e.g., b r adj A , 1 , · ,i = W A · ,i − 1 i ( a ) b λ adj A , 1 ( a, 1 − a ) − 1 i ( a ) X A ⊤ · ,i b γ adj A , 1 ( a, 1 − a ) and b r adj A , 2 ,i, · = W A i, · − 1 i (1 − a ) b λ adj A , 2 ( a, 1 − a ) − 1 i (1 − a ) X A ⊤ i, · b γ adj A , 2 ( a, 1 − a ). 23 Using b λ adj A , 1 (1 , 0) as an example, its CTW v ariance estimator is “ se 2 CTW ¶ b λ adj A , 1 (1 , 0) © = " Ä Z adj ⊤ A , 1 Z adj A , 1 ä − 1 ( N X i =1 Z adj A ,i, · Z adj ⊤ A ,i, · ( b r adj A , 1 ,i, · ) 2 ) Ä Z adj ⊤ A , 1 Z adj A , 1 ä − 1 # (1 , 1) + " Ä Z adj ⊤ A , 2 Z adj A , 2 ä − 1 ( N X i =1 Z adj A , · ,i Z adj ⊤ A , · ,i ( b r adj A , 1 , · ,i ) 2 ) Ä Z adj ⊤ A , 2 Z adj A , 2 ä − 1 # (1 , 1) (15) Similarly , for “ se 2 CTW { b λ adj A , 1 (0 , 1) } , we hav e [ · ] (2 , 2) . The cov ariance estimator is ‘ C o v CTW ¶ b λ adj A , 1 (1 , 0) , b λ adj A , 1 (0 , 1) © = " Ä Z adj ⊤ A , 1 Z adj A , 1 ä − 1 N X i =1 Z adj A ,i, · Z adj ⊤ A , · ,i b r adj A , 1 ,i, · b r adj A , 1 , · ,i ! Ä Z adj ⊤ A , 2 Z adj A , 2 ä − 1 # (1 , 2) + " Ä Z adj ⊤ A , 2 Z adj A , 2 ä − 1 N X i =1 Z adj A , · ,i Z adj ⊤ A ,i, · b r adj A , 1 ,i, · b r adj A , 1 , · ,i ! Ä Z adj ⊤ A , 1 Z adj A , 1 ä − 1 # (1 , 2) . (16) No correction terms are needed in this case, since the correlations b etw een pairs are implicit. The v ariance estimator for b τ adj A , 1 ( a ) is “ se 2 CTW { b τ adj A , 1 ( a ) } = “ se 2 CTW { b λ adj A , 1 (1 , 0) } + “ se 2 CTW { b λ adj A , 1 (0 , 1) } − 2 ‘ C o v CTW { b λ adj A , 1 (1 , 0) , b λ adj A , 1 (0 , 1) } . The CTW v ariance and co v ariance estimators for b λ adj A , 2 ( a, 1 − a ) are obtained follo wing the same pro cedure. F or b λ acv A , 1 ( a, 1 − a ) and b λ acv A , 2 ( a, 1 − a ), we obtain their CTW v ariance estimators b y substituting Z adj A , o , b r adj A , o ,i, · , b r adj A , o , · ,i , and b r adj A , o ,i,j in ( 15 ) and ( 16 ) with Z acv A , o , b r acv A , o ,i, · , b r acv A , o , · ,i , and b r acv A , o ,i,j , for o = 1 , 2. W e ha ve the follo wing result. Theorem 9 Under c omplete r andomization and Conditions 1 , 4 , and 5 , for o = 1 , 2 , N “ se 2 CTW { b λ acv A , o ( a, 1 − a ) } = V c { r acv A , o ,i ( a, 1 − a ) } + o P (1) , N “ se 2 CTW { b τ acv A , o ( a ) } = V c ( r acv A , o ,i ) + o P (1) ; N “ se 2 CTW { b λ adj A , o ( a, 1 − a ) } = V c { r adj A , o ,i ( a, 1 − a ) } + o P (1) , N “ se 2 CTW { b τ adj A , o ( a ) } = V c ( r adj A , o ,i ) + o P (1) . Theorem 9 states that the CTW v ariance estimators are consistent for the Neyman v ariances of estimators using p er-unit pair a verages. T o follo w up on Remark 4 , w e sho w in Section S5 of the Supplemen tary Materials that the HR v ariance estimator is not con- sisten t for the marginal v ariances or the co v ariances, and the TW v ariance estimator is not consisten t for the cov ariances. Hence, the CTW v ariance estimators are recommended as a uniﬁed recip e for asymptotically v alid inference for estimating the GCE estimand, 24 regardless of the c hoice of working mo dels. 4.3 Summary and recommendations W e summarize the results and provide recommendations in this section. T able 2 summarizes the theoretical results. Since there is no strict relative eﬃciency ordering, the recommen- dations for selecting estimators could b e data-dep enden t. F or estimating λ ( a, 1 − a ), if the co v ariates are prognostic for the outcome, then adjusting for co v ariates could increase the estimation eﬃciency , implying that the ANCO V A and cov ariate-adjusted estimators are preferred. If the co v ariates are not prognostic for the outcome, then adjusting for them could h urt the estimation eﬃciency , suggesting that the unadjusted estimator is preferred. F rom a computational persp ectiv e, if the dataset is large, estimators using individual pairs ma y b e prohibitive due to resource constrain ts. Therefore, estimators using p er-unit pair a verages are preferred for practical considerations. F or estimating τ ( a ), the PIM estima- tors are more conv enient to implemen t, and p er previous discussions, the in teraction term b et w een D i,j and X i,j do es not need to b e included in the adjustmen t since there is no asymptotic eﬃciency gain. Same as the estimation of λ ( a, 1 − a ), co v ariate adjustmen ts are preferred if the cov ariates are prognostic of the outcomes, and vice versa. 5 Sim ulation studies 5.1 Sim ulation design W e conduct sim ulation studies to illustrate the ﬁnite-sample b eha vior of the prop osed re- gression p oin t estimators and v ariance estimators. Although our theoretical results demon- strate that cov ariate adjustment under nonlinear contrast functions does not alwa ys admit a univ ersal eﬃciency guarantee, we design scenarios to show that empirical eﬃciency gains can nonetheless arise when baseline co v ariates are prognostic for the outcomes. At the same 25 T able 2: The summary of theoretical results. GCE λ ( a, 1 − a ) Estimator Data lev el Adjustmen t Asy . results V ar. estimator b λ I ( a, 1 − a ) Individual A i (1 − A j ), Thm. 1 Thm. 7 pairs A j (1 − A i ) b λ adj I ( a, 1 − a ) A i (1 − A j ), Thm. 2 Thm. 7 Individual A j (1 − A i ), pairs A i (1 − A j ) X i,j , A j (1 − A i ) X i,j b λ acv I ( a, 1 − a ) Individual A i (1 − A j ), Thm. 3 Thm. 7 pairs A j (1 − A i ), X i,j b λ adj A , 1 ( a, 1 − a ) ⋆ P er-unit A i , 1 − A i , A i X i, · , Thm. 4 Thm. 9 a ve. of pairs (1 − A i ) X i, · b λ acv A , 1 ( a, 1 − a ) ⋆ P er-unit A i , 1 − A i , X i, · Thm. 5 Thm. 9 a ve. of pairs Causal net beneﬁt τ ( a ) b τ P ( a ) † , ⋄ Individual D i,j Thm. 6 Thm. 8 pairs b τ acv P ( a ) ‡ Individual D i,j , X i,j Thm. 6 Thm. 8 pairs b τ int P ( a ) † Individual D i,j , D i,j X i,j Thm. 6 Thm. 8 pairs b τ adj P ( a ) ‡ Individual D i,j , X i,j , Thm. 6 Thm. 8 pairs D i,j X i,j b τ I ( a ) ⋄ , b τ adj I ( a ), Individual same as b λ I , Coro. 1 Thm. 7 b τ acv I ( a ) ‡ pairs b λ adj I , b λ acv I b τ adj A , 1 ( a ) ⋆ , b τ acv A , 1 ( a ) ⋆ P er-unit same as Coro. 1 Thm. 9 a ve. of pairs b λ adj A , 1 , b λ acv A , 1 † b τ P and b τ int P are asymptotically equiv alen t. ‡ b τ acv P , b τ adj P , and b τ acv I are asymptotically equiv alen t. ⋄ b τ P and b τ I are equiv alent. ⋆ W e omit b λ adj A , 2 and b λ acv A , 2 ( b τ adj A , 2 and b τ acv A , 2 ) to sa v e space. They share the same properties as b λ adj A , 1 and b λ acv A , 1 ( b τ adj A , 1 and b τ acv A , 1 ). 26 time, we also construct counterexamples in which cov ariate adjustmen t fails to improv e, or ma y even reduce, eﬃciency , thereby providing practical insight into the limits identiﬁed b y our theory . W e further ev aluate the empirical p erformance of the prop osed CTW v ariance estimator and compare it with commonly used HR and CR alternatives in ﬁnite samples. Throughout, our primary focus is on the U-t yp e GCE estimand deﬁned in ( 1 ); additional sim ulation results for the V-t yp e GCE estimand are pro vided in the W eb T able S8 and S9 in Section S7 of the Supplemen tary Materials, where we conﬁrm that the asymptotic equiv alence established in Section 2 is reﬂected in ﬁnite-sample p erformance. W e sim ulate t w o sample size scenarios with N = 200 and N = 500 units, and an indep enden t treatment assignment A i ∼ B (0 . 5). W e simulate 1,000 replicates for eac h setting-sample size combination. F or the GCEs, we compare b λ I , b λ adj I , b λ acv I , b λ adj A , 1 , b λ adj A , 2 , b λ acv A , 1 , and b λ acv A , 2 . F or the causal net b eneﬁt, we compare b τ I , b τ adj I , b τ acv I , b τ adj A , 1 , b τ adj A , 2 , b τ acv A , 1 , b τ acv A , 2 , and the PIM estimators, b τ P , b τ acv P , b τ int P , and b τ adj P . W e consider the follo wing ﬁve studies. In sim ulation study I, we consider a univ ariate outcome setting, where Y i (1) = 2 / 5 + X 1 ,i + sin( X 2 ,i ) + ϵ i and Y i (0) = X 1 ,i + cos( X 2 ,i ) + e i , with e i ∼ G c (1 , 1) being the random noise follo wing a cen tered gamma distribution with parameters (1 , 1). The observ ed outcome is Y i = A i Y i (1) + (1 − A i ) Y i (0), with contrast function w ( Y i , Y j ) = 1 ( Y i > Y j ). The co v ariates X i = ( X 1 ,i , X 2 ,i ) ⊤ , X 1 ,i ∼ B (0 . 5) and X 2 ,i ∼ N (0 , 1). The standard errors are obtained via the CTW v ariance estimator. In sim ulation study I I, we consider a more complex setting with biv ariate comp osite outcomes, where Y i (1) = ( Y 1 ,i (1) , Y 2 ,i (1)) ⊤ and Y i (0) = ( Y 1 ,i (0) , Y 2 ,i (0)) ⊤ . Sp eciﬁcally , Y 1 ,i ( a ) ∈ { 1 , 2 , 3 } , following the three-category ordinal logistic regression model: log P { Y 1 ,i ( a ) ≤ ı } P { Y 1 ,i ( a ) > ı } = α ı,a + X 1 ,i + sin( X 2 ,i ) + X 3 ,i + X 4 ,i + ζ i , for ı = 1 , 2 and a = 0 , 1, where α 1 , 1 = 0, α 2 , 1 = 2, α 1 , 0 = 0, and α 2 , 0 = 1 . 5, with ζ i ∼ G (1 , 1) b eing a gamma frailt y inducing p ositiv e correlation b et w een potential outcomes for the same unit; Y 2 ,i (1) = 2 / 5 + X 1 ,i + sin( X 2 ,i ) + X 3 ,i + X 4 ,i + ζ i + e i , and Y 2 ,i (0) = 27 X 1 ,i + cos( X 2 ,i ) + X 3 ,i + X 4 ,i + ζ i + e i , where e i ∼ G c (1 , 1) is the individual-lev el random noise follo wing a centered gamma distribution with parameters (1 , 1). The observ ed outcome is Y i = A i Y i (1) + (1 − A i ) Y i (0). W e use a con trast function common for non-prioritized comp osite outcomes, w ( Y i , Y j ) = 0 . 5 w 1 ( Y 1 ,i , Y 1 ,j ) + 0 . 5 w 2 ( Y 2 ,i , Y 2 ,j ), where w 1 ( Y 1 ,i , Y 1 ,j ) = 1 ( Y 1 ,i > Y 1 ,j ) + 0 . 5 1 ( Y 1 ,i = Y 1 ,j ) and w 2 ( Y 2 ,i , Y 2 ,j ) = 1 ( Y 2 ,i > Y 2 ,j ). The cov ariates X i = ( X 1 ,i , X 2 ,i , X 3 ,i , X 4 ,i ) ⊤ , X 1 ,i ∼ B (0 . 5) and X 2 ,i , X 3 ,i , X 4 ,i ∼ N (0 , 1). The standard errors are obtained via the CTW v ariance estimator. In simulation studies II I and IV, we inv estigate settings in whic h adjusting for co v ariates ma y not increase the estimation eﬃciency . Sp eciﬁcally , in sim ulation study I I I, we follow the same data-generating pro cess as in sim ulation study I, but ﬁt estimators adjusting for co v ariates, e.g., cov ariate-adjusted and ANCOV A, using unrelated cov ariates ( ‹ X 1 ,i , ‹ X 2 ,i ) with ‹ X 1 ,i , ‹ X 2 ,i ∼ N (0 , 1). In sim ulation study IV, w e follo w the same data-generating pro cess as in sim ulation study I I, while ﬁtting estimators adjusting for co v ariates using noisy co v ariates ‹ X k,i = X k,i + ε k,i with ε k,i ∼ N (0 , 5 2 ) for k = 1 , 2 , 3 , 4. Cov ariates in these t wo settings are not prognostic for the outcome. Finally , in simulation study V, w e demonstrate the consistency of the CTW v ariance estimator and sho w that other v ariance estimators, i.e., HR, CR, and TW, are not consistent for the Neyman v ariances and co v ariances. The settings are the same as in sim ulation studies I and I I. 5.2 Sim ulation results Figure 1 presen ts the estimation results of λ (1 , 0) from simulation studies I-IV. The rest of the results are summarized in W eb T ables S1-S4 and W eb Figures S1-S5 in Section S7 of the Supplemen tary Materials. Across simulation studies I-IV, all estimators are consis- ten t despite the fact that the w orking model div erges from the data-generating processes, conﬁrming that these estimators are mo del-assisted rather than mo del-based. Also, the empirical co verage p ercentages (ECPs) of 95% CIs are at their nominal lev el with some 28 o ver 95%, conﬁrming the consistency of the CTW v ariance estimator (Section 4 ). −0.05 0.00 0.05 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Bias 0.00 0.25 0.50 0.75 1.00 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Coverage 0.00 0.01 0.02 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Empirical SE (a) Simulation study I: univ ariate outcome. −0.050 −0.025 0.000 0.025 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Bias 0.00 0.25 0.50 0.75 1.00 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Coverage 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Empirical SE (b) Simulation study I I: comp osite outcomes. −0.05 0.00 0.05 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Bias 0.00 0.25 0.50 0.75 1.00 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Coverage 0.00 0.01 0.02 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Empirical SE (c) Simulation study I I I: unrelated co v ariates. −0.04 −0.02 0.00 0.02 0.04 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Bias 0.00 0.25 0.50 0.75 1.00 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Coverage 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 λ ^ I λ ^ I adj λ ^ I acv λ ^ A,1 adj λ ^ A,2 adj λ ^ A,1 acv λ ^ A,2 acv Empirical SE (d) Simulation study IV: noisy cov ariates. Figure 1: Bias, co verage percentages of 95% CIs, and empirical standard errors (SEs) for b λ (1 , 0) from simulation studies I - IV. The n umber of units N = 500. Comparing the empirical SEs, results from sim ulation studies I and I I show that, with prognostic co v ariates, estimators that adjust for cov ariates are more eﬃcient than those that do not. Estimators that use individual pairs while adjusting for cov ariates tend to yield smaller empirical SEs than others, suggesting that they could b e fav ored for their eﬃciency in applications. On the other hand, results from sim ulation studies I I I and IV 29 T able 3: Results from simulation study V under the setting in simulation study I with sample size N = 500. ESE: empirical standard error; ASE: a verage standard error; ECP: empirical co v erage p ercen tage of the 95% conﬁdence interv al. HR: the heteroskedasticit y- robust v ariance estimator; CR: the cluster-robust v ariance estimator; TW: the tw o-w ay clustering v ariance estimator; CTW: the complete t w o-wa y clustering v ariance estimator. λ (1 , 0) HR CR TW CTW Estimator ESE ASE ECP ASE ECP ASE ECP ASE ECP b λ I .0256 .0024 .151 .0197 .874 .0256 .951 .0256 .951 b λ adj I .0221 .0024 .150 .0157 .835 .0219 .949 .0218 .948 b λ acv I .0221 .0023 .138 .0158 .838 .0219 .948 .0218 .947 b λ adj A , 1 .0237 .0148 .779 − − .0221 .935 .0221 .935 b λ adj A , 2 .0229 .0148 .790 − − .0221 .950 .0221 .950 b λ acv A , 1 .0221 .0156 .834 − − .0232 .962 .0233 .965 b λ acv A , 2 .0221 .0156 .833 − − .0232 .962 .0233 .964 λ (0 , 1) HR CR TW CTW Estimator ESE ASE ECP ASE ECP ASE ECP ASE ECP b λ I .0256 .0024 .151 .0165 .806 .0256 .951 .0256 .951 b λ adj I .0221 .0024 .150 .0153 .835 .0219 .949 .0218 .948 b λ acv I .0221 .0023 .138 .0152 .819 .0219 .948 .0218 .947 b λ adj A , 1 .0229 .0148 .790 − − .0221 .950 .0221 .950 b λ adj A , 2 .0237 .0148 .779 − − .0221 .935 .0221 .935 b λ acv A , 1 .0221 .0156 .833 − − .0232 .962 .0233 .964 b λ acv A , 2 .0221 .0156 .834 − − .0232 .962 .0233 .965 τ (1) HR CR TW CTW Estimator ESE ASE ECP ASE ECP ASE ECP ASE ECP b τ I .0512 .0035 .105 .0257 .671 .0363 .845 .0513 .951 b τ adj I .0443 .0034 .110 .0219 .657 .0309 .834 .0437 .949 b τ acv I .0443 .0032 .102 .0220 .660 .0310 .835 .0437 .947 b τ P .0512 .0040 .125 .0516 .953 .0730 .995 .0511 .951 b τ acv P .0443 .0038 .114 .0500 .976 .0705 1.000 .0436 .947 b τ int P .0512 .0040 .124 .0514 .952 .0726 .995 .0511 .951 b τ adj P .0443 .0038 .114 .0500 .976 .0706 1.000 .0436 .947 b τ adj A , 1 .0443 .0209 .635 − − .0313 .847 .0430 .947 b τ adj A , 2 .0443 .0209 .635 − − .0313 .847 .0430 .947 b τ acv A , 1 .0443 .0222 .661 − − .0329 .866 .0454 .960 b τ acv A , 2 .0443 .0222 .661 − − .0329 .866 .0454 .960 30 sho w that, when the co v ariates are uninformative, adjusting for co v ariates does not increase the estimation eﬃciency . In all settings, the contrast functions satisfy an ti-symmetry and π a = 1 / 2 for a = 0 , 1, and sim ulation results conﬁrm Prop ositions 1 - 3 . Additionally , for the PIM estimators, the simulation results conﬁrm that the in teraction term D i,j X i,j is not meaningful asymptotically . F or v ariance estimation, T able 3 presen ts the results from sim ulation study V under the setting in simulation study I with sample size N = 500, and the rest of the results are giv en in W eb T ables S5-S7 in Section S7 of the Supplementary Materials. Results from the sim ulation study V show that only the CTW estimator is consisten t across all v ariances and co v ariances, conﬁrming Theorems 7 - 9 . The HR and CR estimators are not consisten t in general, whereas the TW estimator misses the co v ariance b et w een b λ ( a, 1 − a ) and b λ (1 − a, a ), th us not consistent for the v ariance of b τ ( a ). 6 Data example W e illustrate the prop osed GCE estimators using data from the Best Apnea Interv entions for Research (BestAIR) trial, an individually randomized, parallel-group study designed to ev aluate the eﬀect of contin uous p ositiv e airwa y pressure (CP AP) treatmen t on health outcomes among patien ts with obstructive sleep apnea and elev ated cardiov ascular risk but without sev ere sleepiness ( Zhao et al. , 2017 ). Participan ts w ere recruited from outpatient clinics at three medical cen ters in Boston, MA, and randomized to receiv e either CP AP- based therapy or conserv ative medical therap y . In the analytic sample considered here, there are 169 participan ts, with 83 assigned to the CP AP group and 86 assigned to the con trol group. A set of baseline patien t-level co v ariates was collected prior to treatment assignmen t, and clinical outcomes were measured at baseline, 6 months, and 12 mon ths. F or illustration, we consider estimating the treatment eﬀect of CP AP on tw o outcomes measured at 6 mon ths. W e conduct t w o analyses. The ﬁrst one fo cuses on the ob jective out- come ( Y 1 ), the 24-hour systolic blo o d pressure (SBP) measured every 20 min utes during the 31 da ytime and every 30 min utes during sleep, with the con trast function w ( u, v ) = 1 ( u < v ). The second one considers an additional sub jectiv e outcome ( Y 2 ), day time self-rep orted sleepiness measured by the Ep worth Sleepiness Scale (ESS), b esides Y 1 , with the con trast function w ( u , v ) = 0 . 5 1 ( u 1 < v 1 ) + 0 . 5 1 ( u 2 < v 2 ), as we assign equal weigh ts to the t wo outcomes. F or cov ariate adjustmen t, w e consider a total of 10 baseline co v ariates, includ- ing demographics (age, gender, race, ethnicity), b o dy mass index, Apnea-Hyp opnea Index (AHI), av erage seated radial pulse rate (SDP), trial site, and baseline outcome measures (baseline a v erage blo o d pressure and ESS). The target estimand is the causal net b ene- ﬁt τ (1), where a p ositive b τ (1) with an estimated 95% conﬁdence interv al excluding zero indicates CP AP is more eﬀectiv e. W e implement b τ I , b τ adj I , b τ acv I , b τ adj A , 1 , b τ adj A , 2 , b τ acv A , 1 , b τ acv A , 2 , and the PIM estimators, b τ P , b τ acv P , b τ int P , and b τ adj P . Standard errors are estimated using the CTW v ariance estimator. Data analysis results are presen ted in T able 4 . T able 4: Data analysis results. EST: the estimate. SE: the standard error from the CTW v ariance estimator. CI: 95% conﬁdence interv al. ∗ : the estimated 95% conﬁdence in terv al do es not contain zero. Y 1 Y 1 , Y 2 Estimator EST (SE) CI EST (SE) CI b τ I .228 (.091) ∗ (.050, .406) .165 (.061) ∗ (.046, .284) b τ adj I .159 (.064) ∗ (.033, .285) .134 (.047) ∗ (.041, .227) b τ acv I .154 (.063) ∗ (.030, .278) .132 (.047) ∗ (.039, .224) b τ adj A , 1 .168 (.085) ∗ (.002, .334) .143 (.065) ∗ (.016, .270) b τ adj A , 2 .168 (.085) ∗ (.002, .334) .134 (.066) ∗ (.004, .264) b τ acv A , 1 .158 (.088) ( − .015, .331) .136 (.067) ∗ (.004, .267) b τ acv A , 2 .158 (.088) ( − .015, .331) .131 (.067) ( − .001, .264) b τ P .228 (.089) ∗ (.053, .403) .165 (.059) ∗ (.050, .281) b τ acv P .154 (.061) ∗ (.035, .273) .132 (.045) ∗ (.044, .219) b τ int P .228 (.089) ∗ (.053, .404) .165 (.059) ∗ (.049, .281) b τ adj P .154 (.061) ∗ (.035, .272) .132 (.045) ∗ (.044, .219) F rom T able 4 , we observ e that, when only considering Y 1 , all estimators give p ositiv e estimates for the causal net b eneﬁt, suggesting that the CP AP is more eﬀective; estimators except for b τ acv A , 1 and b τ acv A , 2 giv e 95% CI excluding zero, conﬁrming the eﬀectiv eness of the 32 CP AP in terv ention. When considering Y 1 and Y 2 sim ultaneously , such patterns largely p ersist, with only b τ acv A , 2 giving a 95% CI including zero. Therefore, it could b e concluded that the CP AP interv en tion is more eﬀective in reducing high cardiov ascular disease risk and obstructiv e sleep apnea without causing severe sleepiness. Estimators that adjust for co v ariates generally pro duce smaller SEs and narro w er CIs than those that do not adjust. This is particularly apparent for estimators using individual pairs, implying that these estimators are more eﬃcient. Estimators using p er-unit pair av erages, how ev er, show smaller signiﬁcan t eﬃciency gains when only considering Y 1 and some larger SEs than the unadjusted estimators when considering b oth Y 1 and Y 2 , suggesting that they might be less fa vored in settings where the dataset is of mo derate or smaller size. 7 Concluding remarks In this article, we dev elop a uniﬁed design-based theory for inference on GCE estimands deﬁned through pairwise contrast functions in randomized exp eriments. By representing probabilistic index eﬀects, the causal net b eneﬁt, and related nonlinear summaries within a common ﬁnite-p opulation framew ork, we extend classical regression adjustmen t theory b ey ond the classical linear A TE. W e establish that regression estimators based on both indi- vidual pairs and p er-unit a v erages remain mo del-assisted and asymptotically normal under arbitrary missp eciﬁcation. At the same time, we discuss a fundamen tal departure from the linear setting: for nonlinear contrast functions, cov ariate adjustment do es not admit a uni- v ersal eﬃciency guarantee. This clariﬁes that eﬃciency gains from regression adjustment are inherently estimand-dep enden t when mo ving b eyond additive contrasts. W e further pro vide a randomization-based v ariance theory tailored to pairwise dependence and in- tro duce complete t wo-w a y cluster-robust v ariance estimators that are consisten t. Finally , w e ha v e pursued complete randomization in this work as a starting point, and extend- ing the present framew ork to rerandomized designs w ould require c haracterizing the join t 33 asymptotic distribution of randomization-based U-statistics under correlated assignmen t mec hanisms ( Li and Ding , 2020 ). Understanding the in terplay b etw een rerandomization and analytic co v ariate adjustment can help further elucidate the p otential for potentially more eﬃcient estimators for estimating the GCE estimand. Ac kno wledgemen t Researc h in this article was supp orted b y the United States National Institutes of Health (NIH), National Heart, Lung, and Blo o d Institute (NHLBI, gran t num b er 1R01HL178513). All statements in this rep ort, including its ﬁndings and conclusions, are solely those of the authors and do not necessarily represent the views of the NIH. The authors declare that there are no conﬂicts of in terest relev an t to this w ork. References Bebu, I. and Lac hin, J. M. (2016), “Large sample inference for a win ratio analysis of a comp osite outcome based on prioritized comp onents,” Biostatistics , 17, 178–187. Cameron, A. C., Gelbach, J. B., and Miller, D. L. (2011), “Robust inference with multiw a y clustering,” Journal of Business & Ec onomic Statistics , 29, 238–249. De Neve, J. and Thas, O. (2015), “A regression framew ork for rank tests based on the probabilistic index mo del,” Journal of the Americ an Statistic al Asso ciation , 110, 1276– 1283. De Sc hryver, M. and De Nev e, J. (2019), “A tutorial on probabilistic index mo dels: Re- gression mo dels for the eﬀect size P ( Y 1 < Y 2 ).” Psycholo gic al Metho ds , 24, 403. Fisher, R. A. (1935), The Design of Exp eriments , Edinburgh, London: Oliv er and Boyd, 1st edition. F reedman, D. A. (2008), “On regression adjustmen ts to exp erimental data,” A dvanc es in Applie d Mathematics , 40, 180–193. Hub er, P . J. (1967), “The b ehavior of maxim um likelihoo d estimates under nonstandard conditions,” in Pr o c e e dings of the ﬁfth Berkeley symp osium on mathematic al statistics and pr ob ability , v olume 1, Berkeley , CA: Univ ersity of California Press, 221–233. Li, X. and Ding, P . (2017), “General forms of ﬁnite p opulation central limit theorems with applications to causal inference,” Journal of the Americ an Statistic al Asso ciation , 112, 1759–1769. 34 — (2020), “Rerandomization and regression adjustment,” Journal of the R oyal Statistic al So ciety Series B: Statistic al Metho dolo gy , 82, 241–268. Liang, K.-Y. and Zeger, S. L. (1986), “Longitudinal data analysis using generalized linear mo dels,” Biometrika , 73, 13–22. Lin, W. (2013), “Agnostic notes on regression adjustments to exp erimen tal data: Reexam- ining F reedman’s critique,” The A nnals of Applie d Statistics , 7, 295–318. P o co c k, S. J. (1997), “Clinical trials with multiple outcomes: a statistical p ersp ectiv e on their design, analysis, and interpretation,” Contr ol le d clinic al trials , 18, 530–545. P o co c k, S. J., Ariti, C. A., Collier, T. J., and W ang, D. (2011), “The win ratio: A new approac h to the analysis of comp osite endp oints in clinical trials based on clinical prior- ities,” Eur op e an He art Journal , 33, 176–182. Sc heidegger, C., W andel, S., and M ¨ utze, T. (2025), “Co v ariate adjustmen t for the win o dds: Application to cardio v ascular outcomes trials,” arXiv pr eprint arXiv:2511.14292 . Spla wa-Neyman, J. (1923), “On the application of probability theory to agricultural exp er- imen ts. Essa y on principles. Section 9 (translated). Reprinted ed.” Statistic al Scienc e , 5, 465–472. Su, F. and Ding, P . (2021), “Mo del-assisted analyses of cluster-randomized exp erimen ts,” Journal of the R oyal Statistic al So ciety: Series B (Statistic al Metho dolo gy) , 83, 994–1015. Thas, O., Neve, J. D., Clement, L., and Otto y , J.-P . (2012), “Probabilistic index models,” Journal of the R oyal Statistic al So ciety Series B: Statistic al Metho dolo gy , 74, 623–671. v an der V aart, A. (1998), Asymptotic Statistics , Cam bridge Universit y Press. V ermeulen, K., Thas, O., and V ansteelandt, S. (2015), “Increasing the p ow er of the Mann- Whitney test in randomized exp erimen ts through ﬂexible cov ariate adjustmen t,” Statis- tics in Me dicine , 34, 1012–1030. W ang, D. and Poco ck, S. (2016), “A win ratio approach to comparing contin uous non- normal outcomes in clinical trials,” Pharmac eutic al statistics , 15, 238–245. Zhao, Y. Y., W ang, R., Gleason, K. J., Lewis, E. F., Quan, S. F., T oth, C. M., Morrical, M., Rueschman, M., W eng, J., W are, J. H., Mittleman, M. A., Redline, S., and on b ehalf of the BestAIR In v estigators (2017), “Eﬀect of Contin uous P ositive Airwa y Pressure T reatmen t on Health-Related Quality of Life and Sleepiness in High Cardiov ascular Risk Individuals With Sleep Apnea: Best Apnea Interv en tions for Research (BestAIR) T rial,” Sle ep , 40, zsx040. 35

Design-based inference for generalized causal effects in randomized experiments

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment