Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Causal inference typically assumes centralized access to individual-level data. Yet, in practice, data are often decentralized across multiple sites, making centralization infeasible due to privacy, logistical, or legal constraints. We address this problem by estimating the Average Treatment Effect (ATE) from decentralized observational data via a Federated Learning (FL) approach, allowing inference through the exchange of aggregate statistics rather than individual-level data. We propose a novel method to estimate propensity scores via a federated weighted average of local scores using Membership Weights (MW), defined as probabilities of site membership conditional on covariates. MW can be flexibly estimated with parametric or non-parametric classification models using standard FL algorithms. The resulting propensity scores are used to construct Federated Inverse Propensity Weighting (Fed-IPW) and Augmented IPW (Fed-AIPW) estimators. In contrast to meta-analysis methods, which fail when any site violates positivity, our approach exploits heterogeneity in treatment assignment across sites to improve overlap. We show that Fed-IPW and Fed-AIPW perform well under site-level heterogeneity in sample sizes, treatment mechanisms, and covariate distributions. Theoretical analysis and experiments on simulated and real-world data demonstrate clear advantages over meta-analysis and related approaches.

💡 Research Summary

The paper tackles the problem of estimating the average treatment effect (ATE) when observational data are distributed across multiple sites that cannot share individual‑level records due to privacy, legal, or logistical constraints. Traditional causal inference assumes centralized access to covariates, treatment, and outcomes, which is often infeasible in health‑care networks, finance, or multi‑institutional research. Moreover, existing decentralized approaches such as two‑stage meta‑analysis rely on a strong local positivity assumption (each site must contain both treated and untreated subjects across the covariate space). This assumption is frequently violated when a site follows a deterministic treatment policy or when treatment mechanisms differ dramatically across sites.

The authors propose a federated causal inference framework that builds a global propensity score by aggregating site‑specific propensity scores using Membership Weights (MW). For a covariate vector (x), the global propensity score is expressed as
\

Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation

💡 Research Summary

Comments & Academic Discussion

Leave a Comment