Causal Discovery for Explainable AI: A Dual-Encoding Approach
Understanding causal relationships among features is fundamental for explaining machine learning model decisions. However, traditional causal discovery methods face challenges with categorical variables due to numerical instability in conditional independence testing. We propose a dual-encoding causal discovery approach that addresses these limitations by running constraint-based algorithms with complementary encoding strategies and merging results through majority voting. Applied to the Titanic dataset, our method identifies causal structures that align with established explainable methods.
💡 Research Summary
The paper addresses a critical gap in explainable artificial intelligence (XAI): most post‑hoc explanation techniques such as SHAP, LIME, and partial dependence plots provide feature importance or marginal effects but do not reveal why a feature matters or how it causally interacts with other features. To bridge this gap, the authors propose a dual‑encoding causal discovery framework that adapts constraint‑based algorithms (specifically Fast Causal Inference, FCI) to mixed‑type data sets containing both continuous and categorical variables.
The core technical problem stems from the fact that conditional independence tests used in PC, FCI, and related algorithms (most commonly Fisher’s z‑test) assume continuous variables and rely on the inversion of a covariance matrix. When categorical variables are one‑hot encoded, the resulting design matrix is rank‑deficient because the dummy variables for a single categorical attribute sum to a constant. This singularity makes the covariance matrix non‑invertible, causing the statistical test to fail and leading to unstable or incomplete causal graphs.
To overcome this, the authors introduce two complementary encoding schemes: (1) “drop‑first” encoding, which removes the first dummy column for each categorical variable, and (2) “drop‑last” encoding, which removes the last dummy column. Each scheme restores full rank, yielding a valid covariance matrix and allowing the z‑test to be performed. The same FCI algorithm is run independently on the two encoded data sets, producing two potentially different causal graphs.
The next step is graph merging via majority voting. An edge is kept in the unified graph if it appears in at least one of the two graphs; if it appears in both, the orientation must agree for the directed edge to be retained. In case of conflicting directions, the edge is kept undirected. This merging strategy filters out edges that are artefacts of a particular reference category while preserving relationships that are robust across encodings.
Before causal discovery, continuous variables are discretized using supervised entropy‑based binning. This step creates interpretable intervals (e.g., Age →
Comments & Academic Discussion
Loading comments...
Leave a Comment