An Odd Estimator for Shapley Values

An Odd Estimator for Shapley Values
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The Shapley value is a ubiquitous framework for attribution in machine learning, encompassing feature importance, data valuation, and causal inference. However, its exact computation is generally intractable, necessitating efficient approximation methods. While the most effective and popular estimators leverage the paired sampling heuristic to reduce estimation error, the theoretical mechanism driving this improvement has remained opaque. In this work, we provide an elegant and fundamental justification for paired sampling: we prove that the Shapley value depends exclusively on the odd component of the set function, and that paired sampling orthogonalizes the regression objective to filter out the irrelevant even component. Leveraging this insight, we propose OddSHAP, a novel consistent estimator that performs polynomial regression solely on the odd subspace. By utilizing the Fourier basis to isolate this subspace and employing a proxy model to identify high-impact interactions, OddSHAP overcomes the combinatorial explosion of higher-order approximations. Through an extensive benchmark evaluation, we find that OddSHAP achieves state-of-the-art estimation accuracy.


💡 Research Summary

The paper tackles the long‑standing challenge of efficiently estimating Shapley values for arbitrary machine‑learning models, where exact computation is exponential in the number of players (features, data points, or causal variables). While many model‑agnostic estimators rely on Monte‑Carlo sampling or regression on a surrogate function, the paired‑sampling heuristic—where each sampled coalition S is paired with its complement Sᶜ—has been empirically successful but lacked a solid theoretical explanation.

The authors first prove a fundamental structural property: any set function f can be uniquely decomposed into an odd component fₒdd(S) = (f(S) – f(Sᶜ))/2 and an even component fₑven(S) = (f(S) + f(Sᶜ))/2, and the Shapley value depends exclusively on the odd component (ϕ_i(f) = ϕ_i(fₒdd) for all i). This insight immediately explains why the even part is irrelevant for attribution.

Building on this, they show that paired sampling orthogonalizes the weighted regression problem underlying KernelSHAP and its variants. By simultaneously observing f(S) and f(Sᶜ), the regression splits into two independent sub‑problems—one for the odd part and one for the even part. Since the even part contributes zero to Shapley values, paired sampling effectively filters out noise, reducing variance and improving accuracy. The authors also generalize earlier results that paired sampling with a k‑th order polynomial fit is equivalent to an (k + 1)‑th order fit, confirming the conjecture that odd‑order fits gain an extra degree of expressiveness when paired sampling is used.

Armed with this theory, the paper introduces OddSHAP, a novel consistent estimator that operates solely in the odd subspace. The key design choices are:

  1. Fourier basis – In the Fourier representation of set functions, basis functions of odd cardinality are odd functions, while even‑cardinality bases are even. Thus, selecting only odd‑order Fourier terms directly targets the relevant subspace.
  2. Sparse interaction detection – Even within the odd subspace the number of possible terms grows combinatorially (e.g., O(d³) for third‑order interactions). The authors employ a proxy model (gradient‑boosted trees) to identify a small set of high‑impact odd‑order interactions, leveraging recent interaction‑detection methods such as SPEX and ProxySPEX.
  3. Two‑stage regression – First, the proxy model is trained on the sampled coalitions to produce importance scores for odd‑order Fourier coefficients. Second, a weighted least‑squares regression is solved only on the selected odd basis, dramatically reducing the regression matrix size from |T|² to |T|·m, where |T| is now the number of selected odd terms.

OddSHAP retains the consistency property of KernelSHAP/PolySHAP: as the query budget m approaches the full coalition space (2ᵈ), the fitted odd surrogate converges to the true odd component, guaranteeing exact recovery of Shapley values.

Empirical evaluation spans eight benchmark datasets (including DistilBERT, VIT‑16, NHANES, and a crime dataset) with dimensions ranging from 14 to 101. Using a budget of roughly 100·d coalition evaluations, OddSHAP achieves the lowest average mean‑squared error and the best rank (average rank 1.50) compared to state‑of‑the‑art methods such as RegressionMSR, PolySHAP, LeverageSHAP, and various permutation‑sampling baselines. Notably, OddSHAP matches or exceeds the accuracy of higher‑order polynomial estimators while being orders of magnitude faster due to the reduced basis size.

In summary, the paper makes three major contributions: (1) a rigorous theoretical justification for paired sampling via the odd/even decomposition of set functions; (2) the design of OddSHAP, a consistent, odd‑only estimator that combines Fourier sparsity with proxy‑driven interaction selection; and (3) extensive empirical evidence that OddSHAP sets a new performance frontier for model‑agnostic Shapley value estimation. The work opens a new direction for attribution research, suggesting that focusing exclusively on the odd component can yield both statistical efficiency and computational scalability across a wide range of explainability applications.


Comments & Academic Discussion

Loading comments...

Leave a Comment