Near-Optimal Private Tests for Simple and MLR Hypotheses
We develop a near-optimal testing procedure under the framework of Gaussian differential privacy for simple as well as one- and two-sided tests under monotone likelihood ratio conditions. Our mechanism is based on a private mean estimator with data-driven clamping bounds, whose population risk matches the private minimax rate up to logarithmic factors. Using this estimator, we construct private test statistics that achieve the same asymptotic relative efficiency as the non-private, most powerful tests while maintaining conservative type I error control. In addition to our theoretical results, our numerical experiments show that our private tests outperform competing DP methods and offer comparable power to the non-private most powerful tests, even at moderately small sample sizes and privacy loss budgets.
💡 Research Summary
This paper tackles the problem of hypothesis testing under Gaussian Differential Privacy (GDP), focusing on simple hypotheses and one‑ and two‑sided tests that satisfy a monotone likelihood ratio (MLR) condition. The authors introduce a two‑stage methodology that first constructs a private mean estimator with data‑dependent clamping bounds (GDP‑MeanEst) and then builds test statistics from this estimator that achieve near‑optimal power.
The private mean estimator improves upon prior work (Canonne et al., 2019; Huang et al., 2021) by allowing the clamping interval to be refined through a T‑step binary search where each step consumes ε/T of the privacy budget. By treating the bin width w = (b−a)/2^T as a tunable parameter, the algorithm balances discretization error against the noise introduced for privacy. Lemma 3.1 shows that, under the condition that each bin contains at most one data point, the rank error is bounded by τ + 1 with τ = (1/(ε q))·2^T·log(T/β). This yields a logarithmic‑factor improvement over fixed‑width schemes and ensures that the estimator’s risk matches the DP minimax lower bound up to log factors: |μ̂_DP − \bar X| = O_p(s/(ε √n)).
Using this estimator, the authors construct a test statistic Ẑ = (μ̂_DP − μ₀)/σ̂. Under the MLR assumption, the log‑likelihood ratio is a monotone function of the sufficient statistic, so Ẑ inherits the same asymptotic normal distribution as the non‑private most powerful test. The paper proves that the test attains the correct asymptotic level α (conservative type I error control) and that its asymptotic relative efficiency (ARE) with respect to the classical uniformly most powerful (UMP) test equals one in both the Pitman (local alternatives) and Bahadur (fixed alternatives) frameworks. Consequently, the private test requires essentially the same sample size as its non‑private counterpart to achieve a given power.
Empirical evaluations confirm the theoretical claims. In mean‑estimation experiments, GDP‑MeanEst reduces mean‑squared error by 10–30 % compared with standard DP estimators (Laplace or Gaussian mechanisms with fixed clamping), especially for privacy budgets ε ≈ 0.5–1.0. In hypothesis‑testing simulations, the proposed DP tests outperform DP versions of Kolmogorov‑Smirnov, χ², and Cramér‑von‑Mises tests, delivering 5–15 % higher power at the same significance level while maintaining strict type I error control. Notably, even with moderate sample sizes (n ≈ 50) and modest privacy loss, the private tests achieve power comparable to the non‑private most powerful tests.
Overall, the paper makes three substantive contributions: (1) a data‑driven clamping mechanism that yields a private mean estimator achieving near‑minimax risk, (2) a unified testing framework for simple and MLR hypotheses that attains the same asymptotic efficiency as the classical UMP tests, and (3) a rigorous analysis of the trade‑off between discretization error and privacy noise, enabling practical deployment of DP hypothesis testing in small‑sample scientific studies. This work substantially narrows the gap between private and non‑private inference, paving the way for broader adoption of differential privacy in rigorous statistical practice.
Comments & Academic Discussion
Loading comments...
Leave a Comment