Functional Approximation Methods for Differentially Private Distribution Estimation
The cumulative distribution function (CDF) is fundamental for characterizing random variables, making it essential in applications that require privacy-preserving data analysis. This paper introduces a novel framework for constructing differentially private CDFs inspired by functional analysis and the functional mechanism. We develop two variants: a polynomial projection method, which projects the empirical CDF into a polynomial space, and a sparse approximation method via matching pursuit, which projects it into arbitrary function spaces constructed from dictionaries. In both cases, the empirical CDF is approximated within the chosen space, and the corresponding coefficients are privatized to guarantee differential privacy. Compared with existing approaches such as histogram queries and adaptive quantiles, our methods achieve comparable or superior performance. Our methods are particularly well-suited to decentralized settings and scenarios where CDFs must be efficiently updated with newly collected or streaming data. In addition, we investigate the influence of parameters such as dictionary size and systematically evaluate different dictionary constructions, including Legendre polynomials, B-splines, and distribution-based functions. Overall, our contributions advance the development of practical and reliable methods for privacy-preserving CDF estimation.
💡 Research Summary
**
This paper introduces a novel framework for constructing differentially private (DP) cumulative distribution functions (CDFs) by leveraging functional approximation techniques. The authors observe that existing DP CDF methods—such as histogram queries and adaptive quantiles—suffer from rigidity (fixed binning), multiple communication rounds in decentralized settings, and inefficiency when data arrive incrementally. To address these issues, they propose to first approximate the empirical CDF (eCDF) within a pre‑specified finite‑dimensional function space and then privatize the resulting coefficients. Two concrete instantiations are presented.
- Polynomial Projection (PP) – The eCDF is projected onto a subspace spanned by orthogonal polynomials (the paper uses Legendre polynomials as a concrete example). By the projection theorem, the optimal coefficients are inner products of the eCDF with each basis function, which are directly related to data moments. These coefficients have low ℓ₂‑sensitivity, allowing the addition of calibrated Gaussian (or Laplace) noise to achieve (ε,δ)‑DP. After noise injection, a simple post‑processing step enforces monotonicity and the
Comments & Academic Discussion
Loading comments...
Leave a Comment