Sequential stratified inference for the mean

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We develop conservative tests for the mean of a bounded population under stratified sampling and apply them to risk-limiting post-election audits. The tests are ``anytime valid’’ under sequential sampling, allowing optional stopping in each stratum. Our core method expresses a global hypothesis about the population mean as a union of intersection hypotheses describing within-stratum means. It tests each intersection hypothesis using independent test supermartingales (TSMs) combined across strata by multiplication. A $P$-value for each intersection hypothesis is the reciprocal of that test statistic, and the largest $P$-value in the union is a $P$-value for the global hypothesis. This approach has two primary moving parts: the rule selecting which stratum to draw from next given the sample so far, and the form of the TSM within each stratum. These rules may vary over intersection hypotheses. We construct the test with the smallest expected stopping time and present a few strategies for approximating that optimum. In instances that arise in auditing and other applications, its expected sample size is substantially smaller than that of previous methods.

💡 Research Summary

This paper addresses the problem of making non‑parametric, finite‑sample‑valid inference on the mean of a bounded population when the data are collected by stratified sampling. The authors develop a family of sequential, “anytime‑valid” hypothesis tests that allow optional stopping within each stratum, a feature especially valuable for risk‑limiting audits (RLAs) and other regulatory audits where the analyst may wish to stop as soon as sufficient evidence is accumulated.

The central methodological innovation is to rewrite the global null hypothesis
(H_{0}: \mu(X)\le \eta_{0})
as a union of intersection hypotheses. Let (\eta = (\eta_{1},\dots,\eta_{K})) denote a vector of stratum‑specific mean bounds. The set
(E_{0}= {\eta\in

Sequential stratified inference for the mean

💡 Research Summary

Comments & Academic Discussion

Leave a Comment