Auditing a collection of races simultaneously

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A collection of races in a single election can be audited as a group by auditing a random sample of batches of ballots and combining observed discrepancies in the races represented in those batches in a particular way: the maximum across-race relative overstatement of pairwise margins (MARROP). A risk-limiting audit for the entire collection of races can be built on this ballot-based auditing using a variety of probability sampling schemes. The audit controls the familywise error rate (the chance that one or more incorrect outcomes fails to be corrected by a full hand count) at a cost that can be lower than that of controlling the per-comparison error rate with independent audits. The approach is particularly efficient if batches are drawn with probability proportional to a bound on the MARROP (PPEB sampling).

💡 Research Summary

The paper addresses the practical challenge of auditing many contests that appear on the same set of ballots in a single election. Conducting separate risk‑limiting audits (RLAs) for each contest quickly becomes costly and logistically burdensome. To overcome this, the author introduces a single summary statistic called the Maximum Across‑Race Relative Overstatement of Pairwise margins (MARROP). For each audited batch of ballots p, the relative overstatement e_{pwℓ} is computed for every contest r and every winner‑loser pair (w,ℓ) that appears in that contest. The batch’s error contribution e_p is the maximum of these values across all contests and pairs. Summing over all batches yields E = Σ_p e_p. If E < 1, the apparent winners of every contest are guaranteed to be the true winners; thus testing the null hypothesis “E ≥ 1” directly controls the family‑wise error rate (FWER) for the whole set of contests.

Because e_p cannot be observed without a full hand count, the paper derives an upper bound u_p for each batch based on the maximum possible overstatement given the batch’s size and the reported margins: u_p = max_{r,w,ℓ} (v_{wp}−v_{ℓp}+b_{rp}) / V_{wℓ}, where v_{wp} and v_{ℓp} are the reported votes for winner w and loser ℓ in batch p, b_{rp} is an upper bound on the number of valid votes for contest r in that batch, and V_{wℓ} is the reported overall margin. The collection of u_p values defines a total error bound U = Σ_p u_p.

Sampling can be performed in several ways; the paper focuses on probability‑proportional‑to‑error‑bound (PPEB) sampling, where batch p is selected with probability u_p / U. For each selected batch the “taint” τ_p = e_p / u_p is recorded. Using the Kaplan‑Markov inequality, a cumulative test statistic P = n·min_{j≤n} ∏_{i=1}^j (1−T_i/U) is constructed, where T_i are the observed taints. If P falls below a pre‑specified risk limit α (e.g., 0.25), the audit stops and the election outcome is certified; otherwise a full hand count is triggered.

An illustrative example involves a hypothetical jurisdiction with 200 precincts, three overlapping contests (A, B, C), and 400 audit‑able batches (in‑precinct and vote‑by‑mail). The margins are modest (≈10‑15 % for losers), and each batch contains either 400 or 200 ballots. The paper computes u_p for eight batch types, obtaining values ranging from 0.0350 to 0.0852, and finds U = 22.718. With a PPEB design targeting a 75 % chance of a full hand count if any contest is wrong (α = 0.25), drawing n = 36 batches with replacement yields an expected 34.3 distinct batches (≈8.6 % of all) and about 11,400 ballots (≈9.5 % of the total). If the observed taints consist of five batches with τ = 0.04 and the rest τ = 0, the resulting P = 0.243, allowing the audit to stop.

The paper compares three strategies: (1) a single MARROP‑based audit controlling FWER at α; (2) independent audits of each contest controlling the per‑comparison error rate (PCER) at α; and (3) separate audits of each contest while still controlling FWER by allocating risk across contests. The MARROP approach achieves the desired FWER with fewer batches and less work than independent audits, and it avoids the inflated overall error probability that can arise when PCER is used for multiple contests.

Key contributions are: (i) the definition of MARROP as a unifying error metric for simultaneous contests; (ii) a rigorous risk‑limiting audit framework that tests the family of contest outcomes as a whole; (iii) demonstration that PPEB sampling, guided by batch‑specific error bounds, yields substantial efficiency gains over naïve independent audits. The methodology provides election officials with a statistically sound, operationally feasible tool for auditing complex elections where many races share the same ballots.

Auditing a collection of races simultaneously

💡 Research Summary

Comments & Academic Discussion

Leave a Comment