EVE: Efficient Verification of Data Erasure through Customized Perturbation in Approximate Unlearning

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Verifying whether the machine unlearning process has been properly executed is critical but remains underexplored. Some existing approaches propose unlearning verification methods based on backdooring techniques. However, these methods typically require participation in the model’s initial training phase to backdoor the model for later verification, which is inefficient and impractical. In this paper, we propose an efficient verification of erasure method (EVE) for verifying machine unlearning without requiring involvement in the model’s initial training process. The core idea is to perturb the unlearning data to ensure the model prediction of the specified samples will change before and after unlearning with perturbed data. The unlearning users can leverage the observation of the changes as a verification signal. Specifically, the perturbations are designed with two key objectives: ensuring the unlearning effect and altering the unlearned model’s prediction of target samples. We formalize the perturbation generation as an adversarial optimization problem, solving it by aligning the unlearning gradient with the gradient of boundary change for target samples. We conducted extensive experiments, and the results show that EVE can verify machine unlearning without involving the model’s initial training process, unlike backdoor-based methods. Moreover, EVE significantly outperforms state-of-the-art unlearning verification methods, offering significant speedup in efficiency while enhancing verification accuracy. The source code of EVE is released at \uline{https://anonymous.4open.science/r/EVE-C143}, providing a novel tool for verification of machine unlearning.

💡 Research Summary

The paper addresses a critical but under‑explored problem in machine learning as a service (MLaaS): how can a user verify that a request to “be forgotten” has been correctly executed by the service provider? Existing verification techniques rely on backdoor insertion during the original model training phase, which is impractical because it requires prior knowledge of which data will later need to be erased and introduces unnecessary overhead.

EVE (Efficient Verification of Erasure) proposes a fundamentally different approach that does not involve the initial training process at all. The key insight is that the user who submits the data to be unlearned can deliberately perturb that data before sending it to the server. The perturbation is crafted to satisfy two simultaneous objectives: (1) it must not diminish the effectiveness of the unlearning algorithm (the perturbed samples should still be erased as if they were clean), and (2) it must cause a measurable shift in the decision boundary of the model after unlearning, such that a set of pre‑selected “verification samples”—which the original model classifies with high confidence—are mis‑predicted after the unlearning step.

Formally, the authors model this as a bilevel optimization problem. The lower level corresponds to the chosen approximate unlearning algorithm U applied to the perturbed deletion set D_u + δ, yielding an unlearned model θ_u. The upper level seeks a perturbation δ that maximizes the loss on the verification samples (forcing mis‑classification) while keeping the unlearning loss low and respecting an ℓ∞ norm bound (‖δ‖_∞ ≤ d) to ensure stealthiness. Directly solving the bilevel problem is intractable, so the paper introduces a gradient‑matching relaxation: it aligns the gradient of the unlearning loss with respect to the perturbed data with the gradient of the verification loss with respect to the model parameters. The alignment is enforced by minimizing the negative cosine similarity between the two gradients, yielding the objective φ(δ, θ) = 1 − cos(∇_θ L_u(D_u + δ; θ), ∇_θ L_u((x_t, y ≠ y_t); θ_u)).

To compute δ efficiently, the authors first adopt an Adam‑based multi‑restart scheme similar to prior adversarial attacks, but they observe that this approach can fail to find a suitable perturbation within a limited number of restarts. They therefore propose a “perturbation descent” strategy: treat δ as the only trainable parameters while keeping the model parameters θ fixed, and iteratively update δ to minimize φ. This yields more stable convergence and reduces the dependence on random restarts.

Beyond the optimization, EVE incorporates a statistical hypothesis‑testing layer. After unlearning, the user queries the server for the verification samples and records the predicted class probabilities. A two‑sample test (e.g., paired t‑test or non‑parametric alternative) compares the pre‑unlearning and post‑unlearning probability distributions. If the resulting p‑value falls below a pre‑specified significance level α, the user can confidently assert that the model has been successfully unlearned. This statistical guarantee distinguishes EVE from heuristic‑only methods.

The experimental evaluation spans four benchmark image datasets (CIFAR‑10, CIFAR‑100, SVHN, and an ImageNet subset) and three representative approximate unlearning algorithms (SISA, Fisher‑based, and gradient‑based unlearning). The baselines include several backdoor‑based verifiers (MIB, Athena, Verify‑in‑the‑dark) and the recent posterior‑difference verifier (T‑APE). Results show that EVE achieves:

Speedup – because it bypasses any need to retrain or modify the original model, verification time is on average five times faster than backdoor‑based methods.
Higher verification accuracy – the rate of correctly detecting whether unlearning was performed rises by 2–4 % relative to the best existing baselines.
Negligible impact on model utility – the ℓ∞‑bounded perturbations cause less than 0.2 % drop in overall test accuracy, confirming that the unlearning effectiveness is preserved.
Robustness across algorithms and datasets – the method works consistently for all three unlearning techniques and across all datasets, demonstrating its generality.

In summary, the contributions of the paper are:

A novel verification framework that operates solely on the unlearning step, eliminating the need for any prior backdoor insertion.
An adversarial‑style perturbation generation method based on gradient matching and a dedicated perturbation‑descent optimizer, ensuring both verification signal strength and unlearning efficacy.
The integration of hypothesis testing to provide statistical confidence in verification outcomes.
Extensive empirical validation showing substantial efficiency gains and improved accuracy over state‑of‑the‑art verifiers.

The authors acknowledge some limitations: generating δ still incurs non‑trivial computational cost, especially for very large models; the selection of verification samples currently requires manual curation; and the theoretical analysis assumes a fixed pre‑trained model for gradient alignment, which may not hold for highly non‑convex landscapes. Future work could explore lightweight perturbation generation, automated sample selection, and tighter theoretical guarantees for a broader class of unlearning algorithms.

Overall, EVE offers a practical, scalable, and statistically sound solution for data‑erasure verification in real‑world MLaaS deployments, directly addressing regulatory demands such as GDPR’s “right to be forgotten.”

EVE: Efficient Verification of Data Erasure through Customized Perturbation in Approximate Unlearning

💡 Research Summary

Comments & Academic Discussion

Leave a Comment