Non-Linear Trajectory Modeling for Multi-Step Gradient Inversion Attacks in Federated Learning

Non-Linear Trajectory Modeling for Multi-Step Gradient Inversion Attacks in Federated Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Federated Learning (FL) enables collaborative training while preserving privacy, yet Gradient Inversion Attacks (GIAs) pose severe threats by reconstructing private data from shared gradients. In realistic FedAvg scenarios with multi-step updates, existing surrogate methods like SME rely on linear interpolation to approximate client trajectories for privacy leakage. However, we demonstrate that linear assumptions fundamentally underestimate SGD’s nonlinear complexity, encountering irreducible approximation barriers in non-convex landscapes with only one-dimensional expressiveness. We propose Non-Linear Surrogate Model Extension (NL-SME), the first framework introducing learnable quadratic Bézier curves for trajectory modeling in GIAs against FL. NL-SME leverages $|w|+1$-dimensional control point parameterization combined with dvec scaling and regularization mechanisms to achieve superior approximation accuracy. Extensive experiments on CIFAR-100 and FEMNIST demonstrate NL-SME significantly outperforms baselines across all metrics, achieving 94%–98% performance gaps and order-of-magnitude improvements in cosine similarity loss while maintaining computational efficiency. This work exposes critical privacy vulnerabilities in FL’s multi-step paradigm and provides insights for robust defense development.


💡 Research Summary

This paper presents a significant advancement in Gradient Inversion Attacks (GIAs) against Federated Learning (FL), specifically targeting the prevalent multi-step local update paradigm of FedAvg. The core innovation is the introduction of nonlinear trajectory modeling to overcome the fundamental limitations of existing linear approximation methods.

FL allows multiple clients to collaboratively train a model without sharing raw data, communicating only model parameter updates (gradients). However, GIAs threaten this privacy by attempting to reconstruct clients’ private training data from these shared gradients. While early attacks were effective in single-step scenarios, the realistic FedAvg protocol, where clients perform many (T>1) local Stochastic Gradient Descent (SGD) steps before communicating, poses a major challenge. The Surrogate Model Extension (SME) method addressed this by approximating the client’s parameter trajectory from the initial (w0) to final (wT) weights using simple linear interpolation. The authors identify a critical flaw in this approach: real SGD optimization paths through non-convex loss landscapes are inherently nonlinear. Linear interpolation, with its one-dimensional expressiveness, cannot capture this curvature, leading to irreducible approximation errors and convergence barriers, especially as T increases.

To solve this, the authors propose the Non-Linear Surrogate Model Extension (NL-SME). NL-SME replaces the linear path with a learnable quadratic Bézier curve. A Bézier curve is defined by two fixed endpoints (w0 and wT) and a learnable control point (P1). By optimizing P1, along with a curve parameter (t) and a novel per-parameter scaling vector (d), NL-SME can model curved trajectories that more accurately mimic the true SGD path. The framework jointly optimizes this surrogate model parameters and dummy data (˜D) to minimize the cosine distance between the observed parameter update (Δw = wT - w0) and the gradient computed on the dummy data at the surrogate model’s point. Regularization terms are added to ensure training stability and prevent the control point from deviating excessively.

Extensive experiments on CIFAR-100 and FEMNIST datasets demonstrate NL-SME’s overwhelming superiority. Compared to strong baselines including SME, Inverting Gradients (IG), and Deep Leakage from Gradients (DLG), NL-SME achieves performance gaps of 94% to 98% across all metrics like Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Mean Squared Error (MSE). Most strikingly, it reduces the cosine similarity loss by an order of magnitude. The advantages of NL-SME are further amplified under highly heterogeneous (non-IID) data distributions, which are common in real-world FL, suggesting privacy risks in production systems are even more severe than previously estimated. Importantly, NL-SME maintains computational efficiency comparable to SME.

In conclusion, this work makes several key contributions: 1) It rigorously analyzes and exposes the expressiveness bottleneck of linear trajectory assumptions in multi-step GIAs. 2) It proposes NL-SME, the first framework to employ learnable nonlinear parametric curves (quadratic Bézier) for trajectory modeling in this context. 3) It demonstrates unprecedented attack performance improvements, fundamentally raising the assessed threat level for FedAvg-style FL. The research serves as a critical warning for FL system designers and paves the way for developing next-generation defenses that must account for the nonlinear geometry of optimization trajectories.


Comments & Academic Discussion

Loading comments...

Leave a Comment