TinyGuard:A lightweight Byzantine Defense for Resource-Constrained Federated Learning via Statistical Update Fingerprints
Existing Byzantine robust aggregation mechanisms typically rely on fulldimensional gradi ent comparisons or pairwise distance computations, resulting in computational overhead that limits applicability in large scale and resource constrained federated systems. This paper proposes TinyGuard, a lightweight Byzantine defense that augments the standard FedAvg algorithm via statistical update f ingerprinting. Instead of operating directly on high-dimensional gradients, TinyGuard extracts compact statistical fingerprints cap turing key behavioral properties of client updates, including norm statistics, layer-wise ratios, sparsity measures, and low-order mo ments. Byzantine clients are identified by measuring robust sta tistical deviations in this low-dimensional fingerprint space with nd complexity, without modifying the underlying optimization procedure. Extensive experiments on MNIST, Fashion-MNIST, ViT-Lite, and ViT-Small with LoRA adapters demonstrate that TinyGuard pre serves FedAvg convergence in benign settings and achieves up to 95 percent accuracy under multiple Byzantine attack scenarios, including sign-flipping, scaling, noise injection, and label poisoning. Against adaptive white-box adversaries, Pareto frontier analysis across four orders of magnitude confirms that attackers cannot simultaneously evade detection and achieve effective poisoning, features we term statistical handcuffs. Ablation studies validate stable detection precision 0.8 across varying client counts (50-150), threshold parameters and extreme data heterogeneity . The proposed framework is architecture-agnostic and well-suited for federated fine-tuning of foundation models where traditional Byzantine defenses become impractical
💡 Research Summary
TinyGuard introduces a lightweight Byzantine‑robust defense for federated learning that operates as a pre‑aggregation filter compatible with the standard FedAvg algorithm. Instead of comparing full‑dimensional gradients—a process that incurs O(n² d) computational cost—TinyGuard compresses each client’s update into a low‑dimensional statistical fingerprint. The fingerprint consists of (1) gradient norm statistics (L2, L1, L∞), (2) layer‑wise L2‑norm ratios, (3) first‑order moments (mean, variance) and third‑order moment (skewness), (4) sparsity measured by the proportion of near‑zero entries, and (5) top‑k magnitude concentration. These m features (m ≪ d) are computed with simple arithmetic, yielding O(n d) time for all clients and O(n m) ≈ O(n) for the subsequent detection step.
On the server side, robust statistics (median and median absolute deviation) are used to estimate a central fingerprint μ and a scale σ. For each client i, a Mahalanobis‑like distance d_i = ‖(φ_i − μ)/σ‖₂ is calculated; if d_i exceeds a pre‑set threshold λ (empirically chosen in the range 2.5–10), the client is flagged as Byzantine and its update is discarded. The remaining updates are aggregated by the unchanged FedAvg rule, guaranteeing that in the absence of attacks the algorithm behaves identically to vanilla FedAvg.
The authors evaluate TinyGuard on four datasets—MNIST, Fashion‑MNIST, ViT‑Lite, and ViT‑Small—using both convolutional networks and Vision Transformers fine‑tuned with LoRA adapters. Experiments span 50, 100, and 150 clients, Byzantine fractions up to half the population, and a suite of attacks: sign‑flipping, scaling (β ≫ 1), Gaussian noise injection, label poisoning, and adaptive white‑box attacks. TinyGuard consistently preserves 95 %–97 % test accuracy under attack, outperforms classical robust aggregators (Krum, Trimmed Mean, Bulyan) by 3–5 % in accuracy, and achieves detection precision around 0.80 across all settings.
A Pareto‑frontier analysis of adaptive white‑box attacks shows that attackers cannot simultaneously minimize detection probability and maximize poisoning effect; this “statistical handcuffs” property stems from the need to manipulate multiple independent statistical descriptors at once, dramatically raising the attack cost. Ablation studies reveal that layer‑wise ratio and sparsity features contribute most to detection power, while the choice of λ trades off false‑positive rate against detection recall. The method remains stable when the data heterogeneity parameter α is set to 0.1 (highly non‑IID) and when the client count varies, confirming robustness to realistic federated scenarios.
From a computational perspective, TinyGuard reduces server‑side overhead by an order of magnitude compared with O(n² d) methods, making it suitable for resource‑constrained edge devices and for federated fine‑tuning of large foundation models where full‑gradient processing is prohibitive. The approach is architecture‑agnostic because fingerprints are derived solely from update statistics, not from model topology.
Limitations include the lack of experiments on truly massive models (billions of parameters) and the reliance on static statistical descriptors, which may be vulnerable to highly sophisticated attacks that precisely mimic honest statistics. Future work is suggested to extend the fingerprint set, incorporate dynamic threshold adaptation, and evaluate on real‑world large‑scale foundation model federations.
In summary, TinyGuard offers a practical, low‑overhead, and theoretically motivated defense that preserves FedAvg convergence, delivers high accuracy under diverse Byzantine threats, and scales gracefully to the demanding settings of modern federated learning.
Comments & Academic Discussion
Loading comments...
Leave a Comment