An Empirical Analysis of Federated Learning Models Subject to Label-Flipping Adversarial Attack
In this paper, we empirically analyze adversarial attacks on selected federated learning models. The specific learning models considered are Multinominal Logistic Regression (MLR), Support Vector Classifier (SVC), Multilayer Perceptron (MLP), Convolution Neural Network (CNN), %Recurrent Neural Network (RNN), Random Forest, XGBoost, and Long Short-Term Memory (LSTM). For each model, we simulate label-flipping attacks, experimenting extensively with 10 federated clients and 100 federated clients. We vary the percentage of adversarial clients from 10% to 100% and, simultaneously, the percentage of labels flipped by each adversarial client is also varied from 10% to 100%. Among other results, we find that models differ in their inherent robustness to the two vectors in our label-flipping attack, i.e., the percentage of adversarial clients, and the percentage of labels flipped by each adversarial client. We discuss the potential practical implications of our results.
💡 Research Summary
This paper presents a systematic empirical study of label‑flipping poisoning attacks in federated learning (FL) environments. The authors focus on seven representative learning algorithms—Multinomial Logistic Regression (MLR), Support Vector Classifier (SVC), Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Random Forest, XGBoost, and Long Short‑Term Memory (LSTM). Experiments are conducted under two client population sizes, ten and one hundred participants, to capture the effect of scale.
The dataset used throughout is the classic MNIST handwritten‑digit benchmark, consisting of 60 000 training and 10 000 test images (28 × 28 grayscale). Images are normalized to zero mean and unit variance before being distributed among clients using a random split. The FL framework is built on the open‑source Flower library; most models are aggregated with the standard FedAvg algorithm, while tree‑based models employ a bagging‑style aggregation to accommodate their non‑parameter‑based updates. Each FL run lasts ten communication rounds, with local epochs adjusted so that the total number of local updates matches the convergence requirements of each model.
First, a non‑adversarial baseline is established. Hyper‑parameters for each model are tuned via grid search (details are provided in an appendix). Baseline accuracies show that, except for MLR, all models experience a modest drop when moving from ten to one hundred clients; the decline is especially pronounced for Random Forest and XGBoost, whereas SVC and MLP remain relatively stable.
The core contribution is the evaluation of label‑flipping attacks. For every experiment, a proportion of clients is designated as adversarial (10 % to 100 % in 10 % increments). Each adversarial client then flips a specified percentage of its local labels (also 10 % to 100 % in 10 % steps). This yields a 10 × 10 grid (100 data points) per model per client‑size scenario, resulting in three‑dimensional accuracy surfaces that capture the joint impact of the two attack parameters.
Key findings include:
-
Model‑specific robustness – SVC and MLP demonstrate the greatest resilience; their accuracy degrades slowly even when many clients are malicious or when a large fraction of labels is corrupted. CNN, despite achieving the highest clean‑accuracy, is highly sensitive to the label‑flipping rate, with performance collapsing once the flip proportion exceeds roughly 30 %.
-
Interaction of attack dimensions – Some models (notably LSTM) are more vulnerable when a small number of adversarial clients flip a large share of labels, whereas they tolerate many adversarial clients that flip only a few labels. Conversely, Random Forest and XGBoost are more affected by the sheer number of malicious participants, reflecting the averaging nature of the bagging aggregation.
-
Effect of client population – Scaling to one hundred clients generally amplifies the impact of attacks. The larger the pool, the more weight adversarial updates receive in the global model, leading to steeper accuracy declines, especially for ensemble methods.
The paper mentions an outlier‑detection module on the server that discards suspicious updates, but it does not provide a quantitative assessment of its efficacy or the specifics of the detection algorithm. Consequently, the defensive side remains under‑explored.
Overall, the study convincingly shows that label‑flipping attacks can dramatically erode FL model performance, and that the degree of degradation depends on both the model architecture and the attack configuration. Practitioners should therefore align model selection with anticipated threat models (e.g., expected proportion of compromised devices and the likely extent of label corruption). The work also highlights several avenues for future research: extending experiments to diverse datasets (e.g., natural images, text, time‑series), investigating targeted versus untargeted flips, and integrating robust aggregation or anomaly‑detection mechanisms with rigorous evaluation.
Comments & Academic Discussion
Loading comments...
Leave a Comment