FedEFC: Federated Learning Using Enhanced Forward Correction Against Noisy Labels
Federated Learning (FL) is a powerful framework for privacy-preserving distributed learning. It enables multiple clients to collaboratively train a global model without sharing raw data. However, handling noisy labels in FL remains a major challenge due to heterogeneous data distributions and communication constraints, which can severely degrade model performance. To address this issue, we propose FedEFC, a novel method designed to tackle the impact of noisy labels in FL. FedEFC mitigates this issue through two key techniques: (1) prestopping, which prevents overfitting to mislabeled data by dynamically halting training at an optimal point, and (2) loss correction, which adjusts model updates to account for label noise. In particular, we develop an effective loss correction tailored to the unique challenges of FL, including data heterogeneity and decentralized training. Furthermore, we provide a theoretical analysis, leveraging the composite proper loss property, to demonstrate that the FL objective function under noisy label distributions can be aligned with the clean label distribution. Extensive experimental results validate the effectiveness of our approach, showing that it consistently outperforms existing FL techniques in mitigating the impact of noisy labels, particularly under heterogeneous data settings (e.g., achieving up to 41.64% relative performance improvement over the existing loss correction method).
💡 Research Summary
Federated learning (FL) enables multiple clients to collaboratively train a global model without sharing raw data, but the presence of noisy labels can severely degrade performance, especially under heterogeneous (non‑IID) data distributions. This paper introduces FedEFC, a two‑phase framework designed to mitigate noisy‑label effects in FL.
Phase 1 – Prestopping
Each client evaluates the current global model on its local training set and reports the local accuracy to the server. The server aggregates these accuracies into an estimated global accuracy A(t). A patience counter τp tracks how many consecutive rounds the accuracy fails to improve beyond the best observed value Amax. When τp reaches a predefined threshold γthr, the current round is declared the “prestopping point” Te, indicating that the model has begun to over‑fit the noisy labels. This mechanism adapts the early‑stopping principle to FL, where no clean validation set is available, and automatically determines the optimal stopping time based solely on the dynamics of the aggregated training accuracies.
Phase 2 – Enhanced Forward Correction (EFC)
After Te, clients switch from standard training to a loss‑correction regime. They first estimate a noise transition matrix T̂k for each client k. Unlike classic forward‑correction methods that rely on a pretrained model, FedEFC builds a count matrix C̃y|y from the predictions of the global model at Te and the observed noisy labels. The count matrix records how often each true label (inferred with high confidence) co‑occurs with each observed label, providing a robust empirical estimate of the conditional probabilities that define T̂k. Because the global model at Te has already learned useful representations, the estimated transition matrix closely matches the true noise matrix, as demonstrated by high cosine similarity scores.
The loss is then corrected by applying the inverse of the estimated transition matrix:
L̂(f(x), ỹ) = ℓ(T̂k⁻¹ f(x), ỹ)
where ℓ is the base cross‑entropy loss. This adjustment makes the training behave as if the data were noise‑free.
Theoretical contribution
The authors prove that, under the composite proper loss property, the FL objective with noisy‑label distribution q(ỹ|x) aligns with the clean‑label objective p(y|x) when the corrected loss is used. Consequently, each client can achieve the same optimality as if it were training on entirely clean data, despite the presence of asymmetric, class‑dependent noise.
Experimental setup
Data heterogeneity is simulated by combining a Dirichlet distribution (parameter α_dir) for the number of samples per client with a Bernoulli distribution (parameter p) that determines which classes appear on each client. Asymmetric label noise is generated using a sparsity parameter ζ and a noise rate ρ, following the scheme of confident learning. Experiments are conducted on CIFAR‑10, CIFAR‑100, and FEMNIST with 10–30 clients, comparing FedEFC against FedAvg, FedProx, Ditto, FedCorr, FedNoRo, and other recent FL‑noise mitigation methods.
Results
Across all settings, FedEFC consistently outperforms baselines. In highly heterogeneous scenarios with 40 %+ asymmetric noise, it achieves up to 41.64 % relative improvement over the strongest existing loss‑correction method. The prestopping detection accurately captures the point where training loss begins to rise due to noise, and the transition matrix estimated from the global model at Te matches that from a fully pretrained model, confirming the practicality of the approach.
Limitations and future work
- The prestopping phase requires clients to transmit accuracy metrics each round, adding communication overhead.
- The method assumes the availability of a reasonably trained global model at Te; in early training stages, the count matrix may be unreliable.
- Re‑estimating the transition matrix each round incurs computational cost on resource‑constrained devices.
- The approach focuses on class‑dependent label flips; more complex noise types (e.g., label omission, adversarial label attacks) are not addressed.
Potential extensions include privacy‑preserving aggregation of accuracy signals, self‑supervised techniques to bootstrap the transition matrix without a pretrained model, multi‑noise modeling to handle heterogeneous noise sources, and clustering‑based collaborative estimation of transition matrices across clients.
Conclusion
FedEFC offers a theoretically grounded and empirically validated solution to noisy‑label challenges in federated learning. By coupling a data‑driven prestopping criterion with an enhanced forward‑correction that leverages the global model’s predictions, it delivers robust performance under realistic non‑IID and noisy conditions, making it a valuable addition to the toolbox of privacy‑preserving distributed AI.
Comments & Academic Discussion
Loading comments...
Leave a Comment