Federated Learning Clients Clustering with Adaptation to Data Drifts

Federated Learning Clients Clustering with Adaptation to Data Drifts
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Federated Learning (FL) trains deep models across edge devices without centralizing raw data, preserving user privacy. However, client heterogeneity slows down convergence and limits global model accuracy. Clustered FL (CFL) mitigates this by grouping clients with similar representations and training a separate model for each cluster. In practice, client data evolves over time, a phenomenon we refer to as data drift, which breaks cluster homogeneity and degrades performance. Data drift can take different forms depending on whether changes occur in the output values, the input features, or the relationship between them. We propose FIELDING, a CFL framework for handling diverse types of data drift with low overhead. FIELDING detects drift at individual clients and performs selective re-clustering to balance cluster quality and model performance, while remaining robust to malicious clients and varying levels of heterogeneity. Experiments show that FIELDING improves final model accuracy by 1.9-5.9% and achieves target accuracy 1.16x-2.23x faster than existing state-of-the-art CFL methods.


💡 Research Summary

Federated Learning (FL) enables many edge devices to collaboratively train a shared model while keeping raw data local, thus preserving privacy. A major obstacle in practical FL deployments is client heterogeneity: differences in data size, distribution, hardware, and usage patterns slow convergence and degrade the global model’s accuracy. Clustered FL (CFL) mitigates this by grouping clients with similar data characteristics and training a separate model per cluster, thereby reducing intra‑cluster heterogeneity. However, most existing CFL approaches assume a static data distribution. In real‑world scenarios, client data evolves over time—a phenomenon the authors term “data drift.” Data drift can manifest as (1) label shift (changing class frequencies while the conditional distribution P(x|y) stays stable), (2) covariate shift (changing input distribution P(x) while P(y|x) remains unchanged), or (3) concept shift (the conditional relationship P(y|x) itself changes). As drift accumulates, clusters become as heterogeneous as the whole client population, nullifying the benefits of CFL.

The paper introduces FIELDING, a CFL framework designed to handle all three drift types with minimal overhead. FIELDING’s core ideas are: (i) client‑side drift detection using lightweight representations, (ii) a hybrid re‑clustering strategy that combines per‑client adjustments with selective global re‑clustering, and (iii) a pluggable representation mechanism that chooses the cheapest sufficient descriptor (label‑distribution vectors for label and covariate shift, gradients or loss‑based features for concept shift). When a client detects drift, it sends its updated representation to a centralized coordinator. The coordinator first reassigns the drifting client to the nearest cluster based on the chosen distance metric, without immediately updating cluster centroids. After processing all drifting clients, the coordinator recomputes the centroids and measures the average inter‑cluster distance θ. If any centroid has moved more than θ/3, a full k‑means re‑clustering of all clients is triggered; otherwise, the new assignments are kept and training proceeds. The number of clusters K is automatically selected as the value that maximizes the silhouette score.

Algorithmically, FIELDING proceeds as follows: (1) initial clustering of clients using k‑means on their representations; (2) for each global round, drift handling (Algorithm 2) as described above; (3) standard FL training within each cluster (local SGD on a sampled subset of clients, aggregation of updates); (4) periodic evaluation. The framework is compatible with any client selection strategy, aggregation rule, or model architecture.

The authors provide a theoretical analysis that assumes client representations become increasingly stable as training progresses. Under this assumption they derive per‑round utility bounds and convergence guarantees, showing that the selective global re‑clustering step does not increase the expected loss beyond that of standard CFL. Moreover, the probability of triggering a full re‑clustering is bounded by the magnitude of drift, ensuring that the extra communication and computation cost remains low.

Empirically, FIELDING is implemented on top of the FedScale simulation engine, extended to support streaming data. Experiments involve up to 5,078 image‑streaming clients derived from the Functional Map of the World (FMoW) satellite‑image dataset, with four distinct drift scenarios that mix label, covariate, and concept shifts. Baselines include IFCA, FlexCFL, FedDrift, Auxo, and FedAC. Key findings are:

  • Intra‑cluster heterogeneity: FIELDING maintains the lowest mean client distance throughout training, outperforming pure per‑client adjustment (which can become worse than no clustering) and pure global re‑clustering (which suffers from instability under small drifts).
  • Final model accuracy: Across all scenarios, FIELDING improves the best‑cluster model’s test accuracy by 1.9 %–5.9 % relative to the strongest baseline.
  • Convergence speed: The number of rounds required to reach a target accuracy is reduced by a factor of 1.16 × to 2.23 ×.
  • Communication/computation overhead: Because only label‑distribution vectors (a few dozen floats) are exchanged by every client each time drift is detected, the additional bandwidth is < 5 % of standard FL traffic. Gradient‑based representations are requested only for the subset of clients experiencing concept drift, adding < 12 % extra compute time.
  • Robustness to malicious clients: Experiments where a fraction of clients falsify their label distributions show that the centroid‑based reassignment and the τ‑threshold safeguard cluster integrity, limiting the impact on overall accuracy.

FIELDING’s design addresses two fundamental challenges in drift‑aware CFL: (1) adapting to varying drift magnitudes without incurring the full cost of global re‑clustering at every step, and (2) detecting and responding to heterogeneous drift types using a unified, low‑cost representation pipeline. By re‑clustering all drifted clients (rather than only the sampled subset) it avoids stale cluster assignments that would otherwise degrade performance. The selective global re‑clustering trigger based on centroid movement balances stability and responsiveness, preventing the oscillations observed in naïve global re‑clustering under minor drifts.

In summary, the paper makes the following contributions:

  1. A comprehensive drift taxonomy for FL and an analysis of how each drift type affects cluster homogeneity.
  2. FIELDING, a hybrid CFL framework that combines per‑client migration with threshold‑driven global re‑clustering, supporting pluggable client representations.
  3. Theoretical guarantees on per‑round utility and convergence under realistic assumptions about representation stability.
  4. Extensive empirical validation on large‑scale streaming datasets, demonstrating superior accuracy, faster convergence, and low overhead compared to state‑of‑the‑art CFL methods.
  5. Practical robustness to adversarial behavior and varying degrees of client heterogeneity.

Overall, FIELDING advances the state of federated learning by providing a scalable, drift‑aware clustering mechanism that can be integrated into existing FL pipelines with minimal engineering effort, paving the way for more reliable deployment of FL in dynamic, real‑world environments.


Comments & Academic Discussion

Loading comments...

Leave a Comment