Fed-ADE: Adaptive Learning Rate for Federated Post-adaptation under Distribution Shift

Fed-ADE: Adaptive Learning Rate for Federated Post-adaptation under Distribution Shift
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Federated learning (FL) in post-deployment settings must adapt to non-stationary data streams across heterogeneous clients without access to ground-truth labels. A major challenge is learning rate selection under client-specific, time-varying distribution shifts, where fixed learning rates often lead to underfitting or divergence. We propose Fed-ADE (Federated Adaptation with Distribution Shift Estimation), an unsupervised federated adaptation framework that leverages lightweight estimators of distribution dynamics. Specifically, Fed-ADE employs uncertainty dynamics estimation to capture changes in predictive uncertainty and representation dynamics estimation to detect covariate-level feature drift, combining them into a per-client, per-timestep adaptive learning rate. We provide theoretical analyses showing that our dynamics estimation approximates the underlying distribution shift and yields dynamic regret and convergence guarantees. Experiments on image and text benchmarks under diverse distribution shifts (label and covariate) demonstrate consistent improvements over strong baselines. These results highlight that distribution shift-aware adaptation enables effective and robust federated post-adaptation under real-world non-stationarity.


💡 Research Summary

Federated learning (FL) has become a cornerstone for training models across decentralized devices while preserving privacy. In real‑world deployments, however, models pre‑trained on a central dataset quickly degrade as each client observes a non‑stationary, unlabeled data stream that may undergo label shift, covariate shift, or both. Existing FL adaptation methods either assume a fixed learning rate, rely on costly ensembles, or require labeled data, which makes them unsuitable for continual post‑deployment adaptation.

Fed‑ADE (Federated Adaptation with Distribution Shift Estimation) addresses this gap by automatically adjusting the learning rate for each client at every timestep based on an unsupervised estimate of the local distribution shift. Two lightweight, label‑free signals are computed on the client side: (1) Uncertainty dynamics – the cosine distance between the mean soft‑max vectors of consecutive batches, capturing changes in predictive confidence; (2) Representation dynamics – the cosine distance between ℓ₂‑normalized batch‑wise feature means extracted by the shared backbone, capturing covariate drift. The two signals are averaged to obtain a unified shift score (S_{t}^{c}\in


Comments & Academic Discussion

Loading comments...

Leave a Comment