Federated Balanced Learning
Federated learning is a paradigm of joint learning in which clients collaborate by sharing model parameters instead of data. However, in the non-iid setting, the global model experiences client drift, which can seriously affect the final performance of the model. Previous methods tend to correct the global model that has already deviated based on the loss function or gradient, overlooking the impact of the client samples. In this paper, we rethink the role of the client side and propose Federated Balanced Learning, i.e., FBL, to prevent this issue from the beginning through sample balance on the client side. Technically, FBL allows unbalanced data on the client side to achieve sample balance through knowledge filling and knowledge sampling using edge-side generation models, under the limitation of a fixed number of data samples on clients. Furthermore, we design a Knowledge Alignment Strategy to bridge the gap between synthetic and real data, and a Knowledge Drop Strategy to regularize our method. Meanwhile, we scale our method to real and complex scenarios, allowing different clients to adopt various methods, and extend our framework to further improve performance. Numerous experiments show that our method outperforms state-of-the-art baselines. The code is released upon acceptance.
💡 Research Summary
Federated learning (FL) suffers from severe performance degradation when client data are non‑IID, leading to client drift where the global model overfits dominant domains. Existing remedies such as FedProx, SCAFFOLD, MOON, and FedGen operate after the drift has occurred, adjusting loss terms or gradients but never addressing the root cause: the imbalance of samples across classes on each client. This paper introduces Federated Balanced Learning (FBL), a paradigm that tackles the problem at the data level by enforcing class‑wise balance on every client while keeping the total number of local samples fixed.
The method first computes a per‑client balance point Bk as the floor of the average number of samples per class. Classes are then partitioned into three groups: data‑excessive (nck > Bk), data‑scarce (0 < nck < Bk), and data‑missing (nck = 0). For data‑excessive classes, a loss‑based knowledge sampling strategy is employed: each sample’s cross‑entropy loss is evaluated, and samples with higher loss are preferentially retained while the rest are discarded until the class size reaches Bk. This preserves the most informative examples.
For data‑scarce and data‑missing classes, FBL leverages edge‑side generative models (e.g., a lightweight Stable Diffusion) to synthesize new images according to class‑specific prompts. To bridge the inevitable domain gap between synthetic and real images, a Knowledge Alignment Strategy is introduced. A learnable class embedding Pi is added element‑wise to the feature prototype extracted by the backbone (F) before feeding it to the classifier head (H): ŷij = H(F(Iij) ⊕ Pi). This embedding compensates for style and distribution differences, allowing synthetic samples to contribute effectively.
A Knowledge Drop Strategy further regularizes the alignment process: with a drop rate ζ, a random subset of synthetic samples in each batch bypasses the embedding addition, akin to dropout, preventing over‑fitting of Pi.
FBL also acknowledges heterogeneous computational resources across clients. Devices with ample compute generate more synthetic data, while resource‑constrained devices rely mainly on sampling, all within a unified framework that still follows the standard FL communication protocol (local training → server aggregation).
Extensive experiments on image, video, and satellite datasets under various non‑IID configurations (label skew, domain skew) demonstrate that FBL consistently outperforms state‑of‑the‑art baselines by 2–7 percentage points in top‑1 accuracy. The gains are especially pronounced when certain classes are severely under‑represented, confirming the effectiveness of knowledge filling. Communication overhead remains comparable to traditional FL because generation occurs locally on the edge.
In summary, FBL redefines the role of edge clients from passive data holders to active data balancers, using on‑device generative models to pre‑emptively mitigate client drift. The paper opens new research directions, including privacy‑preserving synthetic data (e.g., differential privacy noise), personalized generative models, and collaborative generation across clients.
Comments & Academic Discussion
Loading comments...
Leave a Comment