Federated Learning (FL) enables collaborative training across multiple clients while preserving data privacy, yet it struggles with data heterogeneity, where clients' data are not distributed independently and identically (non-IID). This causes local drift, hindering global model convergence. To address this, we introduce Federated Learning with Feedback Alignment (FLFA), a novel framework that integrates feedback alignment into FL. FLFA uses the global model's weights as a shared feedback matrix during local training's backward pass, aligning local updates with the global model efficiently. This approach mitigates local drift with minimal additional computational cost and no extra communication overhead.
Our theoretical analysis supports FLFA's design by showing how it alleviates local drift and demonstrates robust convergence for both local and global models. Empirical evaluations, including accuracy comparisons and measurements of local drift, further illustrate that FLFA can enhance other FL methods demonstrating its effectiveness.
Federated Learning (FL) has gained significant attention as an innovative approach enabling multiple clients to collaboratively train a shared model without centralizing their data [15]. In a typical FL setup, each client trains a local model using its own dataset and shares only the model updates with a central server. The server aggregates these updates to refine a global model, leveraging the diverse data from all clients. However, FL faces a critical challenge, data heterogeneity [12,16,21,27]. When clients' data are non-IID (not independently and identically distributed), local model updates deviate from the global optimum-a phenomenon called local drift-degrading convergence and overall performance [1,6,22,23].
Addressing data heterogeneity remains an active area of research. Recent studies have proposed regularizationbased methods [7,[9][10][11] that add auxiliary terms to the local objective function to constrain local updates. Al-though these methods effectively reduce local drift, they introduce considerable computational overhead through additional loss terms and their associated gradient computations, which accumulate across multiple local epochs.
To address data heterogeneity without auxiliary loss terms, we take a fundamentally different approach: instead of constraining what the model learns, we modify how gradients are computed during training by revisiting the backpropagation (BP) algorithm itself. We draw inspiration from Feedback Alignment (FA) [14], originally proposed as a biologically plausible alternative to BP. In BP, weight matrix W is used for both forward and backward passes. FA instead uses W for forward but a fixed random matrix B for backward passes. Despite this asymmetry, the network learns effectively as W aligns with B.
Building on this principle, we propose Federated Learning with Feedback Alignment (FLFA), which uses the global model weights as a shared feedback matrix B for all clients during local training. Unlike traditional FA with random B, our B equals the global weights received each round. Instead of each client using its own diverging local weights W for BP, all clients use the same global model weights as their feedback matrix B. This shared feedback acts as a common reference that naturally aligns local updates across heterogeneous clients without auxiliary loss objective term.
Our main contributions are as follows: • Theoretical foundation: We develop a theoretical foundation that identifies the key factors contributing to local drift in federated learning and establishes how FLFA can mitigate this drift. We provide convergence analysis demonstrating that FLFA achieves robust convergence for both local and global models. • FLFA framework: Deriving from the theoretical foundation, we introduce FLFA, a novel approach that employs the global model weights as a unified backward feedback mechanism. FLFA mitigates local drift while introducing only minimal computational overhead and no extra communication overhead. This makes FLFA highly compatible with existing FL algorithms and practical for real-world deployment.
• Empirical validation: Empirically, FLFA consistently improves accuracy when integrated into diverse FL methods and our ablation studies validate our design of FLFA, derived from the theoretical analysis. Through detailed local drift analysis, we provide empirical evidence that FLFA’s shared feedback mechanism enables clients to implicitly perceive information about other clients through the global model.
To mitigate the local drift in FL, recent studies have proposed adding an auxiliary term to the loss function or developing strategies to aggregate local models on the server. This section first provides an overview of existing FL algorithms that are related to our work.
FedAvg is the pioneering FL algorithm that aggregates locally trained models on the server using weighted averaging [15]. Although FedAvg demonstrates strong performance in IID settings, it suffers from weight divergence under non-IID data distributions due to the independent optimization of local models. To address this, several studies focus on regularizing the local training process by adding an auxiliary term to the local objective function. Model-contrastive federated learning (MOON) [10] improves FL robustness by employing contrastive learning between local and global model representations. By enforcing consistency across client updates, MOON alleviates representation drift. FedProx [11] introduces a proximal term that constrains local updates, preventing them from straying too far from the global model.
Other recent approaches tackle label distribution skew by intervening at the classifier’s output. FedRS [13] modifies the final layer by restricting the softmax function to locally available classes. Similarly, FedLC [26] regularizes the model’s logits before the softmax calculation to prevent over-confidence in local majority classes.
Although these methods effectively
This content is AI-processed based on open access ArXiv data.