NSC-SL: A Bandwidth-Aware Neural Subspace Compression for Communication-Efficient Split Learning
The expanding scale of neural networks poses a major challenge for distributed machine learning, particularly under limited communication resources. While split learning (SL) alleviates client computational burden by distributing model layers between clients and server, it incurs substantial communication overhead from frequent transmission of intermediate activations and gradients. To tackle this issue, we propose NSC-SL, a bandwidth-aware adaptive compression algorithm for communication-efficient SL. NSC-SL first dynamically determines the optimal rank of low-rank approximation based on the singular value distribution for adapting real-time bandwidth constraints. Then, NSC-SL performs error-compensated tensor factorization using alternating orthogonal iteration with residual feedback, effectively minimizing truncation loss. The collaborative mechanisms enable NSC-SL to achieve high compression ratios while preserving semantic-rich information essential for convergence. Extensive experiments demonstrate the superb performance of NSC-SL.
💡 Research Summary
The paper addresses the communication bottleneck inherent in Split Learning (SL), where clients and a server exchange intermediate activations (“smashed data”) and gradients for every mini‑batch. Existing compression techniques (low‑rank projection, quantization, sparsification) either use a fixed rank or ignore real‑time bandwidth constraints, leading to sub‑optimal performance in practical deployments. To overcome these limitations, the authors propose NSC‑SL (Neural Subspace Compression for Split Learning), a two‑stage framework that dynamically adapts compression strength to the available network capacity while preserving the information needed for model convergence.
1. Bandwidth‑Aware Adaptive Rank Selection (BAS).
Instead of performing a full singular value decomposition (SVD) on each activation tensor, NSC‑SL employs a randomized subspace method to estimate the singular spectrum efficiently (O(m n r) where r ≪ min(m,n)). A Gaussian random matrix Ω is multiplied with the target matrix M, followed by iterative orthogonalization and a thin QR factorization to obtain a low‑dimensional proxy B = QᵀM. A small SVD on B yields approximate singular values Σ and subspaces. The algorithm then selects the smallest rank r_η that captures a predefined energy coverage ratio η (e.g., 90 %). Simultaneously, it enforces a communication budget B_max by requiring 4 r (m + n) bytes ≤ B_max (single‑precision factors) and respects a user‑defined rank cap r_cap. The final rank is r = min(r_η, ⌊B_max/(4(m+n))⌋, r_cap). This mechanism automatically tightens compression when bandwidth shrinks and relaxes it when more capacity is available.
2. Orthogonal Alternating Subspace Approximation with Error‑Correction Loop (OASA + ECL).
Given the selected rank, NSC‑SL approximates M by iteratively updating left (P) and right (Q) factor matrices while maintaining orthogonality:
P(t) ← ϕ
Comments & Academic Discussion
Loading comments...
Leave a Comment