GS-KAN: Parameter-Efficient Kolmogorov-Arnold Networks via Sprecher-Type Shared Basis Functions

GS-KAN: Parameter-Efficient Kolmogorov-Arnold Networks via Sprecher-Type Shared Basis Functions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The Kolmogorov-Arnold representation theorem offers a theoretical alternative to Multi-Layer Perceptrons (MLPs) by placing learnable univariate functions on edges rather than nodes. While recent implementations such as Kolmogorov-Arnold Networks (KANs) demonstrate high approximation capabilities, they suffer from significant parameter inefficiency due to the requirement of maintaining unique parameterizations for every network edge. In this work, we propose GS-KAN (Generalized Sprecher-KAN), a lightweight architecture inspired by David Sprecher’s refinement of the superposition theorem. GS-KAN constructs unique edge functions by applying learnable linear transformations to a single learnable, shared parent function per layer. We evaluate GS-KAN against existing KAN architectures and MLPs across synthetic function approximation, tabular data regression and image classification tasks. Our results demonstrate that GS-KAN outperforms both MLPs and standard KAN baselines on continuous function approximation tasks while maintaining superior parameter efficiency. Additionally, GS-KAN achieves competitive performance with existing KAN architectures on tabular regression and outperforms MLPs on high-dimensional classification tasks. Crucially, the proposed architecture enables the deployment of KAN-based architectures in high-dimensional regimes under strict parameter constraints, a setting where standard implementations are typically infeasible due to parameter explosion. The source code is available at https://github.com/rambamn48/gs-impl.


💡 Research Summary

The paper addresses a critical scalability limitation of Kolmogorov‑Arnold Networks (KANs), namely the explosion of parameters caused by learning a distinct univariate function for every edge. By revisiting Sprecher’s refinement of the Kolmogorov‑Arnold representation, the authors propose Generalized Sprecher‑KAN (GS‑KAN), a lightweight architecture that shares a single learnable B‑spline basis per layer and generates edge‑specific functions through learnable linear transformations (scale λ and shift ϵ).

Mathematically, the standard KAN computes each node as y₍q₎ = Σ₍p₎ φ₍q,p₎(x₍p₎), where φ₍q,p₎ are independent splines. GS‑KAN replaces this with y₍q₎ = Σ₍p₎ λ₍p,q₎·ψₗ(x₍p₎+ϵ₍q₎), where ψₗ is a shared spline for layer ℓ, λ₍p,q₎ is a per‑edge scalar weight, and ϵ₍q₎ is a per‑node translation. This reduces the parameter count from O(C·N_in·N_out) to O(N_in·N_out)+C, matching the asymptotic complexity of a conventional MLP while preserving the expressive power of KANs.

Implementation details include: (1) learning ψₗ as a cubic B‑spline with trainable coefficients and a fixed knot vector; (2) fixing the spline domain to


Comments & Academic Discussion

Loading comments...

Leave a Comment