Learning Nonlinear Heterogeneity in Physical Kolmogorov-Arnold Networks

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Physical neural networks typically train linear synaptic weights while treating device nonlinearities as fixed. We show the opposite - by training the synaptic nonlinearity itself, as in Kolmogorov-Arnold Network (KAN) architectures, we yield markedly higher task performance per physical resource and improved performance-parameter scaling than conventional linear weight-based networks, demonstrating ability of KAN topologies to exploit reconfigurable nonlinear physical dynamics. We experimentally realise physical KANs in silicon-on-insulator devices we term ‘Synaptic Nonlinear Elements’ (SYNEs), operating at room temperature, microampere currents, 2 MHz speeds and ~750 fJ per nonlinear operation, with no observed degradation over 10^13 measurements and months-long timescales. We demonstrate nonlinear function regression, classification, and prediction of Li-Ion battery dynamics from noisy real-world multi-sensor data. Physical KANs outperform equivalently-parameterised software multilayer perceptron networks across all tasks, with up to two orders of magnitude fewer parameters, and two orders of magnitude fewer devices than linear weight based physical networks. These results establish learned physical nonlinearity as a hardware-native computational primitive for compact and efficient learning systems, and SYNE devices as effective substrates for heterogenous nonlinear computing.

💡 Research Summary

This paper introduces a hardware‑native implementation of Kolmogorov‑Arnold Networks (KANs) by training the nonlinear synaptic functions themselves rather than conventional linear weights. The authors fabricate silicon‑on‑insulator (SOI) “Synaptic Nonlinear Elements” (SYNEs), disk‑shaped devices doped with phosphorus via polymer‑grafting. Each SYNE has four active terminals: an input voltage (V_in), an output current (I_out), and two tunable control voltages (V_tune1, V_tune2) that shape the I‑V characteristic. By sweeping these control voltages, SYNEs can produce a broad family of nonlinear transfer functions, including positive and negative differential resistance, non‑monotonic turning points, and multi‑valued regions.

To construct a KAN synapse, multiple SYNEs are connected in parallel and operated in a time‑multiplexed hardware‑in‑the‑loop (HIL) scheme. Each SYNE contributes five learnable parameters (two V_tune values, a linear gain G, and input voltage scaling limits). Neurons remain linear summation units with a trainable bias. Because a single SYNE cannot span the full functional space required by KANs, the parallel combination provides a piecewise‑defined nonlinear basis analogous to B‑splines used in software KANs.

Training proceeds via a differentiable surrogate (“digital twin”) model: an MLP trained on experimentally measured I‑V data predicts SYNE behavior, enabling gradient‑based optimization of the tunable parameters. After convergence, the optimized control voltages are programmed back onto the physical SYNEs, thereby imprinting the learned nonlinear functions onto the hardware. Linear summations are performed off‑chip digitally, while the nonlinear operations are fully realized by the SYNEs.

The authors evaluate the physical KAN on four tasks: (1) regression of single nonlinear functions, (2) composition of nested functions f(g(x)), (3) classification on a standard benchmark, and (4) prediction of Li‑Ion battery dynamics from noisy multi‑sensor data. Across all tasks, the physical KAN matches or exceeds the performance of software multilayer perceptrons (MLPs) that have the same number of trainable parameters, while using up to two orders of magnitude fewer physical devices. In comparison with linear‑weight‑based physical neural networks, the KAN shows markedly better performance‑to‑parameter scaling, confirming that learning programmable nonlinearities can dramatically reduce the required device count.

Device metrics are impressive: SYNEs operate at room temperature, consume ~750 fJ per nonlinear operation, support 2 MHz bandwidth, and draw microampere‑scale currents (0.1–1 µA) under ±2 V bias. Reliability tests show no degradation after 10¹³ measurements and months of continuous operation. The authors also introduce an “Epsilon Expressivity” metric to quantify the functional richness of a given SYNE configuration; higher expressivity correlates strongly with improved regression accuracy, providing a useful design guideline before fabrication.

Overall, the work demonstrates that programmable physical nonlinearity is a viable computational primitive. By shifting learning from linear weight matrices to per‑synapse nonlinear functions, KANs exploit the intrinsic heterogeneity of physical devices, achieving compact, energy‑efficient learning systems. The SYNE platform, with its reconfigurable I‑V response, offers a promising route toward neuromorphic, in‑memory, and low‑power edge AI hardware, and the methodology can be extended to other material systems (e.g., memristive oxides, 2D materials) for broader applicability.

Learning Nonlinear Heterogeneity in Physical Kolmogorov-Arnold Networks

💡 Research Summary

Comments & Academic Discussion

Leave a Comment