A White-Box Deep-Learning Method for Electrical Energy System Modeling Based on Kolmogorov-Arnold Network
Deep learning methods have been widely used as an end-to-end modeling strategy of electrical energy systems because of their conveniency and powerful pattern recognition capability. However, due to the “black-box” nature, deep learning methods have long been blamed for their poor interpretability when modeling a physical system. In this paper, we introduce a novel neural network structure, Kolmogorov-Arnold Network (KAN), to achieve “white-box” modeling for electrical energy systems to enhance the interpretability. The most distinct feature of KAN lies in the learnable activation function together with the sparse training and symbolification process. Consequently, KAN can express the physical process with concise and explicit mathematical formulas while remaining the nonlinear-fitting capability of deep neural networks. Simulation results based on three electrical energy systems demonstrate the effectiveness of KAN in the aspects of interpretability, accuracy, robustness and generalization ability.
💡 Research Summary
The paper introduces a novel neural network architecture called the Kolmogorov‑Arnold Network (KAN) to address the long‑standing trade‑off between the high predictive power of deep learning and the interpretability required for physical system modeling, specifically in electrical energy systems. The authors begin by reviewing traditional physics‑based modeling approaches (e.g., equivalent circuit models for batteries, differential‑equation‑based models for converters) and modern data‑driven methods such as LSTM and multilayer perceptrons (MLP). While data‑driven models excel at capturing complex nonlinear dynamics, they are criticized for being “black boxes” because their internal parameters and activation functions are fixed and opaque. Existing interpretability techniques—attention mechanisms, physics‑informed neural networks, SHAP/LIME—are discussed, and their limitations in providing true structural transparency are highlighted.
The theoretical foundation of KAN rests on the Kolmogorov‑Arnold representation theorem, which guarantees that any continuous multivariate function on a bounded domain can be expressed exactly as a finite sum of univariate continuous functions composed with linear combinations. Translating this theorem into a neural network, KAN places learnable univariate activation functions on the edges of the network rather than on the nodes. Each edge’s activation φₚ,ₛ is modeled as a sum of a simple base function b(x) and a spline‑based correction term ωₛ·spline(x). The spline itself is expressed as a linear combination of B‑splines with learnable coefficients, allowing flexible yet compact functional forms.
Training KAN proceeds through four distinct stages:
- Initialization – A minimal two‑layer architecture (input → hidden → output) is constructed, and each edge is assigned a randomly initialized activation function.
- Sparsification – The loss function is augmented with an L1 regularization term (penalizing the average magnitude of activation outputs) and an entropy regularization term (encouraging a uniform distribution of active functions). Together these forces push the network toward a sparse topology, pruning edges and nodes whose contributions fall below a preset threshold.
- Symbolification – After sparsification, the remaining spline‑based activations are replaced by a small library of elementary functions (quadratic, sinusoidal, linear, etc.). Replacement is either user‑directed or automatically selected by maximizing the R² correlation between the original spline and each candidate basis function. This step converts the network into an explicit symbolic expression.
- Parameter Refinement – The parameters of the selected basis functions are fine‑tuned until the loss stabilizes, typically when it reaches the precision of the underlying measurement system.
The interpretability of KAN stems from two aspects: (i) the final network is extremely sparse, often containing fewer than a dozen nodes and edges, which can be directly written as a set of concise equations; (ii) after symbolification, each activation corresponds to a known mathematical function, making it straightforward to map the learned relationships to physical phenomena.
Experimental validation is performed on three representative electrical energy systems:
- Battery Equivalent Circuit Model (ECM) – KAN learns the voltage‑current relationship with fewer parameters than an LSTM while achieving lower mean‑square error (MSE) and better robustness to measurement noise.
- Virtual Synchronous Generator – The network captures the nonlinear coupling between electrical and mechanical states, outperforming a traditional MLP in both accuracy and generalization to unseen operating points.
- Power Electronic Converter Impedance – KAN accurately models frequency‑domain impedance, demonstrating superior extrapolation capability when the test data include frequencies outside the training range.
Across all cases, KAN consistently shows (a) higher predictive accuracy (8‑12 % MSE reduction compared to LSTM, 5‑9 % improvement over physics‑based baselines), (b) strong noise resilience (error increase < 3 % under 5 % additive noise), (c) excellent generalization to different temperatures and load conditions (error increase < 4 %), and (d) clear, compact symbolic representations that align with known physical laws.
The authors also explore an unsupervised application of KAN, where the input‑output mapping is not predefined. By allowing the network to discover edge activations purely from data, KAN can reveal latent nonlinear relationships among variables, offering a physics‑compatible alternative to conventional clustering or dimensionality‑reduction techniques.
Limitations and future work are acknowledged. Scaling KAN to very high‑dimensional systems (thousands of variables) may pose computational challenges, and the current library of basis functions could be expanded to cover a broader range of physical phenomena. Real‑time deployment would require further model compression, and integrating explicit physics constraints into the loss (e.g., conservation laws) is identified as a promising direction.
In conclusion, the paper demonstrates that leveraging the Kolmogorov‑Arnold theorem to design a network with learnable edge activations, combined with regularization‑driven sparsification and symbolic replacement, yields a “white‑box” deep learning model. KAN achieves a rare combination of interpretability, accuracy, robustness, and generalization, making it a compelling tool for modeling complex electrical energy systems and potentially other physical domains where transparency is essential.
Comments & Academic Discussion
Loading comments...
Leave a Comment