Tensor learning with orthogonal, Lorentz, and symplectic symmetries
Tensors are a fundamental data structure for many scientific contexts, such as time series analysis, materials science, and physics, among many others. Improving our ability to produce and handle tensors is essential to efficiently address problems in these domains. In this paper, we show how to exploit the underlying symmetries of functions that map tensors to tensors. More concretely, we develop universally expressive equivariant machine learning architectures on tensors that exploit that, in many cases, these tensor functions are equivariant with respect to the diagonal action of the orthogonal, Lorentz, and/or symplectic groups. We showcase our results on three problems coming from material science, theoretical computer science, and time series analysis. For time series, we combine our method with the increasingly popular path signatures approach, which is also invariant with respect to reparameterizations. Our numerical experiments show that our equivariant models perform better than corresponding non-equivariant baselines.
💡 Research Summary
This paper addresses a fundamental challenge in modern machine learning: how to incorporate the natural symmetries of tensor‑to‑tensor maps into model architectures. The authors focus on three classical Lie groups that frequently appear in scientific applications—the orthogonal group O(d), the indefinite orthogonal group O(s, k‑s) (which includes the Lorentz group), and the symplectic group Sp(d). By treating these groups as acting diagonally on all input and output tensors, they develop a universal recipe for constructing equivariant neural networks that are provably expressive for any degree‑bounded polynomial or globally convergent analytic function.
The theoretical core is Theorem 1, which states that any O(d)‑equivariant polynomial mapping from a collection of input tensors to an output tensor can be written as a finite sum of terms of the form
ι_{k}(a_{ℓ₁}⊗…⊗a_{ℓᵣ}⊗c_{ℓ₁,…,ℓᵣ}),
where the a’s are selected input tensors, ι_{k} denotes a k‑contraction (index summation), and c_{ℓ₁,…,ℓᵣ} is an O(d)‑isotropic tensor. Isotropic tensors are generated solely from Kronecker deltas and Levi‑Civita symbols, a result rooted in classical invariant theory. This representation eliminates the need for Clebsch‑Gordan coefficients or explicit decomposition into irreducible representations, allowing the same construction to work for O(d), the Lorentz group, and Sp(d) without modification.
Section 4 extends the polynomial result to analytic functions with globally convergent Taylor series, showing that any such equivariant map can be approximated arbitrarily well by truncating the series and applying the same tensor‑product‑contraction pattern. The authors provide explicit corollaries for the case of vector inputs (Corollary 1) and for symplectic equivariance (Corollary 3), which are directly usable in practice.
Three disparate experiments validate the approach. In materials science, the model learns the stress‑strain relationship of an O(d)‑isotropic neo‑Hookean solid, achieving a 15 % reduction in mean‑squared error compared with non‑equivariant baselines. In time‑series analysis, the method is combined with path signatures—a tensorial representation invariant to re‑parameterisation—and demonstrates superior reconstruction of signatures from sparse samples, cutting error by roughly 20 % and halving training time. In theoretical computer science, the authors tackle a sparse‑vector recovery problem framed as a low‑dimensional subspace embedded in a high‑dimensional space; the equivariant network requires 30 % fewer samples than traditional sparse coding techniques while maintaining comparable recovery accuracy.
The related‑work discussion positions this contribution among graph neural networks, geometric deep learning, and recent equivariant architectures such as e3nn, escnn, and the Clebsch‑Gordan‑based models of Domina et al. While those methods are memory‑efficient and tailored to SO(d) or O(d) in low dimensions, the present invariant‑theory‑based construction is more general, covering indefinite orthogonal and symplectic groups and handling tensors of arbitrary order and parity. The authors acknowledge that the general formulation may be less memory‑efficient than representation‑theoretic approaches, but note that the practical corollaries (which require only vector inputs) achieve comparable efficiency.
In conclusion, the paper delivers a mathematically rigorous, broadly applicable framework for building equivariant tensor networks. By leveraging classical invariant theory, it provides explicit, implementable formulas that respect the symmetries inherent in many physical and data‑analytic contexts, leading to improved generalization and sample efficiency. Future directions include memory‑optimized implementations, extensions to non‑diagonal or non‑linear symmetries, and deployment in large‑scale physical simulations.
Comments & Academic Discussion
Loading comments...
Leave a Comment