Diffeomorphism-Equivariant Neural Networks
Incorporating group symmetries via equivariance into neural networks has emerged as a robust approach for overcoming the efficiency and data demands of modern deep learning. While most existing approaches, such as group convolutions and averaging-based methods, focus on compact, finite, or low-dimensional groups with linear actions, this work explores how equivariance can be extended to infinite-dimensional groups. We propose a strategy designed to induce diffeomorphism equivariance in pre-trained neural networks via energy-based canonicalisation. Formulating equivariance as an optimisation problem allows us to access the rich toolbox of already established differentiable image registration methods. Empirical results on segmentation and classification tasks confirm that our approach achieves approximate equivariance and generalises to unseen transformations without relying on extensive data augmentation or retraining.
💡 Research Summary
The paper introduces DiffeoNN, a framework that endows pre‑trained neural networks with equivariance to the infinite‑dimensional group of diffeomorphisms without any additional training of the base model. The key idea is to canonicalise each input image by finding a transformation that maps it onto a representative that is “close” to the limited labelled training set. Transformations are parameterised by stationary velocity fields (SVFs), a common representation in deformable image registration, which guarantees smooth, invertible mappings and efficient computation of inverses.
Canonicalisation is posed as an optimisation problem that minimises a task‑specific energy E_can(x, g). This energy combines three components: (i) a reconstruction loss derived from a variational auto‑encoder that measures similarity to the training distribution, (ii) a regularisation term on the SVF that enforces smoothness and orientation‑preserving properties, and (iii) an adversarial discriminator loss that forces canonicalised images to be indistinguishable from real training samples. Gradient‑based optimisation (e.g., Adam) is used, leveraging modern automatic differentiation tools.
Once the canonicalising diffeomorphism g_x is obtained, the input is transformed to x_c = g_x·x and fed to the original network f_θ, which has been trained only on the sparse labelled set X_E. The network’s output y_c = f_θ(x_c) is then mapped back to the original coordinate system by applying the inverse transformation g_x⁻¹, yielding the final prediction ŷ = g_x⁻¹·y_c. For segmentation tasks this procedure guarantees equivariance (the output mask transforms exactly as the input image does), while for classification tasks it yields invariance (the class label remains unchanged under any diffeomorphic deformation).
The authors demonstrate the approach on two families of experiments. In synthetic nested‑square segmentation and real chest‑X‑ray lung segmentation, DiffeoNN raises Intersection‑over‑Union from 0.877 (naïve application of the base model) to 0.956, outperforming a strong data‑augmentation baseline and producing far fewer outlier predictions. In a topological‑invariant MNIST classification experiment, accuracy improves from 92 % to 97 % when test images are randomly warped by diffeomorphisms. Energy analyses show that the canonicalisation step is itself diffeomorphism‑invariant, confirming the theoretical equivariance property.
Technical contributions include (1) extending energy‑based canonicalisation—previously limited to finite‑dimensional Lie groups—to the infinite‑dimensional diffeomorphism group by using SVF parametrisation, and (2) integrating VAE reconstruction and adversarial discrimination into the canonicalisation energy to preserve structural consistency beyond pixel‑wise similarity. The framework is model‑agnostic: any pre‑trained network can be wrapped with the canonicalisation–inverse‑canonicalisation pipeline to obtain diffeomorphism equivariance without architectural changes or retraining.
Overall, DiffeoNN offers a practical, theoretically grounded solution for achieving equivariance to complex, non‑linear spatial transformations, opening new possibilities for robust medical‑image analysis, histopathology, and any domain where data exhibit rich, continuous deformations. Future work may explore time‑dependent velocity fields, 3‑D volumetric data, and meta‑learning strategies to accelerate the canonicalisation optimisation.
Comments & Academic Discussion
Loading comments...
Leave a Comment