A theoretical analysis on the inversion of matrices via Neural Networks designed with Strassen algorithm
We construct a Neural Network that approximates the matrix multiplication operator for any activation function such that there exists a Neural Network which can approximate the scalar multiplication function. In particular, we use the Strassen algorithm to reduce the number of weights and layers needed for such Neural Networks. This allows us to define another Neural Network for approximating the inverse matrix operator. Also, by relying on the Galerkin method, we apply those Neural Networks to solve parametric elliptic PDEs for a whole set of parameters. Finally, we discuss improvements with respect to the prior results.
💡 Research Summary
The paper proposes a theoretical framework for approximating matrix multiplication and matrix inversion using neural networks (NNs) that are designed around the Strassen matrix multiplication algorithm. The authors begin by assuming the existence of an activation function for which a neural network can approximate the scalar multiplication operation on any bounded interval. This assumption includes common activations such as ReLU, the quadratic function, and ReLU², thereby extending the class of admissible activations beyond the specific choice made in prior work (reference
Comments & Academic Discussion
Loading comments...
Leave a Comment