A novel Krylov subspace method for approximating Fréchet derivatives of large-scale matrix functions
We present a novel Krylov subspace method for approximating $L_f(A, E) \vc{b}$, the matrix-vector product of the Fréchet derivative $L_f(A, E)$ of a large-scale matrix function $f(A)$ in direction $E$, a task that arises naturally in the sensitivity analysis of quantities involving matrix functions, such as centrality measures for networks. It also arises in the context of gradient-based methods for optimization problems that feature matrix functions, e.g., when fitting an evolution equation to an observed solution trajectory. In principle, the well-known identity [ f\left( \begin{bmatrix} A & E \ 0 & A \end{bmatrix} \right) \begin{bmatrix} 0 \ \vc{b} \end{bmatrix} = \begin{bmatrix} L_f(A, E) \vc{b} \ f(A) \vc{b} \end{bmatrix}, ] allows one to directly apply any standard Krylov subspace method, such as the Arnoldi algorithm, to address this task. However, this comes with the major disadvantage that the involved block triangular matrix has unfavorable spectral properties, which impede the convergence analysis and, to a certain extent, also the observed convergence. To avoid these difficulties, we propose a novel modification of the Arnoldi algorithm that aims at better preserving the block triangular structure. In turn, this allows one to bound the convergence of the modified method by the best polynomial approximation of the derivative $f^\prime$ on the numerical range of $A$. Several numerical experiments illustrate our findings.
💡 Research Summary
The paper addresses the computational challenge of evaluating the Fréchet derivative of a large‑scale matrix function, specifically the product (L_f(A,E),b) where (A\in\mathbb{C}^{n\times n}), (E\in\mathbb{C}^{n\times n}) and (b\in\mathbb{C}^n). Such quantities appear in sensitivity analysis of network centrality measures, gradient‑based optimization involving matrix functions, and other applications where only the directional derivative applied to a vector is required.
A classical approach exploits the block‑triangular identity
\
Comments & Academic Discussion
Loading comments...
Leave a Comment