Relationships between full-space and subspace quadratic interpolation models and simplex derivatives
Quadratic interpolation models and simplex derivatives are fundamental tools in numerical optimization, particularly in derivative-free optimization. When constructed in suitably chosen affine subspaces, these tools have been shown to be especially effective for high-dimensional derivative-free optimization problems, where full-space model construction is often impractical. In this paper, we analyze the relationships between full-space and subspace formulations of these tools. In particular, we derive explicit conversion formulas between full-space and subspace models, including minimum-norm models, minimum Frobenius norm models, least Frobenius norm updating models, as well as models constructed via generalized simplex gradients and Hessians. We show that the full-space and subspace models coincide on the affine subspace and, in general, along directions in the orthogonal complement. Overall, our results provide a theoretical framework for understanding subspace approximation techniques and offer insight into the design and analysis of derivative-free optimization methods.
💡 Research Summary
The paper investigates the precise mathematical relationship between full‑space and subspace constructions of two fundamental tools in derivative‑free optimization (DFO): quadratic interpolation models and simplex‑based derivative approximations. In high‑dimensional settings, building a full‑space quadratic model requires ((n+1)(n+2)/2) function evaluations, which quickly becomes prohibitive. Consequently, recent DFO methods construct models in a low‑dimensional affine subspace (x_{0}+Q\mathbb{R}^{d}) (with (d<n) and orthonormal columns (Q)). While such subspace approaches have demonstrated practical success, a rigorous understanding of how the subspace models relate to their full‑space counterparts has been lacking.
The authors assume (1) the sample set (Y) lies entirely in the subspace, i.e., (Y={x_{0}+Qb : b\in\mathbb{R}^{d}}), and (2) there exists at least one quadratic function that interpolates the data on (Y). They define a reduced‑dimension objective (\tilde f(b)=f(x_{0}+Qb)) and study three widely used quadratic model families:
- Minimum‑norm (MN) models, obtained by minimizing (\frac12|\alpha|^{2}+\frac12|H|_{F}^{2}) subject to the interpolation constraints.
- Minimum‑Frobenius‑norm (MFN) models, which minimize (\frac12|H|_{F}^{2}) while allowing the gradient to be non‑unique.
- Least‑Frobenius‑norm updating (LFU) models, which start from a prior Hessian (H^{\text{old}}) and minimize (\frac12|H-H^{\text{old}}|_{F}^{2}) under the same constraints.
For each family the paper derives explicit conversion formulas that map the subspace model parameters ((\nabla\tilde f,\nabla^{2}\tilde f)) to the full‑space parameters ((\nabla f,\nabla^{2} f)). The central result (Theorem 1) for MN models states
\
Comments & Academic Discussion
Loading comments...
Leave a Comment