Optimization on Weak Riemannian Manifolds

Riemannian structures on infinite-dimensional manifolds arise naturally in shape analysis and shape optimization. These applications lead to optimization problems on manifolds which are not modeled on Banach spaces. The present article develops the b…

Authors: Valentina Zalbertus, Max Pfeffer, Alex

Optimization on Weak Riemannian Manifolds
OPTIMIZA TION ON WEAK RIEMANNIAN MANIF OLDS V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING Abstract. Riemannian structures on infinite-dimensional manifolds arise naturally in shap e analysis and shap e optimization. These applications lead to optimization problems on manifolds whic h are not modeled on Banach spaces. The present article dev elops the basic framew ork for optimization via gradien t descent on weak Riemannian manifolds leading to the notion of a Hesse manifold. F urther, foundational prop erties for optimiza- tion are established for several classes of weak Riemannian manifolds connected to shap e analysis and shape optimization. Contents 1. In tro duction 1 2. Preliminaries 4 3. W eak Riemannian Manifolds in Optimization 5 4. Optimalit y Conditions 7 4.1. First-Order Optimality Conditions 8 4.2. Second-Order Optimality Conditions 8 5. The Riemannian Gradien t Descen t Metho d 12 6. Classes of Hesse manifolds and their Optimization-relev ant prop erties 15 6.1. Robust Riemannian manifolds 16 6.2. Strong Riemannian manifolds 19 7. Computation of the Riemannian gradien t and the Riemannian Hessian 20 8. Numerical Exp erimen ts 22 App endix A. Spra ys, connections and metrics 25 References 26 1. Introduction In recent y ears, standard first and second order metho ds from con tinuous optimization in euclidean space ha v e b een generalized to Riemannian manifolds, thus kic kstarting the v ery activ e field of Riemannian optimization. In particular, m uch researc h has b een done for matrix manifolds [ 1 , 8 ]. Ev en nonsmo oth optimization on smo oth Riemannian manifolds has b een studied extensiv ely [ 9 , 39 ]. In higher-dimensions, it has b een recognized that tensor trees form Riemannian manifolds, allowing for the adaptation of metho ds on matrix manifolds [ 21 ]. Ho wev er, things are muc h less clear when it comes to Riemannian manifolds of infinite dimension. F or the sp ecial case of Hilbert manifolds, optimization using gradien t descen t is classical, see, e.g., the literature ov erview in [ 40 ]. There are natural geometric applications for gradients and their flows on Hilb ert (sub-)manifolds: morse theory [ 33 , 35 ], energy functionals [ 10 ] for knot deformations, optimal transp ort on W asserstein space (see e.g. Date : March 27, 2026. 2020 Mathematics Subje ct Classific ation. 49K27, 58B20, 58C20, 90C48, 58D30, 49Q10. K ey wor ds and phr ases. (weak) Riemannian manifold, infinite-dimensional optimization, first-order con- ditions, v ariational analysis, shap e analysis, shap e optimization. 1 2 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING [ 2 , 27 , 32 ] for discussions). Beyond Hilb ert manifolds gradient descent techniques t ypically use conjugate gradien ts in reflexive Banac h spaces, see e.g. [ 13 , 14 , 42 ]. In the present article w e discuss basic theory for optimization on infinite-dimensional manifolds using gradients and Hessians b eyond the setting of Hilb ert manifolds. One of sev eral challenges arising in the passage to infinite-dimensions is the split b et ween differen t regimes of Riemannian geometry: Hilb ert manifolds admit str ong Riemannian metrics but manifolds mo deled on more general spaces only admit we ak Riemannian metrics, see [ 37 ]. F or the strong Riemannian metrics, the theory dev elops along the finite-dimensional lines, see e.g. [ 12 , 20 , 33 ]. Since infinite-dimensional manifolds are not lo cally compact, extra conditions (e.g. P alais-Smale condition (C), [ 35 ]) are required to ensure con v ergence of the gradient sequences. Second order theory using the Riemannian Hessian gets more in volv ed on Hilb ert manifolds. Bey ond Hilb ert manifolds, ev ery Riemannian metric is necessarily a w eak Riemannian metric, i.e., the induced inner pro ducts on the tangen t spaces are only contin uous and do not induce the native top ology . Ev en on an op en subset of an infinite-dimensional Hilb ert space, the inner pro duct induced b y a weak Riemannian metric is in general not equiv alent to the Hilb ert space pro duct of the model space. W eak Riemannian metrics arise in man y applications. W e list sev eral settings where gradients, gradient flo ws and questions from optimization are of central interest in an infinite-dimensional setting: ‚ As pioneered by V.I. Arnold, certain partial differen tial equations (PDEs) lift to geo desic equations on manifolds of Sob olev mappings (cf. [ 37 , Chapter 7]). These are Hilb ert manifolds with weak Riemannian metrics, cf. e.g. [ 32 ]. ‚ Shap e analysis studies inv ariant metrics and flo ws on weak Riemannian manifolds of mappings and diffeomorphism groups, cf. e.g. [ 5 , 28 , 30 , 31 , 43 ]. Here optimization is relev an t in large deformation diffeomorphic metric matching (LDDMM), [ 41 ]. See e.g. [ 4 ] for a concrete example inv olving the gradien t flow. ‚ Shap e optimization studies gradients for w eak Riemannian metrics on infinite- dimensional manifolds, see e.g. [ 25 , 26 ]. ‚ (time-)evolving embedded man ifolds and ev olution equations on them lead to gra- dien t flo ws on w eak Riemannian manifolds. The curve-shortening flow studied by Gage and Hamilton and related flows are of this t ype, cf. [ 15 , 38 ] The state of the art to treat these problems is to emplo y one of the follo wing strategies: T reat the qualitativ e b eha viour of gradien t flows. F or example for time-ev olving and shap e manifolds, developmen t of singularities of the gradient flows and geo desic equations are studied without directly employing numerical metho ds, [ 15 , 32 ]. F or optimization sc hemes base on infinite-dimensional manifolds, there are t w o main approac hes: In many relev an t examples, the infinite-dimensional gradien t equations can be translated to finite-dimensional (partial) differen tial equations. These are then numerically solv ed using PDE metho ds (e.g. [ 6 , 25 , 26 ]). In the Hilb ert manifold setting, discretisation of the equations are applied together with conditions assuring conv ergence and con v ergence rate, see e.g. [ 12 , 33 , 40 ]. These tec hniques hav e b een generalised to Banach manifolds (e.g. [ 10 , 13 , 14 , 41 , 42 ]) using w eaker notions of gradients and dualities not necessarily induced by (weak) Riemannian metrics. These approac hes either require strong settings (strong metrics, Hilbert manifolds) or exploit connections to finite dimensional geometry for the discretisation and computation of the descent sc heme. T o the b est of our kno wledge, a general in v estigation of basic optimization algorithms for w eak Riemannian manifolds is so far missing. One aim of the presen t article is to provide an introduction to basic optimization tec h- niques on infinite-dimensional manifolds in the weak setting. W e highligh t pitfalls and OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 3 c hallenges arising on Riemannian manifolds b eyond the Hilb ert setting. F urther, funda- men tal optimality conditions and conv ergence results for optimization on weak Riemann- ian manifolds are provided. While m uc h of the classical in tuition from finite-dimensional optimization (as presen ted in [ 8 ]) carries ov er, the absence of the Hilb ert/Banach space structure mak es it a priori unclear, in which sense standard optimalit y conditions gener- alise to w eak Riemannian manifolds. Theorem 1.1 (First-Order Optimalit y) . L et f : M Ñ R b e c ontinously differ entiable on a we ak R iemannian manifold M . Then every lo c al minimizer p P M satisfies ∇ f p p q “ 0 , wher e ∇ f denotes R iemannian gr adient of f . This recov ers the familiar necessary condition from finite-dimensional optimization [ 8 ]. T o extend first-order optimality conditions to algorithms, we show that under an addi- tional assumption ensuring sufficient structure on weak Riemannian manifolds, the clas- sical finite-dimensional conv ergence result for the Riemannian gradient descen t [ 8 ] carries o ver to our presen t setting. Theorem 1.2. A l l ac cumulation p oints of the se quenc e of iter ates p p n q n P N gener ate d by the Riemannian desc ent algorithm ar e critic al p oints, and lim n Ñ8 ~ ∇ f p p n q~ “ 0 . wher e ~ ¨ ~ is the norm induc e d by the we ak Riemannian metric. Second order optimality is more complicated due to intricate structure arising in the critical p oints of the Hessian (cf. e.g. Theorem 7.6). Nevertheless, one can pro ve the follo wing: Theorem 1.3 (Second-Order Optimality) . A p oint p P M with ∇ f p p q “ 0 and Hess f p p q p ositive-definite is a lo c al minimizer if and only if the R iemannian Hessian is c o er cive at that p oint, i.e. ther e exists µ ą 0 such that g p p Hess f p p qr v s , v q ě µ ~ v ~ p , @ v P T p M . Unlik e the finite-dimensional setting–where p ositiv e definiteness of the Hessian suffices– co ercivit y is more restrictiv e here, failing to follo w from p ositiv e definiteness on w eak Riemannian manifolds. Note that this describ es a t ypical phenomenon b ey ond Hilb ert spaces. F or example it is w ell known that conv exit y prop erties on functions used in finite dimensional optimization, typically force a Banach space to either b e reflexive or ev en a Hilb ert space (see e.g. [ 7 , 16 ]). T o establish second-order optimalit y conditions that pro vide, in addition to necessary conditions, a sufficient condition for local minima, w e require several additional properties of the underlying weak Riemannian manifold. These prop erties ensure that the Hessian is w ell behav ed and allo w us to dra w conclusions ab out lo cal extrema. A w eak Riemannian manifold satisfying these properties will be called a Hesse manifold . W e sho w that Hesse manifolds constitute a refinement of the existing classification in to w eak, robust, and strong Riemannian manifolds. In particular, we demonstrate that: Theorem 1.4. Every r obust R iemannian C 8 - manifold p M , g q is a Hesse manifold. W e then study the robust metrics introduced in [ 28 ] with resp ect to their application in optimization. As a new result, w e prov e that the class of elastic metrics from shap e analysis are robust. Summing up, this leads to the following hierarc h y of Riemannian manifolds: 4 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING (p ossibly) Infinite-Dimensional manifold Examples Strong Riemannian Robust Riemannian Hesse manifold W eak Riemannian dim ă 8 paracompact Grossmann’s ellipsoid Theorem 6.12 L 2 -metric; elastic metric Theorem 6.7; Theorem 6.8 t wisted ℓ 2 Theorem 3.4 Hilb ert Manifold The structure of the article is as follo ws: T o establish Riemannian optimization on w eak Riemannian manifolds, w e first address the primary structural c hallenges. Section 3 in tro duces tw o fundamental restrictions enabling Riemannian optimization in this gen- eralit y , presents examples of pathological b eha vior without them, and verifies that these restrictions preserve the essen tial structure of weak Riemannian manifolds. Building on this foundation, Section 4 derives first- and second-order optimality condi- tions in terms of the Riemannian gradient and Hessian. Section 5 in tro duces the Riemann- ian gradient descent metho d and analyzes its conv ergence, showing that classical results carry ov er under mild additional conditions. W e then in tro duce tw o k ey classes - strong and robust Riemannian manifolds - fo cusing on the latter’s construction and structural prop erties (Section 6), while pro ving simplifica- tions for the former. Finally , Section 7 pro vides explicit formulas for Riemannian gradients and Hessians, complemen ted b y numerical examples (Section 8). A c kno wledgemen ts V.Z. was funded by the German research foundation (DFG – Pro jektn ummer 448293816). V. Zalbertus thanks the mathematical institute at NTNU for the hospitalit y during a researc h stay while part of this w ork w as conducted. 2. Preliminaries W eak Riemannian manifolds are often mo deled on lo cally conv ex spaces whic h are in general not Banac h manifolds. The usual calculus, also called F réc het differen tiability , has to be replaced. W e emplo y Bastiani calculus, see [ 37 , Section 1.4], whic h is based on directional deriv ativ es. This means that a contin uous function f : E Ě U Ñ F on an op en subset of a lo cally con vex space is C 1 if for ev ery x P U, v P E the directional deriv ativ e d f p x ; v q : “ lim h Ñ 0 h ´ 1 p f p x ` hv q ´ f p x qq exists and yields a con tinuous map d f : U ˆ E Ñ F . Using iterated directional deriv ativ es, one likewise defines C k -mappings for k P N . A map which is C k for all k P N is called smo oth or C 8 . The usual assertions suc h as linearit y of the deriv ativ e and the c hain rule remain v alid. As the chain rule is v alid, w e can define as in finite dimensions, manifolds via c harts. A manifold is called a Hilb ert/Banach/F r é chet-manifold if all the mo delling spaces of the manifold are Hilbert/Banach/F réchet spaces. F urther, for a manifold M the tangent spaces T p M are defined via equiv alence classes of curves [ 37 , Def. 1.41] and canonically isomorphic to the mo del space of the manifold. Similarly the tangent bundle and differen tiability of mappings on manifolds can b e defined. F or the tangen t map of a C 1 -map f : M Ñ N we will write D p f : T p M Ñ T f p p q N , r γ s ÞÑ r f ˝ γ s . F or a v ector bundle π : E Ñ M on a smo oth manifold, we will write Γ p E q for the space of smo oth bundle sections. In the sp ecial case that E “ T M is the tangen t bundle, w e also write V p M q : “ Γ p T M q . OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 5 When establishing Riemannian metrics on lo cally con vex manifolds b eyond the Hilb ert setting, a crucial distinction arises b etw een we ak and str ong Riemannian metrics, essential for the subsequen t optimization. Definition 2.1 (W eak/Strong Riemannian Manifold) . Let M be a C 1 -manifold. A we ak R iemannian metric g on M is a smo oth map g : T M ‘ T M Ñ R , p v p , w p q ÞÑ g p p v p , w p q , suc h that g p is s ymmetric, bilinear on T p M ˆ T p M , and g p p v , v q ě 0 with equality iff v “ 0 . If the topology on p T p M , g p q coincides with the subspace top ology of T p M Ă T M , then g is str ong . W e then call p M , g q a we ak/str ong Riemannian manifold . Since we op erate beyond the Banac h setting, there is no natural norm on the spaces we consider. Although the inner pro ducts induce norms, these do not generate the natural top ology , and in particular, the spaces are not complete with resp ect to these norms. R emark 2.2 . T o av oid confusion, we write ~ v ~ p : “ a g p p v , v q for the norm on T p M induced b y the inner pro duct g p , whic h need not b e complete, and } v } for a Banach norm, if we are working in the Banach case. T o facilitate Riemannian optimization in our setting, w e introduce: Definition 2.3 (Riemannian Gradien t) . Let p M , g q b e a w eak Riemannian C 1 -manifold and f : M Ñ R a C 1 -map. A v ector field ∇ f satisfying D p f p v q “ g p p ∇ f p p q , v q @ v P T p M is the R iemannian gr adient of f . Definition 2.4 (Riemannian Hessian) . Let p M , g q b e a C 2 -manifold with first-order 1 Levi–Civita connection ∇ , and f : M Ñ R a C 2 -function with Riemannian gradien t ∇ f . The Riemannian Hessian of f at p is the map Hess f p p q : T p M Ñ T p M , v ÞÑ ∇ v ∇ f p p q . All definitions and results from infinite-dimensional differen tial geometry follo w [ 37 ]. F or the readers con v enience w e recall some essential tec hnical objects in Section A. 3. Weak Riemannian Manifolds in Optimiza tion T o in tro duce the subsequen t chapters on optimization on weak Riemannian manifolds, w e first specify the setting in whic h Riemannian optimization techniques can b e applied. Although the ob jective of this work is to develop optimization methods on spaces as general as p ossible - namely weak Riemannian manifolds - the weak structure of the underlying geometry requires us to impose sev eral structural assumptions in order to establish a w ell-defined framew ork. Since our optimization approac h relies on Riemannian methods, we fo cus on first- and second-order differential ob jects, in particular the Riemannian gradien t and the Riemann- ian Hessian. These quantities are essential for the form ulation and analysis of first- and second-order optimality conditions and gradient-based optimization algorithms. On w eak Riemannian manifolds, how ev er, these ob jects are not a v ailable in general. Recall that for a weak Riemannian C 1 - manifold p M , g q the Riemannian gradient of a C 1 - function f is defined by the unique vector field satisfying D p f p v q “ g p p ∇ f p p q , v q for all v P T p M . Since on weak Riemannian manifolds the musical morphism b et ween the tangen t bundle and it’s dual isn’t necessarily surjective [ 37 , 4.4], the existence of the Riemannian gradien t of a function cannot b e guaran teed. The following example 1 a connection is first order if its v alue at a point dep ends at most on the 1 -jets of the sections at the p oin t. See Remark 4.5. Ev ery connection on a finite dimensional manifold is of first order. 6 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING demonstrates a situation in which the Riemannian gradien t fails to exist on the tangent space under consideration. Example 3.1. W e consider the space Imm p S 1 , R 2 q of all smo oth immersions with the in v ariant H 1 ´ metric: g H 1 inv ,c p u, v q : “ g inv ,c p u, v q ` g inv ,c p 9 u, 9 v q . In [ 37 , Section 4] it has b een sho wn, that ` Imm p S 1 , R 2 ˘ , g H 1 inv ,c ˘ is indeed a w eak Riemannian manifold. W e then consider the length functional L : Imm p S 1 , R 2 q Ñ R , L p c q : “ ż S 1 | 9 c | dµ. In [ 38 , Section 4.1] the in v ariant H 1 -gradien t of the length functional L was computed using a Green’s function to solv e the arising ODE. Using the arc-length reparametrisation for c , we write γ : r 0 , L s Ñ R 2 , s ÞÑ c p exp p i s { 2 π q with L : “ L p c q and the Riemannian gradien t becomes: ∇L p s q “ γ p s q ` ż L 0 γ p t q cosh ` | s ´ t | ´ L 2 ˘ 2 sinh ` ´ L 2 ˘ dt. (1) No w (1) will in general not b e differentiable in s (i.e. in the con tribution b y Green’s func- tion), whence the Riemannian gradien t of L do es not exist as an elemen t in T Imm p S 1 , R 2 q (or for that matter in the tangent space of the one time contin uously differentiable im- mersion which is the context studied in [ 38 ]). Here the gradien t only exists as an elemen t in the completion of the tangen t space, which can b e identified with the space H 1 p S 1 , R 2 q of all Sob olev H 1 -functions. R emark 3.2 . The gradient flow induced b y the length functional with resp ect to the in- v ariant L 2 -metric corresp onds to the famous curve shortening flo w studied in [ 15 ]. With resp ect to the in v arian t H 1 -metric, the corresp onding gradient flo w has been studied in [ 38 ]. Nev ertheless, assuming the existence of a Riemannian gradient does not turn out to b e o verly restrictive, since it’s existence do es not, for instance, imply that the metric is strong. In Section 7, we present several examples illustrating the computation of Riemannian gradien ts on weak Riemannian manifolds. In particular, Example 7.5 provides an explicit computation of the Riemannian gradien t of the length functional L on the space of smo oth immersion Imm p S 1 , R 2 q endow ed with the in v arian t L 2 ´ metric, thereb y demonstrating that the existence of the Riemannian gradien t of a function dep ends not only on the function itself but also on the c hosen metric. In the context of Riemannian optimization, where the structure of the Riemannian gra- dien t is essential, but cannot b e guaran teed when w orking on weak Riemannian manifolds, w e in troduce the following definition for notational con v enience. Definition 3.3. A C 1 - function f : M Ñ R on a w eak Riemannian C 1 - manifold p M , g q is called a gr adient-admitting function (abbreviated gaf ) if the Riemannian gradient ∇ f p p q exists for all p P M . In addition to the Riemannian gradient, the Riemannian Hessian enco des second-order information ab out the lo cal b eha vior of the function. Consider a w eak Riemannian C 8 - manifold M that admits a first-order Levi-Civita connection ∇ . F or a gradient-admitting C 2 - function f on M , recall, that the Riemannian Hessian of f at p P M is defined b y Hess f p p qr u s “ ∇ u ∇ f , u P T p M . Consequen tly , the definition of the Riemannian Hessian requires not only the existence of the Riemannian gradient but also the av ailability of a first-order Levi-Civita connection. OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 7 This imposes an additional structural restriction on the underlying manifold. In particular, on w eak Riemannian manifolds such a connection do es not exist in general. An explicit example of a w eak Riemannian manifold without a Levi-Civita connection is giv en in [ 5 , p.12]. Ho wev er, the existence of a Levi-Civita connection alone is still not sufficien t for our sub- sequen t analysis. In order to carry basis-indep enden t arguments, w e additionally require the existence of a metric spra y , cf. Section A. A spray is a second order vector field whic h, when compatible with the metric, pla ys the same role as the Christoffel symb ols. Suc h a spra y not only induces a first-order Levi-Civita connection, but also pro vides the co v ari- an t deriv ativ e structure necessary for intrinsic arguments. Similiarly to the Levi-Civita connection, a metric spray do es not exist on weak Riemannian manifolds in general. Example 3.4. Consider the Hilb ert space M “ ` ℓ 2 , x¨ , ¨y ˘ of all square-summable real sequences equipp ed with the weak Riemannian metric g : T ℓ 2 ‘ T ℓ 2 Ñ R , T p ℓ 2 ˆ T p ℓ 2 Q ` p x n q n , p y n q n ˘ ÞÑ e ´} p } 2 ÿ n P N x n y n n 3 . As shown in [ 37 , 4.22], this metric does not admit a metric spra y . By contrast, [ 37 , 5.7] computes the metric spra y for a large class of w eak Riemannian manifolds of the form ` C 8 p S 1 , M q , g L 2 ˘ , where p M , g q is a strong Riemannian manifold and g L 2 denotes the induced L 2 metric, sho wing that this additional assumption do es not imply that the metric is strong. Ho wev er, Example 3.4 demonstrates that additional structural assumptions are neces- sary to ensure the existence and w ell-p osedness of the Riemannian Hessian. Accordingly , the follo wing definition establishes notation and identifies the class of w eak Riemannian manifolds considered in this w ork. Definition 3.5. A w eak Riemannian C 8 - manifold p M , g q is called a Hesse manifold if it admits a metric spra y S g . 4. Optimality Conditions In this c hapter, w e derive first- and second-order optimality conditions for optimiza- tion on w eak Riemannian manifolds under the structural assumptions introduced in the previous c hapter. The goal is to show that, once these restrictions are imp osed, the local optimalit y theory closely parallels the one on strong- or finite-dimensional Riemannian manifolds. Our exp osition follo ws the framework dev elop ed by Boumal in [ 8 ] for finite-dimensional Riemannian manifolds. W e adopt his definition of critical p oin ts, Riemannian gradien ts and Riemannian Hessians, and adapt the corresp onding arguments to the presen t setting of w eak Riemannian manifolds. In particular, we sho w that under the stated assump- tions, first-order necessary optimality conditions can b e form ulated in terms of v anishing Riemannian gradients. While second-order conditions in the finite-dimensional setting t ypically only require p ositive definiteness of the Riemannian Hessian to guaran tee a lo cal minim um, in the infinite-dimensional setting considered here p ositive definiteness alone is not sufficient. Instead, an additional requiremen t is needed: the Riemannian Hessian m ust b e co ercive at the p oin t of interest. These results justify the use of classical opti- mization intuition in the more general weak Riemannian setting for first-order conditions; ho wev er, this intuition do es not carry o ver to second-order conditions, where additional assumptions and analytical to ols are required to rigorously establish lo cal optimality . Throughout this c hapter, p M , g q denotes a w eak Riemannian C 1 -manifold. 8 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING 4.1. First-Order Optimality Conditions. As a first step tow ards establishing opti- mization conditions on w eak Riemannian manifolds, we consider the notion of critical p oin ts. In the finite-dimensional and strong Riemannian setting, critical p oints are char- acterized by the v anishing of the Riemannian gradien t and are directly linked to first-order necessary conditions. In the present weak Riemannian setting, ho w ev er, this characterization is not immedi- ate, as the definition of differentials and tangent spaces relies on Bastiani calculus rather than on a Hilb ert space structure. W e therefore b egin b y v erifying that Boumal’s definition of critical p oin ts is compatible with the differen tial structure adopted here. Definition 4.1. Let f : M Ñ R b e a C 1 -map. A p oint p P M is called a critic al p oint of f , if p f ˝ γ q 1 p 0 q ě 0 for all C 1 -curv es γ on M passing through p . Despite the w eak Riemannian structure, critical p oin ts admit the same c haracterization as in the finite-dimensional setting: critical points can be c haracterized equiv alen tly by the v anishing of the differen tial and by the v anishing of the Riemannian gradient. The calcu- lations are the same as in the finite dimensional setting and, for the readers con v enience, w e highligh t only where the w eak structure is needed. Prop osition 4.2. Let f : M Ñ R b e C 1 and p P M . The p oint p is a critical p oint of f if and only if (1) D p f p v q “ 0 for all v P T p M , (2) ∇ f p p q “ 0 if f is a gaf. Finally , ev ery lo cal minimizer of f is a critical p oin t. Pr o of. The equiv alence to (1) and the addendum can b e prov ed exactly as in the finite dimensional case. See e.g. [ 8 , Proposition 4.5.] whic h only uses the con tin uit y of f ˝ c for a smo oth curve c on M . F or (2) we observ e that as D p f p v q “ g p p ∇ f p p q , v q “ 0 , @ v P T p M , (2) w e see that (1) implies (2) as a weak Riemannian metric is non-degenerate and thus (2) implies that the gradient v anishes if and only if p is critical. □ This result enables us to establish the fundamen tal link betw een minimizers and critical p oin ts. Consequen tly , the classical first-order necessary optimalit y condition remains v alid in the weak Riemannian framework considered here. This provides the foundation for the second-order analysis dev elop ed below. 4.2. Second-Order Optimality Conditions. W e now establish sufficient second-order optimalit y conditions on Hesse manifolds, that is, manifolds equipp ed with a Levi-Civita connection induced by a metric spray . The metric spray framework allows us to define co v ariant deriv ativ es of v ector fields along curv es in a basis-indep enden t manner. This in trinsic notion of differentiation is crucial for formulating a second-order T aylor expan- sion of functions along suitable curv es without assuming the existence of a basis of the underlying vector space. W e sho w that, unlik e in the finite-dimensional setting where p ositive definiteness of the Riemannian Hessian alone suffices, a critical p oin t must not only admit a p ositive definite Hessian but also satisfy a co ercivit y condition in order to b e a strict lo cal mini- mizer. This highlights an imp ortan t distinction b etw een finite-dimensional optimization and optimization in the w eak Riemannian setting. W e briefly recall the definition of the Riemannian Hessian for conv enience. Definition 4.3. Let p M , g q b e a Hesse-manifold and f : M Ñ R b e a C 2 - gaf. Then the Riemannian Hessian of f at p P M is defined as follows: Hess f p p q : T p M Ñ T p M u ÞÑ ∇ u ∇ f . OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 9 T o relate the Riemannian Hessian to local minimalit y , we analyze the second-order expansion of f along smo oth curves. Let c : I Ñ M b e a smo oth curve with c p 0 q “ p , and define g “ f ˝ c . Since g : I Ñ R is a classical C 2 - function, w e ha v e the standard T aylor expansion f p c p t qq “ g p t q “ g p 0 q ` tg 1 p 0 q ` t 2 2 g 2 p 0 q ` O p t 3 q . (3) The first deriv ative follows from the c hain rule: g 1 p t q “ D c p t q f p c 1 p t qq “ g c p t q ` ∇ f p c p t qq , c 1 p t q ˘ . (4) In particular, p f ˝ c q 1 p 0 q “ g p ` ∇ f p p q , c 1 p 0 q ˘ . Th us, first-order b ehavior is completely determined by the Riemannian gradient. T o compute the second deriv ativ e g 2 p t q , we m ust differentiate g c p t q ` ∇ f p c p t qq , c 1 p t q ˘ . This requires a notion of differentiation of vector fields along curves. Those vector fields are defined analogously to [ 8 , Definition 5.28.] as follows: Definition 4.4. Let M b e a manifold and c : I Ñ M b e a curv e on M . A (smo oth) map Z : I Ñ T M is called a (smo oth) ve ctor field on c if Z p t q P T c p t q M for all t P I . The set of all smo oth vector fields on c is denoted b y V p c q . T o make sense of differen tiation of vector fields on curves, we require an appropriate op erator with certain prop erties. Since not all v ector fields Z P V p c q are of the form X ˝ c for some X P V p M q , we cannot simply use the Levi-Civita connection on M and must in tro duce a differen t concept for differentiating suc h v ector fields. This is precisely where the metric spra y structure b ecomes essen tial. R emark 4.5 . It is a standard argumen t that every connection ∇ on a finite-dimensional v ector bundle is of first or der in the sense that for section X , Y and m P M , the v alue ∇ X Y p m q dep ends only on the v alue X p m q and the first order jet of Y . Unfortunately , the finite-dimensional pro of do es not generalise without further assumptions. One can prov e that ev ery connection asso ciated to a spra y , cf. Section A, is a first order connection in this sense. It is unkno wn whether there exist connections on infinite-dimenisonal manifolds whic h are not of first order. If the Levi–Civita connection is induced b y a metric spray , then one obtains a canonical differen tiation operator along curves called the c ovariant derivative along c . Theorem 4.6. L et p M , g q b e a Hesse-manifold. F or every smo oth curve c : I Ñ M , ther e exists a unique op er ator D d t : V p c q Ñ V p c q , c al le d the c ovariant derivative along c, that satisfies the fol lowing pr op erties for al l Y , Z P V p c q , X P V p M q , g P C 1 p I , R q and a, b P R : (1) R -linearity: D d t ` aY ` bZ ˘ “ a D d t Y ` b D d t Z, (2) Leibniz rule: D d t ` g Z ˘ “ g 1 Z ` g D d t Z, (3) Chain rule: ` D d t ` X ˝ c ˘˘ p t q “ ∇ c 1 p t q U for all t P I . (4) Pro du ct rule: d d t g p Y , Z q “ g p D d t Y , Z q ` g p Y , D d t Z q , where g p Y , Z q P C 1 p I , R q is defined by g p Y , Z qp t q “ g c p t q p Y p t q , Z p t qq . Pr o of. The existence and uniqueness of suc h an op erator follo ws from Prop osition 4.36 in [ 37 ]. The construction presented there is based on the metric spray and yields a cov arian t deriv ative along curv es satisfying properties (i)–(iv). □ 10 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING R emark 4.7 . In the finite-dimensional setting, analogous constructions are often carried out using lo cal frames and co ordinate represen tations, as for instance done by Boumal in [ 8 , Theorem 5.29.]. Suc h arguments rely on the existence of finite-dimensional bases of the tangent spaces. In contrast, the presen t approach is based on the spra y-induced connection and do es not require the use of lo cal frames. The differen tiation op erator along curv es is constructed in trinsically , without resorting to basis expansions. This makes the argument directly applicable in the weak infinite-dimensional Riemannian setting considered here. T o relate the Riemannian Hessian to the second-order expansion along curv es, we ex- press it in terms of the induced co v arian t deriv ative. Let c : I Ñ M b e a smo oth curve with c p 0 q “ p and c 1 p 0 q “ v . By the c hain rule for the induced co v ariant deriv ative along c, we obtain Hess f p p qr v s “ ∇ v ∇ f “ D d t ∇ f p c p t qq | t “ 0 . (5) Using the representation of the Riemannian Hessian in terms of the induced cov arian t deriv ative (5) and the structural prop erties established in Theorem 4.6, the computation of the second deriv ative of g “ f ˝ c proceeds exactly as in the finite-dimensional case in [ 8 , 5.9]. As the argumen t uses only structural properties of the cov ariant deriv ative, it remains v alid in the present weak Riemannian framework. Hence, g 2 p t q “ g c p t q ` Hess f p c p t qqr c 1 p t qs , c 1 p t q ˘ ` g c p t q ` ∇ f p c p t qq , c 2 p t q ˘ . (6) Consequen tly , the second-order T aylor expansion of f ˝ c is giv en by f p c p t qq “ f p p q ` tg p ` ∇ f p p q , v ˘ ` t 2 2 g p ` Hess f p p qr v s , v ˘ ` t 2 2 g p ` ∇ f p p q , c 2 p 0 q ˘ ` O p t 3 q . (7) Ha ving expressed the second-order T aylor expansion in terms of the Riemannian gradient and the Riemannian Hessian, w e now adopt the notion of second-order critical p oints as in tro duced in the finite-dimensional setting by Boumal [ 8 , Section 6.1]. These p oints will b e sho wn to coincide precisely with the local minimizers of a function, if in addition the Riemannian Hessian at these points is coercive. Establishing this result relies on the second-order T aylor expansion of f ˝ c (cf. (7)). Definition 4.8. Let M b e a C 2 - manifold and f : M Ñ R b e a C 2 - function. A p oin t p P M is called a se c ond-or der critic al p oint for f if it is a critical p oint and p f ˝ c q 2 p 0 q ě 0 for all smo oth curves c on M such that c p 0 q “ p . In direct analogy of the finite-dimensional case [ 8 , Prop osition 6.3.], one can sho w, that critical p oints are exactly the p oin ts where the Riemannian gradient v anishes and the Riemannian Hessian is p ositiv e semi-definite. The pro of carries ov er directly to the weak Riemannian setting, as it relies solely on the first and second deriv ativ es of f ˝ c , which w e ha v e established in (4) and (6). Prop osition 4.9. Let f : M Ñ R b e a smo oth gaf on a Hesse manifold M . Then, x is a second-order critical p oin t if and only if ∇ f p x q “ 0 and Hess f p x q ľ 0 . W e now turn to the pro of of the main result. While the Riemannian gradient con- dition pro vides a necessary criterion, this theorem go es further b y establishing when a critical p oin t is indeed a minimizer. This result demonstrates that in tuition from finite- dimensional optimization do es not directly carry o ver to the more general setting of weak Riemannian manifolds and must b e applied with caution. OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 11 Prop osition 4.10. Let p M , g q b e a Hesse manifold and let f : M Ñ R b e a C 2 -gaf. F or p P M , suppose that the Riemannian Hessian is co ercive, i.e. there exists µ ą 0 suc h that g p p Hess f p p qr v s , v q ě µ ~ v ~ 2 p , @ v P T p M . (8) Then, any strict second-order critical point of f is a strict lo cal minimizer. Pr o of. Let ϕ : U ϕ Ñ V ϕ b e a chart around p with ϕ p p q “ 0 . Since V ϕ is an op en subset of a lo cally conv ex space, there exists an op en con vex neigh b orho o d W ϕ Ă V ϕ con taining 0 . F or an y x P W ϕ , define a smo oth curv e on M via c p t q : “ ϕ ´ 1 p tx q . By the second-order T aylor expansion of f along c (cf. (7)) and the fact that p is a critical p oint, w e obtain f p c p t qq “ f p p q ` t 2 2 g p ` Hess f p p qr c 1 p 0 qs , c 1 p 0 q ˘ ` R p t q , where R p t q “ O p t 3 q , i.e. lim t Ñ 0 R p t q{ t 3 “ 0 . By the co ercivity of the Hessian at p , w e hav e g p ` Hess f p p qr c 1 p 0 qs , c 1 p 0 q ˘ ě µ ~ c 1 p 0 q~ 2 p “ µ   D ϕ p p q ϕ ´ 1 p x q   2 p , and therefore f p c p t qq ě f p p q ` t 2 µ 2   D ϕ p p q ϕ ´ 1 p x q   2 p ` R p t q . (9) On E ϕ w e define a norm as follo ws: ~ x ~ ϕ : “ b g p ` D ϕ p p q ϕ ´ 1 p x q , D ϕ p p q ϕ ´ 1 p x q ˘ , x P E ϕ . By construction, with resp ect to this norm, the linear mapping D ϕ p p q ϕ ´ 1 : ` E ϕ , ~ ¨ ~ ϕ p p q ˘ Ñ ` T p M , g p ˘ is contin uous, where w e identified T ϕ p p q V ϕ – E ϕ . Bounding b y the operator norm A ą 0 ,   D ϕ p p q ϕ ´ 1 p x q   2 p ď A 2 ~ x ~ 2 ϕ p p q for all x P W ϕ . Since R p t q “ O p t 3 q , there exists ξ ą 0 suc h that | R p t q| ď t 2 2 µA 2 for all t P p 0 , min t 1 , ξ uq . Using (9), we obtain f p c p t qq ě f p p q ` t 2 µ 2 A 2 ` R p t q ě f p p q ` t 2 µ 2 A 2 ~ x ~ 2 ϕ p p q ´ t 2 µ 2 A 2 “ f p p q ` t 2 µ 2 A 2 p~ x ~ 2 ϕ p p q ´ 1 q . No w restrict to x P W ϕ with ~ x ~ ϕ p p q ă 1 . Then ~ x ~ 2 ϕ p p q ´ 1 ă 0 , and thus f p c p t qq ą f p p q for all t P p 0 , min t 1 , ξ uq and all x P W ϕ with 0 ă ~ x ~ ϕ p p q ă 1 . Define Y ϕ : “ ! ϕ ´ 1 p tx q ˇ ˇ ˇ t P p 0 , min t 1 , ξ uq , x P W ϕ , ~ x ~ ϕ p p q ă 1 ) . Since ϕ is a homeomorphism and the set t tx | t P p 0 , min t 1 , ξ uq , x P W ϕ , ~ x ~ ϕ p p q ă 1 u is op en in V ϕ with resp ect to the lo cally conv ex top ology , the set Y ϕ is op en in M . By the preceding estimate, we hav e f p q q ą f p p q for all q P Y ϕ , so p is a strict lo cal minimizer of f . □ R emark 4.11 . The coercivity of the Riemannian Hessian represents a k ey difference com- pared to the finite-dimensional case. This is w ell kno wn, see e.g. [ 11 ] for the use of co ercivit y conditions on Banac h manifolds in relation to P alais and Smales condition (C). Condition (C) replaces compactness arguments which are not a v ailable in our setting. In particular, coercivity do es not follow from the p ositive definiteness of the Riemannian Hessian and m ust therefore b e assumed separately . 12 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING Ha ving established first- and second-order optimalit y conditions on w eak Riemannian manifolds, w e now turn to a concrete descent metho d. In Section 8, we will apply these optimalit y conditions to specific examples alongside this metho d. 5. The Riemannian Gradient Descent Method In this c hapter, we introduce a basic descen t metho d, namely the Riemannian gradient descen t (RGD) algorithm, and establish conv ergence results for this metho d. Before we can state the algorithm, w e need an auxiliary structure. In finite dimensional optimization on manifolds [ 8 , Chapter 3.6] one defines Definition 5.1. A smo oth map R : T M Ñ M is called a r etr action if for every v P T M the smo oth curve c v p t q : “ R p tv q satisfies c p 0 q “ x and 9 c p 0 q “ v . W e deviate slightly from lo c.cit. and will allo w retractions defined only on an op en neigh b orho o d Ω of the zero-section in T M . How ev er, even with this relaxation, w e will see that retractions are not sufficient as the next example shows. Example 5.2. Let S 1 Ď R 2 b e the unit circle. W e recall from [ 37 , Example 3.8] that the diffeomorphism group Diff p S 1 q is an infinite-dimensional Lie group not mo delled on a Banac h space. The tangent bundle of the Lie group is trivial, [ 37 , Lemma 3.12 (b)], i.e. the group m ultiplication m induces a diffeomorphism Φ ´ 1 : T Diff p S 1 q Ñ V p S 1 q ˆ Diff p S 1 q , Φ ´ 1 p v g q : “ p g , D m pp 0 g ´ 1 , v g qqq where the vector field V p M q is identified with the tangen t space at the identit y . F urther, the Lie group exp onen tial of Diff p S 1 is the map exp : V p S 1 q Ñ Diff p S 1 q , X ÞÑ Fl X 1 , sending a vector field to its time 1 -flow. Now the map R : T Diff p S 1 q Ñ Diff p S 1 q , v g ÞÑ g ˝ exp p D m pp 0 g ´ 1 , v g qqq is smo oth and satisfies R p 0 g q “ g ˝ exp p 0 id q “ g ˝ id “ g . Exploiting that D m p 0 g ´ 1 , ¨q is con tinuous linear and D 0 exp “ id V p M q , the c hain rule yields d dt ˇ ˇ ˇ ˇ t “ 0 R p tv g qq “ Dm ˆ 0 g , D 0 exp ˆ d dt ˇ ˇ ˇ ˇ t “ 0 tD m p 0 g ´ 1 , v g q ˙˙ “ v g . Hence R is a retraction, but it is well kno wn that this retraction do es not restrict to a lo cal diffeomorphism on an y zero-neighborho o d in T g Diff p S 1 q to any neigh b orho o d of g P Diff p S 1 q . Indeed one can show, see e.g. [ 37 , Example 3.42] for details, that in any neigh b orho o d of g there are infinitely many p oints not in the image of R . One can indeed ev en find con tinuous curves whic h in tersect the image of R | T g Diff p S 1 q only in g . A similar result holds for diffeomorphism groups of arbitrary compact manifolds of dimension ě 2 . Summing up, Theorem 5.2 shows that the retraction condition from Theorem 5.1 will lead to mappings on manifolds whose image fails to be a neigh b orho o d of the foot point. In other words, in infinite-dimensions the retraction property fails to give mappings allo wing us to step into all directions from the footp oin t. This is certainly undesirable, whence the follo wing definition is more suitable: Definition 5.3. Let M b e a smo oth manifold. Then a smo oth map Σ : T M Ě Ω Ñ M defined on Ω an op en neighborho o d of the zero-section is called lo c al addition if it satisfies (1) Σ p 0 x q “ x for all x P M , (2) the map θ : “ p π M , Σ q : Ω Ñ M ˆ M , θ p v x q “ p x, Σ p v x qq induces a diffeomorphism on to it’s op en image θ p Ω q Ď M ˆ M . W e call the local addition normalised if D p Σ | Ω X T x M q 0 x “ id T x M for all x P M . Before we give examples of (non-trivial) retractions and lo cal additions in Theorem 5.5, w e illustrate first the relation betw een lo cal additions and retractions. OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 13 Lemma 5.4. Let M b e a smo oth manifold. (1) Every local addition Σ : Ω Ñ M induces a normalised local addition Σ N whic h is a retraction on Ω . (2) If, in addition, M is a paracompact Banac h manifold, then ev ery retraction R induces a normalised lo cal addition. (3) If, in addition, p M , g q is a paracompact strong Riemannian manifold, then every lo cal addition induces a (normalised) lo cal addition on T M . Pr o of. (1) By [ 3 , A.14] every lo cal addition can b e mo dified to yield a normalised lo cal addition Σ N : Ω Ñ M . Shrinking Ω we ma y assume without loss of generalit y , that Ω x : “ T x M X Ω is star-shap ed around 0 x . Hence, for v P Ω x w e hav e Σ N p 0 v q “ x and since Σ N is normalised, the c hain rule yields d dt ˇ ˇ t “ 0 Σ N p tv q “ v . So Σ N is a retraction on Ω x for ev ery x P M . (2) Let R : ˜ Ω Ñ M b e a retraction. Since d dt ˇ ˇ t “ 0 R p tv q “ v for all v P T M w e see that the deriv ativ e of R | ˜ Ω X T x M at the zero-section is the identit y map. Then paracompactness and the inv erse function theorem show that we can shrink Ω to an op en neigb orho o d on which R restricts to a normalised lo cal addition. The details are recorded in [ 22 , Lemma 3.15]. (3) Finally , if we are given a lo cal addition Σ : Ω Ñ M on some op en neighborho o d of the zero-section, it can b e extended using the argument in [ 29 , Lemma 10.2] to a (normalised) lo cal addition on all of T M . □ Summing up, Theorem 5.4 implies that for finite-dimensional (paracompact) manifolds normalised lo cal additions are equiv alen t to retractions as defined in [ 8 ]. the p oint in ha ving a retraction is that starting at x w e can lo cally reach ev ery p oin t near to x b y a suitable tangent curve. In infinite-dimensions a (normalised) lo cal addition assures this, whence the stronger concept is preferred o v er a retraction. Example 5.5. Let p M , g q be a strong Riemannian manifold. Then as in finite-dimensions, M admits a Riemannian exp onential map exp : T M Ě Ω Ñ M , cf. [ 20 , Chapter 1.6]. The Riemannian exponential map is smo oth and satisfies D p exp | Ω X T x M q 0 x “ id x for all x P M . Hence it is a normalised lo cal addition (this is the standard source of retractions on finite- dimensional manifolds). F or an y compact manifold K , the set of smo oth functions C 8 p K, M q can then b e endo wed with the structure of a F réchet manifold suc h that T C 8 p K, M q – C 8 p K, T M q . Here the iden tification takes T h C 8 p K, M q – t F P C 8 p K, T M q : π M ˝ F “ h u . F urther, the pushforward exp ˚ : C 8 p K, Ω q Ñ C 8 p K, M q , exp ˚ p g q “ exp ˝ g is smo oth. Since also the pushforw ards of the asso ciated mappings θ “ p π M , exp q and θ ´ 1 are smooth, w e deduce that exp ˚ is a lo cal addition. The iden tification of the tangen t bundle yields, see [ 37 , 2.22], D p exp ˚ q “ p D exp q ˚ , whence exp ˚ is a normalised lo cal addition on C 8 p K, M q . F or a C 1 - weak Riemannian manifold the Riemannian gradient descent metho d can be form ulated as follows. Algorithm 1 Riemannian Gradien t Descen t Metho d on p M , g q Input: x 0 P M , f P C 1 p M , R q , normalised lo cal addition R on M . F or k “ 0 , 1 , 2 , ... pic k a step-size α k ą 0 and set x k ` 1 “ R x k p s k q for s k “ ´ α k ∇ f p x k q Our exp osition follo ws the structure of Boumal [ 8 , Section 4.3], where R GD is discussed in the finite-dimensional setting. W e show that, under an additional assumption, these results carry o ver to the weak Riemannian setting. In particular, w e show that every accum ulation p oint of the sequence of iterates generated by Algorithm 5 is a critical p oint of f and that the norms of the corresp onding gradients conv erge to zero. 14 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING In order to pro v e this result, we require a notion of contin uit y for the Riemannian gradien t ∇ f . In particular, w e need ∇ f to be sequentially con tin uous. This prop erty cannot b e inferred directly from the defining prop ert y of the Riemannian gradien t, due to the incompatibilit y of the top ologies on the tangen t bundle of a weak Riemannian manifold. In the follo wing we will show that ∇ f is sequentially con tin uous whenever the sequence ` ∇ f p p n q ˘ n P N con verges in T M for a con v ergent sequence p p n q n P N Ă M . Lemma 5.6. Let p M , g q b e a w eak Riemannian C 1 -manifold, and let ` p n ˘ n P N Ă M b e a sequence conv erging to p P M . Let f : M Ñ R b e a gaf such that the sequence ` ∇ f p p n q ˘ n P N con verges in T M , then lim n Ñ8 ∇ f p p n q “ ∇ f p p q . Pr o of. Since p ∇ f p p n qq n P N con verges in T M and π M is contin uous, it follo ws that lim n Ñ8 π M ` ∇ f p p n q ˘ “ π M ` lim n Ñ8 ∇ f p p n q ˘ “ p. W e lo calise in a chart p ϕ, U q of M around p . So without loss of generality , T U “ U ˆ E (suppressing the iden tification). As g and Df are con tin uous, w e obtain @ v P T p M g p p ∇ f p p q , v q “ D f p v q “ lim n Ñ8 D p n f p v q “ lim n Ñ8 g p n p ∇ f p p n q , v q “ g p p lim n Ñ8 ∇ f p p n q , v q , Since g p is non-degenerate w e conclude that lim n Ñ8 ∇ f p p n q “ ∇ f p p q . □ With this result, the sequen tial con tin uit y of the Riemannian gradien t can no w be defined solely by requiring that the Riemannian gradients of con v ergent sequences con v erge within the tangen t bundle. Corollary 5.7. Let ` M , g ˘ b e a weak Riemannian C r -manifold, r ě 1 and let f : M Ñ R b e a gaf. If for all p p n q n P N Ă M that conv erge in M , f is such that lim n Ñ8 ∇ f p p n q P T M , then ∇ f is sequen tially con tin uous. Equipp ed with this result, w e can establish the main result of this section under the follo wing assumptions. A 5.1. There exists f low P R such that f p p q ě f low for all p P M . A 5.2. A t eac h iteration, the algorithm achiev es sufficient decrease for f , in that there exists a constan t c ą 0 suc h that, for all k , f p p k q ´ f p p k ` 1 q ě c ~ ∇ f p p k q~ 2 p k (10) A 5.3. F or ev ery sequence p p n q n P N Ă M that is conv ergen t in M , ` ∇ f p p n q ˘ n P N con verges in T M . Prop osition 5.8. Let f be a C 1 -function satisfying A 5.3 and A 5.1 on a weak Riemannian C r -manifold, r ě 1 . Let p 0 , p 1 , p 2 , ... be iterates satisfying A 5.2 with constant c . Then lim n Ñ8 ~ ∇ f p p n q~ p n “ 0 . In particular, all accumulation p oints are critical p oints. F urthermore, for all K ě 1 , there exists k P t 0 , ..., K ´ 1 u suc h that ~ ∇ f p p k q~ p k ď c f p p 0 q ´ f low c 1 ? K . Pr o of. The pro of pro ceeds analogously to that in [ 8 , 4.7.], relying on a telescoping sum argumen t together with the sequen tial con tin uit y of ∇ f an d ~ ¨ ~ . Consequen tly , it extends directly to the weak Riemannian setting. □ OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 15 R emark 5.9 . Assumption A 5.1 and A 5.2 are standard assumptions known from finite- dimensional Riemannian optimization. The pro of in [ 8 , 4.7.] shows that Assumption A 5.1 and A 5.2 are sufficien t to guarantee that the norm of the Riemannian gradient along the iteration sequence con verges to zero. How ev er, in the infinite-dimensional setting w e additionally require the sequential contin uit y of the Riemannian gradient, ensured by Assumption A 5.3, in order to conclude that all accum ulation p oints are critical p oints. In the next example, ho w ever, w e will see that Assumption A 5.3 is not guaran teed apriori in the infinite-dimensional setting. Example 5.10. W e consider the length functional on the space C 8 p S 1 , R 2 q L : C 8 p S 1 , R 2 q Ñ R , L p c q : “ ż S 1 | 9 c | dµ. The space C 8 p S 1 , R 2 q , viewed as a lo cally con v ex space equipp ed with the w eak Riemann- ian metric g p h, k q “ ş S 1 x h, k y dµ , forms a w eak Riemannian manifold. Up to the factor | 9 c | , we compute the Riemannian gradient of L analougusly to Theorem 7.5. F or curves c P Imm p S 1 , R 2 q the Riemannian gradient of L is giv en by: ∇L p c q “ ´ k c N c | 9 c | P C 8 p S 1 , R 2 q , where N c p z q “ p´ y z p z q , x z p z qq J denotes the normal v ector to the curv e c p z q “ p x p z q , y p z qq and k c it’s signed curv ature. W e emphasize that this expression is only w ell-defined for immersions, since the signed curv ature k c requires a non-v anishing deriv ativ e of c and is undefined for p oints where 9 c “ 0 . In particular, for curv es that leav e the space of Immersions, the curv ature-based Riemannian gradient no longer exists in a classical sense. W e define a sequence p c k q k P N Ă Imm p S 1 , R 2 q by c k “ p´ 1 q k k id S 1 , k P N . Observ e that for c “ r ¨ id S 1 for some r ‰ 0 , the Riemannian gradien t of L at c P Imm p S 1 , R 2 q is giv en b y ∇L p r ¨ id S 1 q “ ´ sgn p r q id S 1 Clearly , c k Ñ 0 as k Ñ 8 , and thus p c k q k P N con verges within C 8 p S 1 , R 2 q . Nev ertheless, since ∇L p c n q “ p´ 1 q n ` 1 id S 1 , the sequence of Riemannian gradien ts ` ∇L p c n q ˘ n P N do esn’t con verge within T M . R emark 5.11 . Observ e that Assumption 5.2, which imp oses a sufficient decrease condition, dep ends indirectly on the choice of retractions R p , p P M . In this pap er, we do not further address the selection of step sizes or the construction of retractions that satisfy this assumption; this is deferred to future w ork, particularly since retractions on w eak Riemannian manifolds present additional challenges. Provided that a suitable retraction exists, one may exp ect an analogue of a result from the finite-dimensional setting [ 8 , 4.4]. 6. Classes of Hesse manifolds and their Optimiza tion-relev ant pr oper ties In the preceding sections, we established first-order and second-order optimalit y con- ditions for w eak Riemannian manifolds and analyzed the Riemannian gradien t descent metho d together with its con vergence prop erties. Although our framework is formulated for general w eak Riemannian manifolds, we imp osed additional structural assumptions to ensure that these optimization results hold. This led to the notion of a Hesse manifold, whic h is a weak Riemannian manifold endo w ed with extra prop erties that make Riemann- ian optimization w ell defined and analytically tractable. Recall from Theorem 3.5 that a Hesse manifold is a w eak Riemannian manifold whic h admits a metric spra y . 16 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING In this c hapter, w e present tw o imp ortant classes of Hesse manifolds and inv estigate b oth their fundamen tal geometric features and their optimization-related prop erties. Our primary fo cus will b e on the robust Riemannian manifolds. W e then turn to the more classical strong Riemannian manifolds. 6.1. Robust Riemannian manifolds. An imp ortant class of weak Riemannian mani- folds that are suitable for optimization purp oses, yet do not qualify as strong Riemannian manifolds, consists of robust Riemannian manifolds, as they p ossess a Levi-Civita connec- tion b y definition. W e next examine their geometric structure, provide concrete examples, and characterize when a weak Riemannian manifold qualifies as robust. Robust Riemannian manifolds were in tro duced b y Micheli and collab orators in [ 28 ]. This strengthening of the notion of a w eak Riemannian metric allo ws for example curv ature calculations for Riemannian submersions. Definition 6.1. Let p M , g q b e a weak Riemannian manifold. W e say g is a r obust Rie- mannian metric if (1) The Hilb ert space completions of the fibres T x M g x with resp ect to the inner pro d- uct g x form a smo oth v ector bundle T M “ Ť x P M T x M g x o ver M whose trivialisa- tions extend the bundle trivialisations of T M . (2) the metric deriv ativ e of g exists. A w eak Riemannian manifold with a robust Riemannian metric will b e called a r obust R iemannian manifold . R emark 6.2 . Note that condition (1) in Theorem 6.1 entails that the inner pro ducts g x induced b y the w eak Riemannian metric are lo cally (in a c hart) equiv alent to eac h other and thus induce the same Hilbert space completion of the fibres T x M . Before we consider examples of robust Riemannian metrics, let us first assert that: Prop osition 6.3. Every robust Riemannian manifold p M , g q is a Hesse manifold. Pr o of. By prop erty (1) of a robust Riemannian manifold, T M Ñ M is a Hilb ert bundle o ver M with typical fibre H . F urther the Riemannian metric g induces a Riemannian bundle metric g on T M (the distinction here is that T M is not the tangen t bundle of M ). W e work lo cally on a c hart domain U (but suppress the chart in the notation and also the identification T U Ď T M ). F or ev ery p oin t x P U , g U p x, ¨q induces the m usical isomorphisms b etw een the Hilb ert space H and its dual. Hence, the form ula (14) yields a w ell defined quadratic form Γ U p x, ¨q : H Ñ H which smo othly dep ends on x P U . Using the p olarization iden tit y B U p x, v , w q : “ 1 2 p Γ U p x, v ` w q ´ Γ U p x, v q ´ Γ U p x, w qq w e obtain a bilinear. Now as in (15) we obtain a (linear) connection (see [ 17 , VI I.3] or [ 20 , 1.5], neither of [ 23 , 37 ] define connections on v ector bundles) on T M ∇ U : Γ p T U q ˆ Γ p T U q Ñ Γ p T U q , ∇ U p ξ , σ qp x q : “ dσ p x ; ξ p x qq ´ B U p x, ξ p x q , σ p x qq , (11) i.e. ∇ U is tensorial in ξ and a deriv ation in σ . As in the pro of of [ 23 , VI I I §4, Theorem 4.2] a direct calculation shows that ∇ U is a metric connection (cf. [ 19 , Definition 4.2.1]) in the sense that it satisfies the pro duct rule ξ .g U p σ, τ q “ g U p ∇ U p ξ , σ q , τ q ` g U p σ, ∇ U p ξ , τ qq , ξ P Γ p T U q , σ, τ P Γ p T U q (12) By prop erty (2) of a robust Riemannian manifold, the metric deriv ativ e ∇ for g exists on T M , i.e ∇ . The cov arian t deriv ative ∇ will b e a metric deriv ative if on every chart domain U the pro duct rule (16) holds (for g U and ∇ U ). As T U Ñ T U pulls back the Riemannian bundle metric g U to g U , the pullbac k of the metric connection ∇ U b ecomes the (representativ e of the) metric deriv ativ e ∇ U (see [ 24 , Prop osition 5.6 (a) and Exercise OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 17 5.4]). In particular, ∇ U is giv en b y the formula (11). How ev er, rearranging (11) with Γ U p x, v q “ B U p x, v , v q for ξ , σ P Γ p T U q implies that S U : T U Ñ T T U , S p x, ξ q : “ p x, ξ , ξ , Γ U p x, ξ qq factors through a spray S U : T U Ñ T p T U q ( Ď T T U via the tangent of T U Ñ T U ). W e conclude that ∇ U is induced b y S U . Th us (cf. [ 23 , VI I I §4 Theorem 4.2]) S U is a metric spra y for g U . The S U are compatible under change of trivialisation as in [ 23 , VI I I §4 Theorem 4.2], whence they induce a metric spray of g . □ R emark 6.4 . The pro of of Theorem 6.3 shows that one can construct Christoffel sym b ol lik e ob jects on the completion which restrict to the metric spray . A subtle p oint is nev- ertheless the interpla y betw een spra y and metric deriv ative. As M is not even a Banac h manifold, the connection (11) needs to av oid a definition via (sections of ) the cotangent bundle. F ortunately , the calculations in [ 23 ] we needed to app eal to do not need dualit y or cotangent bundle argumen ts. Example 6.5. Every finite dimensional Riemannian manifold is automatically a Robust Riemannian manifold. In [ 28 , p.9], the authors point out (but do not giv e details) that the space Emb p M , N q of smo oth em b eddings with the Sob olev H s -metric (for s ab o ve the critical Sob olev exp onent) is a robust Riemannian manifold. F urther, the follo wing was prov ed in [ 30 , Theorem 5.1] and yields another main class examples: Example 6.6. Let G be a p ossibly infinite-dimensional Lie group. Recall from [ 37 , Chap- ter 3] that an infinite-dimensional Lie group is called regular (in the sense of Milnor) if the so called Lie-t yp e differential equations can be solved on G (ev ery Banac h Lie group is regular). If g is a right-in v arian t weak Riemannian metric on the regular Lie group G whic h admits a metric deriv ative, then p G, g q is already a robust Riemannian manifold. The follo wing Lemma yields another class of examples which is elemen tary and at the same time of interest in applications. T o our kno wledge, the follo wing result has not app eared with a detailed exp osition in the literature b efore: Prop osition 6.7. Let p H, x¨ , ¨yq b e a Hilb ert space and Ω Ď H open. F or ev ery compact manifold K , the L 2 -metric is a robust Riemannian metric on C 8 p K, Ω q . Pr o of. Note that we endo w Ω with the Riemannian metric induced b y the inclusion Ω Ď H and that the function space K Ω : “ C 8 p K, Ω q is an open subset of the F rec het space C 8 p K, H q , whence an infinite-dimensional manifold. Moreov er (citation), the tangen t bundle is trivial T K Ω – C 8 p K, T Ω q – K Ω ˆ C 8 p K, H q . No w due to [ 37 , Prop osition 5.8] the metric deriv ative of the L 2 -metric exists. The Hilb ert space completion of C 8 p K, H q is the space L 2 p K, H q of all (equiv alence classes of) L 2 -functions from K to H (cf. e.g. [ 34 ]). Since the bundle T K Ω is trivial, the (fibre-wise) completion T K Ω L 2 – K Ω ˆ L 2 p K, H q is a bundle ov er K Ω whic h extends T K Ω . □ R emark 6.8 . An imp ortan t sp ecial case of Theorem 6.7 is the case where K “ S 1 and Ω “ R 2 zt 0 u Ď R 2 . Then the robust Riemannian manifold C 8 p S 1 , R 2 zt 0 uq with the L 2 - metric is isometrically isomorphic to the manifold Imm 0 p S 1 , R 2 q : “ t f : S 1 Ñ R 2 is an immersion : f p e i0 q “ 0 u with a so-called elastic metric. The isometry is the so-called square-ro ot-v eco city-transform (SR VT), cf. [ 6 ], and we remark that the elastic metric is in v ariant under the canoni- cal action of Diff p S 1 q . F or this reason, the elastic metric is used in shap e analysis, see e.g. [ 37 , Chapter 5] for an ov erview. W e note that Theorem 6.7 immediately implies that the elastic metric is a robust Riemannian metric. 18 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING As discussed in [ 6 ], the square ro ot-velocity transform is just a sp ecial case of a more general family of transformations turning elastic metrics for other choices of the elastic parameters in to (v arian ts of ) the L 2 -metric. A similar analysis as in Theorem 6.7 should sho w that these metrics are also robust, but w e will not explore this in the curren t pap er. Recall that due to the Nash-embedding theorem, ev ery finite dimensional smo oth Rie- mannian manifold p M , g q admits an isometric embedding θ : p M , g q Ñ p R N , x¨ , ¨yq for some N . As the pushforw ard θ ˚ : C 8 p K, M q Ñ C 8 p K, R N q , θ ˚ p f q “ θ ˝ f is smo oth b y [ 37 , Corollary 2.19], together with the iden tification T C 8 p K, M q – C 8 p K, T M q the map θ ˚ induces a Riemannian embedding in to C 8 p K, R N q . Thus the following is no w an immediate consequence of Theorem 6.7: Corollary 6.9. F or every finite dimensional Riemannian manifold M and every compact manifold K , the L 2 -metric turns C 8 p K, M q into a robust Riemannian manifold. In general we lac k a global isometric embedding for infinite-dimensional strong Rie- mannian manifolds (alb eit many infinite dimensional manifolds embedd as op en subsets of Hilb ert spaces, cf. [ 18 ]). One could argue using lo calisation arguments in c harts to obtain a similar result for mapping spaces into strong Riemannian manifolds. W e shall not giv e a detailed accoun t of this. A first step tow ards this is the follo wing Lemma, whic h is of in terest in its o wn right. Lemma 6.10. Let Ω Ď H b e an op en subset of the Hilb ert space p H , x¨ , ¨yq endow ed with a strong Riemannian metric g . F or a compact manifold K , W rite K Ω : “ C 8 p K, Ω q for the manifold endo wed with G , the L 2 -metric with resp ect to g . (1) There is a bundle trivialisation Θ : T K Ω Ñ K Ω ˆ C 8 p K, H q whic h takes the G - inner pro duct fibre-wise to the L 2 -metric with resp ect to x¨ , ¨y . (2) C 8 p K, Ω q , L 2 g q is a robust Riemannian manifold. Pr o of. Identify T C 8 p K, Ω q – C 8 p K, T Ω q – K Ω ˆ C 8 p K, H q . (1) Recall from [ 23 , VI I, Theorem 3.1] that since g is a strong Riemannian metric there is a smooth map B : Ω ˆ H Ñ H , B p : “ B p p, ¨q suc h that for every p P Ω , B p is a positive definite inv ertible op erator with g p p u, v q “ x B p u, B p v y , u, v P H . W e define θ : K Ω ˆ C 8 p K, H q Ñ C 8 p K, H q , p f , φ q ÞÑ B ˝ p f , φ q By construction θ f : “ θ p f , ¨q is bijectiv e, linear and fibre-wise an isometry as ż S 1 x θ f p φ q , θ f p ψ qy d µ “ ż S 1 x B f p p q p φ p p qq , B f p p q p ψ p p qqy d µ p x q “ ż S 1 g f p p q p φ p p q , ψ p p qq d µ p p q “ G f p φ, ψ q . If θ is smooth, then Θ “ p id K Ω , θ q satisfies the conditions in (1). T o see that θ is smooth, recall that by the exponential la w [ 37 , Theorem 2.12], θ is smooth if and only if the adjoin t map θ ^ : K Ω ˆ C 8 p K, H q ˆ K Ñ H is smo oth, but this map can b e written as θ ^ p f , φ, k q “ ev p B p ev p f , k q , ev p φ p k qqq and since B is smo oth and the ev aluation maps of the spaces K Ω and C 8 p K, H q is smo oth, [ 37 , Lemma 2.16 (a)], w e deduce that θ is smo oth. (2) By part (1), Θ is a bundle isomorphism o ver the iden tit y onto a trivial bundle. By Theorem 6.7, K Ω with the L 2 -metric is a robust Riemannian manifold. W e note that as Θ induces fibre-wise an isometry , it extends in every fibre to an isometry of the Hilb ert space completions (see [ 36 , Lemma 4.16]). Hence taking fib re-wise the contin uous linear extensions to the completions of the fibre-maps of Θ we obtain a fibre-wise isometry Θ : \ f P K Ω T f K Ω g f Ñ K Ω ˆ L 2 p K, H q . Th us there is a unique vector bundle structure on the union of the completed spaces, making Θ a bundle isomorphism and b y construction this bundle extends T K Ω . The OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 19 metric deriv ativ e exists again in this setting by [ 37 , Theorem 5.8] W e conclude that L 2 g -is a robust Riemannian metric. □ In general, the construction in part (2) of Theorem 6.10 already hints at p ermanence prop erties of v arious ob jects connected to Riemannian metrics whic h are hardly surprising. Ho wev er, we state them here and supply the necessary details for the pro ofs for the readers con venience. In particular, while it is somewhat obvious that these constructions should w ork, the added details should convince the reader that the constructions do not dep end on the manifolds b eing finite-dimensional or strong manifolds. Prop osition 6.11. Let p M , g q , p N , ˜ g q b e weak Riemannian manifolds together with a Riemannian isometry F : M Ñ N (i.e. a diffeomorphism suc h that F ˚ ˜ g “ g ). Then p M , g q is a robust Riemannian manifold if and only if p N , ˜ g q is a robust Riemannian manifold. Pr o of. Since F is a Riemannian isometry , the same holds for F ´ 1 . So clearly the situation is symmetric, so it suffices to assume that p N , ˜ g q is a robust Riemannian manifold and w e shall prov e that p M , g q is robust. F or the completion of the bundle T M w e just note that the isometries T F : T M Ñ T N and T F ´ 1 : T N Ñ T M extend fibre-wise to isometries of the Hilb ert completions with resp ect to the inner pro ducts induced by the Riemannian metrics (see [ 36 , Lemma 4.16]). As F is a diffeomorphism, every vector field X on M is f -related to the pushforward ˜ X “ F ˚ X : “ T F ˝ X ˝ F ´ 1 on N . Now p N , ˜ g q admits a metric deriv ativ e ˜ ∇ and w e use it to define a mapping ∇ : V p M q 2 Ñ V p M q via the formula ∇ Y Z “ p F ´ 1 q ˚ p ˜ ∇ ˜ Y p ˜ Z qq “ T F ´ 1 p ˜ ∇ T F ˝ Y ˝ F ´ 1 p T F ˝ Z ˝ F ´ 1 q ˝ F . No w the usual finite dimensional pro of, see [ 24 , Proposition 5.6 (a) and Exercise 5.4] sho ws that ∇ is a connection compatible with the metric, i.e. a metric deriv ativ e. Note that ∇ is even the Levi-Civita deriv ativ e if ˜ ∇ is the Levi-Civita deriv ative. □ 6.2. Strong Riemannian manifolds. W e no w turn to strong Riemannian manifolds, whic h are well established b oth in geometric theory and optimization. Their underlying Hilb ert space structure, extending to the tangen t bundles, enables direct transfer to man y results from finite-dimensional optimization. Ho wev er, it should b e p ointed out that there are also significan t differences already on the Level of Riemannian geometry . Example 6.12. Every Hilb ert space is a strong Riemannian manifold as are embedded submanifolds lik e the unit sphere. Moreov er, in the Hilb ert space ℓ 2 of square summable sequences, if w e define a 1 “ 1 and a n “ 1 ` 2 ´ n , n ě 2 , then the set E : “ tp x n q n P N P ℓ 2 : ÿ n P N x 2 n a 2 n “ 1 u , is a strong Riemannian manifold with the pullback metric. It is known as Grossmann’s ellipsoid, and one can prov e that while it is geo desically complete, there are points whic h do not admit a minimal geo desic path b etw een them (in other w ords: The Hopf Rinow- theorem fails on strong Riemannian manifolds), see [ 37 , 4.43] for details. In the follo wing, w e briefly illustrate this in our setting and the corresp onding results. By [ 37 , 4.5], a strong Riemannian manifold can equiv alently b e described as follows: Lemma 6.13. Let p M , g q b e a weak Riemannian manifold. If M is a Hilb ert manifold, i.e. mo delled on Hilb ert spaces and the injective linear map 5 : T M Ñ T ˚ M , T p M Q v ÞÑ g p p v , ¨q is a v ector bundle isomorphism, then p M , g q is a strong Riemannian manifold. 20 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING The usual sources [ 20 , 23 ] for Riemannian geometry in infinite-dimensional spaces deal with strong Riemannian manifolds. In particular, they sho w that the Levi-Civita deriv ative and the metric spra y (cf. Section A) exist for these manifolds. Summing up this sho ws the following. Lemma 6.14. Every strong Riemannian manifold is a robust Riemannian manifold and th us a Hesse manifold. In particular, (Example 6.5), ev ery finite-dimensional manifold is a strong Riemannian The geometric structure of a strong Riemannian manifold guaran tees the existence and the contin uit y of the Riemannian gradien t through its unique representation. Lemma 6.15. Let p M , g q be a strong Riemannian C 1 -manifold and f : M Ñ R b e a C 1 - function. Then the Riemannian gradien t ∇ f exists and is sequen tially contin uous. Pr o of. As p M , g q is a strong Riemannian manifold, 5 : T M Ñ T ˚ M is an isomorphism. Hence, the Riemannian gradient of an y C 1 - function f : M Ñ R is given by ∇ f p p q “ 5 ´ 1 p d f p p ; ¨qq . By [ 37 , 4.4], 5 is a b ounded linear op erator and thus con tinuous. This implies that for ev ery sequence p p n q n P N Ă M with lim n Ñ8 p n “ p P M , that lim n Ñ8 ∇ f p p n q “ lim n Ñ8 5 ´ 1 p d f p p n ; ¨q “ 5 ´ 1 p lim n Ñ8 d f p p n ; ¨qq “ 5 ´ 1 p d f p p ; ¨qq “ ∇ f p p q . □ Consequen tly , on strong Riemannian manifolds, every C 1 - function is gradien t-admitting, and Assumption 5.3 holds automatically . Thus, Prop osition 5.8 simplifies to: Corollary 6.16. Let p M , g q b e a strong Riemannian C 1 - manifold and f a C 1 -function on M satisfying 5.1. Let p 0 , p 1 , p 2 , ... be iterates satisfying 5.2 with constan t c . Then lim n Ñ8 } ∇ f p p n q} “ 0 . In particular, all accumulation p oints are critical p oints. F urthermore, for all K ě 1 , there exists k P t 0 , ..., K ´ 1 u suc h that } ∇ f p p k q} p k ď c f p p 0 q ´ f low c 1 ? K . Th us, com bined with Lemma 6.15, this implies that on strong Riemannian C 8 - mani- folds, the Riemannian Hessian exists for ev ery C 2 - function and is moreo ver con tin uous. Although many concepts from finite-dimensional Riemannian optimization extend in an essen tially analogous wa y to strong Riemannian manifolds, this analogy breaks down at the level of second-order optimality conditions, since ev en on strong Riemannian manifolds p ositiv e definiteness do es not imply a co ercivity condition. 7. Comput a tion of the Riemannian gradient and the Riemannian Hessian In this chapter, we examine the computation of the Riemannian gradien t and the Rie- mannian Hessian. W e first establish the extension prop erty of the Riemannian gradien t and the Riemannian Hessian. W e then compute these ob jects explicitly for concrete ex- amples. Note first that the constructions are stable under restrictions to op en subsets Lemma 7.1. Let ` E , x¨ , ¨y ˘ b e a lo cally conv ex space with a contin uous inner pro duct. Consider an y op en subset M Ď E . Equipped with the induced metric g , ` M , g ˘ is a weak Riemannian manifold. Let f : M Ñ R b e a C 1 -function and assume that f extends to a gaf f : E Ñ R . Then f is a gaf and grad f | M “ ∇ f , and ∇ f is sequentially contin uous. The pro of follo ws immediately from untangling the identifications and it extends to the Riemannian Hessian, i.e.: OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 21 Lemma 7.2. In the setting of Lemma 7.1, assume that ` E , x¨ , ¨y ˘ admits a Spray-induced Levi-Civita connection ∇ . Then, the Riemannian Hessian of f on ` M , x¨ , ¨y ˘ coincides with its am bient extension: Hess f p p q “ Hess f p p q , p P M . Pr o of. Since the Levi-Civita connection on ` M , x¨ , ¨y ˘ is the restriction of that on ` E , x¨ , ¨y ˘ , the definition of the Riemannian Hessian yields Hess f p p qr v s “ ∇ v ∇ f “ ∇ v grad f “ Hess f p p qr v s , for all p P M and v P T p M . □ R emark 7.3 . Observe that, since the Riemannian gradien t ∇ f is continous in this set- ting, so is the Riemannian Hessian Hess f p p q , owing to the con tinuit y of the Levi-Civita connection. These results transfer to op en subsets of weak Riemannian manifolds, mo dulo the re- sp ectiv e con tin uit y argumen ts for the Riemannian gradien t and Hessian. Lemma 7.4. Let p M , g q b e a w eak Riemannian C 1 ´ manifold and U Ă M b e an op en subset. Restricting the metric g to U yields a w eak Riemannian manifold p U, g q . Let f : U Ñ R b e C 1 with a C 1 ´ extension f : M Ñ R , suc h that f is a gaf. Then the Riemannian gradient on U coincides with that of the extension: ∇ f p p q “ ∇ f p p q , @ p P U. Moreo ver, if p M , g q is a Hesse manifold, so is p U, g q , and Hess f p p q “ Hess f p p q , @ p P U. In the following, we present t w o illustrative examples of weak Riemannian manifolds. F or each example, we derive the corresp onding Riemannian gradient, and for the second example, we additionally compute the Riemannian Hessian. Example 7.5. W e recall from [ 37 , Example 4.6] that the space Imm p S 1 , R 2 q of all smooth immersions is a weak Riemannian manifold with the inv ariant L 2 -metric g inv ,c p u, w q “ ż S 1 x u, w y| 9 c | dµ c P Imm p S 1 , R 2 q , where we used the identification T c Imm p S 1 , R 2 q – C 8 p S 1 , R 2 q and the inner pro duct is the Euclidean inner pro duct of R 2 . W e consider the length functional L : Imm p S 1 , R 2 q Ñ R , L p c q : “ ż S 1 | 9 c | dµ. As in [ 38 ], an easy computation sho ws that the deriv ativ e of the length functional is d L p c ; u q “ ż S 1 ´ k c x N c , u y| 9 c | d µ “ g inv ,c p´ k c N c , u q , (13) where N c p z q “ p´ y z p z q , x z p z qq J is the normal vector to the curv e c p z q “ p x p z q , y p z qq and k c is the signed curv ature scalar at c . Thus ∇L p c q “ ´ k c N c P C 8 p S 2 , R 2 q . The follo wing example sho w cases a classical application of the Hessian of an energy func- tional which w as originally considered to study geo desic lo ops in Riemannian manifolds, see e.g. [ 20 ]. 22 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING Example 7.6. Let M b e a strong Riemannian manifold and denote by H 1 p S 1 , M q the space of all Sob olev H 1 -lo ops with v alues in M , cf. [ 20 , Section 2.3 and 2.4] for the construction and more information on these manifolds. In [ 12 ] the energy functional E : H 1 p S , M q Ñ R , E p x q “ 1 2 ż 1 0 ∥ B x p s q ∥ 2 ds “ 1 2 ∥ B x ∥ 2 L 2 is defined, where B x is the L 2 -tangen t field induced b y the lo op x . The energy function is of in terest as it’s critical p oints are geo desics. The gradient of E with resp ect to the Sob olev H 1 -metric are computed in [ 12 ] as follo ws: ∇ E p x q “ ´p 1 ´ ∆ q ´ 1 ∇ B x where the ∇ on the right is the cov ariant deriv ative induced b y the metric on M , ∆ is the Laplace-Beltrami Op erator (mapping H 1 -lo ops to H ´ 1 -lo ops) and one exploits that p 1 ´ ∆ q is a compact in vertible operator. Then the Hessian at ξ x P T x H 1 p S , M q is giv en b y Hess E p ξ x q “ ξ x ` p 1 ´ ∆ q ´ 1 p R pB x, ξ x qp 1 ´ ∆ q ´ 1 pB x q ´ ∇ p R pB x, ξ x q ∇ E p x qq ´ ξ x q where R is the curv ature tensor of M . As remark ed in [ 12 , p. 114], the Hessian is the iden tity plus a compact op erator and at a critical p oint, the n ullspace of the Hessian consists of all closed Jacobi fields along the critical p oint (which is an M -v alued lo op!). Note that the tangent field B x is suc h a critical point and this corresp onds to the fact that there is a whole circle of critical p oints in H 1 p S , M q obtained by rotating the geo desic x . While the structure of critical p oints is more complicated than in the finite dimensional matrix case (critical p oin ts piling up), the Hessian can nev ertheless b e used to study con vergence of gradients tow ards the critical p oint, see e.g. [ 12 , Theorem B]. 8. Numerical Experiments In this c hapter, we apply the developed optimization metho ds to sp ecific examples. Emplo ying first- and second-order optimalit y conditions, w e lo cate critical p oints, ascertain their nature as extrema where applicable, and implemen t RGD. The examples satisfy all Assumptions of Prop osition 5.8 and therefore exhibit the anticipated con vergence of ~ ∇ f p c k q~ c k to zero and of the iterates to a minimizer. Example 8.1. W e consider the lo cally conv ex space C 8 p S 1 , R 2 q endow ed with the L 2 ´ metric g p h, k q “ ş S 1 x h p θ q , k p θ qy dθ . Since Emb p S 1 , R 2 q is an open subset of C 8 p S 1 , R 2 q , the pair ` Em b p S 1 , R 2 q , g ˘ constitutes a w eak Riemannian manifold. W e aim to minimize f : Em b p S 1 , R 2 q Ñ R , c ÞÑ ż S 1 } c p θ q ´ θ } 2 dθ . using the Riemannian gradient descent as introduced in Section 5. The function f admits a smooth extension on C 8 p S 1 , R 2 q giv en b y the same expression. A direct computation sho ws that the gradient of this extension is given p oin twise b y grad f p c qp θ q “ 2 p c p θ q ´ θ q . By the extension result of Riemannian gradien ts 7.1, the Riemannian gradient of f on Em b p S 1 , R 2 q is therefore ∇ f p c q “ 2 p c ´ Id S 1 q . Consequen tly , a p oin t c P Em b p S 1 , R 2 q is a critical p oin t of f if and only if c “ id S 1 . Since f p c q ě 0 for all c P Em b p S 1 , R 2 q and f p id S 1 q “ 0 , the identit y embedding is the unique global minimizer of f . T o apply the Riemannian gradien t descen t, consider step sizes α k ą 0 for k P N . Since the w eak Riemannian manifold under consideration is an op en subset of a lo cally con vex space, the tangen t space at any p oint c is isomorphic to the space C 8 p S 1 , R 2 q itself. Therefore, w e assume that for sufficiently small step sizes the iterates remain within this OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 23 op en subset, and consequently no retraction needs to b e defined. F or the resulting sequence of iterates p c k q k P N , a direct computation sho ws that f p c k q ´ f p c k ` 1 q “ α k p 1 ´ α k q~ ∇ f p c k q~ 2 c k , @ k P N . Hence, if there exists a constan t c ą 0 such that the step-sizes α k satisfy c ď α k p 1 ´ α k q for all k P N , the sufficient decrease condition stated in Assumption 5.2 is fulfilled. In particular, for a constant step-size 0 ă α ă 1 , this is satisfied for c “ α p 1 ´ α q . Since f attains a global minim um and ∇ f is sequen tially contin uous, all assumption of the general con vergence result 5.8 are fulfilled. Consequen tly , every accum ulation p oin t of the sequence of iterates p c k q k P N is a critical p oin t of f and the gradient norms ~ ∇ f p c k q~ c k con verge to zero. Moreo ver, for every K ě 1 , there exists an index k P t 0 , ..., K ´ 1 u suc h that ~ ∇ f p c k q~ c k ď c f p c 0 q c 1 ? K . W e conclude with a numerical illustration of the ab ov e conv ergence b eha vior. Figure 1 sho ws t w ent y iterations of the Riemannian gradien t descen t with constant step-size α “ 0 . 1 , starting from the initial embedding c 0 p x, y q “ p x 3 , x ` y q . The left panel depicts the ev olution of the iterates, while the righ t panel displays the decrease of the function v alues and the norms of the Riemannian gradien ts, in agreemen t with the theoretical con vergence results. Figure 1. Riemannian gradient descen t for f . Left: ev olution of the iterates. Righ t: function v alues and gradien t norms o ver t wen ty iterations. Example 8.2. As in the Theorem 8.1, w e consider the w eak Riemannian manifold ` Em b p S 1 , R 2 q , g ˘ . Using the Riemannian gradient descent, w e no w aim to minimize the functional f g : Emb p S 1 , R 2 q Ñ R , c ÞÑ ż S } c p θ q ´ g p θ q} 2 dθ ` λ ż S c p θ q 2 dθ for some g P C 8 p S 1 , R 2 q and λ ě 0 . Pro ceeding as in the previous example, we obtain the following expression for the Rie- mannian gradient of f g : ∇ f g p c q “ 2 ` p 1 ` λ q c ´ g ˘ Th us, f g admits a unique critical p oin t giv en by c “ g 1 ` λ . 24 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING In order to v erify that this critical p oint is indeed a minimizer of f g , w e inv estigate the Rie- mannian Hessian. T o this end, we first in tro duce a Levi-Civita connection on Emb p S 1 , R 2 q . W e iden tify vector fields on Emb p S 1 , R 2 q with mappings X : Emb p S 1 , R 2 q Ñ C 8 p S 1 , R 2 q . F ollowing the construction of Sc hmeding in [ 37 , 5.7], whic h is based on the use of connec- tors, the Levi-Civita connection on Emb p S 1 , R 2 q is defined as follo ws. ` ∇ h Y ˘ p c q “ d Y p c ; h q , c P Em b p S 1 , R 2 q , Y P V p Em b p S 1 , R 2 qq , h P C 8 p S 1 , R 2 q . Throughout, we suppress the notation asso ciated with these identifications for simplicity . Consequen tly , the Riemannian Hessian of f g at c P Emb p S 1 , R 2 q is giv en b y Hess f p c qr h s “ ` ∇ u ∇ f ˘ p h q “ d ∇ f p c ; h q “ 2 p 1 ` λ q h, h P C 8 p S 1 , R 2 q Th us, the Riemannian Hessian is p ositiv e definite for all c P Em b p S 1 , R 2 q pro vided that λ ą ´ 1 . Moreov er, Hess f p p q is co erciv e as g c p Hess f p c qr h s , h q “ 2 1 ` λ } h } 2 c for all h P C 8 p S 1 , R 2 q . Then, by 4.10, the second-order critical p oint c “ g 1 ` λ is indeed a minimizer of f g . T o apply the Riemannian gradien t descent from Section 5, let p α k q k P N Ă p 0 , 8q denote a sequence of step-sizes. F or sufficien tly small step-sizes, we again assume that the iterates remain within the op en set Emb p S 1 , R 2 q , which allows us to av oid defining a retraction. F or the resulting sequence of iterates p c k q k P N , a straigh tforward computation yields f p c k q ´ f p c k ` 1 q “ α k ` 1 ´ p 1 ` λ q α k ˘ ~ ∇ f g p c k q~ 2 c k @ k P N . Hence, the sufficient decrease Assumption 5.2 is satisfied provided that, for all step-sizes α k there exists a constant c ą 0 such that c ď α k ` 1 ´ p 1 ` λ q α k ˘ . F or a constan t step-size 0 ă α ă 1 1 ` λ , the c hoice c “ α ` 1 ´ p 1 ` λ q α ˘ satisfies this condition. As f g admits a global minimizer and the Riemannian gradient ∇ f g is sequentially con- tin uous, the decrease of the Riemannian gradien t norm stated in 5.8 follo ws. F urthermore, all accumulation p oints of the resulting iterative sequence are critical p oints and for ev ery K ě 1 , there exists an index k P t 0 , ...K ´ 1 u such that ~ ∇ f g p c k q~ c k ď c f p c 0 q c 1 ? K . Consider the smo oth map g : S 1 Ñ R 2 , p x, y q ÞÑ ` x, 3 2 y ˘ and the smo oth embedding c hosen as the initial iterate, c 0 : S 1 Ñ R 2 , p x, y q ÞÑ p x 3 , x ` y q . Figure 2 illustrates the b ehavior of the Riemannian gradien t descen t with constan t step-size α “ 0 . 04 and parameter λ “ 0 . 7 . The left panel sho ws the ev olution of the iterates c k under the Riemannian gradient descen t. The right panel depicts the decrease of the function v alue f g p c k q ´ f g p c min q in norm, together with the norm of the Riemannian gradien t } ∇ f g p c k q} c k , ov er tw en t y iterations. OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 25 Figure 2. Riemannian gradient descent for f g . Left: ev olution of the iterates. Right: functional v alues and gradient norms ov er tw en t y itera- tions. Appendix A. Spra ys, connections and metrics In this section we recall some standard material. F or Banac h manifolds this can b e found e.g. in [ 22 , 23 ]. First we need the following for a tangent bundle T M of a smo oth manifold: F or every λ P R we let h λ : T M Ñ T M b e the vector bundle morphism which in every fibre T x M is giv en by m ultiplication with λ . Definition A.1. Let M b e a smo oth manifold. A spr ay is a v ector field S P V p T M q on T M , i.e. a map S : T M Ñ T p T M qq such that T π M ˝ S “ id T M and for all λ P R , w e ha v e S ˝ h λ “ Dh λ p λS q . In lo cal coordinates p U, φ q for M , a spray S : T M Ñ T 2 M can be expressed as S U p x, v q “ p x, v , v , S U, 2 p x, v qq , where S U, 2 p x, λv q “ λ 2 S U, 2 p x, v q . It is easy to see (cf. [ 37 , 4.3]) that in ev ery chart p U, φ q to a spra y there is an asso ciated quadratic form and a bilinear form giv en b y the form ulae Γ U p x, v q : “ 1 2 d 2 2 S U, 2 p x, 0; p v , v qq “ S U, 2 p x, v q B U p x, v , w q “ 1 2 d 2 2 S U, 2 p x, 0; p v , w qq . Spra ys pro vide the v ector fields formalizing second order differential equations on mani- folds. Definition A.2. Let p M , g q be a w eak Riemannian manifold. The spra y S is called metric spr ay (or ge o desic spr ay ) if lo cally in every chart domain U the asso ciated quadratic form Γ U satisfies for all v , w P T x U the relation g U p x, Γ U p x, v q , w q “ 1 2 d 1 g U p x, v , v ; w q ´ d 1 g U p x, v , w ; v q , (14) where w e view g lo cally as a map of three v ariables and d 1 denotes the partial deriv ative with resp ect to the first comp onen t. On a strong Riemannian metric (14) can b e used to define the quadratic form Γ U . Note that the spray is a co ordinate base indep endent w a y to describ e the quadratic ob ject usually describ ed as the metrics Christoffel symbols. There are examples ( [ 37 , Example 4.22]) of weak Riemannian metrics without an asso ciated metric spray . Unsurprisingly , metric spra ys are stable under isometric isomorphism. W e provide the pro of here for the readers conv enience as it show cases how sprays transform under diffeomorphisms. 26 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING Lemma A.3. Let F : p M , g q Ñ p N , h q b e a Riemannian isometry betw een weak Riemann- ian manifolds. Then p N , h q admits a metric spra y if and only if p M , h q admits one. Pr o of. The situation is symmetric, whence it suffices to assume that p N , h q admits the metric spray S h . Observe that S g : “ T 2 p F ´ 1 q ˝ S h ˝ T F is a spra y , cf. [ 22 , Lemma 3.9]. T o chec k that S g is a metric spra y , one simply has to observ e that the relation (14) for the quadratic form of S directly yields the desired relation for the quadratic form of S g in suitable charts. F or the readers conv enience we sp ell this out explicitely: Fix a chart p U, φ q of N and obtain the the c hart p F ´ 1 p U q , φ ˝ F q of M . Since F is a diffeomorphism it suffices to compute in c harts of this t yp e that S g is the metric spra y . Note that by construction as S g “ T 2 F ´ 1 ˝ S ˝ T F the local represen tativ e T 2 p φ ˝ F q ˝ S g ˝ T p φ ˝ F q ´ 1 “ T 2 φ ´ 1 ˝ S ˝ T φ ´ 1 of S g in the φ ˝ F c hart coincides with the lo cal represen tative of S in the chart φ . W e deduce that the quadratic forms Γ U for S on φ p U q and Γ g U for S g on φ p U q coincide. No w pic k x P φ p U q , v , w P T x φ p U q and since F is a Riemannian isometry g F ´ 1 p U q p x, v , w q “ g p φ ˝ F q ´ 1 p x q p T x p φ ˝ F q ´ 1 p v q , T x p φ ˝ F q ´ 1 p w qq “ g F ´ 1 φ ´ 1 p x q p T x p F ´ 1 φ ´ 1 qp v q , T x F ´ 1 φ ´ 1 p w qq “ h φ ´ 1 p x q p T x φ ´ 1 p v q , T x φ ´ 1 p w q “ h U p x, v , w q . W e compute lo cally in the pair of charts p U, φ q and p F ´ 1 p U q , φ ˝ F q and since the lo cal represen tatives of the metrics coincide and (14) holds for h U and Γ U , we deduce from the fact that the quadratic forms coincide that Q g U satisfies (14). □ Ev ery spra y induces a cov arian t deriv ativ e (see e.g. [ 37 , Prop osition 4.3.9]). Definition A.4. Let S : T M Ñ T p T M q b e a spray , then there exists a unique co v ariant deriv ative ∇ : V p M q ˆ V p M q Ñ V p M q such that in a c hart p φ, U q , the lo cal formula ∇ U p u, Y qp x q “ d Y p x ; u p x qq ´ B U p x, u p x q , Y p x qq (15) holds. W e call ∇ the cov arian t deriv ativ e associated to the spra y S . A cov arian t deriv ativ e on a w eak Riemannian manifold p M , g q is called metric derivative if it is compatible with g in the sense that X .g p Y , Z q “ g p ∇ X Y , Z q ` g p Y , ∇ X Z q , X, Y , Z P V p M q , (16) where w e use the shorthand X .f : “ D f ˝ X . Note that a spra y is the metric spra y for a Riemannian metric if and only if the asso ciated cov ariant deriv ative is a metric deriv ativ e. The second order differen tial equations describ ed by a spra y are v ariants of geodesic equations. As for a Riemannian metric, if one can solv e these differen tial equations, they giv e rise to an exp onential map asso ciated to the spray . W e recall from [ 23 ]: Example A.5. If M is a paracompact Banac h manifold with a spra y S : T M Ñ T p T M q , then the spra y exponential exp S : T M Ě Ω Ñ M is a normalised lo cal addition on M . References [1] P .-A. Absil, R. Mahony , and R. Sepulchre. Optimization algorithms on matrix manifolds. Princ eton University Pr ess , 2008. [2] J. Altsch uler, S. Chewi, P . R. Gerber, and A. Stromme. A veraging on the Bures-W asserstein manifold: dimension-free conv ergence of gradient descent. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P . Liang, and J. W. V aughan, editors, A dvanc es in Neur al Information Pr oc essing Systems , v olume 34, pages 22132–22145. Curran Associates, Inc., 2021. [3] H. Amiri, H. Glöckner, and A. Schmeding. Lie groupoids of mappings taking v alues in a Lie groupoid. A r ch. Math. (Brno) , 56(5):307–356, 2020. [4] T. Balehowsky , C.-J. Karlsson, and K. Mo din. Shap e analysis via gradient flo ws on diffeomorphism groups. Nonline arity , 36(2):862–877, 2023. OPTIMIZA TION ON WEAK RIEMANNIAN MANIFOLDS 27 [5] M. Bauer, M. Bruv eris, and P . W. Mic hor. Overview of the geometries of shape spaces and diffeomor- phism groups. J. Math. Imaging Vis. , 50(1-2):60–97, 2014. [6] M. Bauer, N. Charon, E. Klassen, S. Kurtek, T. Needham, and T. Pierron. Elastic metrics on spaces of Euclidean curv es: theory and algorithms. J. Nonline ar Sci. , 34(3):38, 2024. Id/No 56. [7] N. Borc hard and G. W achsm uth. Characterization of Hilb ertizable spaces via conv ex functions. Preprin t, arXiv:2506.04686 [math.F A] (2025), 2025. [8] N. Boumal. A n intr o duction to optimization on smooth manifolds . Cambridge Universit y Press, 2023. [9] S. Chen, S. Ma, A. Man-Cho So, and T. Zhang. Proximal gradient metho d for nonsmo oth optimization o ver the Stiefel manifold. SIAM Journal on Optimization , 30(1):210–239, 2020. [10] E. Döhrer and N. F reches. Conv ergence of gradient flows on knotted curves. Preprint, [math.CA] (2025), 2025. [11] H. I. Elíasson. Condition (C) and geo desics on Sob olev manifolds. Bul l. A m. Math. So c. , 77:1002–1005, 1971. [12] H. I. Eliasson. Con vergence of gradient curv es on Hilbert manifolds. Math. Z. , 136:107–116, 1974. [13] P . M. N. F eehan. On the Morse-Bott property of analytic functions on Banach spaces with Ło jasiewicz exp onen t one half. Calc. V ar. Partial Differ. Equ. , 59(2):50, 2020. Id/No 87. [14] P . M. N. F eehan and M. Maridakis. Ło jasiewicz-simon gradient inequalities for analytic and Morse- Bott functions on Banac h spaces. J. R eine A ngew. Math. , 765:35–67, 2020. [15] M. Gage and R. S. Hamilton. The heat equation shrinking conv ex plane curves. J. Differ. Ge om. , 23:69–96, 1986. [16] gerw (https://math.stac kexc hange.com/users/58577/gerw). What is something (non-trivial) that can b e done in Hilbert space but not Banac h spaces for optimization problems? Mathematics Stac k Exc hange. URL:https://math.stac k exchange.com/q/3279480 (version: 2019-07-01). [17] W. Greub, S. Halp erin, and R. V anstone. Conne ctions, curvatur e, and c ohomolo gy. V ol. II: Lie gr oups, princip al bund les, and char acteristic classes , volume 47 of Pur e Appl. Math., A c ademic Pr ess . Aca- demic Press, New Y ork, NY, 1973. [18] D. W. Henderson. Infinite-dimensional manifolds are open subsets of Hilb ert space. T op olo gy , 9:25–33, 1970. [19] J. Jost. Riemannian ge ometry and ge ometric analysis . Universitext. Cham: Springer, 7th edition edition, 2017. [20] W. P . A. Klingenberg. R iemannian geometry , volume 1 of De Gruyter Stud. Math. Berlin: W alter de Gruyter, 2nd ed. edition, 1995. [21] D. Kressner, M. Steinlechner, and B. V andereyck en. Lo w-rank tensor completion by Riemannian optimization. BIT , 54(2):447–468, June 2014. [22] P . Kristel and A. Schmeding. The Stacey-Roberts lemma for Banach manifolds. SIGMA, Symmetry Inte gr ability Ge om. Metho ds A ppl. , 21:pap er 037, 20, 2025. [23] S. Lang. F undamentals of differ ential ge ometry. , volume 191 of Gr ad. T exts Math. New Y ork, NY: Springer, corr. 2nd prin ting edition, 2001. [24] J. M. Lee. Riemannian manifolds: an intr o duction to curvatur e , volume 176 of Gr ad. T exts Math. New Y ork, NY: Springer, 1997. [25] E. Loayza-Romero, L. Pryymak, and K. W elk er. A Riemannian approac h for PDE constrained shap e optimization ov er the diffeomorphism group using outer metrics. Preprint, [math.OC] (2025), 2025. [26] E. Loayza-Romero and K. W elker. Numerical tec hniques for geo desic approximation in Riemannian shap e optimization. Preprin t, arXiv:2504.01564 [math.OC] (2025), 2025. [27] J. Lott. Some geometric calculations on W asserstein space. Commun. Math. Phys. , 277(2):423–437, 2008. [28] M. Mic heli, P . W. Mic hor, and D. Mumford. Sobolev metrics on diffeomorphism groups and the deriv ed geometry of spaces of submanifolds. Izv. R oss. A kad. Nauk Ser. Mat. , 77(3):109–138, 2013. [29] P . W. Michor. Manifolds of differ entiable mappings , volume 3 of Shiva Math. Ser. Shiv a Publishing Limited, Nan twic h, Cheshire, 1980. [30] P . W. Michor. Manifolds of mappings and shap es. Preprin t, arXiv:1505.02359 [math.DG] (2015), 2015. [31] P . W. Michor and D. Mumford. An ov erview of the Riemannian metrics on spaces of curv es using the Hamiltonian approac h. Appl. Comput. Harmon. A nal. , 23(1):74–113, 2007. [32] F. Otto. The geometry of dissipativ e evolution equations: The porous medium equation. Commun. Partial Differ. Equations , 26(1-2):101–174, 2001. [33] R. S. P alais. Morse theory on Hilbert manifolds. T op olo gy , 2:299–340, 1963. [34] R. S. P alais. F oundations of glob al non-line ar analysis . Math. Lect. Note Ser. The Ben- jamin/Cummings Publishing Compan y , Reading, MA, 1968. [35] R. S. P alais and S. Smale. A generalized Morse theory . Bul l. Am. Math. So c. , 70:165–172, 1964. [36] W. R udin. R e al and c omplex analysis. New Y ork, NY: McGraw-Hill, 3rd ed. edition, 1987. 28 V ALENTINA ZALBER TUS, MAX PFEFFER, AND ALEXANDER SCHMEDING [37] A. Sc hmeding. A n intr o duction to infinite-dimensional differential ge ometry , v olume 202 of Camb. Stud. A dv. Math. Cam bridge: Cam bridge Universit y Press, 2023. [38] P . Schrader, G. Wheeler, and V.-M. Wheeler. On the H 1 p ds γ q -gradien t flow for the length functional. J. Ge om. A nal. , 33(9):49, 2023. Id/No 297. [39] W. Si, P .-A. Absil, W. Huang, R. Jiang, and S. V ary . A Riemannian pro ximal Newton metho d. SIAM Journal on Optimization , 34(1):654–681, 2024. [40] G. Smyrlis and V. Zisis. Lo cal con v ergence of the steepest descent metho d in Hilb ert spaces. J. Math. A nal. Appl. , 300(2):436–453, 2004. [41] A. T rouvé. Diffeomorphisms groups and pattern matc hing in image analysis. Commun. Partial Differ. Equations , 28(3):213–221, 1998. [42] T. T. T ruong. Some iterative algorithms on Riemannian manifolds and Banach spaces with go o d global con vergence guarantee. Preprin t, arXiv:2505.22180 [math.OC] (2025), 2025. [43] L. Y ounes. Shapes and diffe omorphisms , v olume 171 of Appl. Math. Sci. Berlin: Springer, 2nd up dated edition edition, 2019. Georg-A ugust-University Göttingen, Institute for Applied and Numerical Ma thema tics, Lotzestr. 16-18, 37083 Göttingen Email addr ess : v.zalbertus@stud.uni-goettingen.de Georg-A ugust-University Göttingen, Institute for Applied and Numerical Ma thema tics, Lotzestr. 16-18, 37083 Göttingen Email addr ess : m.pfeffer@math.uni-goettingen.de Nor wegian University of Science and Technology, Dep ar tment of Ma thema tical Sciences, Alfred Getz’ vei 1, Trondheim Email addr ess : alexander.schmeding@ntnu.no

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment