Local Fréchet regression with toroidal predictors

Lo cal F r ´ ec het regression with toroidal predictors Chang Jun Im 1 and Jeong Min Jeon 2 1 The Institute for Data Inno v ation in Science, Seoul National Univ ersity , South Korea 2 Departmen t of Statistics and School of T ransdisciplinary Inno v ations, Seoul National Univ ersity , South Korea Abstract W e pro vide the ﬁrst regression framew ork that sim ultaneously accommo dates resp onses taking v alues in a general metric space and predictors lying on a general torus. W e prop ose in trinsic lo cal constant and lo cal linear estimators that resp ect the underlying geometries of b oth the response and predictor spaces. Our local linear estimator is no vel ev en in the case of scalar resp onses. W e further establish their asymptotic prop erties, including consistency and con vergence rates. Sim ulation studies, together with an application to real data, illustrate the sup erior p erformance of the prop osed metho dology . Keyw ords: F r´ ec het regression, lo cal linear regression, metric space, toroidal data 1 In tro duction Mo dern data often reside in spaces that lack a v ector space structure or p ossess geometric fea- tures, suc h as dimensions, distances and angles, that are fundamentally diﬀeren t from those of Euclidean spaces. These characteristics p ose substantial c hallenges for statistical analysis and motiv ate the developmen t of intrinsic approaches that resp ect the underlying data geometry . F or an introduction to non-Euclidean data analysis, we refer to San b orn et al ( 2024 ). W e study nonparametric regression for toroidal predictors taking v alues in the d -dimensional torus T d := S 1 × · · · × S 1 , whic h is the Cartesian pro duct of d ≥ 1 unit circles S 1 := { (cos θ , sin θ ) ⊤ ∈ R 2 : θ ∈ [ − π , π ) } . Data exhibiting natural p erio dicit y can b e viewed as lying on S 1 , and data 1 lying on T d naturally arise in man y scientiﬁc disciplines when observ ations exhibit m ultiple p erio dic comp onen ts. F or instance, in Earth science, analyzing multiple wind and w a ve directions is an imp ortan t task, and suc h join t directional measuremen ts giv e rise to data lying on a torus (e.g., Xu and W ang ( 2023 ); Bisw as and Banerjee ( 2025 )). In molecular biology , protein bac kb one conformations are commonly characterized by paired dihedral angles that lie on a torus (e.g., Di Marzio et al. ( 2011 ); Eltzner et al. ( 2018 ); Jung et al. ( 2021 ); Xu and W ang ( 2023 )). In many real-w orld datasets, even ts of in terest dep end sim ultaneously on the time of day and the da y of the y ear, which together deﬁne observ ations on T 2 . Despite the abundance of such data, statistical metho dologies for toroidal predictors remain limited. F or example, Di Marzio et al. ( 2009 ) studied lo cal linear regression for toroidal predictors with scalar resp onses, and Bisw as and Banerjee ( 2025 ) inv estigated semi-parametric regression for predictors and resp onses b oth taking v alues in T 2 . More generally , regression metho ds that accommo date manifold-v alued predictors, including toroidal predictors, ha ve b een developed in P elletier ( 2006 ), Jeon et al. ( 2021 ) and Jeon et al. ( 2022 ), among others. How ev er, the latter works focus on scalar or Hilb ert-space-v alued resp onses and thus exclude manifold-v alued resp onses or more generally , metric-space-v alued resp onses. The F r ´ ec het regression framework, introduced by P etersen and M ¨ uller ( 2019 ), provides a p ow- erful to ol for mo deling resp onses that take v alues in a general metric space, including W asserstein spaces, net w ork spaces and manifolds. Subsequen t studies, suc h as Zhou and M ¨ uller ( 2022 ), Bhattac harjee and M ¨ uller ( 2023 ), T uck er et al. ( 2023 ) and Stey er et al ( 2025 ), hav e extended this framework in v arious directions. How ev er, most existing F r´ ec het regression metho ds fo cus on Euclidean predictors. Recen tly , T uc ker and W u ( 2025 ) considered predictors taking v alues in a general metric space, but their approach is based on lo cal constan t estimation and relies on a H¨ older contin uit y assumption on the regression function, which yields suboptimal con vergence rates when the true regression function is suﬃcien tly smo oth. More recen tly , Im et al. ( 2025 ) dev elop ed lo cal constant and lo cal linear F r´ ec het regression for spherical predictors taking v alues in S p := { v ∈ R p +1 : ∥ v ∥ 2 = 1 } . In this work, we develop b oth lo cal constan t and lo cal linear regression metho ds for toroidal predictors with metric-space-v alued resp onses, fully resp ecting the underlying data geometry . T o the b est of our kno wledge, lo cal linear estimation in this setting has not b een studied. Our lo cal linear estimator is nov el even in the case of scalar resp onses, as it diﬀers from that of Di Marzio et al. ( 2009 ). Both the prop osed lo cal constan t and lo cal linear estimators achiev e the optimal error rate. W e demonstrate the practical utilit y of the prop osed metho ds through an analysis of New 2 Y ork taxi netw ork data, where the resp onse is a graph Laplacian that is neither vector-space-v alued nor manifold-v alued. The rest of this pap er is organized as follo ws. Section 2 introduces the problem setting and the prop osed estimation metho ds. Section 3 presen ts the asymptotic theory , including consistency and con vergence rates. Sections 4 and 5 ev aluate the ﬁnite-sample p erformance of the proposed metho ds through simulation studies and a real data application, resp ectiv ely . All technical pro ofs are provided in the Supplemen tary Material. 2 Problem setting and estimators 2.1 Problem setting Let (Ω , F , P ) b e an underyling probability space and M b e a general totally b ounded metric space equipp ed with a metric d M . Also, let Y : Ω → M b e the resp onse v ariable and X : Ω → T d b e the predictor. F or an y x ∈ T d , we write x =  x ⊤ 1 , . . . , x ⊤ d  ⊤ , where x ℓ ∈ S 1 . W e deﬁne a function M ⊕ : T d × M → R as M ⊕ ( x , y ) := E  d 2 M ( Y , y )   X = x  . In this pap er, we aim to estimate a function m ⊕ : T d → M , deﬁned as m ⊕ ( x ) := argmin y ∈ M M ⊕ ( x , y ) . When Y ∈ R , the usual regression function is characterized by arg min y ∈ R E[( Y − y ) 2 | X = x ]. Hence, the function m ⊕ can b e understo o d as a natural extension of the usual regression function and is often called the F r ´ echet regression function. W e prop ose t wo estimators for m ⊕ : a lo cal constan t estimator and a lo cal linear estimator. Before we introduce the estimators, we present some examples of M . Example 2.1. F or [ a, b ] ⊂ R with −∞ < a < b < ∞ , let W ([ a, b ]) denote the sp ac e of al l distribution functions F : R → [0 , 1] such that F − 1 ( t ) = inf { x ∈ R : F ( x ) ≥ t } ∈ [ a, b ] for al l t ∈ (0 , 1] . Also, let d W denote the 2-Wasserstein distanc e, deﬁne d as d W ( F , G ) :=  Z 1 0 ( F − 1 ( t ) − G − 1 ( t )) 2 d t  1 / 2 , F , G ∈ W ([ a, b ]) . Then, W ([ a, b ]) e quipp e d with d W , c al le d the 2-Wasserstein sp ac e on [ a, b ] is total ly b ounde d; se e Cor ol lary 2.2.5 and Pr op osition 2.2.8 of Panar etos and Zemel ( 2020 ). 3 Example 2.2. L et G = ( V , E ) b e a gr aph network with no des V = { v 1 , . . . , v k } and e dge weights E = { w ij : 1 ≤ i, j ≤ k } satisfying 0 ≤ w ij ≤ C w for some c onstant C w > 0 , wher e w ij = 0 indic ates that no des v i and v j ar e not c onne cte d. We assume that G has no self-lo ops or multi- e dges and is undir e cte d, that is, w ij = w j i for al l 1 ≤ i, j ≤ k . Such a gr aph network c an b e uniquely r epr esente d by its gr aph L aplacian L = ( L ij ) ∈ R k × k , whose entries ar e deﬁne d as L ij :=      − w ij , i  = j, P k  = i w ik , i = j. We c onsider the sp ac e of gr aph L aplacians L k :=  L : L = L ⊤ , L1 k = 0 k , − C w ≤ L ij ≤ 0  , wher e 1 k and 0 k denote the k -dimensional ve ctors of ones and zer os, r esp e ctively. We e quip L k with the F r ob enius distanc e d F ( L , M ) := k X i =1 k X j =1 ( L ij − M ij ) 2 ! 1 / 2 . Then, the metric sp ac e ( L k , d F ) is total ly b ounde d; se e Pr op osition 1 of Zhou and M¨ ul ler ( 2022 ). Example 2.3. Any ﬁnite-dimensional c omp act R iemannian manifold without b oundary, e quipp e d with the ge o desic distanc e, is total ly b ounde d. Such R iemannian manifolds include the spher es, toruses, planar shap e sp ac es ( L e and Kendal l ( 1993 )) and sp e cial ortho gonal gr oups SO( k ) = { A ∈ R k × k : A is ortho gonal and det( A ) = 1 } . 2.2 Lo cal constan t estimator W e introduce our lo cal constant estimation. F or a vector h = ( h 1 , . . . , h d ) ⊤ of bandwidths h ℓ > 0, w e deﬁne the toroidal pro duct kernel L x , h : T d → [0 , ∞ ) as L x , h ( z ) := d Y ℓ =1 L 1 − z ⊤ ℓ x ℓ h 2 ℓ ! , z =  z ⊤ 1 , . . . , z ⊤ d  ⊤ ∈ T d , (2.1) where L : [0 , ∞ ) → [0 , ∞ ) is a decreasing function. Note that the angle ϑ ℓ ∈ [0 , π ] b et w een z ℓ and x ℓ equals arccos( z ⊤ ℓ x ℓ ), which implies that 1 − z ⊤ ℓ x ℓ = 2 sin 2 ( ϑ ℓ / 2). Hence, 1 − z ⊤ ℓ x ℓ is larger as z ℓ is further a w a y from x ℓ . Let { ( X ( i ) , Y ( i ) ) } n i =1 b e a random sample of ( X , Y ). W e estimate M ⊕ ( x , y ) b y the lo cal constant estimator ˆ M h , 0 ( x , y ) := n − 1 P n i =1 L x , h  X ( i )  d 2 M ( Y ( i ) , y ) n − 1 P n i =1 L x , h ( X ( i ) ) 4 and estimate m ⊕ ( x ) by ˆ m h , 0 ( x ) := argmin y ∈ M ˆ M h , 0 ( x , y ) . W e call ˆ m h , 0 ( x ) a lo cal constant F r´ ec het regression estimator at x . 2.3 Lo cal linear estimator W e in tro duce our lo cal linear estimation. W e ﬁrst consider the case where Y is scalar and then extend the construction to the general metric space. Our estimation metho d is new even in this setting. Let g : T d → R be a suﬃcien tly smo oth function. W e b egin b y introducing a notion of the gradient of g , which will b e used in the construction of the lo cal linear estimator. W e deﬁne N ℓ :=  z ⊤ 1 , . . . , z ⊤ d  ⊤ ∈ R 2 d : z ⊤ ℓ = 0 2  , N := d [ ℓ =1 N ℓ , where 0 p denotes the zero v ector in R p for p ∈ N . W e also deﬁne the homogeneous extension ¯ g : R 2 d \ N → R of g as ¯ g ( z ) := g   z ⊤ 1 ∥ z 1 ∥ 2 , . . . , z ⊤ d ∥ z d ∥ 2 ! ⊤   , z =  z ⊤ 1 , . . . , z ⊤ d  ⊤ ∈ R 2 d \ N . (2.2) W e deﬁne the blo c k-diagonal matrix D z ∈ R 2 d × d as D z :=      z 1 · · · 0 2 . . . . . . . . . 0 2 · · · z d      . Note that ¯ g is homogeneous of degree 0, that is, ¯ g ( D z t ) = ¯ g ( t 1 z ⊤ 1 , . . . , t d z ⊤ d ) = ¯ g ( z ) for all t = ( t 1 , . . . , t d ) ⊤ ∈ (0 , ∞ ) d . Since the function t 7→ ¯ g ( D z t ) is constan t with resp ect to t , the chain rule implies that 0 = ∂ ∂ t ℓ ¯ g ( D z t ) = ∇ ¯ g ( D z t ) ⊤ D z e ℓ , ℓ = 1 , . . . , d, (2.3) where ∇ ¯ g ( D z t ) ∈ R 2 d denotes the gradien t of ¯ g at D z t , and e ℓ = (0 , . . . , 0 , 1 , 0 , . . . , 0) ⊤ denotes the ℓ -th standard basis vector in R d . Ev aluating ( 2.3 ) at t = (1 , . . . , 1) ⊤ ∈ (0 , ∞ ) d yields D ⊤ z ∇ ¯ g ( z ) = 0 d . (2.4) 5 This imp oses a natural constraint on the gradien t. No w, we introduce a ﬁrst-order approximation of g . F or x ∈ T d , we deﬁne a function Φ x ,ℓ : [ − π , π ) → S 1 b y Φ x ,ℓ ( θ ℓ ) := x ℓ cos θ ℓ + ( Rx ℓ ) sin θ ℓ , θ ℓ ∈ [ − π , π ) , where R =   0 − 1 1 0   denotes the counterclockwise rotation matrix by π / 2 in R 2 . This function Φ x is a bijection due to the tangen t-normal decomp osition (e.g., Equation (9.1.2) of Mardia and Jupp ( 2000 )). W e also deﬁne Φ x : [ − π , π ) d → T d as Φ x ( θ ) :=  Φ x , 1 ( θ 1 ) ⊤ , . . . , Φ x ,d ( θ d ) ⊤  ⊤ , θ = ( θ 1 , . . . , θ d ) ⊤ ∈ [ − π , π ) d , (2.5) whic h is also a bijection. W e deﬁne the blo ck-diagonal matrix R x ∈ R 2 d × d as R x :=      Rx 1 · · · 0 2 . . . . . . . . . 0 2 · · · Rx d      . If z ∈ T d and x are close enough, then Lemma A.9 in the App endix implies that the follo wing ﬁrst-order approximation holds: g ( z ) ≈ g ( x ) + ( R x θ z ) ⊤ ∇ ¯ g ( x ) . (2.6) Here, θ z ∈ [ − π , π ) d is the v ector satisfying z = Φ x ( θ z ). No w, we establish the local linear estimator. F or each observ ation X ( i ) , let θ ( i ) ∈ [ − π , π ) d denote the v ector satisfying X ( i ) = Φ x ( θ ( i ) ). W e deﬁne ( ˆ α h ( x ) , ˆ β h ( x )) := arg min α ∈ R , β ∈ R 2 d n − 1 n X i =1 L x , h ( X ( i ) )  Y ( i ) − α − ( R x θ ( i ) ) ⊤ β  2 sub ject to D ⊤ x β = 0 d . (2.7) Motiv ated b y ( 2.4 ) and ( 2.6 ), w e may tak e ˆ α h ( x ) as an estimator of E ( Y | X = x ). T o remo ve the constraint in ( 2.7 ), supp ose that β =  β ⊤ 1 , . . . , β ⊤ d  ⊤ satisﬁes D ⊤ x β = 0 d . Since x ⊤ ℓ β ℓ = 0 and x ⊤ ℓ Rx ℓ = 0 for each ℓ , w e can write β ℓ = γ ℓ Rx ℓ for some γ ℓ ∈ R . This implies that β = R x γ for some γ = ( γ 1 , . . . , γ d ) ⊤ ∈ R d . Let I p denote the p × p identit y matrix for p ∈ N . Since R ⊤ x R x =      ( Rx 1 ) ⊤ · · · 0 ⊤ 2 . . . . . . . . . 0 ⊤ 2 · · · ( Rx d ) ⊤           Rx 1 · · · 0 2 . . . . . . . . . 0 2 · · · Rx d      = I d , 6 ( 2.7 ) reduces to the unconstrained optimization problem ( ˆ α h ( x ) , ˆ γ h ( x )) = arg min α ∈ R , γ ∈ R d n − 1 n X i =1 L x , h ( X ( i ) )  Y ( i ) − α − γ ⊤ θ ( i )  2 . (2.8) T o get an explicit expression for ˆ α h ( x ), we deﬁne ˆ µ h , 0 ( x ) := n − 1 n X i =1 L x , h  X ( i )  , ˆ µ h , 1 ( x ) := n − 1 n X i =1 L x , h  X ( i )  θ ( i ) , ˆ µ h , 2 ( x ) := n − 1 n X i =1 L x , h  X ( i )  θ ( i )  θ ( i )  ⊤ , ˆ σ h ( x ) := ˆ µ h , 0 ( x ) − ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ µ h , 1 ( x ) , ˆ W x , h ( X ) := L x , h ( X ) ˆ σ h ( x )  1 − ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 Φ − 1 x ( X )  , giv en that ˆ µ h , 2 ( x ) is inv ertible and that ˆ σ h ( x )  = 0. According to Lemma B.2 and Lemma B.3 in the App endix, ˆ µ h , 2 ( x ) is in v ertible and ˆ σ h ( x ) > 0 with probability tending to one, under mild conditions. By solving ( 2.8 ), we get ˆ α h ( x ) = n − 1 n X i =1 ˆ W x , h ( X ( i ) ) Y ( i ) . In other w ords, we get ˆ α h ( x ) = arg min y ∈ R n − 1 n X i =1 ˆ W x , h ( X ( i ) )  Y ( i ) − y  2 . W e extend this formulation to the general metric space by replacing the Euclidean distance with d M . Sp eciﬁcally , we deﬁne ˆ M h , 1 ( x , y ) := n − 1 n X i =1 ˆ W x , h ( X ( i ) ) d 2 M ( Y ( i ) , y ) , ˆ m h , 1 ( x ) := arg min y ∈ M ˆ M h , 1 ( x , y ) . W e call ˆ m h , 1 ( x ) a lo cal linear F r´ ec het regression estimator at x . Remark 2.1. In the c ase of sc alar r esp onses, the lo c al line ar metho d of Di Marzio et al. ( 2009 ) is b ase d on lo c al ly appr oximating the tar get r e gr ession function m : [ − π , π ) d → R at a given ve ctor θ = ( θ 1 , . . . , θ d ) ⊤ ∈ [ − π , π ) d by m ( θ ) ≈ β 0 + d X l =1 β l sin( θ l − ψ l ) , wher e β 0 , β ℓ ∈ R and θ ≈ ψ = ( ψ 1 , . . . , ψ d ) ⊤ ∈ [ − π , π ) d . This appr o ach is fundamental ly diﬀer ent fr om ours, which r elies on the tangent-normal de c omp osition in the emb e dde d submanifold. 7 3 Asymptotic theory 3.1 Consistency In this section, we deriv e the consistency of our estimators at a given p oint x ∈ T d . W e ﬁrst in tro duce a condition on the function L . Condition L. The function L satisﬁes that 0 < Z ∞ 0 L k ( r 2 ) r j d r < ∞ for al l k ∈ { 1 , 2 } and nonne gative inte gers j . Similar conditions were adopted in Hall et al. ( 1987 ), Bai et al. ( 1988 ) and Garc ´ ıa-P ortugu´ es et al. ( 2013 ). Examples of L satisfying the condition L include the von Mises kernel L ( r ) = e − r , the exp onen tial kernel L ( r ) = e − √ r and the uniform k ernel L ( r ) = 1 [0 , 1] ( r ), where 1 [0 , 1] denotes the indicator function of [0 , 1]. W e also make a typical condition on the bandwidth v ector h . Condition B. The b andwidth ve ctor h satisﬁes that lim n →∞ ∥ h ∥ 2 = 0 and lim n →∞  n Q d ℓ =1 h ℓ  = ∞ . Note that Rupp ert and W and ( 1994 ) assumed that the ratio of the largest to the smallest band- width comp onents remains b ounded as the sample size grows. W e do not imp ose this restriction, allo wing for more ﬂexible bandwidth v ectors. No w, w e make some conditions on the distributions of X and Y . Let B ( S 1 ) and B ( T d ) denote the Borel σ -ﬁelds of S 1 and T d , resp ectively . Note that B ( T d ) equals the pro duct σ -ﬁeld of d Borel σ -ﬁelds B ( S 1 ). Also, let ω d 1 denote the scaled toroidal measure on T d , deﬁned as the pro duct measure of d circular measures on S 1 . Note that ω d 1 ( A 1 × · · · × A d ) = 2 d d Y ℓ =1 Leb 2 ( { t · x : t ∈ [0 , 1] , x ∈ A ℓ } ) , A 1 , · · · , A d ∈ B ( S 1 ) , where Leb 2 denotes the Leb esgue measure on R 2 . This measure satisﬁes that ω d 1 ( T d ) = (2 π ) d . Let f : T d → [0 , ∞ ) denote the densit y of X with resp ect to ω d 1 . Condition D1. The density f satisﬁes that f ( x ) > 0 . Condition D2. The density f is c ontinuous at x and is b ounde d on T d . Most parametric distributions on T d , including the uniform distribution on T d and the v on Mises-Fisher distributions on T d , satisfy the conditions D1 and D2 . Let P ( X ,Y ) denote the join t 8 distribution of ( X , Y ), and let P X and P Y denote the distributions of X and Y , resp ectively . Also, let P Y | X = x denote the conditional distribution of Y given X = x . W e assume that P Y | X = x is absolutely con tinuous with resp ect to P Y for eac h x ∈ T d , so that there exists the Radon-Nik o dym deriv ativ e dP Y | X = x /dP Y of P Y | X = x with resp ect to P Y . F or eac h y ∈ M , w e deﬁne g y : T d → [0 , ∞ ) as g y ( x ) := dP Y | X = x dP Y ( y ) . (3.1) Condition D3. The family { g y : y ∈ M } of functions is e quic ontinuous at x and it holds that sup y ∈ M sup z ∈ T d g y ( z ) < ∞ . Similar conditions w ere used in Di Marzio et al. ( 2014 ) and P etersen and M ¨ uller ( 2019 ). Now, w e in tro duce a condition on M . Recall the deﬁnitions of ˆ µ h , 0 ( x ), ˆ µ h , 1 ( x ), ˆ µ h , 2 ( x ), ˆ σ h ( x ) and ˆ W x , h ( X ) given in Section 2.3 . W e deﬁne their p opulation v ersions as ˜ µ h , 0 ( x ) := E [ L x , h ( X )] , ˜ µ h , 1 ( x ) := E  L x , h ( X ) Φ − 1 x ( X )  , ˜ µ h , 2 ( x ) := E  L x , h ( X ) Φ − 1 x ( X ) Φ − 1 x ( X ) ⊤  , ˜ σ h ( x ) := ˜ µ h , 0 ( x ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ µ h , 1 ( x ) , ˜ W x , h ( X ) := L x , h ( X ) ˜ σ h ( x )  1 − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Φ − 1 x ( X )  . W e also deﬁne the p opulation v ersions of ˆ M h , 0 ( x , y ), ˆ m h , 0 ( x ), ˆ M h , 1 ( x , y ) and ˆ m h , 1 ( x ) as ˜ M h , 0 ( x , y ) := E [ L x , h ( X ) d 2 M ( Y , y )] ˜ µ h , 0 ( x ) , ˜ m h , 0 ( x ) := argmin y ∈ M ˜ M h , 0 ( x , y ) , ˜ M h , 1 ( x , y ) := E h ˜ W x , h ( X ) d 2 M ( Y , y ) i , ˜ m h , 1 ( x ) := argmin y ∈ M ˜ M h , 1 ( x , y ) . Condition M1. F or e ach s ∈ { 0 , 1 } , (i) m ⊕ ( x ) , ˜ m h ,s ( x ) and ˆ m h ,s ( x ) uniquely exist, the latter almost sur ely; (ii) for any ϵ > 0 , it holds that lim inf n inf y ∈ M : d M ( y , ˜ m h ,s ( x )) >ϵ h ˜ M h ,s ( x , y ) − ˜ M h ,s ( x , ˜ m h ,s ( x )) i > 0 , inf y ∈ M : d M ( y ,m ⊕ ( x )) >ϵ [ M ⊕ ( x , y ) − M ⊕ ( x , m ⊕ ( x ))] > 0 . 9 The condition M1 is satisﬁed for v arious metric spaces. F or example, the spaces in Examples 2.1 and 2.2 satisfy this condition by Prop osition 1 of Petersen and M ¨ uller ( 2019 ) and Section B.3 of Zhou and M ¨ uller ( 2022 ), resp ectiv ely . F or the space in Example 2.3 , the condition M1 -(i) is satisﬁed under v arious manifold conditions; see Afsari ( 2011 ) and Charlier ( 2013 ), for example. No w, we present the consistency . Theorem 3.1. Assume that the c onditions L , B , D1 – D3 and M1 hold. F or e ach s ∈ { 0 , 1 } , it holds that d M ( ˆ m h ,s ( x ) , m ⊕ ( x )) = o P (1) . 3.2 Rate of con v ergence In this section, w e establish the conv ergence rates of the prop osed estimators. T o this end, w e in tro duce additional conditions. Let ¯ f : R 2 d \ N → [0 , ∞ ) denote an extension of the densit y f , deﬁned as ¯ f ( z ) := f z ⊤ 1 ∥ z 1 ∥ 2 , . . . , z ⊤ d ∥ z d ∥ 2 ! , z =  z ⊤ 1 , . . . , z ⊤ d  ⊤ ∈ R 2 d \ N . Additionally , let ∇ 2 ¯ f denote the Hessian of ¯ f . With a slight abuse of notation, w e again use ∥ · ∥ 2 to denote the matrix op erator norm. Condition D4. The homo gene ous extension ¯ f is twic e diﬀer entiable on R 2 d \ N and it holds that sup z ∈ T d ∥∇ 2 ¯ f ( z ) ∥ 2 < ∞ . The condition D4 implies the condition D2 . F or g y deﬁned in ( 3.1 ), let ¯ g y : R 2 d \ N → [0 , ∞ ) denote the homogeneous extension of g y , deﬁned as ¯ g y ( z ) := g y   z ⊤ 1 ∥ z 1 ∥ 2 , . . . , z ⊤ d ∥ z d ∥ 2 ! ⊤   , z =  z ⊤ 1 , . . . , z ⊤ d  ⊤ ∈ R 2 d \ N . Additionally , let ∇ ¯ g y and ∇ 2 ¯ g y denote the gradien t and Hessian of ¯ g y , resp ectively . Condition D5. The homo gene ous extension ¯ g y is twic e diﬀer entiable on R 2 d \ N and it holds that sup y ∈ M sup z ∈ T d g y ( z ) < ∞ , sup y ∈ M sup z ∈ T d ∥∇ ¯ g y ( z ) ∥ 2 < ∞ and sup y ∈ M sup z ∈ T d ∥∇ 2 ¯ g y ( z ) ∥ 2 < ∞ . Conditions of this type w ere employ ed in earlier works, including Di Marzio et al. ( 2014 ) and P etersen and M ¨ uller ( 2019 ). The condition D5 implies the condition D3 . No w, we in tro duce t wo additional conditions on the metric space M . 10 Condition M2. F or e ach s ∈ { 0 , 1 } , ther e exist c onstants h ⊕ > 0 , η ⊕ > 0 , C ⊕ > 0 and β ⊕ ∈ (1 , ∞ ) such that ˜ M h ,s ( x , y ) − ˜ M h ,s ( x , ˜ m h ,s ( x )) ≥ C ⊕ · d M ( y , ˜ m h ,s ( x )) β ⊕ whenever ∥ h ∥ 2 < h ⊕ and d M ( y , ˜ m h ,s ( x )) < η ⊕ , and that M ⊕ ( x , y ) − M ⊕ ( x , m ⊕ ( x )) ≥ C ⊕ · d M ( y , m ⊕ ( x )) β ⊕ whenever d M ( y , m ⊕ ( x )) < η ⊕ . Conditions analogous to the condition M2 app eared in earlier w orks (e.g., Petersen and M ¨ uller ( 2019 )). When η ⊕ is suﬃciently large, the condition M2 implies the condition M1 -(ii). The condition M2 with β ⊕ = 2 holds for the spaces in Examples 2.1 and 2.2 b y Prop osition 1 of Petersen and M ¨ uller ( 2019 ) and Section B.3 of Zhou and M ¨ uller ( 2022 ), resp ectively . An analogous result holds with β ⊕ = 2 for the space in Example 2.3 , under the additional assumptions in Prop osition 3 of P etersen and M ¨ uller ( 2019 ). T o in tro duce the next condition, let B M ( y , δ ) denote the op en ball in M centered at y ∈ M with radius δ > 0 and let N ( r, B M ( y , δ ) , d M ) denote the r -co vering n um b er of B M ( y , δ ) with resp ect to the metric d M . Condition M3. Ther e exist c onstants r M > 0 and α M ∈ (0 , 1] such that sup y ∈ M : d M ( y ,m ⊕ ( x )) 0. The error rate in Theorem 3.2 is then optimized b y choosing γ = 1 / ( d + 4), which yields d M ( ˆ m h ,s ( x ) , m ⊕ ( x )) = O P  n − 2 ( d +4)( β ⊕ − 1)  . When β ⊕ = 2, which is achiev ed b y v arious ( M , d M ), this rate simpliﬁes to d M ( ˆ m h ,s ( x ) , m ⊕ ( x )) = O P  n − 2 / ( d +4)  , whic h coincides with the optimal error rate in nonparametric regression with a d -dimensional predictor. It is w orth noting that the lo cal constan t estimator ˆ m h , 0 ( x ) attains the same error rate as the lo cal linear estimator ˆ m h , 1 ( x ) for any x ∈ T d . This con trasts with classical nonparametric smo othing results, in which b oundary eﬀects t ypically lead to slow er error rates for lo cal constant estimators. Since the torus T d has no b oundary , such eﬀects do not arise in the presen t setting. Nev ertheless, the numerical studies presen ted in the following sections reveal that the lo cal linear estimator exhibits superior ﬁnite-sample performance compared with the lo cal constan t estimator. 4 Sim ulation study W e conducted a simulation study with d = 2 and a spherical resp onse ( M = S 2 ). Note that S 2 is a t w o-dimensional compact Riemannian manifold without boundary , equipp ed with the geo desic distance d S 2 ( v , u ) := arccos( v ⊤ u ) . W e compared the p erformance of our estimators ˆ m h , 0 and ˆ m h , 1 with the lo cal constant F r ´ echet regression estimator ˆ m TW h, 0 of T uck er and W u ( 2025 ) and the lo cal linear F r´ ec het regression estima- tor ˆ m PM h , 1 of Petersen and M ¨ uller ( 2019 ). Although ˆ m PM h , 1 is designed for Euclidean predictors, we applied it b y interpreting the angles of the toroidal predictors as Euclidean predictors and using a pro duct kernel. Bandwidth parameters for these methods w ere selected using 5-fold cross-v alidation. F or the estimators emplo ying multiple bandwidths ( ˆ m h , 0 , ˆ m h , 1 and ˆ m PM h , 1 ), w e adopted a t wo-stage grid searc h to reduce the computational burden while maintaining accuracy . In the ﬁrst stage, we conducted a coarse search ov er a 10 × 10 grid. F or our estimators, the initial grid w as set to { 0 . 1 × k : k = 1 , . . . , 10 } 2 . F or ˆ m PM h , 1 , w e considered a wider initial grid, namely { 0 . 5 × k : 12 k = 1 , . . . , 20 } 2 , reﬂecting diﬀerences in the underlying kernel sc hemes. In the second stage, we p erformed a ﬁner search o ver a 5 × 5 grid cen tered at the bandwidth vector h (1) opt selected in the ﬁrst stage. The grid spacing in this stage was set to one-quarter of that used in the ﬁrst stage, namely 0 . 025 for our estimators and 0 . 125 for ˆ m PM h , 1 . F or ˆ m TW h, 0 , w e searched its optimal bandwidth o ver the one-dimensional grid { 0 . 01 × k : k = 1 , . . . , 50 } . W e employ ed the von Mises k ernel for our estimators, and the Epanechnik o v kernel for ˆ m TW h, 0 and ˆ m PM h , 1 . W e sp eciﬁed the regression function m ⊕ : T 2 → S 2 as m ⊕ ( x ) = (cos ψ , sin ϕ, sin ψ cos ϕ ) ⊤ ∥ (cos ψ , sin ϕ, sin ψ cos ϕ ) ⊤ ∥ 2 , x = (cos ψ , sin ψ , cos ϕ, sin ϕ ) ⊤ ∈ T 2 , where ψ , ϕ ∈ [ − π , π ). F or sample sizes n ∈ { 50 , 100 , 200 } and noise lev els σ ∈ { 0 . 1 , 0 . 25 } , w e generated i.i.d. random samples { ( X ( i ) , Y ( i ) ) } n i =1 o ver R = 100 Mon te Carlo replications according to X ( i ) ∼ U ( T 2 ) , Y ( i ) | X ( i ) ∼ V MF ( m ⊕ ( X ( i ) ) , 1 /σ ) , where U ( T 2 ) denotes the uniform distribution on T 2 , and V MF ( µ, κ ) denotes the v on Mises- Fisher distribution with mean direction µ ∈ S 2 and concentration parameter κ > 0. Note that V MF ( µ, κ ) exhibits larger v ariability for smaller v alues of κ . W e ev aluated the p erformance using the mean in tegrated squared error (MISE), deﬁned as 1 R R X r =1 Z T 2 d 2 S 2  ˆ m [ r ] ⊕ ( x ) , m ⊕ ( x )  d ω 2 1 ( x ) = 1 R R X r =1 Z π − π Z π − π d 2 S 2  ˆ m [ r ] ⊕ (cos ψ , sin ψ , cos ϕ, sin ϕ ) , m ⊕ (cos ψ , sin ψ , cos ϕ, sin ϕ )  d ϕ d ψ , where ˆ m [ r ] ⊕ ( x ) denotes the estimator ev aluated at x and computed from the r -th Monte Carlo replication. T able 1 summarizes the simulation results. The results sho w that our lo cal linear estimator ˆ m h , 1 attains the smallest MISE across all scenarios. In con trast, the estimator ˆ m PM h , 1 exhibits mark edly inferior p erformance, as it fails to account for the in trinsic geometry of T 2 . Additionally , our lo cal constan t estimator ˆ m h , 0 outp erforms ˆ m TW h, 0 at the lo w er noise lev el, whereas the opp osite pattern is observed at the higher noise level. Ov erall, these results indicate that the prop osed metho ds are promising options for F r´ ec het regression with toroidal predictors. 13 T able 1: MISE c omp arison in spheric al-tor oidal r e gr ession. σ n ˆ m h , 0 ˆ m h , 1 ˆ m TW h, 0 ˆ m PM h , 1 0.1 50 4.672 3.026 4.861 30.321 100 2.057 1.273 2.237 15.662 200 1.034 0.635 1.148 6.949 0.25 50 9.523 8.373 8.448 33.968 100 4.065 3.435 3.930 18.083 200 2.056 1.623 2.031 8.235 5 Real data analysis W e analyzed the New Y ork taxi net work data obtained from https://www.nyc.gov/site/tlc/ about/tlc- trip- record- data.page . F ollo wing Zhou and M ¨ uller ( 2022 ), we fo cused on yello w taxi trips within Manhattan, whic h w ere aggregated in to 13 regions. T o inv estigate the temp oral dynamics of taxi demand, w e constructed a 13 × 13 graph Laplacian for eac h hour on Sunda ys from Jan uary 1, 2021 to December 31, 2024. Sp eciﬁcally , for each hour, w e formed a 13 × 13 adjacency matrix whose ( k , ℓ )-th en try represen ts the num b er of y ellow taxi trips b etw een the k -th and ℓ -th regions. Each adjacency matrix w as then transformed into a graph Laplacian. W e mo deled the predictor space as the tw o-dimensional torus T 2 = S 1 × S 1 to capture the dual p erio dic structure induced b y daily and annual cycles. In particular, the hour i 1 ∈ { 0 , . . . , 23 } of the day and the i 2 -th day of the y ear were enco ded as X = cos  2 π ( i 1 + 0 . 5) 24  , sin  2 π ( i 1 + 0 . 5) 24  , cos  2 π ( i 2 − 0 . 5) D  , sin  2 π ( i 2 − 0 . 5) D  ! ⊤ ∈ T 2 , where D = 365 for non-leap years and D = 366 for leap y ears. The response v ariable Y ∈ L 13 corresp onds to the asso ciated graph Laplacian. Data from 2021 and 2022 were used for training, while data from 2023 were used for bandwidth v alidation. W e compared the predictive p erformance of the four estimators ˆ m h , 0 , ˆ m h , 1 , ˆ m TW h, 0 and ˆ m PM h , 1 , considered in Section 4 , on data from 2024. F or bandwidth selection of ˆ m h , 0 , ˆ m h , 1 and ˆ m PM h , 1 , w e again emplo yed a t wo-stage grid searc h. In the ﬁrst stage, we used the grid { 0 . 02 × k : k = 1 , . . . , 10 } × { 0 . 01 × k : k = 1 , . . . , 10 } for our estimators, and { 0 . 02 × k : k = 1 , . . . , 20 } × { 0 . 01 × k : k = 1 , . . . , 20 } for ˆ m PM h , 1 . The second stage was p erformed in the same manner as describ ed in Section 4 . F or ˆ m TW h, 0 , we searc hed its bandwidth ov er the grid { 0 . 004 × k : k = 1 , . . . , 50 } . 14 Figure 1: Fitte d yel low taxi networks in Manhattan at sele cte d hours and c alendar dates. Edge thickness is pr op ortional to absolute ridership, while c olor indic ates r elative ridership c omp ar e d with the over al l data aver age. The resulting a v erage squared prediction errors, computed using the F rob enius distance, were 8 . 218 × 10 5 for ˆ m h , 0 , 8 . 100 × 10 5 for ˆ m h , 1 , 8 . 937 × 10 5 for ˆ m TW h, 0 and 8 . 393 × 10 5 for ˆ m PM h , 1 . These results conﬁrm that our lo cal linear approac h outp erforms the lo cal constant approac hes and highligh t the imp ortance of resp ecting the underlying toroidal geometry . T o interpret the eﬀect of X on Y , we reﬁtted the model to the full dataset using the b est- p erforming estimator ˆ m h , 1 . Figure 1 displays the ﬁtted taxi netw orks at selected hours and calendar dates when the corresp onding day is Sunda y . Across all dates, taxi ridership is high during late-nigh t and evening hours and lo w in the early morning, likely reﬂecting increased demand during perio ds of so cial activit y and reduced public transp ortation service. In addition, compared with other p erio ds, taxi ridership in late December app ears to b e relativ ely lo wer, whic h 15 ma y b e attributed to colder weather and a greater tendency for p eople to sta y at home during the holida y season. Overall, this ﬁgure illustrates that taxi demand is strongly inﬂuenced by b oth the time of da y and the calendar date. Ac kno wledgemen ts Chang Jun Im was supp orted by the National Researc h F oundation of Korea grant funded by the Korea gov ernmen t(MSIT) (No. RS-2025-00515381). Jeong Min Jeon was supp orted b y the National Research F oundation of Korea grant funded by the Korea gov ernmen t(MSIT) (No. RS- 2023-00211910). References Afsari, B. (2011). Riemannian L p cen ter of mass: Existence, uniqueness, and conv exit y . Pr o c e e dings of the A meric an Mathematic al So ciety , 139 , 655-673. Bai, Z. D., Radhakrishna Rao, C. and Zhao, L. C. (1988). Kernel estimators of densit y function of directional data. Journal of Multivariate A nalysis , 27 , 24-39. Bhattac harjee, S. and M ¨ uller, H.-G. (2023). Single index F r´ ec het regression. Annals of Statistics , 51 , 1770-1798. Bisw as, S. and Banerjee, B. (2025). A semi-parametric torus-to-torus regression mo del with geo- metric loss: Application to cyclone data. . Charlier, B. (2013). Necessary and suﬃcien t condition for the existence of a F r´ ec het mean on the circle. ESAIM: Pr ob ability and Statistics , 17 , 635-649. Di Marzio, M., P anzera, A. and T aylor, C. C. (2009). Lo cal p olynomial regression for circular predictors. Statistics and Pr ob ability L etters , 79 , 2066-2075. Di Marzio, M., P anzera, A. and T a ylor, C. C. (2011). Kernel density estimation on the torus. Journal of Statistic al Planning and Infer enc e , 141 , 2156-2173. Di Marzio, M, Panzera, A. and T a ylor, C. C. (2014). Nonparametric regression for spherical data. Journal of the Americ an Statistic al Asso ciation , 109 , 748-763. 16 Garc ´ ıa-P ortugu ´ es, E., Crujeiras, R. M. and Gonz´ alez-Man teiga, W. (2013). Kernel density esti- mation for directional-linear data. Journal of Multivariate Analysis , 121 , 152-175. Eltzner, B., Huc kemann, S. and Mardia, K. V. (2018). T orus principal comp onent analysis with applications to RNA structure. Annals of Applie d Statistics , 12 , 1332-1359. Hall, P ., W atson, G. S. and Cabrera, J. (1987). Kernel densit y estimation with spherical data. Biometrika , 74 , 751-762. Im, C. J., Jeon, J. M. and Park, B. U. (2025). Lo cal F r´ ec het regression with spherical predictors. Ele ctr onic Journal of Statistics , 74 , 751-762. Jeon, J. M., Park, B. U. and V an Keilegom, I. (2021). Additiv e regression for non-Euclidean resp onses and predictors. Annals of Statistics , 49 , 2611-2641. Jeon, J. M., Park, B. U. and V an Keilegom, I. (2022). Nonparametric regression on Lie groups with measurement errors. Annals of Statistics , 50 , 2973-3008. Jung, S., Park, K. and Kim, B. (2021). Clustering on the torus b y conformal prediction. A nnals of Applie d Statistics , 15 , 1583-1603. Le, H. and Kendall, D. G. (1993). The Riemannian structure of Euclidean shap e spaces: A no v el en vironment for statistics. Annals of Statistics , 21 , 1225-1271. Lin, Z. and M ¨ uller, H.-G. (2021). T otal v ariation regularized F r´ ec het regression for metric-space v alued data. Annals of Statistics , 49 , 3510-3533. Mardia, K. V. and Jupp, P . E. (2000). Dir e ctional Statistics . John Wiley & Sons. P anaretos, V. M. and Zemel, Y. (2020). An invitation to statistics in Wasserstein sp ac e . Springer. P elletier, B. (2006). Non-parametric regression estimation on closed Riemannian manifolds. Jour- nal of Nonp ar ametric Statistics , 18 , 57-67. P etersen, A. and M ¨ uller, H.-G. (2019). F r´ ec het regression for random ob jects with Euclidean predictors. Annals of Statistics , 47 , 691-719. Rupp ert, D. and W and, M. P . (1994). Multiv ariate Lo cally W eigh ted Least Squares Regression. The Annals of Statistics , 22 , 1346-1370. 17 San b orn, S., Mathe, J., P apillon, M., Buracas, D., Lillemark, H. J., Shewmak e, C., Bertics, A., P ennec, X. and Miolane, N. (2024). Beyond Euclid: An illustrated guide to mo dern machine learning with geometric, top ological, and algebraic structures. . Stey er, L., St¨ ock er, A., Greven, S. and Alzheimer’s Disease Neuroimaging Initiativ e. (2025). Mo del- based F r ´ echet regression in (quotien t) metric spaces with a fo cus on elastic curv es. Journal of Multivariate Analysis , 211 , 105515. T uc ker, D. C. and W u, Y. (2025). Partially-global F r´ ec het regression. Statistic a Sinic a , 35 , 713-736. T uc ker, D. C., W u, Y. and M¨ uller, H.-G. (2023). V ariable selection for global F r´ ec het regression. Journal of the Americ an Statistic al Asso ciation , 118 , 1023-1037. Xu, D. and W ang, Y. (2023). Density estimation for toroidal data using semiparametric mixtures. Statistics and Computing , 33 , 140. Zhou, Y. and M ¨ uller, H.-G. (2022). Net w ork regression with graph Laplacians. Journal of Machine L e arning R ese ar ch , 23 , 1-41. Corresp onding Author Jeong Min Jeon Departmen t of Statistics and Sc ho ol of T ransdisciplinary Inno v ations, Seoul National Univ ersit y , 1 Gwanak-ro, Gw anak-gu, Seoul 08826, South Korea E-mail addr ess: jeongmin.jeon.stat@gmail.com 18 Supplemen tary Material to “Lo cal F r ´ ec het regression with toroidal predictors” Chang Jun Im and Jeong Min Jeon Seoul National Universit y , South Korea The Supplemen tary Material consists of tw o parts. In the ﬁrst part, we provide the pro ofs of Theorem 3.1 and Theorem 3.2 for the lo cal constant estimator. In the second part, we pro ve them for the lo cal linear estimator. A Pro ofs of Theorem 3.1 and Theorem 3.2 for ˆ m h , 0 ( x ) T o deriv e the consistency and error rates for d M ( ˆ m h , 0 ( x ) , m ⊕ ( x )), we establish the prop erties of d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) and d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )). The latter properties are obtained b y in v estigating ˜ M h , 0 ( x , y ) − M ⊕ ( x , y ) and ˆ M h , 0 ( x , y ) − ˜ M h , 0 ( x , y ), resp ectively . First, we collect some notations. Let N ≥ 0 denote the set of all non-negative integers. W e write j = ( j 1 , . . . , j d ) ⊤ for j ℓ ∈ N ≥ 0 . F or k ∈ { 1 , 2 } , we deﬁne c h , j ,k ( L ) := 2 d " d Y =1 Z π 0 L k  1 − cos θ ℓ h 2 ℓ  θ j ℓ ℓ d θ ℓ # . (A.1) T o simplify ( A.1 ), we deﬁne a function L h : R d → [0 , ∞ ) as L h ( u ) := d Y ℓ =1 L  1 − cos u ℓ h 2 ℓ  , u = ( u 1 , . . . , u d ) ⊤ . Then, we get the expression c h , j ,k ( L ) = 2 d Z [0 ,π ) d L k h ( θ ) θ j d θ , where u j = Q d ℓ =1 u j ℓ ℓ . Note that Z [ − π ,π ) d L k h ( θ ) θ j d θ =      0 some j ℓ is an o dd num b er , c h , j ,k ( L ) otherwise . (A.2) Also, it holds that L x , h ( Φ x ( θ )) = d Y ℓ =1 L 1 − ( x ℓ cos θ ℓ + ( Rx ℓ ) sin θ ℓ ) ⊤ x ℓ h 2 ℓ ! = L h ( θ ) (A.3) for every θ ∈ [ − π , π ) d , where L x , h and Φ x are deﬁned in ( 2.1 ) and ( 2.5 ), resp ectively . W e write ρ ( h ) : = Q d ℓ =1 h ℓ and | j | : = P d ℓ =1 j ℓ . 19 A.1 Lemmas for pro of of Theorem 3.1 Lemma A.1. L et h : T d → R b e an inte gr able function. Then, for any x ∈ T d , it holds that Z T d h ( z ) ω d 1 (d z ) = Z [ − π ,π ) d h ( Φ x ( θ )) d θ . Pr o of. By Lemma 2 of Garc ´ ıa-P ortugu´ es et al. (2013), we get Z S 1 h ℓ ( z ℓ ) ω 1 (d z ℓ ) = Z π − π h ℓ ( Φ x ,ℓ ( θ ℓ )) d θ ℓ , ℓ = 1 , . . . , d for any in tegrable function h ℓ : S 1 → R . The lemma follo ws from F ubini’s theorem. □ Lemma A.2. Assume that the c ondition L holds. Then, for any j = ( j 1 , . . . , j d ) with j ℓ ∈ N ≥ 0 and k ∈ { 1 , 2 } , it holds that lim h → 0 d c h , j ,k ( L ) h j ρ ( h ) = 2 3 d + | j | 2 " d Y ℓ =1 Z ∞ 0 L k ( r 2 ) r j ℓ d r # . Pr o of of L emma A.2 . By Lemma B.2 of Im et al. (2025), we get lim h ℓ → 0 2  Z π 0 L k  1 − cos θ ℓ h 2 ℓ  θ j ℓ ℓ d θ ℓ  h − ( j ℓ +1) ℓ = 2 j ℓ +3 2 Z ∞ 0 L k ( r 2 ) r j ℓ d r , ℓ = 1 , . . . , d. (A.4) By taking Q d ℓ =1 on b oth sides of ( A.4 ), w e get the desired result. □ Lemma A.3. Assume that the c onditions L and D2 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, for any k ∈ { 1 , 2 } , it holds that E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x ) = o ( ρ ( h )) . Pr o of of L emma A.3 . Com bining ( A.3 ) and Lemma A.1 , we get E  L k x , h ( X )  = Z T d L k x , h ( z ) f ( z ) ω d (d z ) = Z [ − π ,π ) d L k x , h ( Φ x ( θ )) f ( Φ x ( θ )) d θ = Z [ − π ,π ) d L k h ( θ ) f ( Φ x ( θ )) d θ . Also, ( A.2 ) implies that E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x ) = Z [ − π ,π ) d L k h ( θ )  f ( Φ x ( θ )) − f ( x )  d θ . (A.5) 20 Let ϵ > 0 b e an y giv en constan t. By the condition D2 , there exists a constant δ ϵ ∈ (0 , π ), dep ending on ϵ , such that ∥ z − x ∥ 2 < δ ϵ ⇒ | f ( z ) − f ( x ) | < ϵ, z ∈ T d . (A.6) F or any θ ∈ [ − π , π ) d , it holds that ∥ Φ x ( θ ) − x ∥ 2 2 = 2 d X ℓ =1 (1 − cos θ ℓ ) = 4 d X ℓ =1 sin 2  θ ℓ 2  ≤ ∥ θ ∥ 2 2 . (A.7) W e deﬁne sets E 0 , E 1 , . . . , E d ⊂ [ − π , π ) d as E 0 :=  − δ ϵ √ d , δ ϵ √ d  d , E m :=  ( z 1 , . . . , z d ) ⊤ ∈ [ − π , π ) d : | z m | ≥ δ ϵ √ d  , m = 1 , . . . , d. Note that [ − π , π ) d = S d m =0 E m . By ( A.5 ), we get   E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x )   ≤ d X m =0 Z E m L k h ( θ ) | f ( Φ x ( θ )) − f ( x ) | d θ . (A.8) By ( A.7 ), we also get ∥ Φ x ( θ ) − x ∥ 2 ≤ ∥ θ ∥ 2 ≤ δ ϵ (A.9) for any θ ∈ E 0 . Combining ( A.6 ) and ( A.9 ), w e get | f ( Φ x ( θ )) − f ( x ) | ≤      ϵ if θ ∈ E 0 , 2 sup z ∈ T d f ( z ) if θ / ∈ E 0 . (A.10) Com bining ( A.2 ), ( A.8 ) and ( A.10 ), we also get    E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x )    ≤ ϵ Z E 0 L k h ( θ )d θ + 2 sup z ∈ T d f ( z ) d X m =1 Z E m L k h ( θ )d θ ≤ ϵ Z [ − π ,π ) d L k h ( θ )d θ + 2 sup z ∈ T d f ( z ) d X m =1 Z E m L k h ( θ )d θ = ϵ · c h , 0 d ,k ( L ) + 2 sup z ∈ T d f ( z ) d X m =1 Z E m L k h ( θ )d θ . (A.11) W e write θ − m = ( θ 1 , . . . , θ m − 1 , θ m +1 , . . . θ d ) ⊤ ∈ R d − 1 . It holds that Z E m L k h ( θ )d θ = Z [ − π ,π ) d − 1 " d Y ℓ  = m L k  1 − cos θ ℓ h 2 ℓ  # d θ − m ! Z [ − π ,π ) \  − δ ϵ √ d , δ ϵ √ d  L k  1 − cos θ m h 2 m  d θ m ! = 2 d Z [0 ,π ) d − 1 " d Y ℓ  = m L k  1 − cos θ ℓ h 2 ℓ  # d θ − m ! Z π δ ϵ √ d L k  1 − cos θ m h 2 m  d θ m ! . (A.12) 21 By a mo diﬁcation of Lemma A.2 , we get 2 d − 1 Z [0 ,π ) d − 1 " d Y ℓ  = m L k  1 − cos θ ℓ h 2 ℓ  # d θ − m = O Y 1 ≤ ℓ  = m ≤ d h ℓ ! . (A.13) Also, Lemma B.2 of Im et al. (2025) implies that 2 Z π δ ϵ √ d L k  1 − cos θ m h 2 m  d θ m = o ( h m ) . (A.14) Com bining ( A.12 ), ( A.13 ) and ( A.14 ), we get Z E m L k h ( θ )d θ = o ( ρ ( h )) . (A.15) Com bining ( A.11 ), ( A.15 ) and Lemma A.2 , we also get   E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x )   ≤ ϵ · O ( ρ ( h )) + 2 sup z ∈ T d f ( z ) d X m =1 o ( ρ ( h )) . (A.16) Since ( A.16 ) holds for any ϵ ∈ (0 , ∞ ), we get the desired result. □ Lemma A.4. Assume that the c onditions L , D2 and D3 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, for any k ∈ { 1 , 2 } , it holds that sup y ∈ M    E  L k x , h ( X ) g y ( X )  − c h , 0 d ,k ( L )( f · g y )( x )    = o ( ρ ( h )) . Pr o of of L emma A.4 . The lemma follows by arguing as in the pro of of Lemma A.3 . □ Lemma A.5. Assume that the c onditions L and D1 – D3 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that sup y ∈ M   ˜ M h , 0 ( x , y ) − M ⊕ ( x , y )   = o (1) . Pr o of of L emma A.5 . Com bining Lemma A.3 and Lemma A.4 , we get sup y ∈ M    E [ L x , h ( X ) g y ( X )] − g y ( x ) E [ L x , h ( X )]    = o ( ρ ( h )) . (A.17) Lemma A.2 and Lemma A.3 imply that ( c h , 0 d , 1 ( L ) f ( x )) − 1 = O  ρ − 1 ( h )  , c h , 0 d , 1 ( L ) f ( x ) ( E [ L x , h ( X )]) − 1 = O (1) , ( E [ L x , h ( X )]) − 1 = O  ρ − 1 ( h )  . (A.18) 22 Com bining ( A.17 ) and ( A.18 ), w e get sup y ∈ M     E [ L x , h ( X ) g y ( X )] E [ L x , h ( X )] − g y ( x )     = o (1) . (A.19) Note that M ⊕ ( x , y ) = Z M d 2 M ( y , w )d P Y | X = x ( w ) = Z M d 2 M ( y , w ) g w ( x )d P Y ( w ) , ˜ M h , 0 ( x , y ) = E [ L x , h ( X ) M ⊕ ( X , y )] E [ L x , h ( X )] = Z M d 2 M ( y , w ) E [ L x , h ( X ) g w ( X )] E [ L x , h ( X )] d P Y ( w ) , (A.20) where the last equalit y in ( A.20 ) follows from F ubini’s theorem. Since d 2 M ( y , w ) ≤ diam( M ) 2 < ∞ , the lemma follo ws from ( A.19 ) and ( A.20 ). □ Lemma A.6. Assume that the c onditions L , D1 – D3 and M1 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) = o (1) . Pr o of of L emma A.6 . The lemma follows b y arguing as in the pro of of Lemma B.7 of Im et al. (2025) and using Lemma A.5 . □ Lemma A.7. Assume that the c onditions L , B , D1 and D2 hold. Then, it holds that sup y ∈ M   ˆ M h , 0 ( x , y ) − ˜ M h , 0 ( x , y )   = o P (1) Pr o of of L emma A.7 . Note that   ˆ M h , 0 ( x , y ) − ˜ M h , 0 ( x , y )   ≤ diam( M ) 2 E [ L x , h ( X )] ·     n − 1 n X i =1 L x , h  X ( i )  − E [ L x , h ( X )]     + 1 E [ L x , h ( X )] ·     n − 1 n X i =1 L x , h  X ( i )  d 2 M ( Y ( i ) , y ) − E  L x , h ( X ) d 2 M ( Y , y )      . (A.21) 23 Lemma A.2 and Lemma A.3 imply that n − 1 n X i =1 L x , h  X ( i )  − E [ L x , h ( X )] = O P  q n − 1 V ar [ L x , h ( X )]  = O P  n − 1 2 ρ 1 2 ( h )  , n − 1 n X i =1 L x , h  X ( i )  d 2 M ( Y ( i ) , y ) − E  L x , h ( X ) d 2 M ( Y , y )  = O P  q n − 1 V ar [ L x , h ( X ) d 2 M ( Y , y )]  = O P  n − 1 2 ρ 1 2 ( h )  . (A.22) Com bining ( A.18 ), ( A.21 ) and ( A.22 ), we get   ˆ M h , 0 ( x , y ) − ˜ M h , 0 ( x , y )   = O P  n − 1 2 ρ − 1 2 ( h )  . The rest of the pro of pro ceeds as in the pro of of Lemma B.8 of Im et al. (2025). □ Lemma A.8. Assume that the c onditions L , B , D1 , D2 and M1 hold. Then, it holds that d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) = o P (1) . Pr o of of L emma A.8 . The lemma follows b y arguing as in the pro of of Lemma B.9 of Im et al. (2025). □ A.2 Pro of of Theorem 3.1 for ˆ m h , 0 ( x ) The theorem follo ws from Lemma A.6 and Lemma A.8 . A.3 Lemmas for pro of of Theorem 3.2 Let g : T d → R b e a suﬃcien tly smo oth function and ¯ g : R 2 d \ N → R b e its homogeneous extension, as deﬁned in ( 2.2 ). Diﬀerentiating once more in ( 2.3 ) with resp ect to t k giv es ∂ 2 ∂ t ℓ ∂ t k ¯ g ( D z t ) = ( D z e ℓ ) ⊤ ∇ 2 ¯ g ( D z t ) D z e k = 0 , k , ℓ = 1 , . . . , d, where ∇ 2 ¯ g ( D z t ) denotes the Hessian of ¯ g at D z t . Let O p denote the p × p zero matrix. Collecting these terms in to matrix form yields D ⊤ z ∇ 2 ¯ g ( D z t ) D z = O d . 24 By substituting t = (1 , . . . , 1) ⊤ ∈ (0 , ∞ ) d , we get D ⊤ z ∇ 2 ¯ g ( z ) D z = O d . (A.23) F rom ( 2.4 ) and ( A.23 ), we get D ⊤ z ∇ ¯ f ( z ) = 0 d , D ⊤ z ∇ 2 ¯ f ( z ) D z = O d , D ⊤ z ∇  ¯ f · ¯ g y  ( z ) = 0 d , D ⊤ z ∇ 2  ¯ f · ¯ g y  ( z ) D z = O d , (A.24) where ∇ 2 ¯ f and ∇ 2  ¯ f · ¯ g y  denote the Hessians of ¯ f and ¯ f · ¯ g y , resp ectively . Lemma A.9 b elow pro vides a T a ylor-like expansion for the densit y function f of X . Lemma A.9. Assume that ¯ f is diﬀer entiable on R 2 d \ N . Then, for any θ ∈ [ − π , π ) d , it holds that | f ( Φ x ( θ )) − f ( x ) | ≤ ∥ θ ∥ 2 sup t ∈ [0 , 1]   ∇ ¯ f ( Φ x ( t θ ))   2 . (A.25) If ¯ f is twic e diﬀer entiable on R 2 d \ N , then it holds that    f ( Φ x ( θ )) − f ( x ) − ( R x θ ) ⊤ ∇ ¯ f ( x )    ≤ 1 2 ∥ θ ∥ 2 2 sup t ∈ [0 , 1]   ∇ 2 ¯ f ( Φ x ( t θ ))   2 . (A.26) Pr o of of L emma A.9 . W e ﬁrst pro v e ( A.25 ). Note that the Jacobian matrix of Φ x : [ − π , π ) d → T d is ev aluated as J Φ x ( θ ) =      − x 1 sin θ 1 + ( Rx 1 ) cos θ 1 · · · 0 2 . . . . . . . . . 0 2 · · · − x d sin θ d + ( Rx d ) cos θ d      = R Φ x ( θ ) . (A.27) By ( A.27 ), the gradient ∇  ¯ f ◦ Φ x  of ¯ f ◦ Φ x is ev aluated as ∇  ¯ f ◦ Φ x  ( θ ) = J Φ x ( θ ) ⊤ ∇ ¯ f ( Φ x ( θ )) = R ⊤ Φ x ( θ ) ∇ ¯ f ( Φ x ( θ )) . (A.28) T a ylor’s remainder theorem and ( A.28 ) imply that there exists t ∗ θ ∈ [0 , 1] such that f ( Φ x ( θ )) − f ( x ) =  R Φ x ( t ∗ θ θ ) θ  ⊤ ∇ ¯ f ( Φ x ( t ∗ θ θ )) . (A.29) Note that   R Φ x ( t ∗ θ θ ) θ   2 = ∥ θ ∥ 2 . This and ( A.29 ) imply that | f ( Φ x ( θ )) − f ( x ) | ≤   R Φ x ( t ∗ θ θ ) θ   2   ∇ ¯ f ( Φ x ( t ∗ θ θ ))   2 ≤ ∥ θ ∥ 2 sup t ∈ [0 , 1]   ∇ ¯ f ( Φ x ( t θ ))   2 , whic h gives ( A.25 ). 25 No w, we prov e ( A.26 ). Since Φ x ( 0 d ) = x , ( A.28 ) implies that ∇  ¯ f ◦ Φ x  ( 0 d ) = R ⊤ Φ x ( 0 d ) ∇ ¯ f ( Φ x ( 0 d )) = R ⊤ x ∇ ¯ f ( x ) . By the second-order chain rule for vector-v alued comp ositions, the Hessian of f ◦ Φ x can b e written as ∇ 2 ( ¯ f ◦ Φ x )( θ ) = J Φ x ( θ ) ⊤ ∇ 2 ¯ f ( Φ x ( θ )) J Φ x ( θ ) + 2 d X k =1 ∂ ¯ f ∂ z k ( Φ x ( θ )) ∇ 2 ( Φ x ) k ( θ ) . (A.30) W e show that the second term in ( A.30 ) v anishes. Note that ∂ 2 ∂ θ 2 ℓ Φ x ,ℓ ( θ ℓ ) = − Φ x ,ℓ ( θ ℓ ) , ∂ 2 ∂ θ ℓ ∂ θ k Φ x ( θ ) = 0 2 d , ℓ  = k . Hence, all nonzero second deriv ativ es of Φ x lie in the radial directions Φ x ,ℓ ( θ ℓ ). On the other hand, ( A.24 ) implies that z ⊤ ℓ ∂ ¯ f ( z ) ∂ z ℓ = 0 , ℓ = 1 , . . . , d. Ev aluating at z = Φ x ( θ ), each term in the summation of ( A.30 ) reduces to a blo ckwise inner pro duct b etw een Φ x ,ℓ ( θ ℓ ) and ∂ ¯ f /∂ z ℓ , which v anishes by the ab ov e constraint. Hence, the second term in ( A.30 ) is identically zero. Consequen tly , ∇ 2 ( ¯ f ◦ Φ x )( θ ) = R ⊤ Φ x ( θ ) ∇ 2 ¯ f ( Φ x ( θ )) R Φ x ( θ ) . By T aylor’s remainder theorem, there exists t ∗∗ θ ∈ [0 , 1] such that f ( Φ x ( θ )) − f ( x ) − ( R x θ ) ⊤ ∇ ¯ f ( x ) = 1 2 θ ⊤ R ⊤ Φ x ( t ∗∗ θ θ ) ∇ 2 ¯ f ( Φ x ( t ∗∗ θ θ )) R Φ x ( t ∗∗ θ θ ) θ . (A.31) Note that   R Φ x ( t ∗∗ θ θ ) θ   2 = ∥ θ ∥ 2 . This and ( A.31 ) imply that | f ( Φ x ( θ )) − f ( x ) − ( R x θ ) ⊤ ∇ ¯ f ( x ) | ≤ 1 2   R Φ x ( t ∗∗ θ θ ) θ   2 2   ∇ 2 ¯ f ( Φ x ( t ∗∗ θ θ ))   2 ≤ 1 2 ∥ θ ∥ 2 2 sup t ∈ [0 , 1]   ∇ 2 ¯ f ( Φ x ( t θ ))   2 . This completes the pro of. □ Lemma A.10. Assume that ¯ f and ¯ g y ar e diﬀer entiable on R 2 d \ N . Then, for any θ ∈ [ − π , π ) d , it holds that | ( f · g y ) ( Φ x ( θ )) − ( f · g y ) ( x ) | ≤ ∥ θ ∥ 2 sup t ∈ [0 , 1]   ∇  ¯ f · ¯ g y  ( Φ x ( t θ ))   2 . If ¯ f and ¯ g y ar e twic e diﬀer entiable on R 2 d \ N , then it holds that    ( f · g y ) ( Φ x ( θ )) − ( f · g y ) ( x ) − ( R x θ ) ⊤ ∇  ¯ f · ¯ g y  ( x )    ≤ 1 2 ∥ θ ∥ 2 2 sup t ∈ [0 , 1]   ∇ 2  ¯ f · ¯ g y  ( Φ x ( t θ ))   2 . 26 Pr o of of L emma A.10 . The lemma follows by arguing as in the pro of of Lemma A.9 . □ Lemma A.11. Assume that the c onditions L and D4 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, for any k ∈ { 1 , 2 } , it holds that E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x ) = O  ρ ( h ) ∥ h ∥ 2 2  . Pr o of of L emma A.11 . F or any v = ( v 1 , . . . , v d ) ⊤ ∈ R d , ( A.2 ) implies that Z [ − π ,π ) d L k h ( θ ) θ ⊤ v d θ = d X ℓ =1 v ℓ Z [ − π ,π ) d L k h ( θ ) θ ℓ d θ = 0 . (A.32) Com bining Lemma A.2 and ( A.2 ), we get Z [ − π ,π ) d L k h ( θ ) ∥ θ ∥ 2 2 d θ = d X ℓ =1 Z [ − π ,π ) d L k h ( θ ) θ 2 ℓ d θ = d X ℓ =1 c h , 2 e ℓ ,k ( L ) = O  ρ ( h ) ∥ h ∥ 2 2  . (A.33) Com bining ( A.5 ), ( A.32 ), ( A.33 ) and Lemma A.9 , w e also get   E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x )   =     E  L k x , h ( X )  − c h , 0 d ,k ( L ) f ( x ) − Z [ − π ,π ) d L k h ( θ ) θ ⊤ R ⊤ x ∇ ¯ f ( x )d θ     =     Z [ − π ,π ) d L k h ( θ )  f ( Φ x ( θ )) − f ( x ) − ( R x θ ) ⊤ ∇ ¯ f ( x )  d θ     ≤ 1 2 sup z ∈ T d   ∇ 2 ¯ f ( z )   2 Z [ − π ,π ) d L k h ( θ ) ∥ θ ∥ 2 2 d θ = 1 2 sup z ∈ T d   ∇ 2 ¯ f ( z )   2 O  ρ ( h ) ∥ h ∥ 2 2  . This completes the pro of. □ Lemma A.12. Assume that the c onditions L , D4 and D5 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, for any k ∈ { 1 , 2 } , it holds that sup y ∈ M    E  L k x , h ( X ) g y ( X )  − c h , 0 d ,k ( L )( f · g y )( x )    = O  ρ ( h ) ∥ h ∥ 2 2  . Pr o of of L emma A.12 . The lemma follo ws by arguing as in the pro of of Lemma A.11 and using Lemma A.10 . □ Lemma A.13. Assume that the c onditions L , D1 , D4 , D5 , M1 and M2 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) β ⊕ − 1 = O  ∥ h ∥ 2 2  . 27 Pr o of of L emma A.13 . Com bining ( A.20 ) and the fact that ˜ M h , 0 ( x , m ⊕ ( x )) ≥ ˜ M h , 0 ( x , ˜ m h , 0 ( x )), w e get M ⊕ ( x , ˜ m h , 0 ( x )) − M ⊕ ( x , m ⊕ ( x )) ≤ ( M ⊕ ( x , ˜ m h , 0 ( x )) − M ⊕ ( x , m ⊕ ( x ))) +  ˜ M h , 0 ( x , m ⊕ ( x )) − ˜ M h , 0 ( x , ˜ m h , 0 ( x ))  = Z M  d 2 M ( m ⊕ ( x ) , w ) − d 2 M ( ˜ m h , 0 ( x ) , w )   E [ L x , h ( X ) g w ( X )] E [ L x , h ( X )] − g w ( x )  d P Y ( w ) ≤ Z M   d 2 M ( m ⊕ ( x ) , w ) − d 2 M ( ˜ m h , 0 ( x ) , w )       E [ L x , h ( X ) g w ( X )] E [ L x , h ( X )] − g w ( x )     d P Y ( w ) ≤ 2diam( M ) d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) sup y ∈ M     E [ L x , h ( X ) g y ( X )] E [ L x , h ( X )] − g y ( x )     . (A.34) Lemma A.11 and Lemma A.12 imply that sup y ∈ M   E [ L x , h ( X ) g y ( X )] − g y ( x ) E [ L x , h ( X )]   = O  ρ ( h ) ∥ h ∥ 2 2  . (A.35) Com bining ( A.18 ) and ( A.35 ), w e get sup y ∈ M     E [ L x , h ( X ) g y ( X )] E [ L x , h ( X )] − g y ( x )     = O  ∥ h ∥ 2 2  . (A.36) By Lemma A.6 , there exists a constan t N ∈ N such that d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) < η ⊕ whenev er n ≥ N , where η ⊕ > 0 is the constan t deﬁned in the condition M2 . By the condition M2 , we get C ⊕ · d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) β ⊕ ≤ M ⊕ ( x , ˜ m h , 0 ( x )) − M ⊕ ( x , m ⊕ ( x )) (A.37) whenev er n ≥ N , where C ⊕ > 0 is the constant deﬁned in the condition M2 . Combining ( A.34 ), ( A.36 ) and ( A.37 ), we get the desired result. □ In the pro of of Lemma A.14 b elo w, for any metric space ( T , d T ) and constant δ > 0, let N [] ( δ, T , d T ) and N ( δ, T , d T ) denote the δ -brac k eting n um b er and δ -cov ering n umber of ( T , d T ), resp ectiv ely . Lemma A.14. Assume that the c onditions L , B , D1 – D3 and M1 – M3 hold. Then, it holds that d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) β ⊕ − α M = O P  n − 1 2 ρ − 1 2 ( h )  . Pr o of of The or em A.14 . W e deﬁne functions ˆ T h , 0 : T d × M → R , U h , 0 : T d × M × M → R and ˆ S h , 0 : T d × M → R as ˆ T h , 0 ( x , y ) := ˆ M h , 0 ( x , y ) − ˜ M h , 0 ( x , y ) , U h , 0 ( z , w , y ) := D h , 0 ( x , w , y ) L x , h ( z ) , ˆ S h , 0 ( x , y ) := n − 1 n X i =1 U h , 0 ( X ( i ) , Y ( i ) , y ) − E [ U h , 0 ( X , Y , y )] , 28 where D h , 0 ( x , w , y ) := d 2 M ( w , y ) − d 2 M ( w , ˜ m h , 0 ( x )). Note that E [ L x , h ( X )]    ˆ T h , 0 ( x , y ) − ˆ T h , 0 ( x , ˜ m h , 0 ( x ))    = E [ L x , h ( X )]     n − 1 P n i =1 U h , 0 ( X ( i ) , Y ( i ) , y ) n − 1 P n i =1 L x , h ( X ( i ) ) − E [ U h , 0 ( X , Y , y )] E [ L x , h ( X )]     ≤ E [ L x , h ( X )]     n − 1 P n i =1 U h , 0 ( X ( i ) , Y ( i ) , y ) n − 1 P n i =1 L x , h ( X ( i ) ) − n − 1 P n i =1 U h , 0 ( X ( i ) , Y ( i ) , y ) E [ L x , h ( X )]     +     n − 1 n X i =1 U h , 0 ( X ( i ) , Y ( i ) , y ) − E [ U h , 0 ( X , Y , y )]     =     n − 1 n X i =1 U h , 0 ( X ( i ) , Y ( i ) , y )     ·     E [ L x , h ( X )] n − 1 P n i =1 L x , h ( X ( i ) ) − 1     +    ˆ S h , 0 ( x , y )    ≤ P n i =1   U h , 0 ( X ( i ) , Y ( i ) , y )   P n i =1 L x , h ( X ( i ) ) ·     E [ L x , h ( X )] − n − 1 n X i =1 L x , h  X ( i )      +    ˆ S h , 0 ( x , y )    ≤ 2diam( M ) · d M ( y , ˜ m h , 0 ( x )) ·     E [ L x , h ( X )] − n − 1 n X i =1 L x , h  X ( i )      +    ˆ S h , 0 ( x , y )    , (A.38) where the last inequality follo ws from that | D h , 0 ( x , w , y ) | ≤ 2diam( M ) d M ( y , ˜ m h , 0 ( x )) , | U h , 0 ( z , w , y ) | ≤ 2diam( M ) d M ( y , ˜ m h , 0 ( x )) L x , h ( z ) . Regarding the ﬁrst term in ( A.38 ), we get E "     n − 1 n X i =1 L x , h  X ( i )  − E [ L x , h ( X )]     # ≤ v u u t V ar " n − 1 n X i =1 L x , h ( X ( i ) ) # = O  n − 1 2 ρ 1 2 ( h )  b y ( A.22 ). Hence, there exist constants L 1 > 0 and N 1 ∈ N suc h that E "     n − 1 n X i =1 L x , h  X ( i )  − E [ L x , h ( X )]     # ≤ L 1 n − 1 2 ρ 1 2 ( h ) (A.39) whenev er n ≥ N 1 . Regarding the second term in ( A.38 ), we apply Theorems 2.7.11 and 2.14.2 of v an der V aart and W ellner (1996). T o do so, we deﬁne a set H h ,δ and a function H h ,δ : T d × M → R as H h ,δ := { U h , 0 ( · , · , y ) : y ∈ B M ( ˜ m h , 0 ( x ) , δ ) } , H h ,δ ( z , w ) := 2 δ diam( M ) L x , h ( z ) for δ > 0. By Theorem 2.14.2 of v an der V aart and W ellner (1996), there exists a constan t C > 0 suc h that E   sup y ∈ B M ( ˜ m h , 0 ( x ) ,δ )    ˆ S h , 0 ( x , y )      ≤ C √ n · ∥ H h ,δ ∥ L 2 ( P ) · Z 1 0 r 1 + log N []  ∥ H h ,δ ∥ L 2 ( P ) ϵ, H h ,δ , d M  d ϵ, (A.40) 29 where ∥ H h ,δ ∥ L 2 ( P ) :=  E  H 2 h ,δ ( X , Y )  1 / 2 = 2 δ diam( M )  E  L 2 x , h ( X )  1 / 2 . Note that | U h , 0 ( z , w , y 1 ) − U h , 0 ( z , w , y 2 ) | ≤ d M ( y 1 , y 2 ) H h ,δ ( z , w ) δ (A.41) for any y 1 , y 2 ∈ B M ( ˜ m h , 0 ( x ) , δ ). Combining Theorem 2.7.11 of v an der V aart and W ellner (1996), ( A.40 ) and ( A.41 ), we get E   sup y ∈ B M ( ˜ m h , 0 ( x ) ,δ )    ˆ S h , 0 ( x , y )      ≤ C √ n · ∥ H h ,δ ∥ L 2 ( P ) · Z 1 0 r 1 + log N  ϵ 2 , B M ( ˜ m h , 0 ( x ) , δ ) , d M  d ϵ = 2 C √ n · ∥ H h ,δ ∥ L 2 ( P ) · Z 1 2 0 q 1 + log N ( ϵ, B M ( ˜ m h , 0 ( x ) , δ ) , d M ) d ϵ. (A.42) The condition M3 implies that there exist constan ts L 2 > 0 and δ M ∈ (0 , 1) such that sup y ∈ M : d M ( y ,m ⊕ ( x )) 0 and α M ∈ (0 , 1] are the constants deﬁned in the condition M3 . Also, Lemma A.6 implies that there exists a constant N 2 ∈ N suc h that d M ( ˜ m h , 0 ( x ) , m ⊕ ( x )) < r M (A.44) whenev er n ≥ N 2 . Combining ( A.43 ) and ( A.44 ), w e get Z 1 / 2 0 q 1 + log N ( δ ϵ, B M ( ˜ m h , 0 ( x ) , δ ) , d M ) d ϵ ≤ L 2 δ α M − 1 (A.45) whenev er δ ∈ (0 , δ M ] and n ≥ N 2 . Lemma A.2 , Lemma A.3 and ( A.18 ) imply that there exist constan ts L 3 > 0 and N 3 ∈ N suc h that E  L 2 x , h ( X )  ≤ L 3 ρ ( h ) , ( E [ L x , h ( X )]) − 1 ≤ L 3 ρ − 1 ( h ) (A.46) whenev er n ≥ N 3 . Combining ( A.42 ), ( A.45 ) and ( A.46 ), we get E   sup y ∈ B M ( ˜ m h , 0 ( x ) ,δ )    ˆ S h , 0 ( x , y )      ≤ 2 C √ n ·  2 δ diam( M ) L 1 2 3 ρ 1 2 ( h )  ·  L 2 δ α M − 1  = 4 C L 2 L 1 2 3 diam( M ) δ α M n − 1 2 ρ 1 2 ( h ) (A.47) whenev er δ ∈ (0 , δ M ] and n ≥ max { N 2 , N 3 } . 30 Let N 4 := max { N 1 , N 2 , N 3 } and L 4 := 2 L 3 diam( M )( L 1 + 2 C L 2 L 1 / 2 3 ). Com bining ( A.38 ), ( A.39 ), ( A.46 ) and ( A.47 ), we get E   sup y ∈ B M ( ˜ m h , 0 ( x ) ,δ )    ˆ T h , 0 ( x , y ) − ˆ T h , 0 ( x , ˜ m h , 0 ( x ))      ≤ ( E [ L x , h ( X )]) − 1 · (2diam( M )) · δ ·     E [ L x , h ( X )] − n − 1 n X i =1 L x , h  X ( i )      + ( E [ L x , h ( X )]) − 1 · E   sup y ∈ B M ( ˜ m h , 0 ( x ) ,δ )    ˆ S h , 0 ( x , y )      ≤ 2 L 1 L 3 diam( M ) δ n − 1 2 ρ − 1 2 ( h ) + 4 C L 2 L 3 2 3 diam( M ) δ α M n − 1 2 ρ − 1 2 ( h ) ≤ L 4 δ α M n − 1 2 ρ − 1 2 ( h ) (A.48) whenev er δ ∈ (0 , δ M ] and n ≥ N 4 . Let ϵ > 0 b e giv en. Recall the constan ts η ⊕ > 0 and C ⊕ > 0 deﬁned in the condition M2 . Since P ∞ k =1 4 − k ( β ⊕ − α M ) /β ⊕ < ∞ , there exists a constant M ϵ ∈ N , dep ending on ϵ , suc h that L 4 C ⊕ ∞ X k = M ϵ +1 4 − k ( β ⊕ − α M ) β ⊕ < ϵ 8 . (A.49) By Lemma A.8 , there exists a constant N ϵ ∈ N , dep ending on ϵ , suc h that P ( d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) ≥ η ⊕ ) < ϵ 2 (A.50) whenev er n ≥ N ϵ . Let t n := ( nρ ( h )) β ⊕ / (4 β ⊕ − 4 α M ) . W e deﬁne even ts E n,ϵ , A n,k and B n as E n,ϵ := n d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) > 2 2 M ϵ β ⊕ ( nρ ( h )) − 1 2 β ⊕ − 2 α M o = n t n d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) β ⊕ 2 > 2 M ϵ o , A n,k := n 2 k − 1 < t n d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) β ⊕ 2 ≤ 2 k o , B n := { d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) < η ⊕ } . Since E n,ϵ = ∪ ∞ k = M ϵ +1 A n,k , it holds that P ( E n,ϵ ) ≤ ∞ X k = M ϵ +1 P ( A n,k ∩ B n ) + P ( B c n ) . (A.51) 31 Let r n,k := min { (4 k t − 2 n ) 1 /β ⊕ , δ M } . Since ˆ M h , 0 ( x , ˆ m h , 0 ( x )) ≤ ˆ M h , 0 ( x , ˜ m h , 0 ( x )), it holds that sup y ∈ B M ( ˜ m h , 0 ( x ) ,r n,k )    ˆ T h , 0 ( x , y ) − ˆ T h , 0 ( x , ˜ m h , 0 ( x ))    ≥ − ˆ T h , 0 ( x , ˆ m h , 0 ( x )) + ˆ T h , 0 ( x , ˜ m h , 0 ( x )) ≥ ˜ M h , 0 ( x , ˆ m h , 0 ( x )) − ˜ M h , 0 ( x , ˜ m h , 0 ( x )) ≥ C ⊕ d M ( ˆ m h , 0 ( x ) , ˜ m h , 0 ( x )) β ⊕ ≥ C ⊕ 4 k − 1 t 2 n (A.52) on the ev ent A n,k ∩ B n . Using a v ersion of ( A.48 ) with δ b eing replaced by r n,k and applying Mark ov’s inequalit y to the probability of ( A.52 ), w e get P ( A n,k ∩ B n ) ≤ P sup y ∈ B M ( ˜ m h , 0 ( x ) ,r n,k )    ˆ T h , 0 ( x , y ) − ˆ T h , 0 ( x , ˜ m h , 0 ( x ))    ≥ C ⊕ 4 k − 1 t 2 n ! ≤ t 2 n C ⊕ 4 k − 1 E " sup y ∈ B M ( ˜ m h , 0 ( x ) ,r n,k )    ˆ T h , 0 ( x , y ) − ˆ T h , 0 ( x , ˜ m h , 0 ( x ))    # ≤ L 4 t 2 n C ⊕ 4 k − 1 ( r n,k ) α M n − 1 2 ρ − 1 2 ( h ) ≤ 4 L 4 C ⊕ 4 − k ( β ⊕ − α M ) β ⊕ (A.53) whenev er n ≥ N 4 . Combining ( A.49 ), ( A.50 ), ( A.51 ) and ( A.53 ), we get P ( E n,ϵ ) ≤ 4 L 4 C ⊕ ∞ X k = M ϵ +1 4 − k ( β ⊕ − α M ) β ⊕ + P ( B c n ) < ϵ whenev er n ≥ max { N 4 , N ϵ } . Since ϵ > 0 is arbitrary , we get the desired result. □ A.4 Pro of of Theorem 3.2 for ˆ m h , 0 ( x ) The theorem follo ws from Lemma A.13 and Lemma A.14 . 32 B Pro ofs of Theorem 3.1 and Theorem 3.2 for ˆ m h , 1 ( x ) First, we collect some notations. W e deﬁne ˜ τ h , 0 ( x , y ) := E [ L x , h ( X ) g y ( X )] , ˜ τ h , 1 ( x , y ) := E  L x , h ( X ) Φ − 1 x ( X ) g y ( X )  , ˜ ν h , 0 ( x , y ) := E  L x , h ( X ) d 2 M ( Y , y )  , ˜ ν h , 1 ( x , y ) := E  L x , h ( X ) Φ − 1 x ( X ) d 2 M ( Y , y )  , ˆ ν h , 0 ( x , y ) := n − 1 n X i =1 L x , h ( X ( i ) ) d 2 M ( Y ( i ) , y ) , ˆ ν h , 1 ( x , y ) := n − 1 n X i =1 L x , h ( X ( i ) ) Φ − 1 x ( X ( i ) ) d 2 M ( Y ( i ) , y ) . Then, ˜ M h , 1 and ˆ M h , 1 can b e written as ˜ M h , 1 ( x , y ) = 1 ˜ σ h ( x ) ( ˜ ν h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ ν h , 1 ( x , y )) , ˆ M h , 1 ( x , y ) = 1 ˆ σ h ( x ) ( ˆ ν h , 0 ( x , y ) − ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ ν h , 1 ( x , y )) , (B.1) resp ectiv ely . W e write a j,k ( L ) := Z ∞ 0 L k ( r 2 ) r j d r for j ∈ N ≥ 0 and k ∈ { 1 , 2 } . W e deﬁne the diagonal matrices Λ h =      h 1 · · · 0 . . . . . . . . . 0 · · · h d      , C h =      c h , 2 e 1 , 1 ( L ) · · · 0 . . . . . . . . . 0 · · · c h , 2 e d , 1 ( L )      . B.1 Lemmas for pro of of Theorem 3.1 W e ﬁrst prov e that ˜ µ h , 2 ( x ) is inv ertible for suﬃciently large n and that ˆ µ h , 2 ( x ) is inv ertible with probabilit y tending to one. Lemma B.1. Assume that the c onditions L and D2 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that   Λ − 1 h ˜ µ h , 1 ( x )   2 = o ( ρ ( h )) ,   Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h − Λ − 2 h C h f ( x )   2 = o ( ρ ( h )) . (B.2) 33 Mor e over, C h is invertible for suﬃciently lar ge n and it holds that     1 c h , 0 d , 1 ( L ) Λ − 2 h C h − 2 a 2 , 1 ( L ) a 0 , 1 ( L ) I d     2 = o (1) ,     c h , 0 d , 1 ( L ) Λ 2 h C − 1 h − a 0 , 1 ( L ) 2 a 2 , 1 ( L ) I d     2 = o (1) . (B.3) If the c ondition D1 further holds, then ˜ µ h , 2 ( x ) is invertible for suﬃciently lar ge n and it holds that     Λ h ˜ µ h , 2 ( x ) − 1 Λ h − 1 f ( x ) Λ 2 h C − 1 h     2 = o  ρ − 1 ( h )  ,     c h , 0 d , 1 ( L ) Λ h ˜ µ h , 2 ( x ) − 1 Λ h − a 0 , 1 ( L ) 2 a 2 , 1 ( L ) f ( x ) I d     2 = o (1) . (B.4) Pr o of of L emma B.1 . W e ﬁrst prov e ( B.2 ). Com bining ( A.2 ), ( A.3 ) and Lemma A.1 , we get Z T d L x , h ( z ) Φ − 1 x ( z ) ω d 1 (d z ) = Z [ − π ,π ) d L x , h ( Φ x ( θ )) θ d θ = Z [ − π ,π ) d L h ( θ ) θ d θ = 0 d , Z T d L x , h ( z ) Φ − 1 x ( z ) Φ − 1 x ( z ) ⊤ ω d 1 (d z ) = Z [ − π ,π ) d L x , h ( Φ x ( θ )) θ θ ⊤ d θ = Z [ − π ,π ) d L h ( θ ) θ θ ⊤ d θ = C h . (B.5) The Cauch y-Sc h warz inequalit y and ( B.5 ) imply that   Λ − 1 h ˜ µ h , 1 ( x )   2 2 =     Λ − 1 h Z T d L x , h ( z ) Φ − 1 x ( z )  f ( z ) − f ( x )  ω d 1 (d z )     2 2 ≤  Z T d L x , h ( z )   Λ − 1 h Φ − 1 x ( z )   2   f ( z ) − f ( x )   ω d 1 (d z )  2 =  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 | f ( Φ x ( θ )) − f ( x ) | d θ  2 ≤  Z [ − π ,π ) d L h ( θ ) | f ( Φ x ( θ )) − f ( x ) | d θ  ·  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 2 | f ( Φ x ( θ )) − f ( x ) | d θ  ≤  Z [ − π ,π ) d L h ( θ ) | f ( Φ x ( θ )) − f ( x ) | d θ  · 2 sup z ∈ T d f ( z ) ·  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 2 d θ  . (B.6) By arguing as in the pro of of Lemma A.3 , we get Z [ − π ,π ) d L h ( θ ) | f ( Φ x ( θ )) − f ( x ) | d θ = o ( ρ ( h )) . (B.7) 34 Also, ( A.2 ) and Lemma A.2 imply that Z [ − π ,π ) d L k h ( θ ) ∥ Λ − 1 h θ ∥ 2 2 d θ = d X ℓ =1 1 h 2 ℓ Z [ − π ,π ) d L k h ( θ ) θ 2 ℓ d θ = d X ℓ =1 c h , 2 e ℓ ,k ( L ) h 2 ℓ = O ( ρ ( h )) (B.8) for any k ∈ { 1 , 2 } . Combining ( B.6 ), ( B.7 ) and ( B.8 ), w e get   Λ − 1 h ˜ µ h , 1 ( x )   2 2 = o  ρ 2 ( h )  . (B.9) Similarly , the Cauc h y-Sch warz inequality and ( B.5 ) imply that   Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h − Λ − 2 h C h f ( x )   2 2 =     Λ − 1 h  Z T d L x , h ( z ) Φ − 1 x ( z ) Φ − 1 x ( z ) ⊤ ( f ( z ) − f ( x )) ω d 1 (d z )  Λ − 1 h     2 2 =     Z T d L x , h ( z ) Λ − 1 h Φ − 1 x ( z ) Φ − 1 x ( z ) ⊤ Λ − 1 h ( f ( z ) − f ( x )) ω d 1 (d z )     2 2 ≤  Z T d L x , h ( z )   Λ − 1 h Φ − 1 x ( z )   2 2   f ( z ) − f ( x )   ω d 1 (d z )  2 =  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 2 | f ( Φ x ( θ )) − f ( x ) | d θ  2 ≤  Z [ − π ,π ) d L h ( θ ) | f ( Φ x ( θ )) − f ( x ) | d θ  ·  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   4 2 | f ( Φ x ( θ )) − f ( x ) | d θ  ≤  Z [ − π ,π ) d L h ( θ ) | f ( Φ x ( θ )) − f ( x ) | d θ  · 2 sup z ∈ T d f ( z ) ·  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   4 2 d θ  . (B.10) Also, ( A.2 ) and Lemma A.2 imply that Z [ − π ,π ) d L k h ( θ )   Λ − 1 h θ   4 2 d θ = d X ℓ =1 d X m =1 1 h 2 ℓ h 2 m Z [ − π ,π ) d L k h ( θ ) θ 2 ℓ θ 2 m d θ = d X ℓ =1 d X m =1 c h , 2( e ℓ + e m ) ,k ( L ) h 2 ℓ h 2 m = O ( ρ ( h )) (B.11) for any k ∈ { 1 , 2 } . Combining ( B.7 ), ( B.10 ) and ( B.11 ), w e get   Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h − Λ − 2 h C h f ( x )   2 2 = o  ρ 2 ( h )  . (B.12) Then, ( B.2 ) follo ws from ( B.9 ) and ( B.12 ). 35 No w, we prov e the inv ertibilit y of C h and ( B.3 ). Note that 1 c h , 0 d , 1 ( L ) Λ − 2 h C h = diag  c h , 2 e 1 , 1 ( L ) c h , 0 d , 1 ( L ) h 2 1 , . . . , c h , 2 e d , 1 ( L ) c h , 0 d , 1 ( L ) h 2 d  . (B.13) By Lemma A.2 , we get lim n →∞ c h , 2 e ℓ , 1 ( L ) c h , 0 d , 1 ( L ) h 2 ℓ = 2 a 2 , 1 ( L ) a 0 , 1 ( L ) > 0 , ℓ = 1 , . . . , d. (B.14) Com bining ( B.13 ) and ( B.14 ), C h is inv ertible for suﬃciently large n and it holds that     1 c h , 0 d , 1 ( L ) Λ − 2 h C h − 2 a 2 , 1 ( L ) a 0 , 1 ( L ) I d     2 = o (1) . (B.15) Com bining ( B.14 ) and ( B.15 ), w e get     c h , 0 d , 1 ( L ) Λ 2 h C − 1 h − a 0 , 1 ( L ) 2 a 2 , 1 ( L ) I d     2 =      2 a 2 , 1 ( L ) a 0 , 1 ( L ) I d − 1 c h , 0 d , 1 ( L ) Λ − 2 h C h  ·  a 0 , 1 ( L ) c h , 0 d , 1 ( L ) 2 a 2 , 1 ( L ) Λ 2 h C − 1 h      2 ≤     2 a 2 , 1 ( L ) a 0 , 1 ( L ) I d − 1 c h , 0 d , 1 ( L ) Λ − 2 h C h ·     2 a 0 , 1 ( L ) 2 a 2 , 1 ( L )   c h , 0 d , 1 ( L ) Λ 2 h C − 1 h   2 = o (1) . No w, we prov e the inv ertibilit y of ˜ µ h , 2 ( x ) and ( B.4 ). Note that inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u ≥ f ( x )  inf u ∈ S d u ⊤ Λ − 2 h C h u  − sup u ∈ S d u ⊤  Λ − 2 h C h f ( x ) − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  u = f ( x ) min  c h , 2 e 1 , 1 ( L ) h 2 1 , . . . , c h , 2 e d , 1 ( L ) h 2 d  −   Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h − Λ − 2 h C h f ( x )   2 . (B.16) Com bining Lemma A.2 , ( B.2 ) and ( B.16 ), we get lim inf n →∞ ρ − 1 ( h )  inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u  ≥ f ( x ) lim inf n →∞  min  c h , 2 e 1 , 1 ( L ) ρ ( h ) h 2 1 , . . . , c h , 2 e d , 1 ( L ) ρ ( h ) h 2 d  − lim sup n →∞ ρ − 1 ( h )   Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h − Λ − 2 h C h f ( x )   2 = 2 3 d +2 2 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) > 0 . (B.17) 36 This implies that Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h is p ositive-deﬁnite for suﬃcien tly large n , and th us ˜ µ h , 2 ( x ) is in vertible for suﬃciently large n . By ( B.17 ), we get lim sup n →∞ ρ ( h )     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 = lim sup n →∞ ρ ( h )  inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u  − 1 =  lim inf n →∞ ρ − 1 ( h )  inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u  − 1 ≤ 2 − 3 d +2 2 1 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) . (B.18) Com bining Lemma A.2 , ( B.2 ), ( B.14 ) and ( B.18 ), w e get     Λ h ˜ µ h , 2 ( x ) − 1 Λ h − 1 f ( x ) Λ 2 h C − 1 h     2 =      Λ h ˜ µ h , 2 ( x ) − 1 Λ h  ·  Λ − 2 h C h f ( x ) − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  ·  1 f ( x ) Λ 2 h C − 1 h      2 ≤     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·   Λ − 2 h C h f ( x ) − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 ·     1 f ( x ) Λ 2 h C − 1 h     2 =  ρ ( h )     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2  ·   Λ − 2 h C h f ( x ) − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 ·     ρ ( h ) f ( x ) Λ 2 h C − 1 h     2 · ρ − 2 ( h ) =  ρ ( h )     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2  ·   Λ − 2 h C h f ( x ) − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 · max  ρ ( h ) h 2 1 f ( x ) c h , 2 e 1 , 1 ( L ) , . . . , ρ ( h ) h 2 d f ( x ) c h , 2 e d , 1 ( L )  · ρ − 2 ( h ) ≤ O (1) · o ( ρ ( h )) · O (1) · ρ − 2 ( h ) = o  ρ − 1 ( h )  . This with Lemma A.2 and ( B.3 ) gives ( B.4 ). □ Lemma B.2. Assume that the c onditions L , B and D2 hold. Then, it holds that   Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )   2 = O P  n − 1 2 ρ 1 2 ( h )  ,   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 = O P  n − 1 2 ρ 1 2 ( h )  . (B.19) If the c ondition D1 further holds, then ˆ µ h , 2 ( x ) is invertible with pr ob ability tending to one and it holds that     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 = O P  n − 1 2 ρ − 3 2 ( h )  . (B.20) 37 Pr o of of L emma B.2 . W e ﬁrst prov e ( B.19 ). By Lemma A.1 , we get E h L 2 x , h ( X )  Λ − 1 h Φ − 1 x ( X )  ⊤  Λ − 1 h Φ − 1 x ( X )  i = Z T d L 2 x , h ( z )  Λ − 1 h Φ − 1 x ( z )  ⊤  Λ − 1 h Φ − 1 x ( z )  f ( z ) ω d (d z ) = Z [ − π ,π ) d L 2 h ( θ )   Λ − 1 h θ   2 2 f ( Φ x ( θ )) d θ ≤ sup z ∈ T d f ( z ) · Z [ − π ,π ) d L 2 h ( θ )   Λ − 1 h θ   2 2 d θ ,     E  L 2 x , h ( X )  Λ − 1 h Φ − 1 x ( X ) Φ − 1 x ( X ) ⊤ Λ − 1 h  2      2 =     Z T d L 2 x , h ( z )  Λ − 1 h Φ − 1 x ( z ) Φ − 1 x ( z ) ⊤ Λ − 1 h  2 f ( z ) ω d (d z )     2 =     Z [ − π ,π ) d L 2 h ( θ )  Λ − 1 h θ θ ⊤ Λ − 1 h  2 f ( Φ x ( θ )) d θ     2 ≤ Z [ − π ,π ) d L 2 h ( θ )   Λ − 1 h θ θ ⊤ Λ − 1 h   2 2 f ( Φ x ( θ )) d θ ≤ sup z ∈ T d f ( z ) · Z [ − π ,π ) d L 2 h ( θ )   Λ − 1 h θ   4 2 d θ . (B.21) Com bining ( B.8 ), ( B.11 ), ( B.21 ) and Lemma C.2 of Im et al. (2025), we get ( B.19 ). No w, we prov e the inv ertibilit y of ˆ µ h , 2 ( x ). By ( B.19 ), we get   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 = o P ( ρ ( h )) . (B.22) W e deﬁne an even t F n as F n :=    Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 ≤ ρ ( h )2 3 d − 2 2 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x )  . Let ϵ > 0 b e any giv en constan t. By ( B.22 ), we can c ho ose N 1 ,ϵ ∈ N such that P ( F n ) > 1 − ϵ whenev er n ≥ N 1 ,ϵ . By ( B.17 ), we can also choose N 2 ,ϵ ∈ N suc h that ρ − 1 ( h )  inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u  ≥ 2 3 d 2 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) whenev er n ≥ N 2 ,ϵ . On the even t F n with n ≥ max { N 1 ,ϵ , N 2 ,ϵ } , it holds that inf u ∈ S d u ⊤ Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h u ≥ inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u − sup u ∈ S d u ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h − Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  u = inf u ∈ S d u ⊤ Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h u −   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 ≥ ρ ( h )2 3 d 2 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) − ρ ( h )2 3 d − 2 2 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) = ρ ( h )2 3 d − 2 2 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) > 0 . (B.23) 38 Hence, for n ≥ max { N 1 ,ϵ , N 2 ,ϵ } , it holds that P  inf u ∈ S d u ⊤ Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h u > 0  ≥ P ( F n ) > 1 − ϵ. Since ϵ > 0 is arbitrary , this sho ws that Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h is p ositiv e-deﬁnite with probability tending to one. Th us, ˆ µ h , 2 ( x ) is in v ertible with probabilit y tending to one. No w, we prov e ( B.20 ). By ( B.18 ), w e get     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 = O  ρ − 1 ( h )  . (B.24) By ( B.23 ), we also get ρ ( h )     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1    2 = ρ ( h )  inf u ∈ S d u ⊤ Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h u  − 1 =  ρ − 1 ( h )  inf u ∈ S d u ⊤ Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h u  − 1 ≤ 2 − 3 d − 2 2 1 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) on the ev en t F n with n ≥ max { N 1 ,ϵ , N 2 ,ϵ } . This implies that lim inf n →∞ P ρ ( h )     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1    2 ≤ 2 − 3 d − 2 2 1 a d − 1 0 , 1 ( L ) a 2 , 1 ( L ) f ( x ) ! ≥ lim inf n →∞ P ( F n ) ≥ 1 − ϵ. Since ϵ > 0 is arbitrary , this s ho ws that     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1    2 = O P  ρ − 1 ( h )  . (B.25) Com bining ( B.19 ), ( B.24 ) and ( B.25 ), we get     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ≤     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h − Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h   2 ·     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1    2 ≤ O  ρ − 1 ( h )  · O P  n − 1 2 ρ 1 2 ( h )  · O P  ρ − 1 ( h )  = O P  n − 1 2 ρ − 3 2 ( h )  . This completes the pro of. □ Hereafter, w e assume that ˜ µ h , 2 ( x ) and ˆ µ h , 2 ( x ) are inv ertible without loss of generality . Below, w e prov e that ˜ σ h ( x ) > 0 for suﬃciently large n and that ˆ σ h ( x ) > 0 with probability tending to one. 39 Lemma B.3. Assume that the c onditions L , D1 and D2 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, 1 c h , 0 d , 1 ( L ) ˜ σ h ( x ) − f ( x ) = o (1) . (B.26) If the c ondition lim n →∞ nρ ( h ) = ∞ further holds, then it holds that 1 c h , 0 d , 1 ( L ) ( ˆ σ h ( x ) − ˜ σ h ( x )) = O P  n − 1 2 ρ − 1 2 ( h )  . (B.27) Pr o of of L emma B.3 . Lemma A.2 and Lemma A.3 imply that 1 c h , 0 d , 1 ( L ) ˜ µ h , 0 ( x ) − f ( x ) = o (1) . (B.28) Com bining Lemma A.2 , Lemma B.1 and ( B.24 ), w e get     1 c h , 0 d , 1 ( L ) ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ µ h , 1 ( x )     =     1 c h , 0 d , 1 ( L )  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ µ h , 1 ( x )      ≤ 1 c h , 0 d , 1 ( L ) ·   Λ − 1 h ˜ µ h , 1 ( x )   2 2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 = O  ρ − 1 ( h )  · o ( ρ ( h )) 2 · O  ρ − 1 ( h )  = o (1) . (B.29) Com bining ( B.28 ) and ( B.29 ), w e get ( B.26 ). No w, we prov e ( B.27 ). Combining Lemma A.2 and ( A.22 ), w e get     1 c h , 0 d , 1 ( L ) ( ˆ µ h , 0 ( x ) − ˜ µ h , 0 ( x ))     = O P ( n − 1 2 h − d 2 ) . (B.30) Note that ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ µ h , 1 ( x ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ µ h , 1 ( x ) =  Λ − 1 h ˆ µ h , 1 ( x )  ⊤  Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˆ µ h , 1 ( x )  −  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ µ h , 1 ( x )  =  Λ − 1 h ˆ µ h , 1 ( x )  ⊤   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1   Λ − 1 h ˆ µ h , 1 ( x )  +  Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˆ µ h , 1 ( x ) + Λ − 1 h ˜ µ h , 1 ( x )  . (B.31) Lemma B.1 and Lemma B.2 imply that   Λ − 1 h ˆ µ h , 1 ( x )   2 ≤   Λ − 1 h ˜ µ h , 1 ( x )   2 +   Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )   2 = o ( ρ ( h )) + O P  n − 1 2 ρ 1 2 ( h )  = o P ( ρ ( h )) . (B.32) 40 Com bining Lemma B.1 , Lemma B.2 , ( B.24 ), ( B.31 ) and ( B.32 ), w e get   ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ µ h , 1 ( x ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ µ h , 1 ( x )   ≤   Λ − 1 h ˆ µ h , 1 ( x )   2 2 ·     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 +   Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·    Λ − 1 h ˆ µ h , 1 ( x )   2 +   Λ − 1 h ˜ µ h , 1 ( x )   2  = o P ( ρ ( h )) 2 · O P  n − 1 2 ρ − 3 2 ( h )  + O P  n − 1 2 ρ 1 2 ( h )  · O  ρ − 1 ( h )  · ( o P ( ρ ( h )) + o ( ρ ( h ))) = o P  n − 1 2 ρ 1 2 ( h )  . This with Lemma A.2 and ( B.30 ) gives ( B.27 ). □ Hereafter, we assume that ˜ σ h ( x ) > 0 and ˆ σ h ( x ) > 0 without loss of generalit y . Lemma B.4. Assume that the c onditions L and D1 – D3 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that sup y ∈ M     ˜ τ h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ τ h , 1 ( x , y ) ˜ σ h ( x ) − g y ( x )     = o (1) . Pr o of of L emma B.4 . W e ﬁrst prov e that sup y ∈ M   Λ − 1 h ˜ τ h , 1 ( x , y )   2 = o ( ρ ( h )) . (B.33) The Cauch y-Sc h warz inequalit y and ( B.5 ) imply that   Λ − 1 h ˜ τ h , 1 ( x , y )   2 2 =     Λ − 1 h Z T d L x , h ( z ) Φ − 1 x ( z )  ( f · g y )( z ) − ( f · g y )( x )  ω d 1 (d z )     2 2 ≤  Z [ − π ,π ) d L h ( θ ) | ( f · g y ) ( Φ x ( θ )) − ( f · g y )( x ) | d θ  · 2 sup z ∈ T d ( f · g y )( z ) ·  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 d θ  (B.34) for any y ∈ M . By arguing as in the pro ofs of Lemma A.3 and Lemma A.4 , w e get Z [ − π ,π ) d L h ( θ ) sup y ∈ M | ( f · g y ) ( Φ x ( θ )) − ( f · g y )( x ) | d θ = o ( ρ ( h )) . (B.35) Com bining ( B.8 ), ( B.34 ) and ( B.35 ), we get ( B.33 ). Note that ˜ τ h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ τ h , 1 ( x , y ) − g y ( x ) ˜ σ h ( x ) = ( ˜ τ h , 0 ( x , y ) − g y ( x ) ˜ µ h , 0 ( x )) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ( ˜ τ h , 1 ( x , y ) − g y ( x ) ˜ µ h , 1 ( x )) = ( ˜ τ h , 0 ( x , y ) − g y ( x ) ˜ µ h , 0 ( x )) −  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ µ h , 1 ( x ) − g y ( x ) Λ − 1 h ˜ τ h , 1 ( x , y )  (B.36) 41 for any y ∈ M . Com bining Lemma B.1 , ( A.17 ), ( B.24 ), ( B.33 ) and ( B.36 ), we get sup y ∈ M   ˜ τ h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ τ h , 1 ( x , y ) − g y ( x ) ˜ σ h ( x )   ≤ sup y ∈ M | ˜ τ h , 0 ( x , y ) − g y ( x ) ˜ µ h , 0 ( x ) | +   Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·    Λ − 1 h ˜ µ h , 1 ( x )   2 + sup y ∈ M g y ( x ) · sup y ∈ M   Λ − 1 h ˜ τ h , 1 ( x , y )   2  = o ( ρ ( h )) + o ( ρ ( h )) · O  ρ − 1 ( h )  · ( o ( ρ ( h )) + o ( ρ ( h ))) = o ( ρ ( h )) . (B.37) Also, Lemma A.2 and ( B.26 ) imply that lim n →∞ ρ ( h ) ˜ σ h ( x ) = 2 − 3 d 2 1 a d 0 , 1 ( L ) f ( x ) > 0 . (B.38) Com bining ( B.37 ) and ( B.38 ), w e get the desired result. □ Lemma B.5. Assume that the c onditions L and D1 – D3 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that sup y ∈ M   ˜ M h , 1 ( x , y ) − M ⊕ ( x , y )   = o (1) . Pr o of of L emma B.5 . The lemma follows by arguing as in the pro of of Lemma C.6 of Im et al. (2025). □ Lemma B.6. Assume that the c onditions L , D1 – D3 and M1 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that d M ( ˜ m h , 1 ( x ) , m ⊕ ( x )) = o (1) . Pr o of of L emma B.6 . The lemma follows by arguing as in the pro of of Lemma C.7 of Im et al. (2025) and using Lemma B.5 . □ Lemma B.7. Assume that the c onditions L , B , D1 and D2 hold. Then, it holds that sup y ∈ M   ˆ M h , 1 ( x , y ) − ˜ M h , 1 ( x , y )   = o P (1) . Pr o of of L emma B.7 . W e ﬁrst prov e that    ˆ M h , 1 ( x , y ) − ˜ M h , 1 ( x , y )    = o P (1) (B.39) 42 for any y ∈ M . By ( B.1 ), we get    ˆ M h , 1 ( x , y ) − ˜ M h , 1 ( x , y )    ≤     1 ˆ σ h ( x )     · | ˆ ν h , 0 ( x , y ) − ˜ ν h , 0 ( x , y ) | +     1 ˆ σ h ( x )     ·   ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ ν h , 1 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ ν h , 1 ( x , y )   +     1 ˜ σ h ( x )     ·     1 ˆ σ h ( x )     · | ˆ σ h ( x ) − ˜ σ h ( x ) | ·   ˜ ν h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ ν h , 1 ( x , y )   . (B.40) By ( A.22 ), we also get   ˆ ν h , 0 ( x , y ) − ˜ ν h , 0 ( x , y )   = O P  n − 1 2 ρ 1 2 ( h )  . (B.41) Similarly to ( B.21 ), we get   Λ − 1 h ˜ ν h , 1 ( x , y )   2 2 ≤   E  L x , h ( X ) Λ − 1 h Φ − 1 x ( X ) d 2 M ( Y , y )    2 2 ≤ E h L 2 x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2 2 d 4 M ( Y , y ) i ≤ diam( M ) 4 · E h L 2 x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2 2 i ≤ diam( M ) 4 · sup z ∈ T d f ( z ) · Z [ − π ,π ) d L 2 h ( θ )   Λ − 1 h θ   2 2 d θ . (B.42) Com bining Lemma C.2 of Im et al. (2025), ( B.8 ) and ( B.42 ), w e get   Λ − 1 h ˜ ν h , 1 ( x , y )   2 = O ( ρ ( h )) , ∥ Λ − 1 h ˆ ν h , 1 ( x , y ) − Λ − 1 h ˜ ν h , 1 ( x , y ) ∥ 2 = O P  n − 1 2 ρ 1 2 ( h )  . (B.43) Note that ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ ν h , 1 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ ν h , 1 ( x , y ) =  Λ − 1 h ˆ µ h , 1 ( x )  ⊤  Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˆ ν h , 1 ( x , y )  −  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ ν h , 1 ( x , y )  =  Λ − 1 h ˆ µ h , 1 ( x )  ⊤  Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˆ ν h , 1 ( x , y ) − Λ − 1 h ˜ ν h , 1 ( x , y )  +  Λ − 1 h ˆ µ h , 1 ( x )  ⊤   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1   Λ − 1 h ˜ ν h , 1 ( x , y )  +  Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ ν h , 1 ( x , y )  . (B.44) 43 Com bining ( B.19 ), ( B.20 ), ( B.24 ), ( B.25 ), ( B.32 ), ( B.43 ) and ( B.44 ), w e get   ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 ˆ ν h , 1 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ ν h , 1 ( x , y )   ≤   Λ − 1 h ˆ µ h , 1 ( x )   2 ·     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·   Λ − 1 h ˆ ν h , 1 ( x , y ) − Λ − 1 h ˜ ν h , 1 ( x , y )   2 +   Λ − 1 h ˆ µ h , 1 ( x )   2 ·     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·   Λ − 1 h ˜ ν h , 1 ( x , y )   2 +   Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·   Λ − 1 h ˜ ν h , 1 ( x , y )   2 = o P ( ρ ( h )) · O P  ρ − 1 ( h )  · O P  n − 1 2 ρ 1 2 ( h )  + o P ( ρ ( h )) · O P  n − 1 2 ρ − 3 2 ( h )  · O ( ρ ( h )) + O P  n − 1 2 ρ 1 2 ( h )  · O  ρ − 1 ( h )  · O ( ρ ( h )) = O P  n − 1 2 ρ 1 2 ( h )  . (B.45) Lemma A.2 and Lemma A.3 imply that 0 ≤ ˜ ν h , 0 ( x , y ) ≤ diam( M ) 2 E [ L x , h ( X )] = O ( ρ ( h )) . (B.46) Com bining ( B.2 ), ( B.24 ), ( B.43 ) and ( B.46 ), we get   ˜ ν h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ ν h , 1 ( x , y )   =    ˜ ν h , 0 ( x , y ) −  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ ν h , 1 ( x , y )     ≤ | ˜ ν h , 0 ( x , y ) | +   Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ·   Λ − 1 h ˜ ν h , 1 ( x , y )   2 = O ( ρ ( h )) + o ( ρ ( h )) · O  ρ − 1 ( h )  · O ( ρ ( h )) = O ( ρ ( h )) . (B.47) Lemma B.3 implies that     1 c h , 0 d , 1 ( L ) ˆ σ h ( x ) − f ( x )     ≤     1 c h , 0 d , 1 ( L ) ˜ σ h ( x ) − f ( x )     +     1 c h , 0 d , 1 ( L ) ( ˆ σ h ( x ) − ˜ σ h ( x ))     = o (1) + O P  n − 1 2 ρ − 1 2 ( h )  = o P (1) . (B.48) Com bining Lemma A.2 and ( B.48 ), we get     1 ˆ σ h ( x )     = 1 c h , 0 d , 1 ( L )     1 c h , 0 d , 1 ( L ) ˆ σ h ( x )     − 1 = O  ρ − 1 ( h )  O P (1) = O P  ρ − 1 ( h )  . (B.49) Com bining Lemma A.2 , Lemma B.3 , ( B.38 ), ( B.40 ), ( B.41 ), ( B.45 ), ( B.47 ) and ( B.49 ), we also 44 get    ˆ M h , 1 ( x , y ) − ˜ M h , 1 ( x , y )    = O P  ρ − 1 ( h )  · O P  n − 1 2 ρ 1 2 ( h )  + O P  ρ − 1 ( h )  · O P  n − 1 2 ρ 1 2 ( h )  + O  ρ − 1 ( h )  · O P  ρ − 1 ( h )  · O P  n − 1 2 ρ 1 2 ( h )  · O ( ρ ( h )) = O P  n − 1 2 ρ − 1 2 ( h )  , whic h gives ( B.39 ). No w, w e pro ve the uniform version of ( B.39 ) ov er y ∈ M . First, w e deﬁne functions ˜ R h , ˆ R h : T d → R as ˜ R h ( x ) := ˜ µ h , 0 ( x ) +   Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 · E  L x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2  , ˆ R h ( x ) := ˆ µ h , 0 ( x ) +   Λ − 1 h ˆ µ h , 1 ( x )   2 ·   ( Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h ) − 1   2 · n − 1 n X i =1 L x , h  X ( i )    Λ − 1 h θ ( i )   2 . Note that | ˜ ν h , 0 ( x , y 1 ) − ˜ ν h , 0 ( x , y 2 ) | =   E  L x , h ( X )  d 2 M ( Y , y 1 ) − d 2 M ( Y , y 2 )    ≤ E  L x , h ( X )   d 2 M ( Y , y 1 ) − d 2 M ( Y , y 2 )    ≤ 2diam( M ) d M ( y 1 , y 2 ) ˜ µ h , 0 ( x ) ,   Λ − 1 h ( ˜ ν h , 1 ( x , y 1 ) − ˜ ν h , 1 ( x , y 2 ))   2 =   E  L x , h ( X ) Λ − 1 h Φ − 1 x ( X )  d 2 M ( Y , y 1 ) − d 2 M ( Y , y 2 )    2 ≤ E  L x , h ( X ) ∥ Λ − 1 h Φ − 1 x ( X ) ∥ 2   d 2 M ( Y , y 1 ) − d 2 M ( Y , y 2 )    ≤ 2diam( M ) d M ( y 1 , y 2 ) E  L x , h ( X ) ∥ Λ − 1 h Φ − 1 x ( X ) ∥ 2  (B.50) for any y 1 , y 2 ∈ M . Com bining ( B.1 ) and ( B.50 ), we get | ˜ σ h ( x ) | ·    ˜ M h , 1 ( x , y 1 ) − ˜ M h , 1 ( x , y 2 )    ≤ | ˜ ν h , 0 ( x , y 1 ) − ˜ ν h , 0 ( x , y 2 ) | +   ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ( ˜ ν h , 1 ( x , y 1 ) − ˜ ν h , 1 ( x , y 2 ))   = | ˜ ν h , 0 ( x , y 1 ) − ˜ ν h , 0 ( x , y 2 ) | +     Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ ν h , 1 ( x , y 1 ) − Λ − 1 h ˜ ν h , 1 ( x , y 2 )     ≤ | ˜ ν h , 0 ( x , y 1 ) − ˜ ν h , 0 ( x , y 2 ) | + ∥ Λ − 1 h ˜ µ h , 1 ( x ) ∥ 2 · ∥ ( Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h ) − 1 ∥ 2 · ∥ Λ − 1 h ˜ ν h , 1 ( x , y 1 ) − Λ − 1 h ˜ ν h , 1 ( x , y 2 ) ∥ 2 ≤ 2diam( M ) d M ( y 1 , y 2 ) ˜ R h ( x ) . 45 Similarly , we get | ˆ ν h , 0 ( x , y 1 ) − ˆ ν h , 0 ( x , y 2 ) | ≤ 2diam( M ) · d M ( y 1 , y 2 ) · ˆ µ h , 0 ( x ) ,   Λ − 1 h ( ˆ ν h , 1 ( x , y 1 ) − ˆ ν h , 1 ( x , y 2 ))   2 ≤ 2diam( M ) · d M ( y 1 , y 2 ) · n − 1 n X i =1 L x , h  X ( i )    Λ − 1 h θ ( i )   2 . (B.51) Also, ( B.1 ) and ( B.51 ) imply that | ˆ σ h ( x ) | ·    ˆ M h , 1 ( x , y 1 ) − ˆ M h , 1 ( x , y 2 )    ≤ 2diam( M ) d M ( y 1 , y 2 ) ˆ R h ( x ) . Similarly to ( B.42 ), we get  E  L x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2  2 ≤ E h L 2 x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2 2 i ≤ sup z ∈ T d f ( z ) · Z [ − π ,π ) d L 2 h ( θ )   Λ − 1 h θ   2 2 d θ . (B.52) Com bining Lemma C.2 of Im et al. (2025), ( B.8 ) and ( B.52 ), w e get E  L x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2  = O ( ρ ( h )) , n − 1 n X i =1 L x , h  X ( i )    Λ − 1 h θ ( i )   2 − E  L x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2  = O P  n − 1 2 ρ 1 2 ( h )  . (B.53) Com bining Lemma A.2 , ( B.28 ), ( B.30 ) and ( B.53 ), w e also get ˆ µ h , 0 ( x ) = O ( ρ ( h )) + O P  n − 1 2 ρ 1 2 ( h )  = O P ( ρ ( h )) , n − 1 n X i =1 L x , h  X ( i )    Λ − 1 h θ ( i )   2 = O ( ρ ( h )) + O P  n − 1 2 ρ 1 2 ( h )  = O P ( ρ ( h )) . (B.54) Com bining ( B.2 ), ( B.24 ), ( B.25 ), ( B.32 ), ( B.38 ), ( B.49 ), ( B.53 ) and ( B.54 ), we get     1 ˜ σ h ( x )     · ˜ R h ( x ) = O  ρ − 1 ( h )  ·  O ( ρ ( h )) + O ( ρ ( h )) · O  ρ − 1 ( h )  · O ( ρ ( h ))  = O (1) ,     1 ˆ σ h ( x )     · ˆ R h ( x ) = O P  ρ − 1 ( h )  ·  O P ( ρ ( h )) + O P ( ρ ( h )) · O P  ρ − 1 ( h )  · O P ( ρ ( h ))  = O P (1) . (B.55) The rest of the pro of pro ceeds as in the pro of of Lemma C.8 of Im et al. (2025). □ Lemma B.8. Assume that the c onditions L , B , D1 , D2 and M1 hold. Then, it holds that d M ( ˆ m h , 1 ( x ) , ˜ m h , 1 ( x )) = o P (1) . Pr o of of L emma B.8 . The lemma follows by arguing as in the pro of of Lemma A.8 and using Lemma B.7 . □ 46 B.2 Pro of of Theorem 3.1 for ˆ m h , 1 ( x ) The theorem follo ws from Lemma B.6 and Lemma B.8 . B.3 Lemmas for pro of of Theorem 3.2 Lemma B.9. Assume that the c onditions L and D4 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that   Λ − 1 h ˜ µ h , 1 ( x )   2 = O ( ρ ( h ) ∥ h ∥ 2 ) . (B.56) If the c ondition D5 further holds, then it holds that sup y ∈ M   Λ − 1 h ˜ τ h , 1 ( x , y )   2 = O ( ρ ( h ) ∥ h ∥ 2 ) . (B.57) Pr o of of L emma B.9 . W e ﬁrst prov e ( B.56 ). Combining Lemma A.9 , ( A.33 ), ( B.6 ), ( B.8 ) and the Cauc hy-Sc hw arz inequality , w e get   Λ − 1 h ˜ µ h , 1 ( x )   2 2 ≤  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 | f ( Φ x ( θ )) − f ( x ) | d θ  2 ≤  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 ∥ θ ∥ 2  sup z ∈ T d   ∇ ¯ f ( z )   2  d θ  2 ≤  sup z ∈ T d   ∇ ¯ f ( z )   2  2 ·  Z [ − π ,π ) d L h ( θ ) ∥ θ ∥ 2 2 d θ  ·  Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 2 d θ  ≤ O  ρ ( h ) ∥ h ∥ 2 2  · O ( ρ ( h )) = O  ρ 2 ( h ) ∥ h ∥ 2 2  . Using Lemma A.10 , we similarly get sup y ∈ M   Λ − 1 h ˜ τ h , 1 ( x , y )   2 ≤ sup y ∈ M Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 | ( f · g y ) ( Φ x ( θ )) − ( f · g y )( x ) | d θ ≤ sup y ∈ M sup z ∈ T d   ∇  ¯ f · ¯ g y  ( z )   2 · Z [ − π ,π ) d L h ( θ )   Λ − 1 h θ   2 ∥ θ ∥ 2 d θ = O ( ρ ( h ) ∥ h ∥ 2 ) . This completes the pro of. □ Lemma B.10. Assume that the c onditions L , D1 , D4 , D5 , M1 and M2 hold and that lim n →∞ ∥ h ∥ 2 = 0 . Then, it holds that d M ( ˜ m h , 1 ( x ) , m ⊕ ( x )) β ⊕ − 1 = O  ∥ h ∥ 2 2  . 47 Pr o of of L emma B.10 . By arguing as in the pro of of Lemma C.14 of Im et al. (2025), we get M ⊕ ( x , ˜ m h , 1 ( x )) − M ⊕ ( x , m ⊕ ( x )) ≤ 2diam( M ) · d M ( ˜ m h , 1 ( x ) , m ⊕ ( x )) · sup y ∈ M     ˜ τ h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ τ h , 1 ( x , y ) ˜ σ h ( x ) − g y ( x )     . (B.58) Com bining ( A.35 ), ( B.24 ), ( B.56 ) and ( B.57 ), we get sup y ∈ M   ˜ τ h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ τ h , 1 ( x , y ) − g y ( x ) ˜ σ h ( x )   ≤ sup y ∈ M | ˜ τ h , 0 ( x , y ) − g y ( x ) ˜ µ h , 0 ( x ) | + sup y ∈ M   ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ( ˜ τ h , 1 ( x , y ) − g y ( x ) ˜ µ h , 1 ( x ))   = sup y ∈ M | ˜ τ h , 0 ( x , y ) − g y ( x ) ˜ µ h , 0 ( x ) | + sup y ∈ M     Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h ˜ τ h , 1 ( x , y ) − g y ( x ) Λ − 1 h ˜ µ h , 1 ( x )     ≤ sup y ∈ M | ˜ τ h , 0 ( x , y ) − g y ( x ) ˜ µ h , 0 ( x ) | + sup y ∈ M g y ( x ) ·   Λ − 1 h ˜ µ h , 1 ( x )   2 2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 +   Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 · sup y ∈ M   Λ − 1 h ˜ τ h , 1 ( x , y )   2 = O  ρ ( h ) ∥ h ∥ 2 2  + O  ρ 2 ( h ) ∥ h ∥ 2 2  · O  ρ − 1 ( h )  + O ( ρ ( h ) ∥ h ∥ 2 ) · O  ρ − 1 ( h )  · O ( ρ ( h ) ∥ h ∥ 2 ) = O  ρ ( h ) ∥ h ∥ 2 2  . (B.59) Com bining ( B.38 ) and ( B.59 ), w e also get sup y ∈ M     ˜ τ h , 0 ( x , y ) − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 ˜ τ h , 1 ( x , y ) ˜ σ h ( x ) − g y ( x )     = O  ∥ h ∥ 2 2  . (B.60) Lemma B.6 and the condition M2 imply that there exists N ∈ N such that C ⊕ · d M ( ˜ m h , 1 ( x ) , m ⊕ ( x )) β ⊕ ≤ M ⊕ ( x , ˜ m h , 1 ( x )) − M ⊕ ( x , m ⊕ ( x )) (B.61) whenev er n ≥ N , where C ⊕ > 0 is the constan t deﬁned in the condition M2 . Combining ( B.58 ), ( B.60 ) and ( B.61 ), we get the desired result. □ Lemma B.11. Assume that the c onditions L , B , D1 – D3 and M1 – M3 hold. Then, it holds that d M ( ˆ m h , 1 ( x ) , ˜ m h , 1 ( x )) β ⊕ − α M = O P  n − 1 2 ρ − 1 2 ( h )  . 48 Pr o of of The or em B.11 . W e deﬁne functions ˆ T h , 1 : T d × M → R , U h , 1 : T d × M × M → R and ˆ S h , 1 : T d × M → R as ˆ T h , 1 ( x , y ) := ˆ M h , 1 ( x , y ) − ˜ M h , 1 ( x , y ) , U h , 1 ( z , w , y ) := D h , 1 ( x , w , y ) ˜ W x , h ( z ) , ˆ S h , 1 ( x , y ) := n − 1 n X i =1 U h , 1 ( X ( i ) , Y ( i ) , y ) − E [ U h , 1 ( X , Y , y )] , where D h , 1 ( x , w , y ) := d 2 M ( w , y ) − d 2 M ( w , ˜ m h , 1 ( x )). By arguing as in ( A.38 ), we get    ˆ T h , 1 ( x , y ) − ˆ T h , 1 ( x , ˜ m h , 1 ( x ))    ≤ 2diam( M ) · d M ( y , ˜ m h , 1 ( x )) · n − 1 n X i =1    ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )    +    ˆ S h , 1 ( x , y )    . (B.62) Regarding the ﬁrst term in ( B.62 ), note that ˜ σ h ( x ) ˆ σ h ( x )  ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )  = ( ˜ σ h ( x ) − ˆ σ h ( x )) L x , h  X ( i )  −  ˜ σ h ( x ) ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 − ˆ σ h ( x ) ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1  θ ( i ) L x , h  X ( i )  = ( ˜ σ h ( x ) − ˆ σ h ( x )) L x , h  X ( i )  −  ˜ σ h ( x ) ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 Λ h − ˆ σ h ( x ) ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Λ h  Λ − 1 h θ ( i ) L x , h  X ( i )  . This implies that | ˜ σ h ( x ) | · | ˆ σ h ( x ) | · n − 1 n X i =1    ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )    ≤ | ˜ σ h ( x ) − ˆ σ h ( x ) | · ˆ µ h , 0 ( x ) +   ˜ σ h ( x ) ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 Λ h − ˆ σ h ( x ) ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Λ h   2 · n − 1 n X i =1   Λ − 1 h θ ( i )   2 L x , h  X ( i )  . (B.63) Note that ˜ σ h ( x ) ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 Λ h − ˆ σ h ( x ) ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Λ h = ˜ σ h ( x )  Λ − 1 h ˆ µ h , 1 ( x )  ⊤  Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 − ˆ σ h ( x )  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1 = ˜ σ h ( x )  Λ − 1 h ˆ µ h , 1 ( x )  ⊤   Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  + ˜ σ h ( x )  Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1 + ( ˜ σ h ( x ) − ˆ σ h ( x ))  Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1 . (B.64) 49 Com bining Lemma A.2 , Lemma B.1 , Lemma B.2 , Lemma B.3 , ( B.24 ), ( B.32 ) and ( B.64 ), w e get   ˜ σ h ( x ) ˆ µ h , 1 ( x ) ⊤ ˆ µ h , 2 ( x ) − 1 Λ h − ˆ σ h ( x ) ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Λ h   2 ≤ | ˜ σ h ( x ) | ·   Λ − 1 h ˆ µ h , 1 ( x )   2 ·     Λ − 1 h ˆ µ h , 2 ( x ) Λ − 1 h  − 1 −  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 + | ˜ σ h ( x ) | ·   Λ − 1 h ˆ µ h , 1 ( x ) − Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 + | ˜ σ h ( x ) − ˆ σ h ( x ) | ·   Λ − 1 h ˜ µ h , 1 ( x )   2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 ≤ O ( ρ ( h )) · o P ( ρ ( h )) · O P  n − 1 2 ρ − 3 2 ( h )  + O ( ρ ( h )) · O P  n − 1 2 ρ 1 2 ( h )  · O  ρ − 1 ( h )  + O P  n − 1 2 ρ 1 2 ( h )  · o ( ρ ( h )) · O  ρ − 1 ( h )  = O P  n − 1 2 ρ 1 2 ( h )  . (B.65) Com bining Lemma A.2 , Lemma B.3 , ( B.38 ), ( B.49 ), ( B.54 ), ( B.63 ) and ( B.65 ), we also get n − 1 n X i =1    ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )    = O  ρ − 1 ( h )  · O P  ρ − 1 ( h )  ·  O P  n − 1 2 ρ 1 2 ( h )  · O P ( ρ ( h )) + O P  n − 1 2 ρ 1 2 ( h )  · O P ( ρ ( h ))  = O P  n − 1 2 ρ − 1 2 ( h )  . (B.66) Regarding the second term in ( B.62 ), b y arguing as in the pro of of ( A.43 ), we can sho w that there exist constan ts L 2 > 0, N 2 ∈ N and δ M ∈ (0 , 1) such that Z 1 / 2 0 q 1 + log N ( δ ϵ, B M ( ˜ m h , 1 ( x ) , δ ) , d M ) d ϵ ≤ L 2 δ α M − 1 whenev er δ ∈ (0 , δ M ] and n ≥ N 2 , where r M > 0 and α M ∈ (0 , 1] are constan ts deﬁned in the condition M3 . The Cauch y-Sc hw arz inequalit y implies that | ˜ σ h ( x ) | 2 · E     ˜ W x ,h ( X )    2  = E h L 2 x , h ( X )  1 − ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Φ − 1 x ( X )  2 i ≤ 2 E  L 2 x , h ( X )  + 2 E h L 2 x , h ( X )  ˜ µ h , 1 ( x ) ⊤ ˜ µ h , 2 ( x ) − 1 Φ − 1 x ( X )  2 i = 2 E  L 2 x , h ( X )  + 2 E  L 2 x , h ( X )   Λ − 1 h ˜ µ h , 1 ( x )  ⊤  Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1  Λ − 1 h Φ − 1 x ( X )   2  ≤ 2 E  L 2 x , h ( X )  + 2   Λ − 1 h ˜ µ h , 1 ( x )   2 2 ·     Λ − 1 h ˜ µ h , 2 ( x ) Λ − 1 h  − 1    2 2 · E h L 2 x , h ( X )   Λ − 1 h Φ − 1 x ( X )   2 2 i . (B.67) 50 Com bining Lemma A.2 , Lemma A.3 , Lemma B.1 , ( B.8 ), ( B.21 ), ( B.24 ), ( B.38 ) and ( B.67 ), w e get E     ˜ W x ,h ( X )    2  = O  ρ − 1 ( h )  2 ·  O ( ρ ( h )) + o ( ρ ( h )) 2 · O  ρ − 1 ( h )  2 · O ( ρ ( h ))  = O  ρ − 1 ( h )  , whic h implies that there exist constants L 3 > 0 and N 3 ∈ N suc h that E     ˜ W x ,h ( X )    2  ≤ L 3 ρ − 1 ( h ) whenev er n ≥ N 3 . By arguing as in the pro of of ( A.47 ), w e get E   sup y ∈ B M ( ˜ m h , 1 ( x ) ,δ )    ˆ S h , 1 ( x , y )      ≤ 4 C L 2 L 1 2 3 diam( M ) δ α M n − 1 2 ρ − 1 2 ( h ) (B.68) whenev er δ ∈ (0 , δ M ] and n ≥ max { N 2 , N 3 } . Let ϵ > 0 b e an y giv en constan t. By ( B.66 ), there exist constan ts L 1 ,ϵ > 0 and N 1 ,ϵ ∈ N , dep ending on ϵ , such that P n − 1 n X i =1    ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )    > L 1 ,ϵ n − 1 2 ρ − 1 2 ( h ) ! < ϵ 4 whenev er n ≥ N 1 ,ϵ . W e deﬁne an even t F n,ϵ as F n,ϵ := ( n − 1 n X i =1    ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )    ≤ L 1 ,ϵ n − 1 2 ρ − 1 2 ( h ) ) . Let N 4 ,ϵ := max { N 1 ,ϵ , N 2 , N 3 } and L 4 ,ϵ := 2diam( M )( L 1 ,ϵ + 2 C L 2 L 1 / 2 3 ). Combining ( B.62 ) and ( B.68 ), we get E   sup y ∈ B M ( ˜ m h , 1 ( x ) ,δ )     ˆ T h , 1 ( x , y ) − ˆ T h , 1 ( x , ˜ m h , 1 ( x ))     · 1 F n,ϵ   ≤ (2diam( M )) · δ · E " n − 1 n X i =1    ˆ W x , h ( X ( i ) ) − ˜ W x , h ( X ( i ) )    ! · 1 F n,ϵ # + E   sup y ∈ B M ( ˜ m h , 1 ( x ) ,δ )    ˆ S h , 1 ( x , y )      ≤ L 4 ,ϵ δ α M n − 1 2 ρ − 1 2 ( h ) whenev er δ ∈ (0 , δ M ] and n ≥ N 4 ,ϵ , where 1 F n,ϵ is the indicator function of F n,ϵ . The rest of the pro of pro ceeds as in the pro of of Lemma A.14 . □ 51 B.4 Pro of of Theorem 3.2 for ˆ m h , 1 ( x ) The theorem follo ws from Lemma B.10 and Lemma B.11 . References for Supplemen tary Material Garc ´ ıa-P ortugu ´ es, E., Crujeiras, R. M. and Gonz´ alez-Manteiga, W. (2013). Kernel densit y esti- mation for directional-linear data. Journal of Multivariate Analysis , 121 , 152-175. Im, C. J., Jeon, J. M. and Park, B. U. (2025). Lo cal F r ´ echet regression with spherical predictors. Ele ctr onic Journal of Statistics , 74 , 751-762. v an der V aart, A. and W ellner, J. A. (1996). We ak Conver genc e and Empiric al Pr o c esses: With Applic ations to Statistics . Springer. 52

Local Fréchet regression with toroidal predictors

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment