A Computationally Stable Approach to Gaussian Process Interpolation of Deterministic Computer Simulation Data
For many expensive deterministic computer simulators, the outputs do not have replication error and the desired metamodel (or statistical emulator) is an interpolator of the observed data. Realizations of Gaussian spatial processes (GP) are commonly …
Authors: Pritam Ranjan, Ronald Haynes, Richard Karsten
A Computational ly Stable Approac h to Gauss ian Pro cess In terp olation of Determinist ic Computer Sim ulatio n Data Pritam Ranjan 1 , Ronald Ha ynes 2 and Ric hard Kar sten 1 1 Department of Mathematics and Statistics, Acadia University , NS, Canada 2 Department of Mathematics and Statistics, Memorial Universit y , NL, Canada (pritam.ranjan@a cadiau.ca, rhaynes@mun.ca, richard.k arsten@a cadiau.ca) Abstract F or man y exp ensive deterministic computer sim ulators, the o utputs do not have replica tion error and the desired metamo del (or statistical em ulator) is an interp olator of the obser ved data. Realizations o f Gaussia n spatial pro cesses (GP) a re commonly used to mo del such sim- ulator outputs. Fitting a GP mo del to n da ta points r equires the co mputatio n of the inv er se and determina nt of n × n corr elation matrices , R , that a re so metimes computationally unstable due to near- s ingularity o f R . This happens if an y pair of design points are v ery close together in the input space. The p opular approa ch to ov ercome nea r-singular ity is to in tro duce a s mall nu gget (or jitter) parameter in the mo del that is estimated along with other mo del pa r ameters. The inclusion o f a nu gget in the mo del o ften ca uses unnecessa r y ov er-smo othing of the data. In this pap er, w e pro po se a low er b ound o n the n ug get that minimizes the over-smoo thing and an iterative regular iz ation approa ch t o co ns truct a predictor that f urther impr ov e s the interpolation accuracy . W e also show that the prop o s ed predictor con v erges to the GP interpolato r. KEY WORDS: Computer exp er iment ; Ma trix inv erse approximation; Regular iz a tion. 1 In tro duction Computer sim ulators are ofte n used t o mod el c omplex ph ysical and engineering pro cesses that are either to o exp ensiv e or time consuming to observ e. A sim ulator is said to b e d etermin istic if the replicate runs of the same inputs will yield identica l r esp onses. F or the last few decades, deterministic sim ulators hav e b een widely used to mo d el p h ysical pr o cesses. F or instance, Kumar and Da vidson (1978) u sed deterministic sim ulation mo dels for comparing the p erformance of highly concurrent computers; Su et al. (1996 ) used generalize d linear regression mo d els to design a lamp fi lamen t via a d eterministic fi nite-elemen t compu ter co de; Aslett et al. (1998 ) discuss an optimization problem for a deterministic circuit simulator; sev eral deterministic sim ulators are b eing used for analyzing bio c hemical net w orks (see Bergmann and Sauro 2008 for references). On the other hand, there are cases where sto chastic (non-deterministic) sim ulators are preferred du e to un a voidable biases (e.g., Poole and Raftery 2000). In spite of the recen t in terest in sto c hastic 1 sim ulators, d eterministic sim ulators are still b eing activ ely used. F or instance, Medina, Moreno and Roy o (2 005) demonstrate the preference of deterministic traffic sim ulators o v er their sto chastic coun terparts. In this pap er, w e assume that the sim ulator u nder consideration is deterministic up to w orking pr ecision and the scienti st is confiden t ab out the v alidity of the simulator. Sac k s , W elc h , Mitc hell and Wynn (1989) prop osed mo d eling (or em ulating) such an exp ensiv e deterministic sim ulator as a realization of a Gaussian sto c hastic p r o cess (GP). An em ulator of a deterministic simulat or is desired to b e an in terp olator of the observ ed data (e.g., Sac ks et al. 1989; V an Beers and Kleijnen 2004). F or the problem that motiv ated this wo rk, the ob jectiv e is to em ulate the a v erage extracta ble tidal p o w er as a function of the turbine lo cations in the Ba y of F und y , Nov a Scotia, Canada. The deterministic computer sim ulator for the tid al p o wer m o del is a n umerical solv er of a complex system of partial differential equatio ns, and we accept the sim ulator as a v alid r epresent ation of the tidal p o w er. In this pap er, we discuss a computational issue in building the GP based emulator for a determin- istic sim ulator. Fitting a GP mo del to n data p oin ts using eit her a maxim um lik eliho o d tec hnique or a Ba ye sian approac h requires the computation of the determinant and inv erse of seve ral n × n correlation matrices, R . Although the correlation matrices are p ositiv e d efinite by definition, near- singularit y (also referred to as ill-conditioning) of these matrices is a common problem in fitting GP mo dels. Abab ou, Bagtzoglou and W o o d (1994 ) study the relationship of a uniform grid to the ill-conditioning and qu alit y of mo del fit for v arious co v ariance mo dels. Barton and S alagame (1997 ) study the effect of exp eriment al design on the ill-conditioning of kriging m o dels. Jones, Sc honlau and W elc h (19 98) us ed th e singular v alue decomp osition to o v ercome the near-singularit y of R . Booker (2000) used the s um of indep enden t GPs to o verco me near-singularit y for m ulti-stage adaptiv e designs in kr iging mo dels. A m ore p opu lar solution to o v ercome n ear-singularit y is to in tro duce a nugge t or jitter p arameter, δ , in the mo d el (e.g., Sacks et al. 1989; Neal 1997 ; Booker et al. 1999; S an tner, Williams and Notz 2003; Gramacy and Lee 2008) that is estimated along with other mo d el parameters. How ev er, adding a n u gget to the mod el introdu ces add itional smo othin g in the p redictor and as a r esu lt the pred ictor is no longer an interp olator. Here, w e first prop ose a lo wer b ound on the n ugget ( δ lb ) that minimizes the additional o v er - smo othing. Second, an iterativ e approac h is deve lop ed to enable the construction of a new p redictor that further impro v es the interp olation as well as the prediction (at unsampled design p oints) accuracy . W e also show that th e p rop osed p redictor con verges to an interp olator. Alt hough an arbitrary nugget (0 < δ < 1) can b e used in the iterativ e approac h, the rate of con v ergence (i.e., the n um b er of iteratio ns required to reac h certa in tolerance) dep ends on the magnitude of the n ugget. 2 T o this effect, the p r op osed lo w er b ound δ lb significan tly r educes the num b er of ite rations r equired. This feature is particularly desirable for implementat ion. The p ap er is organized as follo ws. Section 2 present s the tidal p ow er mo deling example. In Section 3, w e review the GP mo del, a compu tational issue in fi tting the m o del, and the p opular approac h to ov ercome near-singularity . Section 4 pr esents the new low er b ound for the nugg et that is required to ac hiev e well- conditioned correlation matrices and min imize unnecessary o v er-sm o othing. In Section 5, we dev elop the iterati v e approac h for constructing a more acc urate predictor. Several examples are present ed in S ection 6 to illustrate the p erformance of our prop osed predictor ov er the one obtained us ing the p opular approac h. Finally , w e conclude the pap er with s ome remarks on the numerical issues and recommend ations for practitioners in Section 7 . 2 Motiv ating example The Ba y of F un d y , lo cated b et ween New Bru nswic k and No v a Scotia, Canada, w ith a sm all p ortion touc h ing Maine, US A, is w orld famous for its high tides. In the upp er p ortion of the Ba y of F undy (see Figure 1(a)), the difference in wat er lev el b et w een h igh tide and lo w tide can b e as muc h as 17 meters. The high tides in this region are a result of a resonance, with the natural p erio d of the Ba y of F un dy very close to the p erio d of the principal lunar tide. This results in ve ry regular tides in the Ba y of F un dy with a h igh tide ev ery 12.42 hour s. Th e incredible energy in these tides has mean t that the region has significant p oten tial for extracting tidal p o w er (Green b erg 1979; Karsten, McMillan, Lic kley and Ha ynes 200 8 (h er eafter KMLH)). Though the notion of h arnessing tidal p o w er from the Ba y of F und y is not new, earlier prop osed metho ds of harvesti ng the muc h needed green electrica l energy in v olv ed building a barr age or dam. Th is metho d w as considered infeasible for a v ariet y of economic and en vironment al reasons. Recen tly , there h as b een rapid tec hnological deve lopmen t of in-steam tidal turbines. These devices act m uc h like wind turbines, with individual turbines placed in regions of strong tidal currents. Individu al tur bines can b e up to 20 m in diameter and can pro duce o v er 1 MW of p o wer. Ideally , these turbines w ould pro duce a p redictable and renew able s ource of p o w er w ith less of an impact on the environmen t than a d am. KMLH examined the p o w er p otent ial of f arms of such turbines across the Minas P assage (Figure 1(b)) wh ere the tidal cur r en ts are strongest. They found that the p oten tial extractable p o w er is muc h higher than pr evious estimates and that the environmen tal impacts of extracting p o w er can b e greatly reduced b y extracting only a p ortion of the m axim u m p o w er av ailable. The simulatio ns in K MLH did not represen t ind ividual turb ines and left open the 3 question of h o w to optimally place turb in es. In this pap er, we em ulate the KMLH numerical mo del to examine the placemen t of turbines to maximize the p ow er output. 66 o W 30’ 65 o W 30’ 64 o W 30’ 45’ 45 o N 15’ 30’ 45’ 46 o N Bay of Fundy Nova Scotia Minas Passage New Brunswick Minas Basin Nova Scotia (a) The upp er Bay of F und y 30’ 28’ 64 o W 26.00’ 24’ 22’ 20’ 18’ 19’ 20’ 45 o N 21.00’ 22’ 23’ 24’ (b) The Minas Passage Figure 1: Figure (a) shows the tr iangular grid used in the FV COM mo del for sim u lating tides in the upp er Ba y of F u ndy . The small b ox in the cen ter surroun ds the Minas P assage s h o wn in (b). The shaded triangles in the cen ter of (b) represen t a p ossible turb ine lo cation. W e numericall y sim ulate th e tides as in KMLH b y solving the 2D shallo w w ater equations using the Finite-V olume Coastal Ocean Mo del (FV COM) with a triangular grid on the upp er Ba y of F und y (see Figure 1(a)). Since the grid triangle s differ in size and orien tation, the i -t h turbine was mo deled on the set of all triangular elemen ts whose cen ters lie within 250 m of ( x i , y i ). A p ossible turbine lo cation is shown in Figure 1(b). T he triangular grid wa s devel op ed by Da vid Green b erg and colleagues at the Bedford Institute of Oceanograph y , NS, Canada. The details of FV COM can b e foun d in Chen et al. (2006). Using this set up , the estimate of the electric p o w er that can b e harnessed th rough a turbine at a particular lo cation ( x, y ) o v er a tidal cycle T = 12 . 42 hour s is obtained by the simulator in KMLH. The a v erage tidal p o w er at the lo cation ( x, y ) is given by ¯ P ( x, y ) = 1 T Z T 0 P ( t ; x, y ) dt , where P ( t ; x, y ) is the extractable p o wer ou tp ut at time t and lo cation ( x, y ). The pro cess is deterministic up to the mac hin e precision, and the main ob jectiv e is to em ulate ¯ P ( x, y ). It tu rns out th at the GP mo del fitted to the sim ulator output at n = 100 p oints (c h osen usin g a space-filling d esign criterion) is not an interp olator an d results in an o ver-smoothed emulato r (see 4 Example 4 f or details). Th is is un desirable as the o cean mo d elers are in terested in an em ulator th at in terp olates their sim ulator. This em ulator will b e used to obtain estimates of b oth the maximizer of the p o w er fun ction (i.e., the lo cation wh er e to pu t the tu r bine) and th e extractable p o w er at this location. Th e man u facturing and installation cost of the initial protot yp e turbine is v ery h igh (roughly 20 million dollars). Since th e o v er-smoothed em u lator can un derestimate the maxim um extractable p o w er, a go o d approximat ion of th e att ainable p o wer function can b e h elpful in sa ving the cost of a few turbin es. Example 4 sh o ws that the prop osed approac h leads to a more accurate estimate of the m axim u m extractable p o w er . 3 Bac kgroun d review 3.1 Gaussian pro cess mo del Let th e i -th input and output of the computer sim ulator b e denoted b y a d -dimensional v ector, x i = ( x i 1 , ..., x id ), and the univ ariate resp onse, y i = y ( x i ), resp ectiv ely . The exp eriment design D 0 = { x 1 , ..., x n } is the set of n input trials. The outpu ts of the sim ulation trials are held in the n -dimensional v ector Y = y ( D 0 ) = ( y 1 , y 2 , . . . , y n ) ′ . The simula tor output, y ( x i ), is m o deled as y ( x i ) = µ + z ( x i ); i = 1 , ..., n, (1) where µ is the o v er all m ean, and z ( x i ) is a GP with E ( z ( x i )) = 0, V ar ( z ( x i )) = σ 2 z , and Co v( z ( x i ) , z ( x j )) = σ 2 z R ij . In general, y ( D 0 ) has a multiv ariate normal distribu tion, N n ( 1 n µ, Σ), where Σ = V ( D 0 | y ( D 0 )) = σ 2 z R , and 1 n is a n × 1 v ector of all ones (see Sac ks et al. 1989, and Jones et al. 1998 f or details). Although there are several c hoices f or the correlation function, w e fo cus on the Gaussian correlation b ecause of its p rop erties lik e smo othness (or differentiabilit y in mean square sense) and p opularity in other areas lik e mac h in e learning (radial basis k ernels) and geostat istics (kriging). F or a d etailed discussion on correlation functions see Stein (19 99), San tner, Williams and Notz (2003) , and Rasm ussen and Williams (2006) . The Gaussian correlation fun ction is a sp ecial case ( p k = 2 for all k ) of the p o w er exp onen tial correlation family R ij = corr( z ( x i ) , z ( x j )) = d Y k =1 exp {− θ k | x ik − x j k | p k } , for all i, j, (2) where θ = ( θ 1 , ..., θ d ) is the vect or of h yp er-parameters, and p k ∈ (0 , 2] is the smo othn ess param- eter. As discussed in Section 7, the results dev elop ed in th is pap er ma y v ary sligh tly when other correlation structur es are used instead of the Gaussian correlation. 5 W e use the GP mo d el with Gaussian correlation function to pr edict r esp onses at any u nsampled design p oin t x ∗ , ho wev er, th e theory deve lop ed here is also v alid for other corr elation structur es in the p o w er exp onent ial family (see Section 7 for more details). F ollo wing the maxim um lik eliho od approac h, the b est linear unbiase d predictor (BLUP) at x ∗ is ˆ y ( x ∗ ) = ˆ µ + r ′ R − 1 ( Y − 1 n ˆ µ ) = (1 − r ′ R − 1 1 n ) 1 ′ n R − 1 1 n 1 ′ n + r ′ R − 1 Y , (3) with mean squared error s 2 ( x ∗ ) = σ 2 z (1 − 2 C ′ r + C ′ RC ) (4) = σ 2 z 1 − r ′ R − 1 r + (1 − 1 ′ n R − 1 r ) 2 1 ′ n R − 1 1 n , where r = ( r 1 ( x ∗ ) , ..., r n ( x ∗ )) ′ , r i ( x ∗ ) = corr( z ( x ∗ ) , z ( x i )), an d C is such that ˆ y ( x ∗ ) = C ′ Y . In practice, the p arameters µ , σ 2 z and θ are replaced with estimates (see Sac ks et al. 1989, S an tner, Williams and Notz 2003, for details). 3.2 A computational issue in mo del fitting Fitting a GP mod el (1)–(4) to a d ata set with n observ ations in d -dimensional inpu t space requires n umerous ev aluatio ns of the log- lik elihoo d function for sev eral r ealizations of the parameter vect or ( θ 1 , ..., θ d ; µ, σ 2 z ). The closed form estimators of µ and σ 2 z , giv en by ˆ µ ( θ ) = ( 1 n ′ R − 1 1 n ) − 1 ( 1 n ′ R − 1 Y ) and ˆ σ 2 z ( θ ) = ( Y − 1 n ˆ µ ( θ )) ′ R − 1 ( Y − 1 n ˆ µ ( θ )) n , (5) are often u sed to obtain the profile log-lik eliho o d − 2 log L p ∝ log ( | R | ) + n log[( Y − 1 n ˆ µ ( θ )) ′ R − 1 ( Y − 1 n ˆ µ ( θ ))] , (6) for estimating the h yp er-parameters θ = ( θ 1 , ..., θ d ), wh ere | R | denotes the d eterminan t of R . Recall from (2), that the correlatio n matrix R d ep ends on θ and the design p oints. An n × n matrix R is said to b e near-singular (or, ill-conditioned) if its condition n um b er κ ( R ) = k R k · k R − 1 k is to o large (see Section 4 for details on “ho w large is large?”), where k · k denotes a matrix norm (w e will u se the L 2 –norm). Although these correlation matrices are p ositiv e definite b y defin ition, computation of | R | an d R − 1 can sometimes b e unstable d ue to ill-conditioning. This pr oh ib its precise computation of th e lik eliho o d and h ence the parameter estimat es. Ill-conditioning of R often o ccurs if an y pair of design p oin ts are v ery close in the inpu t space, or θ k ’s are close to zero, i.e., P d k =1 θ k | x ik − x j k | p k ≈ 0. The d istances b et wee n neigh b oring p oin ts in space- filling designs with large n (sample size) and small d (input dimension) can b e v ery small. 6 Near-singularit y is more common in the s equ en tial design setup (e.g., expected impro v ement based designs, see J ones et al. 1998; Schonlau et al. 1998; Oakley 2004; Hu ang et al. 2006; Ranjan et al. 2008; T addy et al. 2009), where the follo w-up p oin ts tend to “pile u p ” near th e pre-sp ecified features of interest lik e the global maxim um , con tours, quan tiles, and so on. 3.3 The p opular approac h A p opular appr oac h to ov ercome th e ill-conditioning of R is to in tro duce a nugget, 0 < δ < 1 in the m o del, and replace the ill-conditioned R w ith a well- conditioned R δ = R + δ I that h as a smaller condition n um b er (see Section 4 for details) as compared to that of R . Equ iv alen tly , one can introdu ce an in dep end en t w hite noise p ro cess in the mo del y ( x i ) = µ + z ( x i ) + ǫ i , i = 1 , ..., n, where ǫ i are i.i.d. N (0 , σ 2 ǫ ). That is, V ar ( Y ) = V ( D 0 | y ( D 0 )) = σ 2 z R + σ 2 ǫ I = σ 2 z ( R + δI ) for δ = σ 2 ǫ /σ 2 z . The v alue of the nugg et is b ounded ab ov e, δ < 1, to ensu re that the n umerical uncertain t y is smaller th an the pr o cess uncertain t y . Th e resulting BLUP is give n b y ˆ y δ ( x ) = (1 − r ′ ( R + δ I ) − 1 1 n ) 1 ′ n ( R + δ I ) − 1 1 n 1 ′ n + r ′ ( R + δ I ) − 1 Y , (7) and the asso ciated mean squared error s 2 δ ( x ) is s 2 δ ( x ) = σ 2 z 1 − 2 C ′ δ r + C ′ δ RC δ , (8) where C δ is suc h that ˆ y δ ( x ) = C ′ δ Y . Theoretically , it is s tr aigh tforward to s ee that th e u se of a p ositiv e nugg et in th e GP mo d el pro du ces a non-inte rp olator. Jones et al. (1998) show that the GP fi t giv en by (3) and (4) is an in ter- p olator b ecause for 1 ≤ j ≤ n , r ′ R − 1 = e ′ j , where e j is the j -th unit v ector, r = ( r 1 ( x j ) , ..., r n ( x j )) ′ and r i ( x j ) = corr( z ( x i ) , z ( x j )). If w e use a δ ( > 0) in the mo del (i.e., replace R with R δ ), then r ′ R − 1 δ 6 = e ′ j and thus ˆ y ( x j ) 6 = y j and ˆ s 2 ( x j ) 6 = 0. F rom a practitioner’s viewp oin t, one could sacrifice exact interp olation if the interpolation accuracy of the fit is within the desired tole rance, but it is not alw a ys ac hiev able (see Sectio n 6 for illustrations). The n ugget parameter δ is often estimated along with the other mo del parameters. How ev er, one of the ma jor concerns in the optimization is that the likeli ho o d (mo dified b y replacing R with R δ ) computation fails if the ca ndidate nugget δ ∈ (0 , 1) is n ot large enough to ov ercome ill-co nditioning of R δ . T o a v oid th is problem in the optimizat ion, it is common to fix an ad-ho c b oundary v alue on the nugget parameter. Th e resulting maximum lik eliho o d estimate is often close to this b oundary 7 v alue and th e fi t is not an int erp olator of the observ ed d ata (i.e., th e interp olation err or is more than the desired tolerance). Even if the estimated nugg et is n ot near the b oundary , the us e of a n ugget in th e mo d el in this manner may in trod uce un necessary o v er-sm o othing from a p r actical standp oint (Section 6 p resen ts sev eral illustrations). In the next s ection, we p r op ose a low er b ound on the nugget that minimizes the unnecessary o ve r-smo othing. 4 Cho osing the Nugget Recall from Section 3.2 that an n × n matrix R is said to b e ill-conditioned or near-singular if its condition num b er κ ( R ) is to o large. Thus, we in tend to find δ suc h that κ ( R δ ) is smaller than a certain thresh old. Ou r main ob jective s here are to compu te the condition num b er of R δ and the threshold that classifies R δ as wel l-b eha v ed. Let λ 1 ≤ λ 2 ≤ · · · ≤ λ n b e the eigen v alues of R . Th en, in the L 2 –norm, κ ( R ) = λ n /λ 1 (Golub and V an Loan 1996) . The addition of δ along the main d iagonal of R shifts all of the eigen v alues of R by δ . That is, the eig en v alues of R δ = R + δ I are λ i + δ , i = 1 , ..., n , where λ i is the i -th smallest eigen v alue of R . T hus, R δ is w ell-conditioned if log( κ ( R δ )) / a λ n + δ λ 1 + δ / e a δ ' λ n ( κ ( R ) − e a ) κ ( R )( e a − 1) = δ lb , where κ ( R ) = λ n /λ 1 and e a is the d esired threshold for κ ( R δ ). Note th at δ lb is a fu nction of the design p oints and the hyper -p arameter θ . The closed form expressions for the eigenv alues and hence the condition num b er of a Gaussian correlation matrix R , in (2), for arb itrary θ and d esign { x 1 , ..., x n } is, to our kno w ledge, yet u n- kno wn. If x ∈ ( −∞ , ∞ ) d and x k ∼ N (0 , σ 2 x ), closed form expressions of the exp ected eigen v alues of R are kno wn (see Section 4.3 in Rasm u s sen and Williams 2006). In ou r case, x ∈ [0 , 1] d , and the design p oin ts are often c hosen u s ing a space-filling cr iterion (e.g., Latin h y p ercub e with prop erties lik e maximin distance, minimum correlation, O A; un iform d esigns, and so on). In suc h cases, one ma y assum e, at m ost, x k ∼ U (0 , 1) for k = 1 , ..., d . In fact, the ob jectiv es of b uilding efficient em- ulators for computer simulators often include estimating pre-sp ecified pro cess features of interest, and s equ en tial d esigns (e.g., exp ected improv ement based designs) are preferred to ac hiev e such goals. In such designs, the follo w-u p p oin ts tend to “pile up” near the feature of interest. The distributions of su c h d esign p oint s are not u niform and can b e non-trivial to represen t in analyti cal 8 expressions. Hence, it is almost surely infeasible to obtain closed form expressions for the eigen v al- ues of su ch R in general. O f course, one can compute these quant ities numerically . W e u se Matlab’s built-in function eig to compute the maximum eigen v alue of R and c ond to calculate the condition n um b er κ ( R ) = λ n /λ 1 in the expr ession of δ lb . Another imp ortan t comp onent of the p rop osed lo wer b oun d is the threshold for getting w ell- b ehav ed n on-singular correlation matrices. As one wo uld susp ect, the near-singularit y of suc h a correlation matrix dep ends on n , d , th e distribu tion of { x 1 , ..., x n } ∈ [0 , 1] d and θ ∈ (0 , ∞ ) d . W e no w present the k ey steps of th e sim ulation algorithm used for estimating the threshold und er a sp ecific design f ramew ork. Th e resu lts are a v eraged o v er the distribution of { x 1 , ..., x n } and θ , and th us it is su fficien t to find the threshold of κ ( R ). F or several com bin ations of n and d , we generate 5000 correlation matrices where the design p oints { x 1 , ..., x n } follo w the maximin Latin h yp ercub e s ampling sc h eme (Stein 1987) and θ k ’s are c hosen fr om an exp onentia l d istribution with mean 1. Recall from Section 3.2 that a near- singular (or ill-conditioned) correlation matrix has a large cond ition num b er, and κ ( R ) is in versely prop ortional to θ . Consequently , w e fo cuss ed on small v alues of θ in sim ulating R . These correlation matrices are u sed to compu te the pr op ortion of matrices that are near-singular (see the contours in the left panel of Figure 2). W e used Matla b’s built-in fun ction lhsdesign to generate the design p oints and chol (wh ic h computes the Cholesky factorizatio n) to chec k whether or n ot a matrix R w as near-singular un der the working precision. F or a p ositive definite wel l-b ehave d matrix R , “ [ U, p ] = chol ( R ) ” pr o duc es an upp er triangular matrix U satisfying U ′ U = R and p is zer o. If R is not p ositive definite, then p is a p ositive inte ger. W e also computed the condition num b ers of these 5000 correlation matrices (u sing Matlab’s b uilt-in function c ond ). Th e r ight panel of Figure 2 presents th e conto urs of the a verage of log( κ ( R )) for differen t com binations of n and d . F rom Figur e 2, it is clear that a ≈ 25 can b e us ed as the threshold for log ( κ ( R δ )) of a well- b ehav ed correlation matrix R δ . Also note that the p rop ortion of near-singular cases, denoted b y the contours in the left panel of Figure 2 , d ecreases rapid ly with the increment in the inpu t dimension. This is somewhat intuitiv e b ecause the volume of the voi d (or un explored region) increases exp onentia lly with the d imension, and a really large space-filling design is needed to jeopardize th e conditioning of the correlation matrices in high d imensional inpu t space. F or other design schemes (e.g. , sequential designs), one can follo w these steps to estimate the th reshold for the condition num b er of w ell-b eha v ed correlation m atrices. 9 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 5 10 15 20 25 30 35 40 45 50 0.05 0.1 0.2 0.4 0.6 0.8 0.9 d n/d 10 15 20 25 30 35 40 d n/d 1 2 3 4 5 6 5 10 15 20 25 30 35 40 45 50 Figure 2: The con tour s in the left panel sh o w the prop ortion of correlation matrices flagged as near-singular. T he cont ours in the right panel displa y a v erage log ( κ ( R )) v alues. The shaded region in the left panel corresp onds to log ( κ ( R )) > 25. The lo w er b oun d on the nugge t is only a sufficien t condition and not a necessary on e f or R δ to b e well- conditioned. F or instance, a correlation matrix with 100 design p oin ts in (0 , 1) 2 c h osen using a space-filling criterion ma y lead to a well-beh av ed R if θ is very large. If the correlation matrix is well- conditioned, R should b e used instead of R δ , i.e., δ lb = max λ n ( κ ( R ) − e a ) κ ( R )( e a − 1) , 0 . (9) That is, when R is well-behav ed our approac h allo ws δ lb to b e zero and h ence a m ore accurate surrogate can b e obtained as compared to the p opular approac h (Section 3.3) wh ere a non-zero n ugget is forced in the mo del which ma y lead to undesirable ov er-smo othing. This could b e of concern in h igh d imensional in put space, b ecause the prop ortion of near-singular cases d ecreases with th e increment in the inp ut dimension. Example 3 demons tr ates the p erformance of the prop osed metho dology o ver the p opular approac h for an eigh t-dimensional simulato r. Although the u se of δ lb in the GP mo del minimizes the o v er-smo othing, δ lb ma y not b e small enough to achiev e the desired in terp olation accuracy (see Examples 1 and 2 f or illustrations), and c h o osing δ < δ lb ma y lead to ill-conditioned R . This ma y not b e a big issue if one b eliev es th at th e sim ulator is somewhat noisy and/or the statistical em u lator is b iased due to m is-sp ecification in the correlation structure or m o del assumptions. In such cases, a little smo othing m igh t b e a go o d idea. Ho wev er, con trolling the amount of sm o othing is a n on-trivial task and requires more atten tion. On the other h and, o v er-smo othing is u ndesirable if the exp erim enter b eliev es that the compu ter 10 sim ulator is deterministic and the statistician is confiden t ab out th e c h oice of the em ulator (w e consider the GP mo del w ith Gaussian correlation structur e). Under these assumptions, we no w prop ose a new p redictor that can ac hiev e the desired lev el of interp olation accuracy . 5 New Iterativ e App r oac h In this section, we pr op ose a p redictor th at is based on the iterativ e use of a nugge t δ ∈ (0 , 1). This approac h do es not dep end sev erely on the m agnitude of the nugg et, and the results dev elop ed here are based on an arbitrary 0 < δ < 1, large enough to ensure R δ w ell-b eha ved. Ho w ev er, c ho osing δ > δ lb ma y requir e more iterations to attain th e desired interp olation accuracy , and w e recommend using δ lb . W e also sho w that the prop osed predictor con verges to th e interp olator (3) and (4). Recall that the k ey p roblem here is the inaccurate computation of | R | and R − 1 due to ill- conditioning of R . The main idea of the new approac h is to r ewrite the profi le log-li k elihoo d as − 2 log L p ∝ − log ( | R − 1 | ) + n log [( Y − 1 n ˆ µ ( θ )) ′ R − 1 ( Y − 1 n ˆ µ ( θ ))] , (10) and r eplace the ill-conditioned R − 1 with a w ell-b eha ved quan tity . This mo d ified profi le log- lik eliho od can then b e optimized to get the parameter estimates. Next , w e describ e how to fin d the appr opriate w ell-behav ed substitute for R − 1 . In the same spirit as the p opular app roac h, w e attempt to ev aluate R − 1 w by solving Rt = w , under the assumption th at R cannot b e in v erted accurately (i.e., R is near-singular) and there exists a δ ∈ (0 , 1) such that R δ = ( R + δ I ) is well-co nditioned. In an attempt to find an interp olator of the simulato r (up to certain accuracy), our ob jectiv e is to fin d t ∗ = f ( R δ , w ) that is a b etter appro ximation of t = R − 1 w as compared to ˜ t = R − 1 δ w , suggested by the p opular approac h. T o ac hieve this goal, w e p rop ose to us e iter ative r e gularization (e.g., Tikhono v 1963, Neumaier 1998), a tec h nique for solving ill-conditioned systems of equations. Let s 0 = w and s i , i = 1 , ..., M , b e a sequence of vec tors obtained by recursiv ely solving the system of equations giv en b y ( R + δ I ) s i = δ s i − 1 . (11) Then, the estimate of t = R − 1 w after the i -th iteration (1 ≤ i ≤ M ) of regularization is giv en b y t i = t i − 1 + s i δ , (12) where t 0 is a vecto r of zeros. The final solution with M iterations of regularizatio n, t M = M X k =1 δ k − 1 ( R + δ I ) − k w, 11 requires only one direct inv ersion (or one Ch olesky decomp osition) of R δ = R + δ I , follo w ed by M forward and b ac kward substitutions. The pr op osed approximat ion of t = R − 1 w is t M , with M ≥ 1 c hosen to satisfy the in terp olatio n acc uracy requiremen t. Lemma 1 sho ws that the iterativ e regularization approac h in (11) an d (12 ) leads to a solution that is a generalizati on of the p opu lar approac h outlined in S ection 3.3. Lemma 1 L et R b e a n × n p ositive definite c orr elation matrix, I b e the n × n i dentity matrix, and 0 < δ < 1 b e a c onstant, then R − 1 = ∞ X k =1 δ k − 1 ( R + δ I ) − k . The con v ergence of this infi nite series follo ws from the v on Neumann series (the matrix version of the T a ylor series, Leb edev 1997) exp ansion of g ( u ) = ( R + uI ) − 1 around u = δ : g ( u ) = g ( δ ) + ( u − δ ) g ′ ( δ ) + ( u − δ ) 2 2! g ′′ ( δ ) + ( u − δ ) 3 3! g ′′′ ( δ ) + · · · , i.e., ( R + uI ) − 1 = ( R + δ I ) − 1 + ( u − δ )( − 1)( R + δ I ) − 2 + ( u − δ ) 2 2! ( − 1)( − 2)( R + δ I ) − 3 + · · · = ( R + δ I ) − 1 − ( u − δ )( R + δ I ) − 2 + ( u − δ ) 2 ( R + δ I ) − 3 − · · · . Setting u = 0, we get R − 1 = P ∞ k =1 δ k − 1 ( R + δ I ) − k and thus the prop osed solution obtained using the iterativ e regularization is th e M -th ord er v on Neumann appro ximation of R − 1 . The predictor ˆ y δ in the p opular approac h (7) uses t 1 , the fi rst order vo n Neumann appro ximation, and hence our prop osed appr oac h is a generaliza tion of the p opular approac h. The prop osed regulariza tion is implemen ted by optimizi ng the mo dified profile log-lik eliho o d − 2 log L p ∝ − log( | R − 1 δ,M | ) + n log [( Y − 1 n ˆ µ ( θ )) ′ R − 1 δ,M ( Y − 1 n ˆ µ ( θ ))] , (13) where R − 1 δ,M = P M k =1 δ k − 1 ( R + δ I ) − k . Closed form expressions for ˆ µ ( θ ) and ˆ σ 2 z ( θ ) are the same as in (5) su b ject to R − 1 replaced by R − 1 δ,M . Th e new regularized p redictor ˆ y δ,M ( x ) at x ∈ χ is ˆ y δ,M ( x ) = 1 − r ′ R − 1 δ,M 1 n 1 ′ n R − 1 δ,M 1 n 1 ′ n + r ′ R − 1 δ,M Y , (14) and the corresp onding MSE s 2 δ,M ( x ) is giv en by s 2 δ,M ( x ) = σ 2 z (1 − 2 C ′ δ,M r + C ′ δ,M RC δ,M ) , (15) where C δ,M is suc h that ˆ y δ,M ( x ) = C ′ δ,M Y . Lemmas 2 and 3 establish the con vergence results for an arbitrary 0 < δ < 1. 12 Lemma 2 L et R b e a ne ar-singu lar c orr elation matrix as define d in ( 2 ), and 0 < δ < 1 b e a nugget such that R + δ I i s wel l-b ehave d. Then, for every x ∗ ∈ χ = [0 , 1] d , lim M → ∞ ˆ y δ,M ( x ∗ ) = ˆ y ( x ∗ ) , wher e ˆ y ( x ∗ ) and ˆ y δ,M ( x ∗ ) ar e define d in ( 3 ) and ( 14 ) r esp e ctively. The pro of f ollo ws from Lemma 1 and using lim M → ∞ R − 1 δ,M = R − 1 in (14). It is straigh tforward to sho w that C δ,M in (15) con v erges to C in (4) as M → ∞ . This also p ro v es the next result on the con vergence of the mean squared error for th e prop osed predictor. Lemma 3 L et R b e a ne ar-singu lar c orr elation matrix as define d in ( 2 ), and 0 < δ < 1 b e a nugget such that R + δ I i s wel l-b ehave d. Then, for every x ∗ ∈ χ = [0 , 1] d , lim M → ∞ s 2 δ,M ( x ∗ ) = s 2 ( x ∗ ) , wher e s ( x ∗ ) and s δ,M ( x ∗ ) ar e define d in ( 4 ) and ( 15 ) r esp e c tively. Lemmas 2 and 3 p ro v e that ev en if a few pairs of p oint s are to o close together in the input space, or θ k ’s are close to zero to cause n ear-singularity of R , the prop osed iterativ e predictor con v erges to an interp olator as M increases (i.e., for 1 ≤ i ≤ n , ˆ y δ,M ( x i ) → y i and s 2 δ,M ( x i ) → 0 as M → ∞ ). Remark: In practice, w hen a pre-sp ecified interpolation accuracy is desired, the pr op osed itera- tiv e appr oac h suggests refitting the GP mo d el (i.e., optimization of (13)) for different c h oices of M ≥ 1. Note that the parameter estimate s c hange with M whic h allo ws for th e extra flexibilit y in the mod el that adju sts the ov er-smo othed p ortion of the surr ogate. First of all, the computational cost of fitting this mo d el increases with M . S econdly , the com bin ed cost of refi tting the mo del for differen t v alues of M can b e quite large. Although the n umerical stabilit y in compu ting R − 1 δ,M do es not c hange with M , computation of | R − 1 δ,M | can b ecome less numerically stable w ith increasing M . This is b ecause R − 1 δ,M → R − 1 as M → ∞ and the computation of | R − 1 | is assumed to b e unstable. Considerin g these issues, w e recommend optimizing th e profi le log-lik eliho o d (13) w ith M = 1 to obtain ˆ θ mle and δ lb ( ˆ θ mle ), and then u se it to co mpute ˆ y δ,M ( x ) and ˆ s δ,M ( x ) for any M ≥ 1 b y follo win g the iterativ e regularization steps outlined ab o v e. The conv ergence results in Lemmas 1, 2 and 3 do not d ep end on the choice of θ and δ in R δ = R + δ I , and so the pr edictor obtained is still an in terp olator. T he k ey steps required for the implemen tation of th e prop osed approac h are as follo ws: 13 1. C omp utation of the p rofile log-lik eliho o d (10) for the estimation of θ . (a) Ch o ose a candidate θ in Θ d and compute R . (b) Comp u te the lo w er b ound of nugge t δ lb in (9). Note that δ lb is a f unction of the hyper- parameters θ , the design matrix and th e threshold. (c) Replace R − 1 with R − 1 δ lb , 1 in the likel iho o d (10). 2. O b tain th e p arameter estimates ˆ θ and δ lb ( ˆ θ ) b y optimizing the profi le log-lik eliho o d . T hen compute ˆ µ ( ˆ θ ) and ˆ σ 2 z ( ˆ θ ). 3. Use the parameter estimates ˆ θ , δ lb ( ˆ θ ), ˆ µ ( ˆ θ ) and ˆ σ 2 z ( ˆ θ ) to compute the regularized em ulator giv en by ˆ y δ lb ,M ( x ) and ˆ s 2 δ lb ,M ( x ) in (14 ) an d (15) resp ectiv ely . The num b er of iterations ( M ) in ˆ y δ lb ,M ( x ) and ˆ s 2 δ lb ,M ( x ) dep ends on the d esir ed in terp olation accuracy , and one ca n build s topp ing rules for attaining the p re-sp ecified accuracy in (14). W e use Mahalanobis distance (Ba stos and O’Hagan 2009) to compute th e accuracy of the predictor. Th e in terp olation accuracy is measured b y ξ 0 I ,k = log 10 ( y ( D 0 ) − ˆ y δ lb ,k ( D 0 )) ′ { V ( D 0 | y ( D 0 )) } − 1 ( y ( D 0 ) − ˆ y δ lb ,k ( D 0 )) , where ˆ y δ lb ,k ( D 0 ) = ( ˆ y δ lb ,k ( x 1 ) , ..., ˆ y δ lb ,k ( x n )) ′ , and V ( D 0 | y ( D 0 )) = σ 2 z ( R + δ lb I ). Similarly , ξ I ,k = log 10 ( ˆ y δ lb ,k ( D 0 ) − ˆ y δ lb ,k − 1 ( D 0 )) ′ { V ( D 0 | y ( D 0 )) } − 1 ( ˆ y δ lb ,k − 1 ( D 0 ) − ˆ y δ lb ,k ( D 0 )) , measures the impr o vemen t of the predictor ˆ y δ,k in in terp olati ng the data by increasing the num b er of terms in the von Neum ann app ro ximation. Lemmas 2 and 3 s ho w th at b oth ξ I ,k and ξ 0 I ,k tend to −∞ as k in creases. As ξ I ,k tends to −∞ , the predictor ˆ y δ,k is stabilizing. While as ξ 0 I ,k tends to −∞ , the p redictor in (14) is conv erging to the BLUP in (3). As we w ill see in Example 1, the rates of con ve rgence of ξ I ,k and ξ 0 I ,k ma y differ. Th at is, b oth of these measur es ( ξ 0 I ,k and ξ I ,k ) can b e used in practice to c h o ose appropriate M for ac hieving the d esired interp olation accuracy . F or measuring the p rediction accuracy (at out-of-sample p oint s), we define an analogo us quan tit y ξ 0 P ,k = log 10 ( y ( D new ) − ˆ y δ lb ,k ( D new )) ′ { V ( D new | y ( D 0 )) } − 1 ( y ( D new ) − ˆ y δ lb ,k ( D new )) , where D new is a set of n new unsampled p oint s in the inp u t space. Th e parameters σ 2 z and θ in the co v ariance matrix V ( D new | y ( D 0 )) = σ 2 z ( R + δ lb I ) are estimated fr om the original data D 0 , bu t δ lb w as recomputed for D new . F or the simulated examples considered in this pap er, w e used maximin Latin h yp ercub e designs of size n new = 1000 · d as D new , whereas the tidal p o wer applicatio n used a holdout set for D new . The next section illustrates that even the b est c h oice of δ can lead to o ver-smooth em ulators, and the iterativ e approac h is adv antag eous. 14 6 Examples T o illustrate the pr op osed approac h we fi rst present a few simulate d examples. The p erformance of the new iterativ e predictor is also compared w ith the p opular approac h . T hen, we revisit th e tidal p ow er mo d eling example. Example 1 Let x 1 , x 2 ∈ [0 , 1], and the u nderlying deterministic simula tor output b e generated using the GoldPrice function (Andre, Siarry an d Dognon 2000), f ( x 1 , x 2 ) = " 1 + x 1 4 + 2 + x 2 4 2 ( 5 − 7 x 1 2 + 3 x 1 4 + 1 2 2 − 7 x 2 2 + 3 x 1 2 + 3 x 2 4 + 1 2 + 3 x 2 4 + 1 2 2 )# ∗ " 30 + x 1 2 − 1 2 − 3 x 2 4 2 ( 26 − 8 x 1 + 12 x 1 4 + 1 2 2 + 12 x 2 − (9 x 1 + 18) x 2 4 + 1 2 + 27 x 2 4 + 1 2 2 )# . F or illustration pu rp oses, w e in ten tionally select a maximin Latin h yp ercub e design (Stein 1987) with n = 70 p oints that leads to an ill-conditioned correlation matrix for small θ ∈ (0 , ∞ ) 2 . It turns out that for this particular d esign (see Figure 3), the correlation matrix R is ill-co nditioned if θ 1 · θ 2 / 3. Figure 3(a) presents the cont ours (at heigh ts y = 120 , 500 , 1000 and 10000) of the true sim ulator (solid curve) and the GP surrogate fit (obtained using the metho dology outlined in Sections 3.1 and 3.3). F or successful implementat ion of the p opu lar appr oac h, we optimized the lik eliho od in the p arameter space δ ∈ (10 − 5 , 1) and θ ∈ (0 , ∞ ) 2 . The parameter estimates for the GP fit are ˆ δ mle = 1 . 0 6 · 10 − 5 and ˆ θ mle = (5 . 01 , 7 . 33) . Note that ˆ δ mle is close to the b oun d ary and the fitted su rrogate is significan tly d ifferen t than realit y in the cen tral part of the input space. The p arameter estimates for the GP mo del fit obtained fr om the prop osed method are ˆ θ mle = (2 . 26 , 2 . 7 5) and δ lb ( ˆ θ mle ) = 5 . 26 · 10 − 10 . The GP sur r ogate for M = 1, in Figure 3(b), shows a muc h b etter fit, wh ic h is further impr o v ed by the iterativ e appr oac h (see Figure 3(c) and 3(d)). Figure 4 sho ws that ξ I ,k go es to −∞ at a faster rate as compared to ξ 0 I ,k . T able 1 sum m arizes the results from a detailed sim u lation stu dy b ased on sev eral com b inations of the design sizes and th e b oun dary v alues of δ in th e lik eliho o d optimizat ion. F or fair comparison b et w een the tw o metho dologies, fi rst we use the prop osed mo del with only one term in the v on Neumann app ro ximation (i.e., M = 1) and the p op u lar metho d with ˆ δ mle . W e then increase the n um b er of iterations to m easur e the impro v ement in the int erp olation accuracy . The results in T able 1 are summarized o ver mo del fits with 50 random maximin Latin h yp ercub e d esigns. Th ese designs are generated u sing Matlab’s built-in f unction lhsdesign w hic h tak es rand om starting p oint s and h en ce the output designs are random. The table en tries are P 50 ( P 5 , P 95 ), where P r denotes the r -th p ercen tile of ξ 0 I ,M v alues obtained fr om the mo d el fits. 15 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (a) Po pular fit with MLE 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (b) M = 1 (with low er b oun d) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (c) M = 5 (with low er b ound) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (d) M = 20 (with lo w er b oun d ) Figure 3: The dots d enote the design p oin ts, the solid curves denote the con tour s of the true GoldPrice function, and the dashed curves represen t the con tours of the predicted surfaces. 0 20 40 60 80 100 −2 0 2 k ξ I,k 0 0 20 40 60 80 100 −5 0 ξ I,k Figure 4: Con v ergence of ξ I ,k (dashed curve - righ t axis) and ξ 0 I ,k (solid curve - left axis). F rom T able 1, it is clear th at the in terp olation err or of the surrogates fitted using the p opu lar approac h decreases by lo wering the b oundary v alue of the nugg et parameter (from 10 − 5 to 10 − 10 ) in the optimizatio n problem. I t turn s out that the correlation matrices are w ell-b eh av ed for small 16 T able 1: Median and ( P 5 , P 95 ) of ξ 0 I ,M v alues for the pr op osed approac h (denoted b y “ lb ”) and the p opular appr oac h (den oted by “ mle ”) applied to the GoldPrice fun ction. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 0.71 (- 3.60, 3.02) 1.36 (0.61, 1.74) 2.12 (1.91, 2.4 9) 2.43 (2.26, 2.60) ˆ δ mle ≥ 10 − 10 -0.89 (-7.08, 2.14) 1.07 (- 3.28, 2.03) 1.48 (0.46, 2.10) 1.21 (0.41, 1.70) δ lb ( M = 1) -25.71 (-28.13, -20.50) -16.68 (-20.49, -14.40) 0.85 (0.63, 1.17) 1.09 (0.90, 1.26) δ lb ( M = 5) 0.19 (- 0.29, 0.70) 0.43 (0.27, 0.72) δ lb ( M = 20) -0.48 (-0.91, -0.09) -0.0 7 (-0.35, 0.14) designs (i.e., n = 25 and n = 50) and non zero nugg ets are not required for a numerically stable mo del fitting p ro cess. Th is is captured by the prop osed approac h, as δ lb ( ˆ θ mle ) = 0 and the in terp o- lation error is muc h smaller than in the p opular approac h where a non-zero nugget is forced in the mo del. C onsequen tly , the iterati v e approac h is not used f or these cases. F or n = 75 and n = 100, the correlation matrices turn out to b e near-singular for θ near ˆ θ mle , and n on-zero δ had to b e u s ed for n umerically stable computation. It is clea r from the last thr ee ro ws that the prop osed ite rativ e approac h leads to impro vemen t in the inte rp olation accuracy . T able 2 summarizes the corresp onding prediction accuracy v alues, ξ 0 P ,M , w h ere the test set of unsampled p oints, D new , is a rand omly c hosen 2000 -p oint maximin Latin h yp ercub e d esign. The sim ulation results are based on 50 realizatio ns. As b efore, the maximin L atin hyp ercub e designs w ere generated using the built-in fu nction lhsdesign in Mat lab and the output designs are random. The resu lts illustrate that the p r op osed iterativ e approac h also impro v es the prediction at out- of-sample p oin ts. Note that the impr ov ement in prediction accuracy is not as significant as the impro v emen t in inte rp olation accuracy . This is exp ected as the prop osed metho dology is geared to wards improving the appro ximation of the in terp olator. T able 2: Median and ( P 5 , P 95 ) of ξ 0 P ,M v alues f or the predictors in the pr op osed approac h (denoted b y “ lb ”) and the p opular approac h (denoted by “ ml e ”) applied to the GoldPrice function. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 4.59 (3.71, 6.17) 3.45 (3.25, 3.75) 3.34 (3.18, 3.46) 3.23 ( 3.12, 3.43) ˆ δ mle ≥ 10 − 10 4.49 (3.68, 6.01) 3.40 (3.12, 3.85) 2.76 (2.50, 3.03) 2.30 ( 1.85, 2.52) δ lb ( M = 1) 4.28 (3.59, 5.72) 3.22 (3.02, 3.54) 2.29 (2.07, 2.52) 1.94 (1.68, 2.18) δ lb ( M = 5) 2.15 (1.90, 2.42) 1.77 (1.55, 2.02) δ lb ( M = 20) 2.08 (1.71, 2.50) 1.69 (1.47, 1.98) Example 2 Supp ose the deterministic simulato r ou tp uts are generated using the three d imensional 17 P erm function (Y ang 2010) giv en by f ( x ) = 3 X k =1 " 3 X i =1 ( i k + β ) ( x i /i ) k − 1 # 2 , where x = ( x 1 , x 2 , x 3 ) and the i -th inpu t v ariable x i ∈ [ − 3 , 3] for i = 1 , ..., 3. F or con v enience, w e re-scale the input v ariables in [0 , 1]. As in th e previous example, we fit the GP mo d el us in g b oth the p opular method (Section 3.3) and the prop osed approac h (Sections 4 and 5) to the data generated b y ev aluating the P erm function at n design p oin ts. T able 3 compares the median and the tw o tail p ercenti les ( P 5 , P 95 ) of ξ 0 I ,M v alues obtained from fitting GP m o dels to 50 d ata sets (maximin Latin h yp ercub e designs generated using Matlab’s built-in function lhsdesign ) of different r u n-sizes. Here also, we p re-sp ecified the b oundary v alues f or estimati ng δ in the p opu lar approac h. T able 3: Median and ( P 5 , P 95 ) of ξ 0 I ,M v alues for the pr op osed approac h (denoted b y “ lb ”) and the p opular appr oac h (den oted by “ mle ”) applied to the p erm function. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 2.42 (- 2.78, 4.29) 1.70 (0.00, 2.17) 2.79 (1.56, 3.16) 3.12 (2.22, 3.63) ˆ δ mle ≥ 10 − 10 2.24 (- 4.17, 3.85) 1.69 (- 0.36, 2.33) 2.90 (1.39, 3.28) 3.07 (1.72, 3.68) δ lb ( M = 1) -26.46 (-27.53, -24.97) -21.37 (-22.98, -18.99) -18.76 (-20.07, -17.35) 1.76 (-20.35, 1.81) δ lb ( M = 5) -16.71 (-20.22, 1.73) As in Example 1, the int erp olation err ors of the GP fits obtained through th e p opular app roac h are sligh tly r educed b y lo w ering the b oundary v alue of δ (from 10 − 5 to 10 − 10 ) in the optimizati on pro cess. Th e small v alues of th e p ercen tiles of ξ 0 I ,M in the row lab elled “ δ lb ( M = 1)” of T able 3 indicate that the correlat ion matrices are wel l-b eha v ed (i.e., δ lb ( ˆ θ mle ) = 0) for most of the designs with runs-sizes n = 25, 50 and 75. It turns out that approxi mately 46% of the correlation matrices are well-behav ed for designs of size n = 100. In “ δ lb ( M = 5)” case, the interp olation accuracy has increased and 53% of the designs of size n = 100 sho w δ lb ( ˆ θ mle ) = 0. The p rop osed approac h facilitat es the inclusion of a non-zero nugge t only wh en requ ir ed for fixing the ill-conditioning problem. Clearly , the n umber of realizations that requ ired a non-zero n ugget in the GP mo dels here is muc h smaller than in the GoldPr ice example. Th is is exp ected b ecause getting n ear-singular correlation matrices b ecomes less lik ely as the dimensionalit y of the inpu t space increases. The corresp onding prediction accuracy measures for 50 sim u lations are summarized in T able 4 . The test set, D new , r equired for computing ξ 0 P ,M , is a r andomly chosen 3000-p oin t maximin Latin h yp ercub e design. As in E xample 1, the prediction accuracy increases with M , the n umber of iterations, and by lo wering the b oundary v alue of the n u gget in the optimization problem. Example 3 The b orehole mo del is a more r ealistic deterministic sim u lator, that mo dels the flo w 18 T able 4: Median and ( P 5 , P 95 ) of ξ 0 P ,M v alues f or the predictors in the pr op osed approac h (denoted b y “ lb ”) and the p opular approac h (denoted by “ ml e ”) applied to the P erm fu nction. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 6.78 (4.94, 7.43) 4.24 (4.06, 4.44) 4.01 (3.91, 4.28) 4.02 ( 3.80, 4.16) ˆ δ mle ≥ 10 − 10 6.68 (5.21, 7.22) 4.23 (4.12, 4.40) 4.10 (3.90, 4.27) 3.90 ( 3.52, 4.24) δ lb ( M = 1) 5.87 (4.70, 6.88) 4.08 (3.91, 4.28) 3.80 (3.60, 4.00) 2.60 (2.21, 3.89) δ lb ( M = 5) 2.49 (2.17, 3.86) rate thr ough a b orehole whic h is d r illed from the grou n d s urface through tw o aquifers, and is commonly used in compu ter exp eriments (e.g., Joseph, Hung and Sud j ian to 2008) to compare differen t metho ds . The fl o w rate is giv en by f ( x ) = 2 π T u ( H u − H l ) log( r /r w ) h 1 + 2 LT u log( r /r w ) r 2 w K w + T u T l i , where x = ( r w , r , T u , T l , H u , H l , L, K w ), and the input r w ∈ [0 . 05 , 0 . 15 ] is the r adius of the b orehole, r ∈ [100 , 50000 ] is th e radiu s of the influence, T u ∈ [63070 , 11 5600] is the transmiss ivity of the upp er aquifer, T l ∈ [63 . 1 , 116 ] is the transmiss ivit y of the lo wer aquifer, H u ∈ [990 , 111 0] is the p oten tiometric head of the upp er aquifer, H l ∈ [700 , 820] is the p oten tiometric h ead of the lo wer aquifer, L ∈ [1120 , 1680] is the length of the b orehole and K w ∈ [9855 , 12045 ] is the hydraulic conductivit y of the b orehole. F or con v enience, w e re-scale the input v ariables to [0 , 1]. T able 5 compares the median and tw o tail p ercen tiles ( P 5 , P 95 ) of ξ 0 I ,M v alues obtained f rom the GP mo d el surr ogates fitted to 50 random maximin Latin hyp ercub e d esigns via the t w o metho d s. Since the sim ulator is 8-dimensional, w e considered slightly larger ru n-sizes n = 50 , 75 , 100 and 125, h o wev er, th e num b er of sim ulations and the candid ates for the b oundary v alues of δ in the lik eliho od optimization were k ept the same as in Examples 1 and 2. T able 5: Median and ( P 5 , P 95 ) of ξ 0 I ,M v alues for the pr op osed approac h (denoted b y “ lb ”) and the p opular appr oac h (den oted by “ mle ”) applied to the b orehole mo d el. n = 50 n = 75 n = 100 n = 125 ˆ δ mle ≥ 10 − 5 0.62 (0.21, 1.26) 1.27 ( 0.69, 1.58) 1.55 (1.11, 1.91) 1.8 6 (1.44, 2.30) ˆ δ mle ≥ 10 − 10 0.48 (- 1.59, 1.12) 0.65 (- 1.05, 1.56) 0.83 (-1.47, 1.66) 1.33 (-0.41, 2.15) δ lb ( M = 1) - 18.47 (-19.45, -16.66) -16.18 (-17.06, -14.27) -13.93 (-15.19, -12.46) -14.74 (-16.00, - 13.47) As exp ected, the interpolation error of the GP fi ts obtained u sing the p opular approac h slight ly decreases by lo wering the b oundary v alue of the nugget parameter (from 10 − 5 to 10 − 10 ) in the optimization p roblem. Th e prop osed m etho d leads to predictors with significan tly higher in terp o- lation accuracy . The p ercen tiles of ξ 0 I ,M in the last ro w of T able 5 also suggest that most of the 19 correlation matrices are well-c onditioned for the θ v alues near ˆ θ mle , and δ lb ( ˆ θ mle ) = 0. That is, the iterativ e approac h is n ot needed to f urther improv e the in terp olation accuracy . T able 6 s ummarizes the prediction accuracy v alues (i.e., ξ P ,M ) for 50 simulations. Th e test set f or computing ξ 0 P ,M is a randomly chosen 8000-point maximin Latin hypercub e d esign. It is clear f rom T able 6 that more accurate prediction can b e ac hieved b y low ering the δ v alue in the optimization pro cess of the p opular approac h , and certainly the prop osed approac h results in the b est pr ed iction at unsamp led p oin ts among the three cases considered here. T able 6: Median and ( P 5 , P 95 ) of ξ 0 P ,M v alues f or the predictors in the pr op osed approac h (denoted b y “ lb ”) and the p opular approac h (denoted by “ ml e ”) applied to the b oreh ole mo del. n = 50 n = 75 n = 100 n = 125 ˆ δ mle ≥ 10 − 5 4.01 (3.73, 4.37) 4.04 (3.68, 4.34) 4.13 (3.77, 4.53) 4.23 ( 3.93, 4.56) ˆ δ mle ≥ 10 − 10 3.90 (3.55, 4.32) 3.81 (3.48, 4.16) 3.74 (3.27, 4.04) 3.77 ( 3.43, 4.07) δ lb ( M = 1) 3.73 (3.37, 4.08) 3.57 (3.22, 3.86) 3.37 (2.94, 3.82) 3.64 (3.25, 4.03) Example 4 W e no w r evisit the tidal p o wer example in Section 2. The computer simulator (a v ersion of FVCOM) is exp ensive and cannot b e ev aluated at numerous co ordin ates. Eac h of th e runs present ed here required ap p ro ximately one hour to run on 4 pro cessors in p arallel on the A tlan tic Computational Excellence net w ork (A C Enet) mahone cluster. While this is n ot particularly on er ou s on a large cluster, the grid resolutio n us ed in KMLH is ab out 200 m (length of a s ide in a triangle). A realistic mo del of 20 m sided triangular grid and with 10 vertica l la yers to mo del 3D flo w w ould increase th e compu tational exp ense by a factor of 5120 , making eac h individual simula tor run rough ly 10 times more costly than the generation of the en tire data set examined h ere. The o cean mo d elers b eliev e that the sim ulator is deterministic up to the mac hine precision and they are inte rested in an em ulator that int erp olates the sim u lator. A total of 533 runs (on a 13 × 41 grid) were used to obtain the data displa yed in Figure 5. W e use this data to compare our results. The goal is to bu ild an emulator of the computer mo del usin g a fraction of the budget (533 run s) that pro vides the b est appro ximation of the simulato r. W e used a maximin based cov erage design (Johnson, Mo ore and Ylvisak er 1990) to choose a subset of n = 100 p oints from these 533 p oint s to constitute a sp ace-filling design. The con tours from b oth the pred icted sur face and the tru e simulat or (based on the 13 × 41 grid) are s ho wn in Figure 6. F or successful implementa tion of the p opular approac h outlined in Sections 3.1 and 3.3, the lik eliho od optimization to ok place in the parameter space δ ∈ (10 − 5 , 1) and θ ∈ (0 , ∞ ) 2 , and the parameter estimates for the GP fit are ˆ θ mle = (163 . 1 8 , 50 . 66) and ˆ δ mle = 0 . 0462 (see Figure 6(a)). The parameter estimates for the GP mo del fitted u sing the p rop osed app roac h with M = 1 are 20 −64.5 −64.48 −64.46 −64.44 −64.42 −64.4 −64.38 −64.36 −64.34 45.3 45.31 45.32 45.33 45.34 45.35 45.36 45.37 45.38 45.39 45.4 0 5 10 15 x 10 7 Figure 5: FV CO M outp u ts (a v erage extractable p o wer) o v er a coarse grid in the Minas Pa ssage. A colored version of the figur e is a v ailable online. ˆ θ mle ≈ (788 . 54 , 221 . 18) and δ lb ( ˆ θ mle ) ≈ 0 (see Figure 6(b)). 0.75 0.8 0.85 0.9 0.95 0.2 0.3 0.4 0.5 0.6 0.7 0.8 X 1 X 2 (a) Po pular fit with MLE 0.75 0.8 0.85 0.9 0.95 0.2 0.3 0.4 0.5 0.6 0.7 0.8 X 1 X 2 (b) M = 1 (with low er bound ) Figure 6: Dots denote the design p oints, the solid curv es denote the cont ours for the true simulato r based on the 13 × 41 grid, the dashed curves sho w th e con tours of the GP fits. Figure 6 sho ws that the GP based emulator obtained using the pr op osed approac h (Figure 6(b)) is less sm o oth as compared to the emulat or obtained via the p opular app roac h (Figure 6(a)). The interp olation errors for the GP fits obtained with the p opular metho d and the pr op osed 21 approac h are ξ 0 I ,M = 6 . 63 and − 26 . 58 resp ectiv ely . That is, the prop osed approac h is b etter at appro ximating the interp olator. In terms of predicting the p o wer surface at un sampled p oints (i.e., the rest of 433 p oin ts), the prediction error v alues for b oth the p opular and p r op osed app roac h es are somewhat close, ξ 0 P ,M ≈ 10. Moreo v er, when usin g the p opular approac h, the maxim u m p redicted p o w er obtained by ev aluating max { ˆ y ( x ) , x ∈ χ } is appr oximate ly 1 . 4 · 10 8 W with the maximizer b eing (0.7850, 0.4500 ), whereas if w e use the prop osed approac h th e maxim u m p redicted p o wer is appro ximately 1 . 6 · 10 8 W observed at (0.7900 , 0.450 0). 7 Discussion Assuming that the und erlying compu ter sim ulator is deterministic up to th e mac h in e precision and th e s tatisticia n is certain ab out th e s uitabilit y of a GP m o del with Gaussian correlation as the em ulator, fitting the mo del to a data set with n p oin ts in d -dimensional in put space requ ir es computation of the determinan t and in v erse of n × n correlation matrices for sev eral θ v alues. In Section 4, we conducted a sim u lation study to exp lore sp ace-filling designs (sp ecifically maximin Latin h yp ercub e designs) for differen t com bin ations of ( n, d ) that can lead to near-singular corre- lation m atrices. In Section 5, we pr op osed an iterativ e appr oac h, that is also a generalization of the p opu lar approac h, to construct a new predictor ˆ y δ,M that has higher in terp olatio n accuracy as compared to ˆ y δ — the predictor fr om the p opular approac h. Lemmas 1, 2 and 3 sho w that ˆ y δ,M con verges to the BLUP as the n um b er of iteratio ns ( M ) increases. The lo we r b ound δ lb , prop osed in Section 4, also allo ws u s to use a n on-zero n ugget only when needed, and in this case minimizes the num b er of iteratio ns required to reac h th e desired inte rp olation accuracy . There are a few imp ortant remarks w orth noting. First, the metho dology d evelo p ed here can also b e adapted to the Ba y esian fr amework. F or computing th e p osterior of the parameters and the predictor, | R | and R − 1 need to b e computed for s everal r ealizati ons of θ , and a nugget is often used to o verco me the near-singularit y of R (e.g., T addy et al. 2009). T he prop osed low er b ound δ lb can b e u sed for defin ing a prior for δ , i.e., the s earc h should b e limite d to [ δ lb , 1). One can also use the iterativ e app roac h to furth er impr o ve the inte rp olation and/or prediction accuracy . Second, we used the squared exp onential correlation ( p k = p = 2 for all k ) in the GP mo del b ecause of its p op u larit y and go o d theoretical prop erties. It tur ns out the GP mo d el with other p o w er exp onen tial correlation (i.e., p k = p < 2) ma y lead to predictors with larger MSE and sometimes worse fits as compared to that of th e GP m o dels with the Gaussian correlation. Recall that the near-singularit y of R o ccurs b ecause (a) at least tw o of the design p oin ts (sa y x i and x j ) 22 are close together in the input sp ace, and /or (b) the hyper -p arameters θ k , k = 1 , ..., d are ve ry close to zero, i.e., P d k =1 θ k | x k ,i − x k ,j | p k ≈ 0. T his make s a few of the ro ws of R v ery similar, and will happ en ev en if p k < 2. Th at is, the ill-conditioning pr oblem ma y also o ccur when other p o w er exp onential correlatio n functions (i.e., p k ’s are same and less than 2 or p k ’s are different and less than 2) are used. A close r in v estigation r ev eals that with p k = p < 2, near-singular cases o ccur very frequently in the sequen tial design setup. Ho w ever, for the sp ace-filling designs, it is rather f ascinating that the o ccurr ence of near-singular cases is substantia lly reduced b y ev en a small reduction in the p o w er from p = 2 to p = 1 . 99. W e su sp ect this is due to the limiting b eha viour of the Gaussian correlation in the family of p o w er exp onen tial correlation fu nctions p ∈ (0 , 2]. In conclusion, wh en fitting a GP mo del to a data set obtained from a deterministic computer mo del with nearly–singular correlation matrices, w e recommend using δ lb - the lo w er b ound on the n ugget, along with the iterativ e approac h with the n um b er of iterations, M , c hosen according to the desired inte rp olation accuracy . Ac k n o wledgmen ts W e would lik e to thank the AE and t wo anonymous referees for man y useful comments and s ug- gestions that lead to significan t improv ement of the pap er . This w ork was partially supp orted b y Disco ve ry gran ts from th e Natural Sciences and Engineering Researc h Council of C anada. REFERENCES Ababou, R., Bagtzoglou, A. C. and W ood, E. F. (199 4). On th e condition n um b er of co v ariance matrices in kriging, estimation, and simulatio n of random fields. Mathematic al Ge olo gy , 26, 99 - 133. Andre, J., Siarr y, P. and Dognon, T. (200 0). An impr o vemen t of the standard genetic algorithm fighting premature con vergence. A dvanc es in Engine ering Softwar e , 32, 49 - 60. Aslett, R., Buck, R. J., Duv al l, S. G., Sacks, J . and Welch , W. J. (1998 ). Circuit opti- mization via sequentia l computer exp erimen ts: d esign of an outpu t bu ffer. Applie d Statistics , 47(1), 31 – 48. Bar ton, R. R. and Salagame, R. R. (1997). F actorial hyp ercub e d esigns for spatial correlation regression. Journal of Applie d Statistics , 24, 453 - 473 . 23 Bastos, L. and O’Ha gan, A. (2009 ). Diagnostics for Gaussian P r o cess E mulators. T e chnomet- rics , 51, 4, 425 - 438. Bergmann F. T. a nd Saur o H. M. (2008 ). Comparing simulat ion results of SBML capable sim ulators. Bioinformatics , 24, 196 3 – 196 5. Booker, A. J ., Dennis Jr., J. E., Frank , P. D., Serafini, D. B., Tor czon, V. and Tr osset, M. W. (1999) . A rigorous framew ork for optimization of exp ensive fun ctions b y surrogates. Structur al and Multidisciplinary Optimization , 17, 1 - 13. Booker, A. (20 00). W ell-conditioned Kriging mo dels for op timization of compu ter sim ulations. Mathematics and Computing T e chnolo gy P hantom Works - The Bo eing Comp any , M&CT- TECH-00-002 . Branin, F. K . (1972) . A widely con v ergent metho d for find ing m ultiple solutions of sim ultaneous nonlinear equations. IBM J. R es. Develop. , 504 - 522. Chen, C., Beardsley, R. C. and Cowles, G. (2006). An un structured grid, fi n ite-v olume coastal o cean mod el (FV COM) system. Oc e ano gr aphy , 19, 78 - 89. Golub, G. H. and V an Loan, C. F. (199 6). Matrix Computations . J ohns Hopkin s Univ ers it y Press, Baltimore, MD. Gramacy, R. B. and Lee, H. K. H. (2008). Bay esian treed Gaussian p ro cess mo dels with an application to compu ter mo deling. J. Amer. Statis. Ass. , 103(483), 111 9-113 0. Greenberg, D. (1979). A n umerical mo del inv estigation of tidal phenomena in the Ba y of F undy and Gulf of Maine. Marine Ge o desy , 2, 161 - 187. Huang, D., Al len, T.T., Notz, W.I. and Mille r, R.A. (2006). Sequen tial Kriging op timiza- tion using multiple fidelit y ev aluations, Struct. Multidisc Optim. , 32, 369 - 382 . Johnso n, M. E., Moore, L. M., and Yl visaker, D. (1990). Minimax and maximin distance designs. Journal of Statistic al Planning and Infer enc e , 26, 131 - 148. Jones, D., S chonlau, M ., and W elch, W. (1998) . Efficient Global O ptimization of Exp ensiv e Blac k-Bo x F unctions. Journal of Glob al Optimization , 13, 455 - 492. Joseph , V. R., Hung, Y., and A. Sudjianto. (2008) . Blind Kriging: A New Metho d for Dev eloping Metamo dels. Journal of Me chanic al Desi g n , 130(3 ), 31 – 102. 24 Lebedev, V. I. (1997). An intr o duction to functional analysis in c omputational mathematics , Birkh¨ auser, Boston. Karsten , R., McMill an, J ., Lickl ey, M. and Ha ynes , R. (2008) . Assessment of tidal current energy for the Minas P assage, Ba y of F undy . P r o c e e dings of the Institution of Me chanic al Engine ers, Part A: Journal of Power and Ener g y , 222, 493 - 507 . Kumar, B. and D a vidson, E. S . (1978). Performance ev aluation of highly concurr ent computers b y deterministic sim ulation, Comm unic ations of the ACM , 21 (11), 904 – 913. Medina, J .S., Moreno, M.G. and Ro yo, E.R. (2005). Sto c hastic vs d eterministic traffic sim ulator. Comparativ e study for its u se w ithin a traffic light cycles optimization arc hitecture, In pr o c. IWIN AC (2), 622 – 631. Neal, R. M. ( 1997). Mon te Carlo imp lementati on of Gaussian pro cess mo dels f or Ba ye sian regression and classification. T ec h Rep. 9702, Dept. of Statistics, Univ. of T or onto, Canada . Neumaier, A. (19 98). Solving ill- conditioned and singular linear systems: A tutorial on regular- ization. SIAM R eview , 40, 636 - 666. O akley, J. (2004 ). Es timating p ercentil es of computer code outputs. Appl. Statist. , 53, 83 - 93. Poole, D. and A. E. Rafte r y. (2000 ). I nference f or d eterministic sim ulation mo d els: the Ba yesia n melding approac h. J. Amer. Statis. Ass. , 95 (452), 1244 -1255 . Ranjan, P., Bingham, D. and Micha ilidis, G. (2008). S equen tial exp eriment design for con tour estimation from complex computer co des. T e chnometr ics , 50, 527 - 541. Rasmuss en, C. E. and Williams , C. K . I. (2006). Gaussian Pr o c esses for Machine L e arning . The MIT Press. Sacks, J., Welch, W. J ., Mitchell, T. J . and Wynn, H. P. (1989). Design and analysis of computer exp erimen ts. Statistic al Scie nc e , 4, 409 - 423. Santner , T. J., Williams, B. J. and Notz, W. (2003) The Design and Analysis of Computer Exp eriments . Spr inger-V erlag Inc., New Y ork, NY. Schonlau, M., Welch, W. and Jones, D. (1998). Global versus lo cal search in constrained op- timizatio n of computer mo dels. Ne w Developments and A pplic ations in E xp erimental Design, Institute of Mathematic al Statistics L e ctur e N otes, H aywar d, California , 34, 11 - 25 25 Stein, M. (1987 ). Large Sample Prop erties of Sim ulations Using Latin Hyp ercub e Sampling. T e chnometr ics , 29, 143 – 151. Stein, M. L. (1999). Interp olation of Sp atial Data: Some The ory for Kriging . Springer, NY. Su, H, Nelder, JA, W olber t, P and S pence, R (1996 ). Application of generalized linear mo dels to design imp ro v ement of an engineering artefact. Q u al R eliab E ngng Int , 12, 101-112. T addy, M. A., Lee , H. K. H., Gra y, G. A. and Griffin, J. (2009). Ba ye sian guidance for robust pattern search optimizatio n. T e chnometr ics (to app ear). Tikhonov, A. N. (1993). Solution of in correctly f ormulated problems and the r egularization metho d. Soviet Math. Doklady , 4, 1035 - 1038. V an Beers, W. C. M., and Kleijnen, J. P.C. (2004) . Kriging inte rp olation in sim ulation: a surve y . In Pr o c e e dings of the 2004 Winter Simulation Confer enc e , ed. R.G. Ingalls, M.D. Rossetti, J.S . Sm ith, and B.A. Peter, 113 - 121. Piscata wa y , New Jersey: Institute of Electrical and Electronics Engineers. Y ang, X. -S. (2010). T est problems in optimiza tion, Engine ering Optimizatio n: An Intr o duction with Metaheuristic Applic ations (Eds Xin-She Y ang), John Wiley & Sons . 26
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment