A Computationally Stable Approach to Gaussian Process Interpolation of Deterministic Computer Simulation Data

A Computational ly Stable Approac h to Gauss ian Pro cess In terp olation of Determinist ic Computer Sim ulatio n Data Pritam Ranjan 1 , Ronald Ha ynes 2 and Ric hard Kar sten 1 1 Department of Mathematics and Statistics, Acadia University , NS, Canada 2 Department of Mathematics and Statistics, Memorial Universit y , NL, Canada (pritam.ranjan@a cadiau.ca, rhaynes@mun.ca, richard.k arsten@a cadiau.ca) Abstract F or man y exp ensive deterministic computer sim ulators, the o utputs do not have replica tion error and the desired metamo del (or statistical em ulator) is an interp olator of the obser ved data. Realizations o f Gaussia n spatial pro cesses (GP) a re commonly used to mo del such sim- ulator outputs. Fitting a GP mo del to n da ta points r equires the co mputatio n of the inv er se and determina nt of n × n corr elation matrices , R , that a re so metimes computationally unstable due to near- s ingularity o f R . This happens if an y pair of design points are v ery close together in the input space. The p opular approa ch to ov ercome nea r-singular ity is to in tro duce a s mall nu gget (or jitter) parameter in the mo del that is estimated along with other mo del pa r ameters. The inclusion o f a nu gget in the mo del o ften ca uses unnecessa r y ov er-smo othing of the data. In this pap er, w e pro po se a low er b ound o n the n ug get that minimizes the over-smoo thing and an iterative regular iz ation approa ch t o co ns truct a predictor that f urther impr ov e s the interpolation accuracy . W e also show that the prop o s ed predictor con v erges to the GP interpolato r. KEY WORDS: Computer exp er iment ; Ma trix inv erse approximation; Regular iz a tion. 1 In tro duction Computer sim ulators are ofte n used t o mod el c omplex ph ysical and engineering pro cesses that are either to o exp ensiv e or time consuming to observ e. A sim ulator is said to b e d etermin istic if the replicate runs of the same inputs will yield identica l r esp onses. F or the last few decades, deterministic sim ulators hav e b een widely used to mo d el p h ysical pr o cesses. F or instance, Kumar and Da vidson (1978) u sed deterministic sim ulation mo dels for comparing the p erformance of highly concurrent computers; Su et al. (1996 ) used generalize d linear regression mo d els to design a lamp ﬁ lamen t via a d eterministic ﬁ nite-elemen t compu ter co de; Aslett et al. (1998 ) discuss an optimization problem for a deterministic circuit simulator; sev eral deterministic sim ulators are b eing used for analyzing bio c hemical net w orks (see Bergmann and Sauro 2008 for references). On the other hand, there are cases where sto chastic (non-deterministic) sim ulators are preferred du e to un a voidable biases (e.g., Poole and Raftery 2000). In spite of the recen t in terest in sto c hastic 1 sim ulators, d eterministic sim ulators are still b eing activ ely used. F or instance, Medina, Moreno and Roy o (2 005) demonstrate the preference of deterministic traﬃc sim ulators o v er their sto chastic coun terparts. In this pap er, w e assume that the sim ulator u nder consideration is deterministic up to w orking pr ecision and the scienti st is conﬁden t ab out the v alidity of the simulator. Sac k s , W elc h , Mitc hell and Wynn (1989) prop osed mo d eling (or em ulating) such an exp ensiv e deterministic sim ulator as a realization of a Gaussian sto c hastic p r o cess (GP). An em ulator of a deterministic simulat or is desired to b e an in terp olator of the observ ed data (e.g., Sac ks et al. 1989; V an Beers and Kleijnen 2004). F or the problem that motiv ated this wo rk, the ob jectiv e is to em ulate the a v erage extracta ble tidal p o w er as a function of the turbine lo cations in the Ba y of F und y , Nov a Scotia, Canada. The deterministic computer sim ulator for the tid al p o wer m o del is a n umerical solv er of a complex system of partial diﬀerential equatio ns, and we accept the sim ulator as a v alid r epresent ation of the tidal p o w er. In this pap er, we discuss a computational issue in building the GP based emulator for a determin- istic sim ulator. Fitting a GP mo del to n data p oin ts using eit her a maxim um lik eliho o d tec hnique or a Ba ye sian approac h requires the computation of the determinant and inv erse of seve ral n × n correlation matrices, R . Although the correlation matrices are p ositiv e d eﬁnite by deﬁnition, near- singularit y (also referred to as ill-conditioning) of these matrices is a common problem in ﬁtting GP mo dels. Abab ou, Bagtzoglou and W o o d (1994 ) study the relationship of a uniform grid to the ill-conditioning and qu alit y of mo del ﬁt for v arious co v ariance mo dels. Barton and S alagame (1997 ) study the eﬀect of exp eriment al design on the ill-conditioning of kriging m o dels. Jones, Sc honlau and W elc h (19 98) us ed th e singular v alue decomp osition to o v ercome the near-singularit y of R . Booker (2000) used the s um of indep enden t GPs to o verco me near-singularit y for m ulti-stage adaptiv e designs in kr iging mo dels. A m ore p opu lar solution to o v ercome n ear-singularit y is to in tro duce a nugge t or jitter p arameter, δ , in the mo d el (e.g., Sacks et al. 1989; Neal 1997 ; Booker et al. 1999; S an tner, Williams and Notz 2003; Gramacy and Lee 2008) that is estimated along with other mo d el parameters. How ev er, adding a n u gget to the mod el introdu ces add itional smo othin g in the p redictor and as a r esu lt the pred ictor is no longer an interp olator. Here, w e ﬁrst prop ose a lo wer b ound on the n ugget ( δ lb ) that minimizes the additional o v er - smo othing. Second, an iterativ e approac h is deve lop ed to enable the construction of a new p redictor that further impro v es the interp olation as well as the prediction (at unsampled design p oints) accuracy . W e also show that th e p rop osed p redictor con verges to an interp olator. Alt hough an arbitrary nugget (0 < δ < 1) can b e used in the iterativ e approac h, the rate of con v ergence (i.e., the n um b er of iteratio ns required to reac h certa in tolerance) dep ends on the magnitude of the n ugget. 2 T o this eﬀect, the p r op osed lo w er b ound δ lb signiﬁcan tly r educes the num b er of ite rations r equired. This feature is particularly desirable for implementat ion. The p ap er is organized as follo ws. Section 2 present s the tidal p ow er mo deling example. In Section 3, w e review the GP mo del, a compu tational issue in ﬁ tting the m o del, and the p opular approac h to ov ercome near-singularity . Section 4 pr esents the new low er b ound for the nugg et that is required to ac hiev e well- conditioned correlation matrices and min imize unnecessary o v er-sm o othing. In Section 5, we dev elop the iterati v e approac h for constructing a more acc urate predictor. Several examples are present ed in S ection 6 to illustrate the p erformance of our prop osed predictor ov er the one obtained us ing the p opular approac h. Finally , w e conclude the pap er with s ome remarks on the numerical issues and recommend ations for practitioners in Section 7 . 2 Motiv ating example The Ba y of F un d y , lo cated b et ween New Bru nswic k and No v a Scotia, Canada, w ith a sm all p ortion touc h ing Maine, US A, is w orld famous for its high tides. In the upp er p ortion of the Ba y of F undy (see Figure 1(a)), the diﬀerence in wat er lev el b et w een h igh tide and lo w tide can b e as muc h as 17 meters. The high tides in this region are a result of a resonance, with the natural p erio d of the Ba y of F un dy very close to the p erio d of the principal lunar tide. This results in ve ry regular tides in the Ba y of F un dy with a h igh tide ev ery 12.42 hour s. Th e incredible energy in these tides has mean t that the region has signiﬁcant p oten tial for extracting tidal p o w er (Green b erg 1979; Karsten, McMillan, Lic kley and Ha ynes 200 8 (h er eafter KMLH)). Though the notion of h arnessing tidal p o w er from the Ba y of F und y is not new, earlier prop osed metho ds of harvesti ng the muc h needed green electrica l energy in v olv ed building a barr age or dam. Th is metho d w as considered infeasible for a v ariet y of economic and en vironment al reasons. Recen tly , there h as b een rapid tec hnological deve lopmen t of in-steam tidal turbines. These devices act m uc h like wind turbines, with individual turbines placed in regions of strong tidal currents. Individu al tur bines can b e up to 20 m in diameter and can pro duce o v er 1 MW of p o wer. Ideally , these turbines w ould pro duce a p redictable and renew able s ource of p o w er w ith less of an impact on the environmen t than a d am. KMLH examined the p o w er p otent ial of f arms of such turbines across the Minas P assage (Figure 1(b)) wh ere the tidal cur r en ts are strongest. They found that the p oten tial extractable p o w er is muc h higher than pr evious estimates and that the environmen tal impacts of extracting p o w er can b e greatly reduced b y extracting only a p ortion of the m axim u m p o w er av ailable. The simulatio ns in K MLH did not represen t ind ividual turb ines and left open the 3 question of h o w to optimally place turb in es. In this pap er, we em ulate the KMLH numerical mo del to examine the placemen t of turbines to maximize the p ow er output. 66 o W 30’ 65 o W 30’ 64 o W 30’ 45’ 45 o N 15’ 30’ 45’ 46 o N Bay of Fundy Nova Scotia Minas Passage New Brunswick Minas Basin Nova Scotia (a) The upp er Bay of F und y 30’ 28’ 64 o W 26.00’ 24’ 22’ 20’ 18’ 19’ 20’ 45 o N 21.00’ 22’ 23’ 24’ (b) The Minas Passage Figure 1: Figure (a) shows the tr iangular grid used in the FV COM mo del for sim u lating tides in the upp er Ba y of F u ndy . The small b ox in the cen ter surroun ds the Minas P assage s h o wn in (b). The shaded triangles in the cen ter of (b) represen t a p ossible turb ine lo cation. W e numericall y sim ulate th e tides as in KMLH b y solving the 2D shallo w w ater equations using the Finite-V olume Coastal Ocean Mo del (FV COM) with a triangular grid on the upp er Ba y of F und y (see Figure 1(a)). Since the grid triangle s diﬀer in size and orien tation, the i -t h turbine was mo deled on the set of all triangular elemen ts whose cen ters lie within 250 m of ( x i , y i ). A p ossible turbine lo cation is shown in Figure 1(b). T he triangular grid wa s devel op ed by Da vid Green b erg and colleagues at the Bedford Institute of Oceanograph y , NS, Canada. The details of FV COM can b e foun d in Chen et al. (2006). Using this set up , the estimate of the electric p o w er that can b e harnessed th rough a turbine at a particular lo cation ( x, y ) o v er a tidal cycle T = 12 . 42 hour s is obtained by the simulator in KMLH. The a v erage tidal p o w er at the lo cation ( x, y ) is given by ¯ P ( x, y ) = 1 T Z T 0 P ( t ; x, y ) dt , where P ( t ; x, y ) is the extractable p o wer ou tp ut at time t and lo cation ( x, y ). The pro cess is deterministic up to the mac hin e precision, and the main ob jectiv e is to em ulate ¯ P ( x, y ). It tu rns out th at the GP mo del ﬁtted to the sim ulator output at n = 100 p oints (c h osen usin g a space-ﬁlling d esign criterion) is not an interp olator an d results in an o ver-smoothed emulato r (see 4 Example 4 f or details). Th is is un desirable as the o cean mo d elers are in terested in an em ulator th at in terp olates their sim ulator. This em ulator will b e used to obtain estimates of b oth the maximizer of the p o w er fun ction (i.e., the lo cation wh er e to pu t the tu r bine) and th e extractable p o w er at this location. Th e man u facturing and installation cost of the initial protot yp e turbine is v ery h igh (roughly 20 million dollars). Since th e o v er-smoothed em u lator can un derestimate the maxim um extractable p o w er, a go o d approximat ion of th e att ainable p o wer function can b e h elpful in sa ving the cost of a few turbin es. Example 4 sh o ws that the prop osed approac h leads to a more accurate estimate of the m axim u m extractable p o w er . 3 Bac kgroun d review 3.1 Gaussian pro cess mo del Let th e i -th input and output of the computer sim ulator b e denoted b y a d -dimensional v ector, x i = ( x i 1 , ..., x id ), and the univ ariate resp onse, y i = y ( x i ), resp ectiv ely . The exp eriment design D 0 = { x 1 , ..., x n } is the set of n input trials. The outpu ts of the sim ulation trials are held in the n -dimensional v ector Y = y ( D 0 ) = ( y 1 , y 2 , . . . , y n ) ′ . The simula tor output, y ( x i ), is m o deled as y ( x i ) = µ + z ( x i ); i = 1 , ..., n, (1) where µ is the o v er all m ean, and z ( x i ) is a GP with E ( z ( x i )) = 0, V ar ( z ( x i )) = σ 2 z , and Co v( z ( x i ) , z ( x j )) = σ 2 z R ij . In general, y ( D 0 ) has a multiv ariate normal distribu tion, N n ( 1 n µ, Σ), where Σ = V ( D 0 | y ( D 0 )) = σ 2 z R , and 1 n is a n × 1 v ector of all ones (see Sac ks et al. 1989, and Jones et al. 1998 f or details). Although there are several c hoices f or the correlation function, w e fo cus on the Gaussian correlation b ecause of its p rop erties lik e smo othness (or diﬀerentiabilit y in mean square sense) and p opularity in other areas lik e mac h in e learning (radial basis k ernels) and geostat istics (kriging). F or a d etailed discussion on correlation functions see Stein (19 99), San tner, Williams and Notz (2003) , and Rasm ussen and Williams (2006) . The Gaussian correlation fun ction is a sp ecial case ( p k = 2 for all k ) of the p o w er exp onen tial correlation family R ij = corr( z ( x i ) , z ( x j )) = d Y k =1 exp {− θ k | x ik − x j k | p k } , for all i, j, (2) where θ = ( θ 1 , ..., θ d ) is the vect or of h yp er-parameters, and p k ∈ (0 , 2] is the smo othn ess param- eter. As discussed in Section 7, the results dev elop ed in th is pap er ma y v ary sligh tly when other correlation structur es are used instead of the Gaussian correlation. 5 W e use the GP mo d el with Gaussian correlation function to pr edict r esp onses at any u nsampled design p oin t x ∗ , ho wev er, th e theory deve lop ed here is also v alid for other corr elation structur es in the p o w er exp onent ial family (see Section 7 for more details). F ollo wing the maxim um lik eliho od approac h, the b est linear unbiase d predictor (BLUP) at x ∗ is ˆ y ( x ∗ ) = ˆ µ + r ′ R − 1 ( Y − 1 n ˆ µ ) =  (1 − r ′ R − 1 1 n ) 1 ′ n R − 1 1 n 1 ′ n + r ′  R − 1 Y , (3) with mean squared error s 2 ( x ∗ ) = σ 2 z (1 − 2 C ′ r + C ′ RC ) (4) = σ 2 z  1 − r ′ R − 1 r + (1 − 1 ′ n R − 1 r ) 2 1 ′ n R − 1 1 n  , where r = ( r 1 ( x ∗ ) , ..., r n ( x ∗ )) ′ , r i ( x ∗ ) = corr( z ( x ∗ ) , z ( x i )), an d C is such that ˆ y ( x ∗ ) = C ′ Y . In practice, the p arameters µ , σ 2 z and θ are replaced with estimates (see Sac ks et al. 1989, S an tner, Williams and Notz 2003, for details). 3.2 A computational issue in mo del ﬁtting Fitting a GP mod el (1)–(4) to a d ata set with n observ ations in d -dimensional inpu t space requires n umerous ev aluatio ns of the log- lik elihoo d function for sev eral r ealizations of the parameter vect or ( θ 1 , ..., θ d ; µ, σ 2 z ). The closed form estimators of µ and σ 2 z , giv en by ˆ µ ( θ ) = ( 1 n ′ R − 1 1 n ) − 1 ( 1 n ′ R − 1 Y ) and ˆ σ 2 z ( θ ) = ( Y − 1 n ˆ µ ( θ )) ′ R − 1 ( Y − 1 n ˆ µ ( θ )) n , (5) are often u sed to obtain the proﬁle log-lik eliho o d − 2 log L p ∝ log ( | R | ) + n log[( Y − 1 n ˆ µ ( θ )) ′ R − 1 ( Y − 1 n ˆ µ ( θ ))] , (6) for estimating the h yp er-parameters θ = ( θ 1 , ..., θ d ), wh ere | R | denotes the d eterminan t of R . Recall from (2), that the correlatio n matrix R d ep ends on θ and the design p oints. An n × n matrix R is said to b e near-singular (or, ill-conditioned) if its condition n um b er κ ( R ) = k R k · k R − 1 k is to o large (see Section 4 for details on “ho w large is large?”), where k · k denotes a matrix norm (w e will u se the L 2 –norm). Although these correlation matrices are p ositiv e deﬁnite b y deﬁn ition, computation of | R | an d R − 1 can sometimes b e unstable d ue to ill-conditioning. This pr oh ib its precise computation of th e lik eliho o d and h ence the parameter estimat es. Ill-conditioning of R often o ccurs if an y pair of design p oin ts are v ery close in the inpu t space, or θ k ’s are close to zero, i.e., P d k =1 θ k | x ik − x j k | p k ≈ 0. The d istances b et wee n neigh b oring p oin ts in space- ﬁlling designs with large n (sample size) and small d (input dimension) can b e v ery small. 6 Near-singularit y is more common in the s equ en tial design setup (e.g., expected impro v ement based designs, see J ones et al. 1998; Schonlau et al. 1998; Oakley 2004; Hu ang et al. 2006; Ranjan et al. 2008; T addy et al. 2009), where the follo w-up p oin ts tend to “pile u p ” near th e pre-sp eciﬁed features of interest lik e the global maxim um , con tours, quan tiles, and so on. 3.3 The p opular approac h A p opular appr oac h to ov ercome th e ill-conditioning of R is to in tro duce a nugget, 0 < δ < 1 in the m o del, and replace the ill-conditioned R w ith a well- conditioned R δ = R + δ I that h as a smaller condition n um b er (see Section 4 for details) as compared to that of R . Equ iv alen tly , one can introdu ce an in dep end en t w hite noise p ro cess in the mo del y ( x i ) = µ + z ( x i ) + ǫ i , i = 1 , ..., n, where ǫ i are i.i.d. N (0 , σ 2 ǫ ). That is, V ar ( Y ) = V ( D 0 | y ( D 0 )) = σ 2 z R + σ 2 ǫ I = σ 2 z ( R + δI ) for δ = σ 2 ǫ /σ 2 z . The v alue of the nugg et is b ounded ab ov e, δ < 1, to ensu re that the n umerical uncertain t y is smaller th an the pr o cess uncertain t y . Th e resulting BLUP is give n b y ˆ y δ ( x ) =  (1 − r ′ ( R + δ I ) − 1 1 n ) 1 ′ n ( R + δ I ) − 1 1 n 1 ′ n + r ′  ( R + δ I ) − 1 Y , (7) and the asso ciated mean squared error s 2 δ ( x ) is s 2 δ ( x ) = σ 2 z  1 − 2 C ′ δ r + C ′ δ RC δ  , (8) where C δ is suc h that ˆ y δ ( x ) = C ′ δ Y . Theoretically , it is s tr aigh tforward to s ee that th e u se of a p ositiv e nugg et in th e GP mo d el pro du ces a non-inte rp olator. Jones et al. (1998) show that the GP ﬁ t giv en by (3) and (4) is an in ter- p olator b ecause for 1 ≤ j ≤ n , r ′ R − 1 = e ′ j , where e j is the j -th unit v ector, r = ( r 1 ( x j ) , ..., r n ( x j )) ′ and r i ( x j ) = corr( z ( x i ) , z ( x j )). If w e use a δ ( > 0) in the mo del (i.e., replace R with R δ ), then r ′ R − 1 δ 6 = e ′ j and thus ˆ y ( x j ) 6 = y j and ˆ s 2 ( x j ) 6 = 0. F rom a practitioner’s viewp oin t, one could sacriﬁce exact interp olation if the interpolation accuracy of the ﬁt is within the desired tole rance, but it is not alw a ys ac hiev able (see Sectio n 6 for illustrations). The n ugget parameter δ is often estimated along with the other mo del parameters. How ev er, one of the ma jor concerns in the optimization is that the likeli ho o d (mo diﬁed b y replacing R with R δ ) computation fails if the ca ndidate nugget δ ∈ (0 , 1) is n ot large enough to ov ercome ill-co nditioning of R δ . T o a v oid th is problem in the optimizat ion, it is common to ﬁx an ad-ho c b oundary v alue on the nugget parameter. Th e resulting maximum lik eliho o d estimate is often close to this b oundary 7 v alue and th e ﬁ t is not an int erp olator of the observ ed d ata (i.e., th e interp olation err or is more than the desired tolerance). Even if the estimated nugg et is n ot near the b oundary , the us e of a n ugget in th e mo d el in this manner may in trod uce un necessary o v er-sm o othing from a p r actical standp oint (Section 6 p resen ts sev eral illustrations). In the next s ection, we p r op ose a low er b ound on the nugget that minimizes the unnecessary o ve r-smo othing. 4 Cho osing the Nugget Recall from Section 3.2 that an n × n matrix R is said to b e ill-conditioned or near-singular if its condition num b er κ ( R ) is to o large. Thus, we in tend to ﬁnd δ suc h that κ ( R δ ) is smaller than a certain thresh old. Ou r main ob jective s here are to compu te the condition num b er of R δ and the threshold that classiﬁes R δ as wel l-b eha v ed. Let λ 1 ≤ λ 2 ≤ · · · ≤ λ n b e the eigen v alues of R . Th en, in the L 2 –norm, κ ( R ) = λ n /λ 1 (Golub and V an Loan 1996) . The addition of δ along the main d iagonal of R shifts all of the eigen v alues of R by δ . That is, the eig en v alues of R δ = R + δ I are λ i + δ , i = 1 , ..., n , where λ i is the i -th smallest eigen v alue of R . T hus, R δ is w ell-conditioned if log( κ ( R δ )) / a λ n + δ λ 1 + δ / e a δ ' λ n ( κ ( R ) − e a ) κ ( R )( e a − 1) = δ lb , where κ ( R ) = λ n /λ 1 and e a is the d esired threshold for κ ( R δ ). Note th at δ lb is a fu nction of the design p oints and the hyper -p arameter θ . The closed form expressions for the eigenv alues and hence the condition num b er of a Gaussian correlation matrix R , in (2), for arb itrary θ and d esign { x 1 , ..., x n } is, to our kno w ledge, yet u n- kno wn. If x ∈ ( −∞ , ∞ ) d and x k ∼ N (0 , σ 2 x ), closed form expressions of the exp ected eigen v alues of R are kno wn (see Section 4.3 in Rasm u s sen and Williams 2006). In ou r case, x ∈ [0 , 1] d , and the design p oin ts are often c hosen u s ing a space-ﬁlling cr iterion (e.g., Latin h y p ercub e with prop erties lik e maximin distance, minimum correlation, O A; un iform d esigns, and so on). In suc h cases, one ma y assum e, at m ost, x k ∼ U (0 , 1) for k = 1 , ..., d . In fact, the ob jectiv es of b uilding eﬃcient em- ulators for computer simulators often include estimating pre-sp eciﬁed pro cess features of interest, and s equ en tial d esigns (e.g., exp ected improv ement based designs) are preferred to ac hiev e such goals. In such designs, the follo w-u p p oin ts tend to “pile up” near the feature of interest. The distributions of su c h d esign p oint s are not u niform and can b e non-trivial to represen t in analyti cal 8 expressions. Hence, it is almost surely infeasible to obtain closed form expressions for the eigen v al- ues of su ch R in general. O f course, one can compute these quant ities numerically . W e u se Matlab’s built-in function eig to compute the maximum eigen v alue of R and c ond to calculate the condition n um b er κ ( R ) = λ n /λ 1 in the expr ession of δ lb . Another imp ortan t comp onent of the p rop osed lo wer b oun d is the threshold for getting w ell- b ehav ed n on-singular correlation matrices. As one wo uld susp ect, the near-singularit y of suc h a correlation matrix dep ends on n , d , th e distribu tion of { x 1 , ..., x n } ∈ [0 , 1] d and θ ∈ (0 , ∞ ) d . W e no w present the k ey steps of th e sim ulation algorithm used for estimating the threshold und er a sp eciﬁc design f ramew ork. Th e resu lts are a v eraged o v er the distribution of { x 1 , ..., x n } and θ , and th us it is su ﬃcien t to ﬁnd the threshold of κ ( R ). F or several com bin ations of n and d , we generate 5000 correlation matrices where the design p oints { x 1 , ..., x n } follo w the maximin Latin h yp ercub e s ampling sc h eme (Stein 1987) and θ k ’s are c hosen fr om an exp onentia l d istribution with mean 1. Recall from Section 3.2 that a near- singular (or ill-conditioned) correlation matrix has a large cond ition num b er, and κ ( R ) is in versely prop ortional to θ . Consequently , w e fo cuss ed on small v alues of θ in sim ulating R . These correlation matrices are u sed to compu te the pr op ortion of matrices that are near-singular (see the contours in the left panel of Figure 2). W e used Matla b’s built-in fun ction lhsdesign to generate the design p oints and chol (wh ic h computes the Cholesky factorizatio n) to chec k whether or n ot a matrix R w as near-singular un der the working precision. F or a p ositive deﬁnite wel l-b ehave d matrix R , “ [ U, p ] = chol ( R ) ” pr o duc es an upp er triangular matrix U satisfying U ′ U = R and p is zer o. If R is not p ositive deﬁnite, then p is a p ositive inte ger. W e also computed the condition num b ers of these 5000 correlation matrices (u sing Matlab’s b uilt-in function c ond ). Th e r ight panel of Figure 2 presents th e conto urs of the a verage of log( κ ( R )) for diﬀeren t com binations of n and d . F rom Figur e 2, it is clear that a ≈ 25 can b e us ed as the threshold for log ( κ ( R δ )) of a well- b ehav ed correlation matrix R δ . Also note that the p rop ortion of near-singular cases, denoted b y the contours in the left panel of Figure 2 , d ecreases rapid ly with the increment in the inpu t dimension. This is somewhat intuitiv e b ecause the volume of the voi d (or un explored region) increases exp onentia lly with the d imension, and a really large space-ﬁlling design is needed to jeopardize th e conditioning of the correlation matrices in high d imensional inpu t space. F or other design schemes (e.g. , sequential designs), one can follo w these steps to estimate the th reshold for the condition num b er of w ell-b eha v ed correlation m atrices. 9 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 5 10 15 20 25 30 35 40 45 50 0.05 0.1 0.2 0.4 0.6 0.8 0.9 d n/d 10 15 20 25 30 35 40 d n/d 1 2 3 4 5 6 5 10 15 20 25 30 35 40 45 50 Figure 2: The con tour s in the left panel sh o w the prop ortion of correlation matrices ﬂagged as near-singular. T he cont ours in the right panel displa y a v erage log ( κ ( R )) v alues. The shaded region in the left panel corresp onds to log ( κ ( R )) > 25. The lo w er b oun d on the nugge t is only a suﬃcien t condition and not a necessary on e f or R δ to b e well- conditioned. F or instance, a correlation matrix with 100 design p oin ts in (0 , 1) 2 c h osen using a space-ﬁlling criterion ma y lead to a well-beh av ed R if θ is very large. If the correlation matrix is well- conditioned, R should b e used instead of R δ , i.e., δ lb = max  λ n ( κ ( R ) − e a ) κ ( R )( e a − 1) , 0  . (9) That is, when R is well-behav ed our approac h allo ws δ lb to b e zero and h ence a m ore accurate surrogate can b e obtained as compared to the p opular approac h (Section 3.3) wh ere a non-zero n ugget is forced in the mo del which ma y lead to undesirable ov er-smo othing. This could b e of concern in h igh d imensional in put space, b ecause the prop ortion of near-singular cases d ecreases with th e increment in the inp ut dimension. Example 3 demons tr ates the p erformance of the prop osed metho dology o ver the p opular approac h for an eigh t-dimensional simulato r. Although the u se of δ lb in the GP mo del minimizes the o v er-smo othing, δ lb ma y not b e small enough to achiev e the desired in terp olation accuracy (see Examples 1 and 2 f or illustrations), and c h o osing δ < δ lb ma y lead to ill-conditioned R . This ma y not b e a big issue if one b eliev es th at th e sim ulator is somewhat noisy and/or the statistical em u lator is b iased due to m is-sp eciﬁcation in the correlation structure or m o del assumptions. In such cases, a little smo othing m igh t b e a go o d idea. Ho wev er, con trolling the amount of sm o othing is a n on-trivial task and requires more atten tion. On the other h and, o v er-smo othing is u ndesirable if the exp erim enter b eliev es that the compu ter 10 sim ulator is deterministic and the statistician is conﬁden t ab out th e c h oice of the em ulator (w e consider the GP mo del w ith Gaussian correlation structur e). Under these assumptions, we no w prop ose a new p redictor that can ac hiev e the desired lev el of interp olation accuracy . 5 New Iterativ e App r oac h In this section, we pr op ose a p redictor th at is based on the iterativ e use of a nugge t δ ∈ (0 , 1). This approac h do es not dep end sev erely on the m agnitude of the nugg et, and the results dev elop ed here are based on an arbitrary 0 < δ < 1, large enough to ensure R δ w ell-b eha ved. Ho w ev er, c ho osing δ > δ lb ma y requir e more iterations to attain th e desired interp olation accuracy , and w e recommend using δ lb . W e also sho w that the prop osed predictor con verges to th e interp olator (3) and (4). Recall that the k ey p roblem here is the inaccurate computation of | R | and R − 1 due to ill- conditioning of R . The main idea of the new approac h is to r ewrite the proﬁ le log-li k elihoo d as − 2 log L p ∝ − log ( | R − 1 | ) + n log [( Y − 1 n ˆ µ ( θ )) ′ R − 1 ( Y − 1 n ˆ µ ( θ ))] , (10) and r eplace the ill-conditioned R − 1 with a w ell-b eha ved quan tity . This mo d iﬁed proﬁ le log- lik eliho od can then b e optimized to get the parameter estimates. Next , w e describ e how to ﬁn d the appr opriate w ell-behav ed substitute for R − 1 . In the same spirit as the p opular app roac h, w e attempt to ev aluate R − 1 w by solving Rt = w , under the assumption th at R cannot b e in v erted accurately (i.e., R is near-singular) and there exists a δ ∈ (0 , 1) such that R δ = ( R + δ I ) is well-co nditioned. In an attempt to ﬁnd an interp olator of the simulato r (up to certain accuracy), our ob jectiv e is to ﬁn d t ∗ = f ( R δ , w ) that is a b etter appro ximation of t = R − 1 w as compared to ˜ t = R − 1 δ w , suggested by the p opular approac h. T o ac hieve this goal, w e p rop ose to us e iter ative r e gularization (e.g., Tikhono v 1963, Neumaier 1998), a tec h nique for solving ill-conditioned systems of equations. Let s 0 = w and s i , i = 1 , ..., M , b e a sequence of vec tors obtained by recursiv ely solving the system of equations giv en b y ( R + δ I ) s i = δ s i − 1 . (11) Then, the estimate of t = R − 1 w after the i -th iteration (1 ≤ i ≤ M ) of regularization is giv en b y t i = t i − 1 + s i δ , (12) where t 0 is a vecto r of zeros. The ﬁnal solution with M iterations of regularizatio n, t M = M X k =1 δ k − 1 ( R + δ I ) − k w, 11 requires only one direct inv ersion (or one Ch olesky decomp osition) of R δ = R + δ I , follo w ed by M forward and b ac kward substitutions. The pr op osed approximat ion of t = R − 1 w is t M , with M ≥ 1 c hosen to satisfy the in terp olatio n acc uracy requiremen t. Lemma 1 sho ws that the iterativ e regularization approac h in (11) an d (12 ) leads to a solution that is a generalizati on of the p opu lar approac h outlined in S ection 3.3. Lemma 1 L et R b e a n × n p ositive deﬁnite c orr elation matrix, I b e the n × n i dentity matrix, and 0 < δ < 1 b e a c onstant, then R − 1 = ∞ X k =1 δ k − 1 ( R + δ I ) − k . The con v ergence of this inﬁ nite series follo ws from the v on Neumann series (the matrix version of the T a ylor series, Leb edev 1997) exp ansion of g ( u ) = ( R + uI ) − 1 around u = δ : g ( u ) = g ( δ ) + ( u − δ ) g ′ ( δ ) + ( u − δ ) 2 2! g ′′ ( δ ) + ( u − δ ) 3 3! g ′′′ ( δ ) + · · · , i.e., ( R + uI ) − 1 = ( R + δ I ) − 1 + ( u − δ )( − 1)( R + δ I ) − 2 + ( u − δ ) 2 2! ( − 1)( − 2)( R + δ I ) − 3 + · · · = ( R + δ I ) − 1 − ( u − δ )( R + δ I ) − 2 + ( u − δ ) 2 ( R + δ I ) − 3 − · · · . Setting u = 0, we get R − 1 = P ∞ k =1 δ k − 1 ( R + δ I ) − k and thus the prop osed solution obtained using the iterativ e regularization is th e M -th ord er v on Neumann appro ximation of R − 1 . The predictor ˆ y δ in the p opular approac h (7) uses t 1 , the ﬁ rst order vo n Neumann appro ximation, and hence our prop osed appr oac h is a generaliza tion of the p opular approac h. The prop osed regulariza tion is implemen ted by optimizi ng the mo diﬁed proﬁle log-lik eliho o d − 2 log L p ∝ − log( | R − 1 δ,M | ) + n log [( Y − 1 n ˆ µ ( θ )) ′ R − 1 δ,M ( Y − 1 n ˆ µ ( θ ))] , (13) where R − 1 δ,M = P M k =1 δ k − 1 ( R + δ I ) − k . Closed form expressions for ˆ µ ( θ ) and ˆ σ 2 z ( θ ) are the same as in (5) su b ject to R − 1 replaced by R − 1 δ,M . Th e new regularized p redictor ˆ y δ,M ( x ) at x ∈ χ is ˆ y δ,M ( x ) =    1 − r ′ R − 1 δ,M 1 n   1 ′ n R − 1 δ,M 1 n  1 ′ n + r ′   R − 1 δ,M Y , (14) and the corresp onding MSE s 2 δ,M ( x ) is giv en by s 2 δ,M ( x ) = σ 2 z (1 − 2 C ′ δ,M r + C ′ δ,M RC δ,M ) , (15) where C δ,M is suc h that ˆ y δ,M ( x ) = C ′ δ,M Y . Lemmas 2 and 3 establish the con vergence results for an arbitrary 0 < δ < 1. 12 Lemma 2 L et R b e a ne ar-singu lar c orr elation matrix as deﬁne d in ( 2 ), and 0 < δ < 1 b e a nugget such that R + δ I i s wel l-b ehave d. Then, for every x ∗ ∈ χ = [0 , 1] d , lim M → ∞ ˆ y δ,M ( x ∗ ) = ˆ y ( x ∗ ) , wher e ˆ y ( x ∗ ) and ˆ y δ,M ( x ∗ ) ar e deﬁne d in ( 3 ) and ( 14 ) r esp e ctively. The pro of f ollo ws from Lemma 1 and using lim M → ∞ R − 1 δ,M = R − 1 in (14). It is straigh tforward to sho w that C δ,M in (15) con v erges to C in (4) as M → ∞ . This also p ro v es the next result on the con vergence of the mean squared error for th e prop osed predictor. Lemma 3 L et R b e a ne ar-singu lar c orr elation matrix as deﬁne d in ( 2 ), and 0 < δ < 1 b e a nugget such that R + δ I i s wel l-b ehave d. Then, for every x ∗ ∈ χ = [0 , 1] d , lim M → ∞ s 2 δ,M ( x ∗ ) = s 2 ( x ∗ ) , wher e s ( x ∗ ) and s δ,M ( x ∗ ) ar e deﬁne d in ( 4 ) and ( 15 ) r esp e c tively. Lemmas 2 and 3 p ro v e that ev en if a few pairs of p oint s are to o close together in the input space, or θ k ’s are close to zero to cause n ear-singularity of R , the prop osed iterativ e predictor con v erges to an interp olator as M increases (i.e., for 1 ≤ i ≤ n , ˆ y δ,M ( x i ) → y i and s 2 δ,M ( x i ) → 0 as M → ∞ ). Remark: In practice, w hen a pre-sp eciﬁed interpolation accuracy is desired, the pr op osed itera- tiv e appr oac h suggests reﬁtting the GP mo d el (i.e., optimization of (13)) for diﬀerent c h oices of M ≥ 1. Note that the parameter estimate s c hange with M whic h allo ws for th e extra ﬂexibilit y in the mod el that adju sts the ov er-smo othed p ortion of the surr ogate. First of all, the computational cost of ﬁtting this mo d el increases with M . S econdly , the com bin ed cost of reﬁ tting the mo del for diﬀeren t v alues of M can b e quite large. Although the n umerical stabilit y in compu ting R − 1 δ,M do es not c hange with M , computation of | R − 1 δ,M | can b ecome less numerically stable w ith increasing M . This is b ecause R − 1 δ,M → R − 1 as M → ∞ and the computation of | R − 1 | is assumed to b e unstable. Considerin g these issues, w e recommend optimizing th e proﬁ le log-lik eliho o d (13) w ith M = 1 to obtain ˆ θ mle and δ lb ( ˆ θ mle ), and then u se it to co mpute ˆ y δ,M ( x ) and ˆ s δ,M ( x ) for any M ≥ 1 b y follo win g the iterativ e regularization steps outlined ab o v e. The conv ergence results in Lemmas 1, 2 and 3 do not d ep end on the choice of θ and δ in R δ = R + δ I , and so the pr edictor obtained is still an in terp olator. T he k ey steps required for the implemen tation of th e prop osed approac h are as follo ws: 13 1. C omp utation of the p roﬁle log-lik eliho o d (10) for the estimation of θ . (a) Ch o ose a candidate θ in Θ d and compute R . (b) Comp u te the lo w er b ound of nugge t δ lb in (9). Note that δ lb is a f unction of the hyper- parameters θ , the design matrix and th e threshold. (c) Replace R − 1 with R − 1 δ lb , 1 in the likel iho o d (10). 2. O b tain th e p arameter estimates ˆ θ and δ lb ( ˆ θ ) b y optimizing the proﬁ le log-lik eliho o d . T hen compute ˆ µ ( ˆ θ ) and ˆ σ 2 z ( ˆ θ ). 3. Use the parameter estimates ˆ θ , δ lb ( ˆ θ ), ˆ µ ( ˆ θ ) and ˆ σ 2 z ( ˆ θ ) to compute the regularized em ulator giv en by ˆ y δ lb ,M ( x ) and ˆ s 2 δ lb ,M ( x ) in (14 ) an d (15) resp ectiv ely . The num b er of iterations ( M ) in ˆ y δ lb ,M ( x ) and ˆ s 2 δ lb ,M ( x ) dep ends on the d esir ed in terp olation accuracy , and one ca n build s topp ing rules for attaining the p re-sp eciﬁed accuracy in (14). W e use Mahalanobis distance (Ba stos and O’Hagan 2009) to compute th e accuracy of the predictor. Th e in terp olation accuracy is measured b y ξ 0 I ,k = log 10  ( y ( D 0 ) − ˆ y δ lb ,k ( D 0 )) ′ { V ( D 0 | y ( D 0 )) } − 1 ( y ( D 0 ) − ˆ y δ lb ,k ( D 0 ))  , where ˆ y δ lb ,k ( D 0 ) = ( ˆ y δ lb ,k ( x 1 ) , ..., ˆ y δ lb ,k ( x n )) ′ , and V ( D 0 | y ( D 0 )) = σ 2 z ( R + δ lb I ). Similarly , ξ I ,k = log 10  ( ˆ y δ lb ,k ( D 0 ) − ˆ y δ lb ,k − 1 ( D 0 )) ′ { V ( D 0 | y ( D 0 )) } − 1 ( ˆ y δ lb ,k − 1 ( D 0 ) − ˆ y δ lb ,k ( D 0 ))  , measures the impr o vemen t of the predictor ˆ y δ,k in in terp olati ng the data by increasing the num b er of terms in the von Neum ann app ro ximation. Lemmas 2 and 3 s ho w th at b oth ξ I ,k and ξ 0 I ,k tend to −∞ as k in creases. As ξ I ,k tends to −∞ , the predictor ˆ y δ,k is stabilizing. While as ξ 0 I ,k tends to −∞ , the p redictor in (14) is conv erging to the BLUP in (3). As we w ill see in Example 1, the rates of con ve rgence of ξ I ,k and ξ 0 I ,k ma y diﬀer. Th at is, b oth of these measur es ( ξ 0 I ,k and ξ I ,k ) can b e used in practice to c h o ose appropriate M for ac hieving the d esired interp olation accuracy . F or measuring the p rediction accuracy (at out-of-sample p oint s), we deﬁne an analogo us quan tit y ξ 0 P ,k = log 10  ( y ( D new ) − ˆ y δ lb ,k ( D new )) ′ { V ( D new | y ( D 0 )) } − 1 ( y ( D new ) − ˆ y δ lb ,k ( D new ))  , where D new is a set of n new unsampled p oint s in the inp u t space. Th e parameters σ 2 z and θ in the co v ariance matrix V ( D new | y ( D 0 )) = σ 2 z ( R + δ lb I ) are estimated fr om the original data D 0 , bu t δ lb w as recomputed for D new . F or the simulated examples considered in this pap er, w e used maximin Latin h yp ercub e designs of size n new = 1000 · d as D new , whereas the tidal p o wer applicatio n used a holdout set for D new . The next section illustrates that even the b est c h oice of δ can lead to o ver-smooth em ulators, and the iterativ e approac h is adv antag eous. 14 6 Examples T o illustrate the pr op osed approac h we ﬁ rst present a few simulate d examples. The p erformance of the new iterativ e predictor is also compared w ith the p opular approac h . T hen, we revisit th e tidal p ow er mo d eling example. Example 1 Let x 1 , x 2 ∈ [0 , 1], and the u nderlying deterministic simula tor output b e generated using the GoldPrice function (Andre, Siarry an d Dognon 2000), f ( x 1 , x 2 ) = " 1 +  x 1 4 + 2 + x 2 4  2 ( 5 − 7 x 1 2 + 3  x 1 4 + 1 2  2 − 7 x 2 2 +  3 x 1 2 + 3   x 2 4 + 1 2  + 3  x 2 4 + 1 2  2 )# ∗ " 30 +  x 1 2 − 1 2 − 3 x 2 4  2 ( 26 − 8 x 1 + 12  x 1 4 + 1 2  2 + 12 x 2 − (9 x 1 + 18)  x 2 4 + 1 2  + 27  x 2 4 + 1 2  2 )# . F or illustration pu rp oses, w e in ten tionally select a maximin Latin h yp ercub e design (Stein 1987) with n = 70 p oints that leads to an ill-conditioned correlation matrix for small θ ∈ (0 , ∞ ) 2 . It turns out that for this particular d esign (see Figure 3), the correlation matrix R is ill-co nditioned if θ 1 · θ 2 / 3. Figure 3(a) presents the cont ours (at heigh ts y = 120 , 500 , 1000 and 10000) of the true sim ulator (solid curve) and the GP surrogate ﬁt (obtained using the metho dology outlined in Sections 3.1 and 3.3). F or successful implementat ion of the p opu lar appr oac h, we optimized the lik eliho od in the p arameter space δ ∈ (10 − 5 , 1) and θ ∈ (0 , ∞ ) 2 . The parameter estimates for the GP ﬁt are ˆ δ mle = 1 . 0 6 · 10 − 5 and ˆ θ mle = (5 . 01 , 7 . 33) . Note that ˆ δ mle is close to the b oun d ary and the ﬁtted su rrogate is signiﬁcan tly d iﬀeren t than realit y in the cen tral part of the input space. The p arameter estimates for the GP mo del ﬁt obtained fr om the prop osed method are ˆ θ mle = (2 . 26 , 2 . 7 5) and δ lb ( ˆ θ mle ) = 5 . 26 · 10 − 10 . The GP sur r ogate for M = 1, in Figure 3(b), shows a muc h b etter ﬁt, wh ic h is further impr o v ed by the iterativ e appr oac h (see Figure 3(c) and 3(d)). Figure 4 sho ws that ξ I ,k go es to −∞ at a faster rate as compared to ξ 0 I ,k . T able 1 sum m arizes the results from a detailed sim u lation stu dy b ased on sev eral com b inations of the design sizes and th e b oun dary v alues of δ in th e lik eliho o d optimizat ion. F or fair comparison b et w een the tw o metho dologies, ﬁ rst we use the prop osed mo del with only one term in the v on Neumann app ro ximation (i.e., M = 1) and the p op u lar metho d with ˆ δ mle . W e then increase the n um b er of iterations to m easur e the impro v ement in the int erp olation accuracy . The results in T able 1 are summarized o ver mo del ﬁts with 50 random maximin Latin h yp ercub e d esigns. Th ese designs are generated u sing Matlab’s built-in f unction lhsdesign w hic h tak es rand om starting p oint s and h en ce the output designs are random. The table en tries are P 50 ( P 5 , P 95 ), where P r denotes the r -th p ercen tile of ξ 0 I ,M v alues obtained fr om the mo d el ﬁts. 15 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (a) Po pular ﬁt with MLE 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (b) M = 1 (with low er b oun d) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (c) M = 5 (with low er b ound) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 1 X 2 (d) M = 20 (with lo w er b oun d ) Figure 3: The dots d enote the design p oin ts, the solid curves denote the con tour s of the true GoldPrice function, and the dashed curves represen t the con tours of the predicted surfaces. 0 20 40 60 80 100 −2 0 2 k ξ I,k 0 0 20 40 60 80 100 −5 0 ξ I,k Figure 4: Con v ergence of ξ I ,k (dashed curve - righ t axis) and ξ 0 I ,k (solid curve - left axis). F rom T able 1, it is clear th at the in terp olation err or of the surrogates ﬁtted using the p opu lar approac h decreases by lo wering the b oundary v alue of the nugg et parameter (from 10 − 5 to 10 − 10 ) in the optimizatio n problem. I t turn s out that the correlation matrices are w ell-b eh av ed for small 16 T able 1: Median and ( P 5 , P 95 ) of ξ 0 I ,M v alues for the pr op osed approac h (denoted b y “ lb ”) and the p opular appr oac h (den oted by “ mle ”) applied to the GoldPrice fun ction. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 0.71 (- 3.60, 3.02) 1.36 (0.61, 1.74) 2.12 (1.91, 2.4 9) 2.43 (2.26, 2.60) ˆ δ mle ≥ 10 − 10 -0.89 (-7.08, 2.14) 1.07 (- 3.28, 2.03) 1.48 (0.46, 2.10) 1.21 (0.41, 1.70) δ lb ( M = 1) -25.71 (-28.13, -20.50) -16.68 (-20.49, -14.40) 0.85 (0.63, 1.17) 1.09 (0.90, 1.26) δ lb ( M = 5) 0.19 (- 0.29, 0.70) 0.43 (0.27, 0.72) δ lb ( M = 20) -0.48 (-0.91, -0.09) -0.0 7 (-0.35, 0.14) designs (i.e., n = 25 and n = 50) and non zero nugg ets are not required for a numerically stable mo del ﬁtting p ro cess. Th is is captured by the prop osed approac h, as δ lb ( ˆ θ mle ) = 0 and the in terp o- lation error is muc h smaller than in the p opular approac h where a non-zero nugget is forced in the mo del. C onsequen tly , the iterati v e approac h is not used f or these cases. F or n = 75 and n = 100, the correlation matrices turn out to b e near-singular for θ near ˆ θ mle , and n on-zero δ had to b e u s ed for n umerically stable computation. It is clea r from the last thr ee ro ws that the prop osed ite rativ e approac h leads to impro vemen t in the inte rp olation accuracy . T able 2 summarizes the corresp onding prediction accuracy v alues, ξ 0 P ,M , w h ere the test set of unsampled p oints, D new , is a rand omly c hosen 2000 -p oint maximin Latin h yp ercub e d esign. The sim ulation results are based on 50 realizatio ns. As b efore, the maximin L atin hyp ercub e designs w ere generated using the built-in fu nction lhsdesign in Mat lab and the output designs are random. The resu lts illustrate that the p r op osed iterativ e approac h also impro v es the prediction at out- of-sample p oin ts. Note that the impr ov ement in prediction accuracy is not as signiﬁcant as the impro v emen t in inte rp olation accuracy . This is exp ected as the prop osed metho dology is geared to wards improving the appro ximation of the in terp olator. T able 2: Median and ( P 5 , P 95 ) of ξ 0 P ,M v alues f or the predictors in the pr op osed approac h (denoted b y “ lb ”) and the p opular approac h (denoted by “ ml e ”) applied to the GoldPrice function. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 4.59 (3.71, 6.17) 3.45 (3.25, 3.75) 3.34 (3.18, 3.46) 3.23 ( 3.12, 3.43) ˆ δ mle ≥ 10 − 10 4.49 (3.68, 6.01) 3.40 (3.12, 3.85) 2.76 (2.50, 3.03) 2.30 ( 1.85, 2.52) δ lb ( M = 1) 4.28 (3.59, 5.72) 3.22 (3.02, 3.54) 2.29 (2.07, 2.52) 1.94 (1.68, 2.18) δ lb ( M = 5) 2.15 (1.90, 2.42) 1.77 (1.55, 2.02) δ lb ( M = 20) 2.08 (1.71, 2.50) 1.69 (1.47, 1.98) Example 2 Supp ose the deterministic simulato r ou tp uts are generated using the three d imensional 17 P erm function (Y ang 2010) giv en by f ( x ) = 3 X k =1 " 3 X i =1 ( i k + β )  ( x i /i ) k − 1  # 2 , where x = ( x 1 , x 2 , x 3 ) and the i -th inpu t v ariable x i ∈ [ − 3 , 3] for i = 1 , ..., 3. F or con v enience, w e re-scale the input v ariables in [0 , 1]. As in th e previous example, we ﬁt the GP mo d el us in g b oth the p opular method (Section 3.3) and the prop osed approac h (Sections 4 and 5) to the data generated b y ev aluating the P erm function at n design p oin ts. T able 3 compares the median and the tw o tail p ercenti les ( P 5 , P 95 ) of ξ 0 I ,M v alues obtained from ﬁtting GP m o dels to 50 d ata sets (maximin Latin h yp ercub e designs generated using Matlab’s built-in function lhsdesign ) of diﬀerent r u n-sizes. Here also, we p re-sp eciﬁed the b oundary v alues f or estimati ng δ in the p opu lar approac h. T able 3: Median and ( P 5 , P 95 ) of ξ 0 I ,M v alues for the pr op osed approac h (denoted b y “ lb ”) and the p opular appr oac h (den oted by “ mle ”) applied to the p erm function. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 2.42 (- 2.78, 4.29) 1.70 (0.00, 2.17) 2.79 (1.56, 3.16) 3.12 (2.22, 3.63) ˆ δ mle ≥ 10 − 10 2.24 (- 4.17, 3.85) 1.69 (- 0.36, 2.33) 2.90 (1.39, 3.28) 3.07 (1.72, 3.68) δ lb ( M = 1) -26.46 (-27.53, -24.97) -21.37 (-22.98, -18.99) -18.76 (-20.07, -17.35) 1.76 (-20.35, 1.81) δ lb ( M = 5) -16.71 (-20.22, 1.73) As in Example 1, the int erp olation err ors of the GP ﬁts obtained through th e p opular app roac h are sligh tly r educed b y lo w ering the b oundary v alue of δ (from 10 − 5 to 10 − 10 ) in the optimizati on pro cess. Th e small v alues of th e p ercen tiles of ξ 0 I ,M in the row lab elled “ δ lb ( M = 1)” of T able 3 indicate that the correlat ion matrices are wel l-b eha v ed (i.e., δ lb ( ˆ θ mle ) = 0) for most of the designs with runs-sizes n = 25, 50 and 75. It turns out that approxi mately 46% of the correlation matrices are well-behav ed for designs of size n = 100. In “ δ lb ( M = 5)” case, the interp olation accuracy has increased and 53% of the designs of size n = 100 sho w δ lb ( ˆ θ mle ) = 0. The p rop osed approac h facilitat es the inclusion of a non-zero nugge t only wh en requ ir ed for ﬁxing the ill-conditioning problem. Clearly , the n umber of realizations that requ ired a non-zero n ugget in the GP mo dels here is muc h smaller than in the GoldPr ice example. Th is is exp ected b ecause getting n ear-singular correlation matrices b ecomes less lik ely as the dimensionalit y of the inpu t space increases. The corresp onding prediction accuracy measures for 50 sim u lations are summarized in T able 4 . The test set, D new , r equired for computing ξ 0 P ,M , is a r andomly chosen 3000-p oin t maximin Latin h yp ercub e design. As in E xample 1, the prediction accuracy increases with M , the n umber of iterations, and by lo wering the b oundary v alue of the n u gget in the optimization problem. Example 3 The b orehole mo del is a more r ealistic deterministic sim u lator, that mo dels the ﬂo w 18 T able 4: Median and ( P 5 , P 95 ) of ξ 0 P ,M v alues f or the predictors in the pr op osed approac h (denoted b y “ lb ”) and the p opular approac h (denoted by “ ml e ”) applied to the P erm fu nction. n = 25 n = 50 n = 75 n = 100 ˆ δ mle ≥ 10 − 5 6.78 (4.94, 7.43) 4.24 (4.06, 4.44) 4.01 (3.91, 4.28) 4.02 ( 3.80, 4.16) ˆ δ mle ≥ 10 − 10 6.68 (5.21, 7.22) 4.23 (4.12, 4.40) 4.10 (3.90, 4.27) 3.90 ( 3.52, 4.24) δ lb ( M = 1) 5.87 (4.70, 6.88) 4.08 (3.91, 4.28) 3.80 (3.60, 4.00) 2.60 (2.21, 3.89) δ lb ( M = 5) 2.49 (2.17, 3.86) rate thr ough a b orehole whic h is d r illed from the grou n d s urface through tw o aquifers, and is commonly used in compu ter exp eriments (e.g., Joseph, Hung and Sud j ian to 2008) to compare diﬀeren t metho ds . The ﬂ o w rate is giv en by f ( x ) = 2 π T u ( H u − H l ) log( r /r w ) h 1 + 2 LT u log( r /r w ) r 2 w K w + T u T l i , where x = ( r w , r , T u , T l , H u , H l , L, K w ), and the input r w ∈ [0 . 05 , 0 . 15 ] is the r adius of the b orehole, r ∈ [100 , 50000 ] is th e radiu s of the inﬂuence, T u ∈ [63070 , 11 5600] is the transmiss ivity of the upp er aquifer, T l ∈ [63 . 1 , 116 ] is the transmiss ivit y of the lo wer aquifer, H u ∈ [990 , 111 0] is the p oten tiometric head of the upp er aquifer, H l ∈ [700 , 820] is the p oten tiometric h ead of the lo wer aquifer, L ∈ [1120 , 1680] is the length of the b orehole and K w ∈ [9855 , 12045 ] is the hydraulic conductivit y of the b orehole. F or con v enience, w e re-scale the input v ariables to [0 , 1]. T able 5 compares the median and tw o tail p ercen tiles ( P 5 , P 95 ) of ξ 0 I ,M v alues obtained f rom the GP mo d el surr ogates ﬁtted to 50 random maximin Latin hyp ercub e d esigns via the t w o metho d s. Since the sim ulator is 8-dimensional, w e considered slightly larger ru n-sizes n = 50 , 75 , 100 and 125, h o wev er, th e num b er of sim ulations and the candid ates for the b oundary v alues of δ in the lik eliho od optimization were k ept the same as in Examples 1 and 2. T able 5: Median and ( P 5 , P 95 ) of ξ 0 I ,M v alues for the pr op osed approac h (denoted b y “ lb ”) and the p opular appr oac h (den oted by “ mle ”) applied to the b orehole mo d el. n = 50 n = 75 n = 100 n = 125 ˆ δ mle ≥ 10 − 5 0.62 (0.21, 1.26) 1.27 ( 0.69, 1.58) 1.55 (1.11, 1.91) 1.8 6 (1.44, 2.30) ˆ δ mle ≥ 10 − 10 0.48 (- 1.59, 1.12) 0.65 (- 1.05, 1.56) 0.83 (-1.47, 1.66) 1.33 (-0.41, 2.15) δ lb ( M = 1) - 18.47 (-19.45, -16.66) -16.18 (-17.06, -14.27) -13.93 (-15.19, -12.46) -14.74 (-16.00, - 13.47) As exp ected, the interpolation error of the GP ﬁ ts obtained u sing the p opular approac h slight ly decreases by lo wering the b oundary v alue of the nugget parameter (from 10 − 5 to 10 − 10 ) in the optimization p roblem. Th e prop osed m etho d leads to predictors with signiﬁcan tly higher in terp o- lation accuracy . The p ercen tiles of ξ 0 I ,M in the last ro w of T able 5 also suggest that most of the 19 correlation matrices are well-c onditioned for the θ v alues near ˆ θ mle , and δ lb ( ˆ θ mle ) = 0. That is, the iterativ e approac h is n ot needed to f urther improv e the in terp olation accuracy . T able 6 s ummarizes the prediction accuracy v alues (i.e., ξ P ,M ) for 50 simulations. Th e test set f or computing ξ 0 P ,M is a randomly chosen 8000-point maximin Latin hypercub e d esign. It is clear f rom T able 6 that more accurate prediction can b e ac hieved b y low ering the δ v alue in the optimization pro cess of the p opular approac h , and certainly the prop osed approac h results in the b est pr ed iction at unsamp led p oin ts among the three cases considered here. T able 6: Median and ( P 5 , P 95 ) of ξ 0 P ,M v alues f or the predictors in the pr op osed approac h (denoted b y “ lb ”) and the p opular approac h (denoted by “ ml e ”) applied to the b oreh ole mo del. n = 50 n = 75 n = 100 n = 125 ˆ δ mle ≥ 10 − 5 4.01 (3.73, 4.37) 4.04 (3.68, 4.34) 4.13 (3.77, 4.53) 4.23 ( 3.93, 4.56) ˆ δ mle ≥ 10 − 10 3.90 (3.55, 4.32) 3.81 (3.48, 4.16) 3.74 (3.27, 4.04) 3.77 ( 3.43, 4.07) δ lb ( M = 1) 3.73 (3.37, 4.08) 3.57 (3.22, 3.86) 3.37 (2.94, 3.82) 3.64 (3.25, 4.03) Example 4 W e no w r evisit the tidal p o wer example in Section 2. The computer simulator (a v ersion of FVCOM) is exp ensive and cannot b e ev aluated at numerous co ordin ates. Eac h of th e runs present ed here required ap p ro ximately one hour to run on 4 pro cessors in p arallel on the A tlan tic Computational Excellence net w ork (A C Enet) mahone cluster. While this is n ot particularly on er ou s on a large cluster, the grid resolutio n us ed in KMLH is ab out 200 m (length of a s ide in a triangle). A realistic mo del of 20 m sided triangular grid and with 10 vertica l la yers to mo del 3D ﬂo w w ould increase th e compu tational exp ense by a factor of 5120 , making eac h individual simula tor run rough ly 10 times more costly than the generation of the en tire data set examined h ere. The o cean mo d elers b eliev e that the sim ulator is deterministic up to the mac hine precision and they are inte rested in an em ulator that int erp olates the sim u lator. A total of 533 runs (on a 13 × 41 grid) were used to obtain the data displa yed in Figure 5. W e use this data to compare our results. The goal is to bu ild an emulator of the computer mo del usin g a fraction of the budget (533 run s) that pro vides the b est appro ximation of the simulato r. W e used a maximin based cov erage design (Johnson, Mo ore and Ylvisak er 1990) to choose a subset of n = 100 p oints from these 533 p oint s to constitute a sp ace-ﬁlling design. The con tours from b oth the pred icted sur face and the tru e simulat or (based on the 13 × 41 grid) are s ho wn in Figure 6. F or successful implementa tion of the p opular approac h outlined in Sections 3.1 and 3.3, the lik eliho od optimization to ok place in the parameter space δ ∈ (10 − 5 , 1) and θ ∈ (0 , ∞ ) 2 , and the parameter estimates for the GP ﬁt are ˆ θ mle = (163 . 1 8 , 50 . 66) and ˆ δ mle = 0 . 0462 (see Figure 6(a)). The parameter estimates for the GP mo del ﬁtted u sing the p rop osed app roac h with M = 1 are 20 −64.5 −64.48 −64.46 −64.44 −64.42 −64.4 −64.38 −64.36 −64.34 45.3 45.31 45.32 45.33 45.34 45.35 45.36 45.37 45.38 45.39 45.4 0 5 10 15 x 10 7 Figure 5: FV CO M outp u ts (a v erage extractable p o wer) o v er a coarse grid in the Minas Pa ssage. A colored version of the ﬁgur e is a v ailable online. ˆ θ mle ≈ (788 . 54 , 221 . 18) and δ lb ( ˆ θ mle ) ≈ 0 (see Figure 6(b)). 0.75 0.8 0.85 0.9 0.95 0.2 0.3 0.4 0.5 0.6 0.7 0.8 X 1 X 2 (a) Po pular ﬁt with MLE 0.75 0.8 0.85 0.9 0.95 0.2 0.3 0.4 0.5 0.6 0.7 0.8 X 1 X 2 (b) M = 1 (with low er bound ) Figure 6: Dots denote the design p oints, the solid curv es denote the cont ours for the true simulato r based on the 13 × 41 grid, the dashed curves sho w th e con tours of the GP ﬁts. Figure 6 sho ws that the GP based emulator obtained using the pr op osed approac h (Figure 6(b)) is less sm o oth as compared to the emulat or obtained via the p opular app roac h (Figure 6(a)). The interp olation errors for the GP ﬁts obtained with the p opular metho d and the pr op osed 21 approac h are ξ 0 I ,M = 6 . 63 and − 26 . 58 resp ectiv ely . That is, the prop osed approac h is b etter at appro ximating the interp olator. In terms of predicting the p o wer surface at un sampled p oints (i.e., the rest of 433 p oin ts), the prediction error v alues for b oth the p opular and p r op osed app roac h es are somewhat close, ξ 0 P ,M ≈ 10. Moreo v er, when usin g the p opular approac h, the maxim u m p redicted p o w er obtained by ev aluating max { ˆ y ( x ) , x ∈ χ } is appr oximate ly 1 . 4 · 10 8 W with the maximizer b eing (0.7850, 0.4500 ), whereas if w e use the prop osed approac h th e maxim u m p redicted p o wer is appro ximately 1 . 6 · 10 8 W observed at (0.7900 , 0.450 0). 7 Discussion Assuming that the und erlying compu ter sim ulator is deterministic up to th e mac h in e precision and th e s tatisticia n is certain ab out th e s uitabilit y of a GP m o del with Gaussian correlation as the em ulator, ﬁtting the mo del to a data set with n p oin ts in d -dimensional in put space requ ir es computation of the determinan t and in v erse of n × n correlation matrices for sev eral θ v alues. In Section 4, we conducted a sim u lation study to exp lore sp ace-ﬁlling designs (sp eciﬁcally maximin Latin h yp ercub e designs) for diﬀeren t com bin ations of ( n, d ) that can lead to near-singular corre- lation m atrices. In Section 5, we pr op osed an iterativ e appr oac h, that is also a generalization of the p opu lar approac h, to construct a new predictor ˆ y δ,M that has higher in terp olatio n accuracy as compared to ˆ y δ — the predictor fr om the p opular approac h. Lemmas 1, 2 and 3 sho w that ˆ y δ,M con verges to the BLUP as the n um b er of iteratio ns ( M ) increases. The lo we r b ound δ lb , prop osed in Section 4, also allo ws u s to use a n on-zero n ugget only when needed, and in this case minimizes the num b er of iteratio ns required to reac h th e desired inte rp olation accuracy . There are a few imp ortant remarks w orth noting. First, the metho dology d evelo p ed here can also b e adapted to the Ba y esian fr amework. F or computing th e p osterior of the parameters and the predictor, | R | and R − 1 need to b e computed for s everal r ealizati ons of θ , and a nugget is often used to o verco me the near-singularit y of R (e.g., T addy et al. 2009). T he prop osed low er b ound δ lb can b e u sed for deﬁn ing a prior for δ , i.e., the s earc h should b e limite d to [ δ lb , 1). One can also use the iterativ e app roac h to furth er impr o ve the inte rp olation and/or prediction accuracy . Second, we used the squared exp onential correlation ( p k = p = 2 for all k ) in the GP mo del b ecause of its p op u larit y and go o d theoretical prop erties. It tur ns out the GP mo d el with other p o w er exp onen tial correlation (i.e., p k = p < 2) ma y lead to predictors with larger MSE and sometimes worse ﬁts as compared to that of th e GP m o dels with the Gaussian correlation. Recall that the near-singularit y of R o ccurs b ecause (a) at least tw o of the design p oin ts (sa y x i and x j ) 22 are close together in the input sp ace, and /or (b) the hyper -p arameters θ k , k = 1 , ..., d are ve ry close to zero, i.e., P d k =1 θ k | x k ,i − x k ,j | p k ≈ 0. T his make s a few of the ro ws of R v ery similar, and will happ en ev en if p k < 2. Th at is, the ill-conditioning pr oblem ma y also o ccur when other p o w er exp onential correlatio n functions (i.e., p k ’s are same and less than 2 or p k ’s are diﬀerent and less than 2) are used. A close r in v estigation r ev eals that with p k = p < 2, near-singular cases o ccur very frequently in the sequen tial design setup. Ho w ever, for the sp ace-ﬁlling designs, it is rather f ascinating that the o ccurr ence of near-singular cases is substantia lly reduced b y ev en a small reduction in the p o w er from p = 2 to p = 1 . 99. W e su sp ect this is due to the limiting b eha viour of the Gaussian correlation in the family of p o w er exp onen tial correlation fu nctions p ∈ (0 , 2]. In conclusion, wh en ﬁtting a GP mo del to a data set obtained from a deterministic computer mo del with nearly–singular correlation matrices, w e recommend using δ lb - the lo w er b ound on the n ugget, along with the iterativ e approac h with the n um b er of iterations, M , c hosen according to the desired inte rp olation accuracy . Ac k n o wledgmen ts W e would lik e to thank the AE and t wo anonymous referees for man y useful comments and s ug- gestions that lead to signiﬁcan t improv ement of the pap er . This w ork was partially supp orted b y Disco ve ry gran ts from th e Natural Sciences and Engineering Researc h Council of C anada. REFERENCES Ababou, R., Bagtzoglou, A. C. and W ood, E. F. (199 4). On th e condition n um b er of co v ariance matrices in kriging, estimation, and simulatio n of random ﬁelds. Mathematic al Ge olo gy , 26, 99 - 133. Andre, J., Siarr y, P. and Dognon, T. (200 0). An impr o vemen t of the standard genetic algorithm ﬁghting premature con vergence. A dvanc es in Engine ering Softwar e , 32, 49 - 60. Aslett, R., Buck, R. J., Duv al l, S. G., Sacks, J . and Welch , W. J. (1998 ). Circuit opti- mization via sequentia l computer exp erimen ts: d esign of an outpu t bu ﬀer. Applie d Statistics , 47(1), 31 – 48. Bar ton, R. R. and Salagame, R. R. (1997). F actorial hyp ercub e d esigns for spatial correlation regression. Journal of Applie d Statistics , 24, 453 - 473 . 23 Bastos, L. and O’Ha gan, A. (2009 ). Diagnostics for Gaussian P r o cess E mulators. T e chnomet- rics , 51, 4, 425 - 438. Bergmann F. T. a nd Saur o H. M. (2008 ). Comparing simulat ion results of SBML capable sim ulators. Bioinformatics , 24, 196 3 – 196 5. Booker, A. J ., Dennis Jr., J. E., Frank , P. D., Serafini, D. B., Tor czon, V. and Tr osset, M. W. (1999) . A rigorous framew ork for optimization of exp ensive fun ctions b y surrogates. Structur al and Multidisciplinary Optimization , 17, 1 - 13. Booker, A. (20 00). W ell-conditioned Kriging mo dels for op timization of compu ter sim ulations. Mathematics and Computing T e chnolo gy P hantom Works - The Bo eing Comp any , M&CT- TECH-00-002 . Branin, F. K . (1972) . A widely con v ergent metho d for ﬁnd ing m ultiple solutions of sim ultaneous nonlinear equations. IBM J. R es. Develop. , 504 - 522. Chen, C., Beardsley, R. C. and Cowles, G. (2006). An un structured grid, ﬁ n ite-v olume coastal o cean mod el (FV COM) system. Oc e ano gr aphy , 19, 78 - 89. Golub, G. H. and V an Loan, C. F. (199 6). Matrix Computations . J ohns Hopkin s Univ ers it y Press, Baltimore, MD. Gramacy, R. B. and Lee, H. K. H. (2008). Bay esian treed Gaussian p ro cess mo dels with an application to compu ter mo deling. J. Amer. Statis. Ass. , 103(483), 111 9-113 0. Greenberg, D. (1979). A n umerical mo del inv estigation of tidal phenomena in the Ba y of F undy and Gulf of Maine. Marine Ge o desy , 2, 161 - 187. Huang, D., Al len, T.T., Notz, W.I. and Mille r, R.A. (2006). Sequen tial Kriging op timiza- tion using multiple ﬁdelit y ev aluations, Struct. Multidisc Optim. , 32, 369 - 382 . Johnso n, M. E., Moore, L. M., and Yl visaker, D. (1990). Minimax and maximin distance designs. Journal of Statistic al Planning and Infer enc e , 26, 131 - 148. Jones, D., S chonlau, M ., and W elch, W. (1998) . Eﬃcient Global O ptimization of Exp ensiv e Blac k-Bo x F unctions. Journal of Glob al Optimization , 13, 455 - 492. Joseph , V. R., Hung, Y., and A. Sudjianto. (2008) . Blind Kriging: A New Metho d for Dev eloping Metamo dels. Journal of Me chanic al Desi g n , 130(3 ), 31 – 102. 24 Lebedev, V. I. (1997). An intr o duction to functional analysis in c omputational mathematics , Birkh¨ auser, Boston. Karsten , R., McMill an, J ., Lickl ey, M. and Ha ynes , R. (2008) . Assessment of tidal current energy for the Minas P assage, Ba y of F undy . P r o c e e dings of the Institution of Me chanic al Engine ers, Part A: Journal of Power and Ener g y , 222, 493 - 507 . Kumar, B. and D a vidson, E. S . (1978). Performance ev aluation of highly concurr ent computers b y deterministic sim ulation, Comm unic ations of the ACM , 21 (11), 904 – 913. Medina, J .S., Moreno, M.G. and Ro yo, E.R. (2005). Sto c hastic vs d eterministic traﬃc sim ulator. Comparativ e study for its u se w ithin a traﬃc light cycles optimization arc hitecture, In pr o c. IWIN AC (2), 622 – 631. Neal, R. M. ( 1997). Mon te Carlo imp lementati on of Gaussian pro cess mo dels f or Ba ye sian regression and classiﬁcation. T ec h Rep. 9702, Dept. of Statistics, Univ. of T or onto, Canada . Neumaier, A. (19 98). Solving ill- conditioned and singular linear systems: A tutorial on regular- ization. SIAM R eview , 40, 636 - 666. O akley, J. (2004 ). Es timating p ercentil es of computer code outputs. Appl. Statist. , 53, 83 - 93. Poole, D. and A. E. Rafte r y. (2000 ). I nference f or d eterministic sim ulation mo d els: the Ba yesia n melding approac h. J. Amer. Statis. Ass. , 95 (452), 1244 -1255 . Ranjan, P., Bingham, D. and Micha ilidis, G. (2008). S equen tial exp eriment design for con tour estimation from complex computer co des. T e chnometr ics , 50, 527 - 541. Rasmuss en, C. E. and Williams , C. K . I. (2006). Gaussian Pr o c esses for Machine L e arning . The MIT Press. Sacks, J., Welch, W. J ., Mitchell, T. J . and Wynn, H. P. (1989). Design and analysis of computer exp erimen ts. Statistic al Scie nc e , 4, 409 - 423. Santner , T. J., Williams, B. J. and Notz, W. (2003) The Design and Analysis of Computer Exp eriments . Spr inger-V erlag Inc., New Y ork, NY. Schonlau, M., Welch, W. and Jones, D. (1998). Global versus lo cal search in constrained op- timizatio n of computer mo dels. Ne w Developments and A pplic ations in E xp erimental Design, Institute of Mathematic al Statistics L e ctur e N otes, H aywar d, California , 34, 11 - 25 25 Stein, M. (1987 ). Large Sample Prop erties of Sim ulations Using Latin Hyp ercub e Sampling. T e chnometr ics , 29, 143 – 151. Stein, M. L. (1999). Interp olation of Sp atial Data: Some The ory for Kriging . Springer, NY. Su, H, Nelder, JA, W olber t, P and S pence, R (1996 ). Application of generalized linear mo dels to design imp ro v ement of an engineering artefact. Q u al R eliab E ngng Int , 12, 101-112. T addy, M. A., Lee , H. K. H., Gra y, G. A. and Griffin, J. (2009). Ba ye sian guidance for robust pattern search optimizatio n. T e chnometr ics (to app ear). Tikhonov, A. N. (1993). Solution of in correctly f ormulated problems and the r egularization metho d. Soviet Math. Doklady , 4, 1035 - 1038. V an Beers, W. C. M., and Kleijnen, J. P.C. (2004) . Kriging inte rp olation in sim ulation: a surve y . In Pr o c e e dings of the 2004 Winter Simulation Confer enc e , ed. R.G. Ingalls, M.D. Rossetti, J.S . Sm ith, and B.A. Peter, 113 - 121. Piscata wa y , New Jersey: Institute of Electrical and Electronics Engineers. Y ang, X. -S. (2010). T est problems in optimiza tion, Engine ering Optimizatio n: An Intr o duction with Metaheuristic Applic ations (Eds Xin-She Y ang), John Wiley & Sons . 26

A Computationally Stable Approach to Gaussian Process Interpolation of Deterministic Computer Simulation Data

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment