Restoring a smooth function from its noisy integrals

Restoring a smo oth function from its noisy in tegrals Olga Goulk o Dep artment of Physics, University of Massachusetts, A mherst, MA 01003, USA and Pr esent addr ess: R aymond and Beverly Sackler Scho ol of Chemistry and School of Physics and Astr onomy, T el Aviv University, T el Aviv 6997801, Isr ael Nik olay Prok of ’ev Dep artment of Physics, University of Massachusetts, Amherst, MA 01003, USA and National R ese ar ch Center “Kur chatov Institute,” 123182 Mosc ow, Russia Boris Svistunov Dep artment of Physics, University of Massachusetts, Amherst, MA 01003, USA National Rese ar ch Center “Kur chatov Institute,” 123182 Mosc ow, Russia and Wilczek Quantum Center, Scho ol of Physics and Astr onomy and T. D. L e e Institute, Shanghai Jiao T ong University, Shanghai 200240, China (Dated: May 14, 2018) Numerical (and exp erimen tal) data analysis often requires the restoration of a smo oth function from a set of sampled integrals ov er ﬁnite bins. W e presen t the bin hierarch y metho d that eﬃciently computes the maximally smooth function from the sampled integrals using essentially all the infor- mation con tained in the data. W e p erform extensiv e tests with diﬀerent classes of functions and lev els of data qualit y , including Monte Carlo data suﬀering from a sev ere sign problem and physical data for the Green’s function of the F r¨ ohlic h p olaron. P A CS n umbers: 02.60.-x, 02.70.-c, 07.05.Rm, 87.57.np I. INTR ODUCTION Sampling is at the heart of many Monte Carlo com- putations and exp erimental measuremen ts. Data p oin ts are generated or measured and w e ev entually wan t to restore the underlying smo oth probabilit y densit y dis- tribution f ( x ) b ehind them. Man y density estimation proto cols for lists of statistical data exist, 1–3 but for large-scale Mon te Carlo calculations storing the individ- ual data p oin ts would require p entab ytes of memory and lead to a substantial slowing down of the sim ulation. More importantly , the individual data p oints are not needed, provided that we know that f ( x ) is structure- less below a certain scale. Hence, without losing an y es- sen tial information, w e can collect in tegrals of f ( x ) o ver ﬁnite-size bins. The problem then is ho w to extract all the information from these integrals without systematic bias, augmenting it with our a priori kno wledge of the smo othness of the function. In this paper, we propose a simple and eﬃcient solu- tion, whic h we call the bin hierarch y metho d (BHM). 4 W e note that the problem is closely related to the prob- lem of numerical analytic con tinuation under consisten t constrain ts, 5 but with a crucial simpliﬁcation. Unlik e for the numerical restoration of sp ectral functions, we do not ha ve to accoun t for p ossible sharp features, suc h as δ - function p eaks or kinks, which cannot b e resolved within the sampled error bars. Assuming that f ( x ) is smo oth, w e are allow ed to parametrize its optimal appro ximation ˜ f ( x ) as a p olynomial spline (or a spline with a general functional ansatz). The protocol of constructing ˜ f ( x ) has some similarities with the smo othing spline approach. 6 Ho wev er, there is a fundamental diﬀerence in the data structure. Rather than ﬁtting to appro ximate function v alues on a given set of p oints, we ﬁt the spline directly to the sampled in tegrals. The central requiremen t is that ˜ f ( x ) m ust b e consisten t not only with the sampled in tegrals ov er the elemen tary histogram bins, but also with the integrals o ver an y combination of these. In tegrals o ver large bins are kno wn with higher precision than those o ver smaller bins, since the former con tain more data points. Hence, accurately reco vering a large in tegral is a priorit y o v er re- co vering a smaller one. In particular, bins below a certain scale con tain noise rather than information about f ( x ), and thus can b e safely omitted in the ﬁtting. W e formal- ize this idea b y introducing a hierarc h y of bins with one bin o ver the en tire domain of f ( x ) on the top level, and doubling the num b er of bins on each subsequent lev el. The go o dness of ﬁt is ev aluated on each level n sepa- rately , with χ 2 n / ˜ n as the measure, where ˜ n is the num b er of bins on the giv en hierarch y lev el. Out of all p ossible approximations ˜ f ( x ) consisten t with the sampled integrals we select the one with the least features, i.e. the one with the least ﬁtting parameters. This is ac hiev ed by ﬁtting a spline of order m (t ypically m = 3) with the minimal num b er of knots that yields an acceptable ﬁt. The n umber and the p ositions of the knots are determined automatically b y the ﬁtting algo- rithm. Kno wn b oundary conditions on f ( x ) can also b e accoun ted for. If desired, additional optimization terms, for instance for the jump in the highest spline deriv ative, can b e included in to the ﬁtting procedure, with iterative impro vemen t in the spirit of the metho d of consistent 2 constrain ts. 7 The pap er is organized as follows. In Sec. I I, we re- view alternative metho ds to construct ˜ f ( x ) and argue that the BHM is superior in accuracy , eﬃciency , and sim- plicit y . The details of the BHM are presented in Sec. I I I. In Sec. IV, we presen t several tests of the BHM. W e con- clude in Sec. V. I I. REVIEW OF AL TERNA TIVE METHODS A. Naiv e histograms The most straigh tforw ard w ay to approximate a distri- bution f ( x ) is the histogram: 1–3 divide the domain into sev eral bins and count how many times the sampled v al- ues of x , whic h were generated with probabilities giv en b y f ( x ), fall into each of the bins. F or each generated x ∈ bin i , the counter c i for the bin i is increased b y one. The distribution is then appro ximated by ˜ f ( x ) = X i c i N ∆ i Π i ( x ) , (1) where N is the total num b er of sampled p oints, ∆ i is the width of the bin i , and Π i ( x ) is the boxcar function, whic h is one if x ∈ bin i and zero otherwise. In what fol- lo ws, w e refer to this metho d as “naiv e histogramming”— to emphasize the contrast with the BHM that also in- v olves histogramming as a part of the proto col. The end result of the naiv e histogram metho d is a staircase func- tion, which is not smo oth. This drawbac k can b e amelio- rated by taking c i / ( N ∆ i ) as the v alue of ˜ f ( x ) at the bin cen ters and ﬁtting a smo othing spline to the data with their statistical error bars. 6 A more imp ortant issue is that the naive histogram metho d suﬀers from an inher- en t compromise b etw een the resolution of features and statistical noise. Suﬃcien tly narrow bins are necessary to resolve f ( x ) without in tro ducing a signiﬁcan t system- atic bias, while reducing the size of the bins increases the noise in the sampled coun ters. B. Basis pro jection method A generalization of the naiv e histogram metho d—the basis pro jection metho d—can signiﬁcantly reduce this dra wback. This method is based on a generalized F ourier series expansion of f ( x ). F or each bin i , we choose a ba- sis { e ( j ) i ( x ) } of m i real-v alued functions with e ( j ) i ( x ) = 0 if x / ∈ bin i that satisfy the orthonormalit y condition h e ( j ) i | e ( j 0 ) i i ≡ Z bin i w i ( x ) e ( j ) i ( x ) e ( j 0 ) i ( x ) dx = δ j j 0 . (2) The non-negative weigh t function w i ( x ) of the inner pro duct should be chosen such that all in tegrals are con- v ergent. If no divergences are presen t w i ( x ) can b e set to unit y . The basis and weigh t functions ma y b e diﬀeren t for each bin, and can be deﬁned also for bins of inﬁnite size. In general, the basis functions can b e c hosen to re- ﬂect the properties of f ( x ) that are known in adv ance, for instance kno wn divergences or asymptotic b eha vior. Otherwise, in the ma jorit y of the cases, shifted Legendre p olynomials are a reasonable c hoice. With the basis pro jection metho d, in eac h bin i w e k eep track of sev eral counters c ( j ) i estimating the m i gen- eralized F ourier co eﬃcients h e ( j ) i | f i . Sp eciﬁcally , eac h time w e generate x ∈ bin i w e add the v alue w i ( x ) e ( j ) i ( x ) (rather than one) to eac h of the c ( j ) i coun ters. The func- tion ˜ f ( x ) is then given b y ˜ f ( x ) = X i m i X j =1 c ( j ) i N e ( j ) i ( x ) . (3) Note that Eq. (1) is a sp ecial case of the ab o ve form ula with m i = 1, w i ( x ) = 1, and e (1) i ( x ) = Π i ( x ) / √ ∆ i . It can be easily seen that this pro cedure corresp onds to the b est p ossible represen tation of f ( x ) in terms of orthogonal basis functions on each bin i , since minimizing Z bin i w i ( x )( f ( x ) − ˜ f ( x )) 2 dx = = Z bin i w i ( x )   f ( x ) − m i X j =1 c ( j ) i N e ( j ) i ( x )   2 dx (4) with respect to the c ( j ) i implies Z bin i w i ( x ) e ( j ) i ( x )   f ( x ) − m i X j 0 =1 c ( j 0 ) i N e ( j 0 ) i ( x )   dx = 0 , (5) and consequen tly c ( j ) i / N = h e ( j ) i | f i . Using basis pro jections we collect more information in eac h sampling step than with the naive histogram, since some information ab out the p osition of x inside the bin is preserv ed. Therefore few er bins are necessary to resolve the function. The bin widths and basis sizes m i can b e c hosen in a wa y that the systematic error is negligible compared to the statistical error, without a substan tial increase in the statistical noise of the ˜ f ( x ) v alues. While the basis pro jection method is a signiﬁcan t im- pro vemen t o ver the naive histogram, it still has several disadv antages. The binning and the basis functions m ust b e chosen at the b eginning of the sampling pro cess and cannot be altered retrosp ectively . Basis pro jections can also be computationally exp ensiv e, especially for a large basis, or a basis built of complicated functions. Moreov er, the sampled pro jections generally exhibit large jumps at the bin b oundaries b ecause the basis functions ma y b e un b ounded. As b efore, w e can eliminate these jumps by taking the v alues of ˜ f ( x ) at several points inside each bin 3 and then constructing a smoothing spline. This pro ce- dure, how ever, in volv es an unnecessary loss of informa- tion. Rather than reducing the ﬁt to speciﬁc individual p oin ts, it is suggestive to ﬁt a spline to the sampled inte- grals h e ( j ) i | f i directly , resulting in smaller errors. In addi- tion, we can extract more exhaustive information ab out f ( x ) by using ov erlapping bins of diﬀeren t size that co ver a range of relev ant scales of the function. This is ac hieved b y the BHM describ ed in Sec. I II. C. Rew eighting Rew eighting is a Monte Carlo-sp eciﬁc tec hnique that allo ws one to con v ert the statistics generated for a given v ariable x to statistics for other v alues of this v ariable. 8,9 This is ac hieved in the following wa y . Prior to sampling, w e choose a ﬁxed set of p oints x i and asso ciate a bin with each of the x i . The bins can b e large and o v erlap- ping. Whenever the random v ariable x falls into the bin i associated with the p oin t x i , w e up date the x i coun ter b y the reweigh ted v alue vp ( x i ) /p ( x ). Here v is the v alue that w ould hav e b een sampled without rew eighting, and p ( x i ) /p ( x ) is the ratio of the probability w eights at x i and x , where p ( x ) is the probability distribution used to sample the physical function f ( x ). This can b e seen as a deterministic “update” from the generated point x to the ﬁxed p oint x i . F or suﬃcien tly large bins, sev eral coun ters ma y be up dated simultaneously for eac h gener- ated x . The reweigh ting technique provides an unbiased estimator for the v alue f ( x i ) at any ﬁxed x i . The reweigh ting metho d is very eﬃcient in the case when one is interested in a single special v alue of a certain con tinuous v ariable. A c haracteristic example is the sam- pling of the single-particle density matrix from the statis- tics for the Green’s function by reweigh ting the latter to the imaginary-time v alue τ = − 0. In all other cases, rew eighting is computationally expensive compared to the gain in accuracy . It also requires a case-sp eciﬁc imple- men tation that assumes knowledge of the analytical form of the rew eigh ting factors. T o b e maximally eﬃcien t, one has to adjust the size of bin i for each x i of interest. The bin size m ust b e optimized in such a wa y that, on the one hand, it is large enough to ensure suﬃciently large statistics, and on the other hand small enough that the rew eighting factors remain of the order of unit y (to av oid n umerous v anishingly small contributions that consume sim ulation time). Bins that are too large can also result in a strong increase in the auto correlation time, since rew eighting from a rare even t to a frequen t one implies a large p ( x i ) /p ( x ) ratio, whic h will result in o ccasional anomalously large contributions to the sampled coun ters. Moreo ver, to pro duce a smooth outcome, the rew eighting proto col requires an appropriately dense mesh of p oints with strongly ov erlapping bins, which is computationally exp ensiv e since a large num b er of counters needs to b e up dated for eac h generated x . F or a contin uous set of function v alues, a smo othing spline needs to b e ﬁtted at additional computational cost, comparable to the cost of the BHM ﬁt describ ed in Sec. II I. I II. BIN HIERAR CHY METHOD The BHM restores smooth distributions in a universal, un biased, and eﬃcient manner. The metho d satisﬁes the follo wing requiremen ts: • All the information con tained in the sampled data should b e used to ﬁt the distribution: the accuracy should b e essen tially the same as if w e had stored the full list of the generated v alues of x . This is ac hieved by introducing a hierarch y of ov erlapping histogram bins of diﬀerent size, where large bins giv e precise estimates of the broad features of the distribution, while s mall bins resolv e its ﬁne struc- ture once enough data has accumulated. Note that w e assume that the sampled distribution is struc- tureless below a certain scale, and hence there al- w ays exists a binning that is suﬃcien tly ﬁne to re- solv e all of its features. • The resulting function should b e exactly smo oth, meaning that the function and all its deriv atives up to a given order must ha ve no jumps. This is ac hieved by ﬁtting a p olynomial (or generalized) spline. • Out of all smooth functions consistent with the sampled data within its error bars, the selected re- sult should ha ve the least features (p eaks, oscilla- tions, etc.). This is achiev ed by c ho osing the spline with the minimal n umber of free parameters that ﬁts the data. • The sampling pro cess and the restoration of the smo oth function should b e computationally and memory eﬃcient. In particular, they should b e more eﬃcien t than the basis pro jection metho d. • The algorithm should w ork well regardless of the qualit y and the n umber of sampled data. As more data are collected and statistics improv es, so should the ﬁtted distribution without needing an y external adjustmen ts. • The algorithm should b e highly automated. No h uman input or con trol should b e necessary other than a limited c hoice of initial parameters. In par- ticular, the metho d should b e applicable to all rea- sonably o ccurring functions and should require no adjustmen t for diﬀeren t classes of functions. Belo w w e describe the setup in detail. A. Sampling stage W e divide the domain into 2 K non-o verlapping elemen- tary bins that do not need to b e of equal width. Because 4 in the end we form com binations of sev eral bins, the ele- men tary bins ma y b e arbitrarily small. Bins with to o few data p oin ts are automatically excluded from the analy- sis. As the n umber of sampled points increases, bins of smaller size will b ecome usable. The num b er of elemen- tary bins is thus limited only b y memory . The K → ∞ limit w ould formally corresp ond to k eeping the full list of the generated v alues of x . In practice, the distributions of interest are structureless beyond a certain scale that sets a natural limit on the required elemen tary bin width. W e generate v alues of x according to a probabil- it y distribution given by | f ( x ) | (imp ortance sampling) and sample the v alue v = sign[ f ( x )] in the elementary bin i that contains x . W e can also include additional w eighting co eﬃcien ts into the sampled v alues. F or each bin we store the num b er of sampled v alues in the bin, N i , as well as the av erage ¯ v i and the scaled v ariance M 2 ( v i ) = ( N i − 1)V ar( v i ) of the sampled v alues o v er the bin. W e obtain the s ampled in tegral I i o ver eac h elemen- tary bin via I i = ¯ v i N i / N , where N is the total n umber of sampled points. The v ariance of the sampled in tegral is calculated via V ar( I i ) = M 2 ( I i ) N − 1 (6) M 2 ( I i ) = M 2 ( v i ) + ¯ v 2 i N i ( N − N i ) N (7) and its error is giv en by δ I i = p V ar( I i ) / N . B. Bin hierarch y After sampling has ﬁnished, w e construct com binations of elementary bins. These make up a hierarc hy with 2 n bins at each level n ∈ { 0 , . . . , K } . At the top level we ha ve one large bin containing the en tire in tegral; on the next level we hav e tw o bins con taining the in tegrals o v er half of the in terv al each, etc. The ﬁt is constructed to minimize K X n =0 χ 2 n 2 n = χ 2 0 1 + χ 2 1 2 + . . . + χ 2 K 2 K , (8) where χ 2 n is the goo dness of the ﬁt considering only the bins on the level n . Note that the construction of the bin hierarc hy happ ens in the p ost pro cessing. During the sampling stage, only the v alues N i , ¯ v i and M 2 ( v i ) in the elemen tary bins (corresp onding to the hierarch y level K ) need to b e collected and stored. W e tested several mo diﬁcations of this setup, which pro duced consisten t results. Including more than 2 n bins on eac h lev el b y using ov erlapping bins of equal size w as found to bring no signiﬁcan t impro vemen t. Likewise, us- ing weigh ts other than 1 / 2 n for χ 2 n did not impro ve the result. In general, ﬁner bins at higher levels do not sub- stan tially con tribute to the form of the ﬁt, as their error bars are large and the shape of the distribution is usually already determined by the lo wer-lev el in tegrals. But they are still included into the ﬁtting pro cedure (provided that they contain enough data p oin ts for statistics) to resolve p oten tial ﬁne structures of the distribution. F or example, a perio dically oscillating function lik e the sine can ha v e zero a verage ov er large parts of the domain, while b eing nonzero on a ﬁner scale. The structure of suc h a function will b e resolved from the integrals ov er small bins if they con tain enough data. Due to the 1 / 2 n w eight factors, the small bins cannot ov erp o wer the large bin con tributions. Note that while for minimization we use the sum o ver all lev els, to decide whether a ﬁt is accepted w e c heck the go odness of the ﬁt at ev ery level separately . A ﬁt is ac- cepted only if e ach of the χ 2 n / ˜ n is approximately 1, within a given num b er T of standard deviations σ = p 2 / ˜ n ( ˜ n ≤ 2 n is the num b er of bins on level n which con- tain suﬃcient data to be used for ﬁtting). The ﬁt accep- tance threshold T is an external input parameter, typi- cally T = 2. Low er v alues of the threshold are lik ely to result in the failure to pro duce an acceptable ﬁt. Instead of a ﬁxed v alue, a range of T can also b e sp eciﬁed, for example betw een tw o and four. If there is no acceptable ﬁt at the low est threshold v alue, it is gradually increased un til either the maxim um is reached or an acceptable ﬁt is found. F or a large n um b er of hierarc h y lev els, the sta- tistical probabilit y for a level to exceed a 2 σ threshold on χ 2 n / ˜ n b ecomes non-negligible, even if the ov erall ﬁt is go od. Sp ecifying a threshold range allows one to attempt ﬁts with low er threshold v alues, without risking a failed ﬁt. C. P olynomial ﬁt The sampled in tegrals are ﬁtted to a spline. In what follo ws w e discuss p olynomial (typically , cubic) splines, but a generalization to other functions is straigh tforward. Here we brieﬂy review the theory of ﬁtting a single poly- nomial p ( x ) = P m k =0 a k x k , before we turn to the full spline in the next section. There exists a lot of standard softw are for ﬁtting a set of p oints to a function. Here we wan t to match the p oly- nomial to a set of integrals, which requires a sligh t adjust- men t. The deriv ation follo ws the steps of general linear least squares ﬁtting. 10 The b est polynomial ﬁt minimizes χ 2 = X i I i − I ( p ) i δ I i ! 2 , (9) where I ( p ) i is the in tegral of the polynomial ov er the bin i . F or simplicity we omit the sum o ver bin levels and the corresp onding w eighting factors in this section, but the generalization is straightforw ard. Let us lab el the b oundaries of bin i b y ¯ x i and ¯ x i +1 . The in tegral of the p olynomial ov er the bin equals Z ¯ x i +1 ¯ x i p ( x ) dx = m X k =0 a k k + 1 ( ¯ x k +1 i +1 − ¯ x k +1 i ) ≡ m X k =0 a k X ik 5 (10) and Eq. (9) b ecomes χ 2 = X i I i δ I i − m X k =0 a k X ik δ I i ! 2 ≡ X i b i − X k A ik a k ! 2 = | A · a − b | 2 . (11) The v ector b con tains the sampled in tegrals divided by the error bar (its length equals the num ber of bins ˜ n ), the vector a of length m + 1 contains the free parameters of the polynomial, and the ˜ n × ( m + 1) design matrix A is deﬁned by A ik = X ik /δ I i . The design matrix has more ro ws than columns since there must b e more data points than ﬁt parameters. W e ﬁnd the vector a that minimizes | A · a − b | 2 using singular v alue decomp osition of the design matrix. 10 The error on the ﬁtted p olynomial at a given p oin t x is obtained from the co v ariance matrix C j k = Co v( a j , a k ) via δ p ( x ) = v u u t m X i,j =0 C ij x i + j . (12) D. Spline A t eac h knot b etw een t w o spline pieces, the v alues of the corresp onding p olynomials and their deriv atives up to order ( m − 1) are equal. This giv es m constraints at eac h knot. Since eac h polynomial has m + 1 co eﬃcients, the total n umber of free parameters is n p + m , where n p is the num b er of spline pieces. The extra m parameters arise since there is one more p olynomial piece than knots. Kno wn b oundary conditions, if any , further reduce the n umber of free parameters. The algorithm determines the p ositions of the knots using the following pro cedure. Begin by attempting to ﬁt one p olynomial on the whole in terv al. If this ﬁt is not acceptable, split the in terv al into tw o equal parts and attempt ﬁtting a t wo-piece spline. If this ﬁt is not acceptable either, c heck the go o dness of the ﬁt on each in terv al separately . In terv als where the ﬁt is not accept- able hav e to b e split in the next iteration, while all other in terv als remain unc hanged. Rep eat this pro cedure until either an acceptable ﬁt is found, or the interv als can no longer b e split. W e require that the set of equations for eac h spline piece be ov erdetermined, i.e. that the n umber of bins inside eac h in terv al is larger than m + 1. This sets the limit on the maximal num b er of spline pieces. T o chec k the go o dness of the ﬁt on an individual in- terv al, w e compute the v alues of χ 2 n / ˜ n , where χ 2 n is ev al- uated only on bins of level n that are fully inside the giv en interv al (and that contain enough statistical data), and ˜ n is the num b er of such bins. Larger bins that reach o ver sev eral interv als cannot be used for this c heck, but they are used to chec k the go odness of the ov erall ﬁt. In all test cases, if the polynomials on the individual in- terv als passed the go odness-of-ﬁt test, the ov erall spline ev aluated o ver the entire domain passed also. Note that as before, the c hecks on the individual in terv als are per- formed for each hierarc hy lev el separately . If more than half of the level n bins inside a given in terv al do not co n- tain enough data, the chec k for this in terv al terminates without proceeding to subsequen t lev els. E. Constraining jumps in the highest deriv ative An additional constraint on the jumps in the m th deriv ative of the spline can be imp osed on the ﬁnal ﬁt, if desired. Deriv atives up to order m − 1 are matched at the spline knots, while all deriv atives of order greater than m are zero. The piecewise constant m th deriv ative, whic h is proportional to the v alue of the parameter a ( j ) m on each interv al j , is the only one that exhibits jumps at the knots. Imp osing an additional constraint that aims to minimize these jumps ensures a con tin uous ev olution of the ﬁtted spline as a function of the n um b er of sam- pled points N . The num b er of spline pieces for the best ﬁt generally increases with growing N , as b etter statistics p ermits a better resolution of the sampled function. Ad- ditional interv al divisions are triggered at certain discrete v alues of N , namely when one of the χ 2 n v alues on the af- fected in terv al crosses the goo dness-of-ﬁt threshold. This creates an additional jump in the m th deriv ative at the p osition of the new knot. This jump can be closed with only a small sacriﬁce in χ 2 n since the previous spline with one few er knot w as almost acceptable. In this sense, the additional constrain t on the jump acts as a comp ensation mec hanism for a p oten tially to o vigorous interv al split- ting. In most cases, constraining the jump has almost no eﬀect of the ﬁt. The constrain t term has the form λ n p n p − 1 X j =1 λ j a ( j +1) m − a ( j ) m ¯ a ( j +1) m − ¯ a ( j ) m ! 2 , (13) where ¯ a ( j ) m are the p olynomial parameters of the opti- mal ﬁt without the constrain t and the w eights λ and λ j con trol the strength of the constraint. After comput- ing the optimal spline through minimization of Eq. (8), a new spline ﬁt is attempted on the same interv al di- vision, this time b y minimizing the sum of (8) and the constrain t (13). The lo cal w eights λ j are determined b y the go o dness of the original ﬁt (without constraint) on the interv als j − 1, j , j + 1 and j + 2, whic h are adjacent or next-to-adjacent to the respective knot. Sp eciﬁcally , w e determine (on each hierarc hy level of each of these in terv als) the diﬀerence b etw een the maximally accept- able χ 2 n / ˜ n = 1 + T p 2 / ˜ n and the actual χ 2 n / ˜ n . W e take λ j to be the minimum of all these diﬀerences. This en- sures that the constrain t does not aﬀect spline in terv als 6 where already the original ﬁt was barely acceptable. The global weigh t λ is iteratively adjusted to the maximal v alue that is still compatible with the imp osed thresh- old on the χ 2 n / ˜ n . Additional iterations can b e p erformed b y replacing the ¯ a ( j ) m with the v alues calculated with the constrain t. F. Errors on the spline co eﬃcients An estimate on the spline co eﬃcient errors can b e ob- tained from Eq. (12) for each spline piece. This equation is accurate for parametric ﬁts to a set of statistically in- dep enden t normally distributed data p oin ts. This is not the case for the BHM, due to the o v erlapping bins and the unconv entional χ 2 -minimization proto col. Because of this, one cannot attach Gaussian standard deviation probabilities to error bars. A robust wa y of deﬁning the error at a given p oint x is to lo ok at the histogram of ˜ f ( x ) v alues ov er a large set of indep endent runs with the same sample size N . The smallest interv al around the histogram mean that con tains 68 . 27% of the ˜ f ( x ) v alues corresp onds to the robust estimate for tw o standard er- rors ( ± 1 σ ). This is the most rigorous, alb eit impractical, w ay of deﬁning the error under these circumstances. A suitable alternativ e proto col should yield errors that are close to the robust v alue. W e p erformed extensiv e tests of sev eral error estima- tion proto cols by comparing them to the robust estimate for diﬀeren t types of distribution f ( x ). These tests con- ﬁrmed that Eq. (12) provides a go od error estimate in most cases. In some cases the error w as found to b e o verestimated by a factor of ab out tw o compared to the robust error. This o ccurred in particular at the domain b oundaries and for BHM splines with a large num b er of knots. A concrete example is presented in Sec. IV C. The error w as nev er observ ed to be to o small and hence Eq. (12) gives a con venien t and eﬃcient wa y to obtain a conserv ative error estimate on the BHM ﬁt. A tigh ter error estimate can b e obtained using b ootstrap 11 with the follo wing proto col. Pro duce an ar- ra y of M histograms eac h of which contains a fraction N / M of the N sampled data points, so that each p oint is sampled in exactly one of the histograms. Generate ˜ M ≥ M b o otstrap histograms by taking ˜ M random linear combinations (with positive in teger coeﬃcients) of the M histograms. Each of these b o otstrapped his- tograms then contains N sampled points, but with p os- sible rep etitions. Run the BHM ﬁt on eac h of the bo ot- strap histograms ﬁxing the p ositions of the knots to b e the same as for the BHM ﬁt of the regular histogram (the sum of all bo otstrap histograms with unit weigh t) and without ev aluating the go o dness of ﬁt. The error on the spline co eﬃcients is then determined from the statis- tics on the b o otstrapp ed histogram ﬁts in the usual w ay . The b o otstrap error was found to b e very close to the robust error estimate in all our tests. This sc heme uses more resources, esp ecially memory , since M & 100 his- tograms need to b e stored instead of one. The increase in computational time is less signiﬁcant, since p erforming ﬁts with ﬁxed knot positions is v ery fast. Another error estimation proto col is based on the anal - ysis of the evolution of the BHM ﬁt as a function of sam- ple size N . F or this, the BHM spline is pro duced and sa ved perio dically in in terv als of ∆ sampling steps. The idea is that the ﬁt results ˜ f ( x ) will approach the exact v alue with the disp ersion σ ∝ 1 / √ N . The interv al size ∆ m ust be large enough to guaran tee that the con tributions A k from diﬀeren t in terv als are statistically independent, allo wing one to inv oke the central limit theorem: ˜ f k = 1 k k X i =1 A i ⇒ A k = k ˜ f k − ( k − 1) ˜ f k − 1 . (14) The disp ersion of ˜ f k can then b e obtained from the dis- p ersion σ ∗ of the indep endent A k via σ 2 k = σ 2 ∗ /k . An additional cutoﬀ k ≥ k 0 should also b e introduced to re- duce the eﬀect of the initial p ortion of the statistics that can b e strongly aﬀected by transien t pro cesses. A simple consistency criterion is that there should b e a range of ∆ and k 0 suc h that σ ∗ is essentially indep endent of b oth parameters. Our tests sho w ed that the errors calculated with this method are almost indistinguishable from the ones obtained with b o otstrap, and hence also agree with the robust error. This method is somewhat slo wer, since the full BHM ﬁt needs to b e ev aluated rep eatedly , and requires a suﬃciently large N . On the other hand, the A k sequence pro vides explicit insigh t in to the statistics of the data and can help to identify statistical ﬂuctuations. G. Sign problem Mon te Carlo sampling often suﬀers from a sign prob- lem, when p ositive and negativ e terms o ccur with almost equal frequency in the sampling pro cess. Before a suﬃ- cien t amount of data has been collected, the error may th us substantially exceed the absolute v alue of the sam- pled integrals. The same applies to the error of the ﬁtted spline. Fits where the noise substan tially exceeds the sig- nal are unsuitable for further data analysis. T o iden tify this problem, the algorithm chec ks—at all bin levels— whether the sampled data are consisten t with zero before the start of the ﬁtting process. The data is deemed c er- tainly inc onsistent with zero only if at least one of the follo wing conditions is met: (i) at any individual level n , the v alue of χ 2 n / ˜ n for the zero function exceeds 1 by at least 4 standard deviations (4 p 2 / ˜ n ), or (ii) at any t wo individual levels, χ 2 n / ˜ n exceeds 1 b y at least 3 p 2 / ˜ n , resp ectiv ely , or (iii) χ 2 n / ˜ n exceeds 1 by at least 3 p 2 / ˜ n at one lev el, and by at least 2 p 2 / ˜ n on t wo other levels, resp ectiv ely , or (iv) at an y four individual levels, χ 2 n / ˜ n exceeds 1 by at least 2 p 2 / ˜ n , resp ectiv ely . The probabil- it y for this to o ccur as a statistical ﬂuctuation is lo wer than 10 − 4 . Unless one of these c onditions is met, the al- 7 1 . 0 1 . 6 2 . 2 2 . 8 x 0 . 3 0 . 4 0 . 5 0 . 6 f ( x ) ∝ 1 − 3 x/ 2 + 2 x 2 − x 3 / 2 (a) BHM spline basis pro jection f ( x ) 1 . 0 1 . 6 2 . 2 2 . 8 x − 0 . 04 − 0 . 02 0 . 00 0 . 02 ˜ f ( x ) − f ( x ) (b) BHM spline basis pro jection FIG. 1. Cubic polynomial test function. The BHM ﬁt (red dashed line with error band) in comparison with the basis pro jection metho d (blue dotted line with error band). The left panel (a) shows the function and the ﬁts, while the righ t panel (b) sho ws the diﬀerence b etw een the ﬁtted and the true function for b oth metho ds. gorithm can be set to not con tinue with the ﬁtting proce- dure un til more data has b een collected. The conditions are c hosen to be strict, to minimize the risk of con tinuing the analysis with a noisy ﬁt. An alternativ e to this pro cedure is to consider the evo- lution of the ﬁtted spline with accumulated statistics, accepting the spline only when it ceases to change sub- stan tially o ver Monte Carlo time, Z    ˜ f N ( x ) − ˜ f 2 N ( x )    dx < α Z | ˜ f 2 N ( x ) | dx, (15) where ˜ f N ( x ) is the BHM spline obtained from N sam- pled data p oin ts, α is the acceptance threshold and the in tegration goes ov er the en tire domain. Typical v alues for α lie approximately b et ween 0 . 2 and 0 . 5. IV. NUMERICAL TESTS W e presen t tests of the BHM for sev eral t yp es of distri- bution. Unless otherwise stated, for eac h test 10 4 random samples were tak en, to demonstrate the ability to restore the smo oth distribution from a mo derate sample. W e al- w ays ﬁt cubic splines. The display ed error bands on the splines w ere obtained with Eq. (12). F or comparison, we sampled the same data p oints also with the basis pro jection metho d using a p olynomial ba- sis up to cubic order. The binning for basis pro jection sampling was retrosp ectively c hosen to b e the same as the in terv al division for the spline that was determined b y the bin hierarch y algorithm. This is a strict test, since in practice the optimal binning for the pro jection metho d is not known in adv ance. F or eac h of the tests w e observ ed that the accuracy of the bin hierarch y ﬁt was at least as go o d as with the basis pro jection metho d, pro ving the sup eriority of BHM due to at le ast its eﬃ- ciency and the smoothness of ˜ f ( x ). W e displa y the basis pro jection results for the ﬁrst example of a p olynomial distribution and omit them in the following examples to a void ov ercrowding the plots. A. P olynomial The distribution sampled on the interv al [1 , 2 . 8] is f ( x ) ∝ 1 − 3 x/ 2 + 2 x 2 − x 3 / 2 . (16) Since the distribution is a cubic p olynomial, the algo- rithm should stop after ﬁtting one p olynomial on the en tire domain. This is indeed the case. Figure 1 sho ws the BHM ﬁt in comparison with the result obtained using the basis pro jection method. In this particular case, the basis pro jection metho d provides the most accurate esti- mate on the function by deﬁnition, since we are pro ject- ing on a function that has the same form as f ( x ) and th us co ver the whole in terv al [1 , 2 . 8] without introducing any systematic error. Nev ertheless, BHM pro duces a result comparable in accuracy . This example demonstrates that the BHM ﬁt comp etes with the basis pro jection method ev en when the latter is kno wn to be optimal. W e also p erform tests with a higher order polynomial. The function sampled on the interv al [ − 1 , 1] is f ( x ) ∝ x 4 − 0 . 8 x 2 . (17) Since f ( x ) c hanges sign, we use | f ( x ) | as the probabilit y to generate x and then sample sign[ f ( x )]. The BHM pro vides an accurate smo oth ﬁt of this function. The result in comparison with the basis pro jection metho d is sho wn in Fig. 2. F our spline pieces were needed to resolve the function in this example. As in the previous example, 8 − 1 . 0 − 0 . 5 0 . 0 0 . 5 1 . 0 x − 1 . 0 − 0 . 5 0 . 0 0 . 5 1 . 0 f ( x ) ∝ x 4 − 0 . 8 x 2 (a) BHM spline basis pro jection f ( x ) − 1 . 0 − 0 . 5 0 . 0 0 . 5 1 . 0 x − 0 . 10 − 0 . 05 0 . 00 0 . 05 0 . 10 ˜ f ( x ) − f ( x ) (b) BHM spline basis pro jection FIG. 2. Quartic p olynomial test function. The BHM ﬁt (red dashed line with error band) in comparison with the basis pro jection metho d (blue dotted line with error band). The left panel (a) sho ws the function and the ﬁts, while the right panel (b) shows the diﬀerence b etw een the ﬁtted and the true function for b oth metho ds. 1 . 0 1 . 6 2 . 2 2 . 8 x 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 f ( x ) ∝ exp( − 3 x ) (a) BHM spline f ( x ) 1 . 0 1 . 6 2 . 2 2 . 8 x − 0 . 10 − 0 . 05 0 . 00 0 . 05 0 . 10 ˜ f ( x ) − f ( x ) (b) BHM spline FIG. 3. BHM ﬁt with error band of an exp onen tial test function. The left panel (a) sho ws the function and the ﬁt, while the righ t panel (b) shows the diﬀerence b etw een the ﬁtted and the true function. the errors on the BHM spline and the basis pro jection are comparable in size. B. Deca ying exp onential The distribution sampled on the interv al [1 , 2 . 8] is f ( x ) ∝ exp( − 3 x ) . (18) An exp onentially decaying distribution is typical for man y physical processes. In this example, the BHM pro- duced an acceptable ﬁt with t wo spline pieces on the in terv als [1 , 1 . 9] and [1 . 9 , 2 . 8]. Figure 3 shows the b est ﬁt, whic h agrees w ell with the original function. T o demonstrate the imp ortance of including com bina- tions of bins into the ﬁt, we also compare our result with the b est spline obtained b y ﬁtting elemen tary bins only (those elementary bins that did not contain enough data p oin ts for sensible statistics were coarse grained un til the minimal size that can b e used for statistical analysis was reac hed). Figure 4 compares the tw o ﬁts. The ﬁt us- ing only elemen tary bins deviates strongly from f ( x ) on the ﬁrst in terv al [1 , 1 . 9]. In particular, it is apparen t that the in tegrals o v er this in terv al and ov er the entire domain w ere not properly captured. C. Oscillating function The distribution sampled on the interv al [1 , π + 0 . 6] is f ( x ) ∝ 10 + cos(10 x ) . (19) 9 1 . 0 1 . 6 2 . 2 2 . 8 x 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 f ( x ) ∝ exp( − 3 x ) BHM spline spline using only elemen tary bins f ( x ) FIG. 4. Exp onential test function. The BHM ﬁt (red dashed line with error band) in comparison with the ﬁt using only elemen tary bins (blue dotted line with error band). The latter do es not prop erly resolve all sampled integrals of the test function. The c hallenge is to resolve the p eriodic oscillations within the accuracy of the sampled data. F or a small sample, w e exp ect a ﬂat ﬁt repro ducing the a verage of the func- tion. As we collect more data the oscillations should b e resolv ed. W e sho w the bin hierarc hy ﬁt for a small sam- ple of 10 4 p oin ts and for a larger sample of 10 6 p oin ts in Fig. 5. The ﬁt repro duces the original distribution w ell within error bars. Man y spline pieces are needed to resolv e the structure (in the examples shown: 15–16 in terv als) and Eq. (12) o verestimates the error on the spline co eﬃcien ts. In Fig. 5 we sho w the error band ob- tained from Eq. (12) as well as the one obtained using b ootstrap. In Fig. 6 we explicitly compare the error estimates ob- tained with Eq. (12), b o otstrap, and ﬁt ev olution. The b ootstrap and ﬁt evolution errors are nearly the same for all x . In this example, for b ootstrap and ﬁt evolution errors, the deviation ˜ f ( x ) − f ( x ) exceeds 1 σ on roughly 30% of the domain and 2 σ on roughly 7% of the do- main, which is comparable to the Gaussian case. F or the Eq. (12) error, the deviation almost never exceeds 1 σ . W e also presen t a robust error analysis based on 100 in- dep enden t samples with 10 6 p oin ts each. The histogram of ˜ f ( x ) v alues is shown in Fig. 7 for several v alues of x . F or reference we indicate the sp eciﬁc v alue of ˜ f ( x ) from the example sho wn in Fig. 5, to illustrate for which x the ﬁt v alues in this example are statistical outliers, and for which they are t ypical. W e also sho w the median of the Eq. (12) errors and a representativ e b o otstrap error (from the example shown in Fig. 5). Indeed, the size of the bo otstrap and ﬁt ev olution errors is v ery close to the robust estimate, while the error from Eq. (12) is to o large. D. Div ergent function on a semi-inﬁnite domain The distribution sampled on the interv al [0 , ∞ ) is f ( x ) ∝ 1 √ x (1 + x ) . (20) The basis pro jection metho d has the adv antage that the 1 / √ x div ergence at x = 0 can b e incorporated directly into the basis, and that the asymptotic b e- ha vior at large x can b e resolved using a semi-inﬁnite bin. W e present a setup that achiev es the same in the BHM framework. The div ergence can b e a voided by w eighting all sampled v alues with √ x and recov ering the original distribution at the end by dividing the BHM spline by √ x . The semi-inﬁnite domain can be mapped on to a ﬁnite in terv al for instance through a transform suc h as y ( x ) = 2 arctan( x ) /π (used in this example) or y ( x ) = 1 − exp( − x ). F or each x generated according to the distribution f ( x ), the v alue y ( x ) ∈ [0 , 1] is calculated and sampling is then performed in to the corresp onding y - histogram. T o recov er the correct function, the sampled v alues need to b e scaled by 1 /x 0 ( y ). Figure 8 shows the BHM ﬁt transformed back in to the original domain. Due to the large domain, 10 5 sampling p oin ts were used for this example. The ﬁt agrees well with the original distribution within error bars. In this example, the BHM pro duced an acceptable ﬁt on one spline in terv al. In order to comp ensate for the divergence, it is as- sumed that its lo cation and type are known in adv ance. Often it is suﬃcient to know the approximate prop er- ties of the divergence. T o demonstrate this, Fig. 9 shows BHM ﬁts of data sampled with a weigh ting factor that sligh tly ov erestimates or underestimates the p o wer of the div ergence (the mapping of the semi-inﬁnite domain on to a ﬁnite interv al is the same as before). Despite the wrong scaling, the ﬁts still agree with the original distribution, but the errors on the ﬁts are larger. It is b etter to o ver- estimate the p o w er, since in this case the divergence is still fully comp ensated. E. Sampling with a severe sign problem W e demonstrate the p erformance of the algorithm in the presence of a sev ere sign problem by generating data on the interv al [0 , 3] with the distribution giv en by f ( x ) ∝ exp( − 0 . 99 x ) − exp( − x ) ≡ f 1 ( x ) − f − 1 ( x ) . (21) The v alues of x are generated via a Monte Carlo Marko v c hain pro cess with tw o t yp es of up dates: switc hing b e- t ween tw o “sectors” corresp onding to f 1 and f − 1 , and v arying the v alue of x on the interv al [0 , 3] within the same sector. The up date acceptance probabilities are giv en b y the detailed balance equations, P f − σ → f σ = f σ ( x ) /f − σ ( x ) = exp( ± 0 . 01 x ) , (22) P x → x 0 = f σ ( x 0 − x ) . (23) 10 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 3 . 5 x 0 . 20 0 . 25 0 . 30 0 . 35 0 . 40 0 . 45 f ( x ) = 1 /π + cos(10 x ) / 10 π (a) BHM spline f ( x ) 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 3 . 5 x − 0 . 10 − 0 . 05 0 . 00 0 . 05 0 . 10 ˜ f ( x ) − f ( x ) (b) BHM spline 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 3 . 5 x 0 . 28 0 . 30 0 . 32 0 . 34 0 . 36 0 . 38 f ( x ) = 1 /π + cos(10 x ) / 10 π (c) BHM spline f ( x ) 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 3 . 5 x − 0 . 010 − 0 . 005 0 . 000 0 . 005 0 . 010 ˜ f ( x ) − f ( x ) (d) BHM spline FIG. 5. Oscillating test function using 10 4 sampled p oints (top panels (a) and (b)) and 10 6 sampled p oints (b ottom panels (c) and (d)). The BHM ﬁt (red dashed line) is sho wn with t wo error estimates on the spline co eﬃcien ts: errors calculated using Eq. (12) (red band) and errors calculated using b o otstrap (blue band). Bo otstrap provides a tighter bound on the errors in this example. The left panels, (a) and (c), sho w the function and the ﬁt, while the right panels, (b) and (d), sho w the diﬀerence b et w een the ﬁtted and the true function. The sign problem is due to the functions f 1 ( x ) and f − 1 ( x ) ha ving almost the same magnitude but b eing sampled with opp osite sign. This setup is a toy version of the t ypical situation arising in diagrammatic Mon te Carlo for fermions. Figure 10 sho ws the BHM spline for 10 5 and 10 7 sam- pled p oin ts. In the former case, the data is compatible with zero, whic h is correctly captured b y the algorithm presen ted in Sec. I II G. In this case, the ﬁt should not b e used, despite having acceptable χ 2 (the ﬁt is consis- ten t with the true function within errors). As more data are gathered the algorithm b egins to correctly repro duce the features of the function. In the examples shown, one spline piece was suﬃcient for an acceptable ﬁt. F. Green’s function of the F r¨ ohlich polaron W e now apply the BHM to a ph ysical problem—the diagrammatic Monte Carlo calculation of the zero mo- men tum imaginary time Green’s function, 9 G ( k = 0 , τ ) = h v ac | a 0 ( τ ) a † 0 (0) | v ac i , (24) a k ( τ ) = e H τ a k e − H τ . (25) of the F r¨ ohlich p olaron with the dimensionless coupling constan t α = 2 and the c hemical potential µ = − 2 . 07 ω 0 , where ω 0 is the (momentum-independent) phonon fre- quency . Here | v ac i is the v acuum state and a k is the annihilation op erator for an electron with momen tum k . The F r¨ ohlich Hamiltonian describ es an electron coupled 11 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 3 . 5 x − 3 − 2 − 1 0 1 2 3 ( ˜ f ( x ) − f ( x )) /σ ( x ) Eq. (9) error b o otstrap error ﬁt ev olution error FIG. 6. Oscillating test function. Diﬀerence b etw een the ﬁtted and the true function normalized by one standard error obtained with diﬀerent metho ds. to a bath of phonons, 12 H = H e + H ph + H e − ph , (26) H e = X k k 2 2 a † k a k , H ph = X q ω 0 b † q b q , (27) H e − ph = X k , q i (2 √ 2 απ ) 1 / 2 q ( b † q − b − q ) a † k − q a k , (28) where b q is the annihilation operator for a phonon with momen tum q . The Green’s function is a central quantit y for the dia- grammatic technique, from which other prop erties of the system can b e obtained with appropriate analysis. As ref- erence w e use a v ery long Monte Carlo run with rew eight- ing. This pro vides a reliable estimate of the Green’s func- tion with negligible errors (several orders of magnitude smaller than the errors of the sample used for the BHM and for basis pro jection sampling). Figure 11 sho ws the BHM ﬁt and the result obtained using the basis pro jection metho d for approximately 2 · 10 6 sampled points. The BHM ﬁt pro duced four spline pieces in this example and, as in the previous examples, the basis w as retrosp ectiv ely chosen to hav e the same in terv al division. It can b e clearly seen that the BHM pro vides an accurate smo oth ﬁt of the F r¨ ohlich p olaron Green’s function, which agrees well with the reference. While the error bars on the BHM ﬁt and the basis pro- jection are comparable, the basis pro jection sampling is less eﬃcient, since it requires O ( m 2 ) op erations during the sampling stage, where m is the num b er of basis func- tions used. In this particular example with a polynomial basis up to cubic order, this corresp onds to at least 16 times as many op erations as needed during the sampling stage for BHM. V. CONCLUSIONS W e hav e argued and demonstrated that the BHM yields an eﬃcien t, ﬂexible, and fully automatized algo- rithm to restore smo oth functions from their noisy inte- grals using all av ailable information. The resulting ﬁts are at least as accurate as the ones obtained using basis pro jection sampling, but with guaranteed smo othness at the knots. Sampling with the BHM is also computation- ally less exp ensiv e than using basis pro jections, which require man y op erations in eac h sampling step. Similar to the pro jection metho d onto a p olynomial basis, the BHM spline is a piecewise p olynomial function on sev- eral large in terv als. The crucial technical adv antage is that these interv als do not need to b e ﬁxed beforehand. A suitable division into interv als is found automatically b y the algorithm and is adjusted ov er time, as more data p oin ts are collected. In the future w e plan to extend the bin hierarc hy al- gorithm to multiv ariate functions, since many relev ant ph ysical observ ables dep end on several v ariables, such as time and momen tum. Tw o, three and four dimensions are most relev an t for ph ysical applications. A CKNOWLEDGMENTS W e thank Chris Amey for helpful discussions. This w ork w as supported b y the Simons Collaboration on the Man y Electron Problem and the National Science F oun- dation under the grant DMR-1720465. O.G. also ac- kno wledges supp ort b y the US-Israel Binational Science F oundation (Gran ts 2014262 and 2016087). 1 I. Narsky and F. C. Porter, Statistic al Analysis T echniques in Particle Physics (John Wiley & Sons, 2013). 2 D. W. Scott, Multivariate Density Estimation (John Wiley & Sons, 2015). 3 B. W. Silverman, Density estimation for statistics and data analysis (London: Chapman and Hall, 1986). 4 O. Goulko, A. Gaenk o, E. Gull, N. Prokof ’ev, and B. Svistuno v, ArXiv e-prints (2017), co de a v ailable at https://github.com/olgagoulko/BHM/tree/ GPLv3 , arXiv:1711.04316 [stat.OT]. 5 O. Goulko, A. S. Mishchenk o, L. Pollet, N. Prokof ’ev, and B. Svistunov, Phys. Rev. B 95 , 014102 (2017), arXiv:1609.01260 [cond-mat.other]. 6 C. De Boor, A Pr actic al Guide to Splines (R evise d Edition) (Springer, 2001). 7 N. V. Prokof ’ev and B. V. Svistunov, JETP Lett. 97 , 649 (2013), arXiv:1304.5198 [cond-mat.stat-mech]. 8 A. M. F errenberg and R. H. Swendsen, Phys. Rev. Lett. 61 , 2635 (1988), ; Erratum: Ph ys. Rev. Lett. 63, 1658 (1989). 9 A. S. Mishc henko, N. V. Prokof ’ev, A. Sak amoto, and 12 0.28 0.286 0.292 0.298 0.304 0.327 0.329 0.331 0.333 0.335 0.290 0.292 0.294 0.296 0.298 0.33 0.34 0.35 0.36 0.37 (a) (b) (c) (d) FIG. 7. Smo othed histogram of ˜ f ( x ) v alues for the oscillating test function at x = 1 , 2 , 2 . 9 , 3 . 74 (from left to righ t). The v ertical lines are: histogram mean (red solid line); true function v alue (blue dotted line); the particular ˜ f ( x ) v alue from the sim ulation sho wn in the bottom panel of Fig. 5 (blac k dashed line). The latter is shown for reference. The horizon tal lines are: the ± 1 σ in terv al around the histogram mean, where σ is the median standard error from Eq. (12) (red solid line); the ± 1 σ interv al around the histogram mean, where σ is the robust standard error (blue dotted line); the ± 1 σ in terv al around the histogram mean, where σ is the typical bo otstrap error, taken from the example shown in the bottom panel of Fig. 5 (blac k dashed line). 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 x 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 3 . 5 4 . 0 4 . 5 f ( x ) ∝ 1 √ x (1+ x ) (a) 1 2 3 4 5 6 7 8 9 10 x 0 . 00 0 . 04 0 . 08 0 . 12 0 . 16 BHM spline f ( x ) 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 x − 0 . 04 − 0 . 02 0 . 00 0 . 02 0 . 04 ˜ f ( x ) − f ( x ) (b) BHM spline 1 2 3 4 5 6 7 8 9 10 x − 0 . 001 0 . 000 0 . 001 ˜ f ( x ) − f ( x ) FIG. 8. Sampling a divergen t function deﬁned on a semi-inﬁnite domain using the BHM with appropriate transforms. The left panel (a) shows the function and the ﬁt, while the right panel (b) shows the diﬀerence b etw een the ﬁtted and the true function. 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 x − 0 . 2 0 . 0 0 . 2 ˜ f ( x ) − f ( x ) (a) BHM spline 1 2 3 4 5 6 7 8 9 10 x − 0 . 002 0 . 000 0 . 002 ˜ f ( x ) − f ( x ) 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 x − 0 . 8 − 0 . 4 0 . 0 0 . 4 0 . 8 ˜ f ( x ) − f ( x ) (b) BHM spline 1 2 3 4 5 6 7 8 9 10 x − 0 . 002 0 . 000 0 . 002 ˜ f ( x ) − f ( x ) FIG. 9. Divergen t test function f ( x ) ∝ x − 0 . 5 (1 + x ) − 1 with an x − 0 . 5 div ergence at x = 0. Data is sampled with a comp ensation for the div ergence that ov erestimates (left panel (a)) and underestimates (righ t panel (b)) the p o wer of the divergence by using w eighting factors of x 0 . 8 and x 0 . 2 , resp ectiv ely . The plots sho w the diﬀerence betw een the BHM ﬁt and the true function. 13 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 x − 0 . 005 0 . 000 0 . 005 0 . 010 0 . 015 0 . 020 f ( x ) ∝ exp( − 0 . 99 x ) − exp( − x ) (a) BHM spline f ( x ) 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 x 0 . 0000 0 . 0005 0 . 0010 0 . 0015 0 . 0020 0 . 0025 f ( x ) ∝ exp( − 0 . 99 x ) − exp( − x ) (b) BHM spline f ( x ) FIG. 10. Sampling with a sev ere sign problem using 10 5 sampled points (left panel (a)) and 10 7 sampled points (righ t panel (b)). The sampled in tegrals used for the ﬁt in the left panel are consistent with zero, which is correctly captured by the chec k presen ted in Sec. I I I G. In this case, the display ed BHM spline w ould not b e used for further data analysis. 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 ω 0 τ 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 G ( τ ) (a) 0 . 0 0 . 5 1 . 0 1 . 5 2 . 0 ω 0 τ − 0 . 010 − 0 . 005 0 . 000 0 . 005 0 . 010 ˜ G ( τ ) − G ( τ ) (b) BHM spline basis pro jection FIG. 11. Diagrammatic Mon te Carlo sampling of the F r¨ ohlich p olaron Green’s function. The left panel (a) shows a very precise numerical calculation of the function, while the right panel (b) shows the diﬀerence b etw een the ﬁtted and the reference function for the BHM and the basis pro jection metho d. B. V. Svistunov, Phys. Rev. B 62 , 6317 (2000), cond- mat/9910025. 10 W. H. Press, S. A. T eukolsky , W. T. V etterling, and B. P . Flannery , Numerical Re cipes in C++ : The Art of Sci- entiﬁc Computing , 2nd ed. (Cambridge Univ ersit y Press, 2002). 11 B. Efron, Ann. Statist. 7 , 1 (1979). 12 H. F r¨ ohlich, H. P elzer, and S. Zienau, Philos. Mag. 41 , 221 (1950).

Restoring a smooth function from its noisy integrals

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment