Priors for New Physics

V ersion 1.8 Priors for New Ph ysics Maurizio Pierini 1 , Harrison B. Prosp er 2 , Sezen Sekmen 2 and Maria Spiropulu 1 , 3 1 CERN, CH-1211, Geneva 23, Switzerland 2 Dep artment of Physics, Florida State University, T al lahasse e, Florida 32306, USA and 3 Dep artment of Physics, Calte ch, Pasadena, California 91125, USA (Dated: Nov em b er 17, 2021) Abstract The in terpretation of data in terms of m ulti-parameter mo dels of new ph ysics, using the Ba y esian approac h, requires the construction of m ulti-parameter priors. W e prop ose a construction that uses elemen ts of Bay esian reference analysis. Our idea is to initiate the c hain of inference with the reference prior for a likelihoo d function that depends on a single parameter of in terest that is a function of the parameters of the physics mo del. The reference p osterior density of the parameter of interest induces on the parameter space of the ph ysics mo del a class of p osterior densities. W e prop ose to con tin ue the chain of inference with a particular densit y from this class, namely , the one for whic h indistinguishable models are equiprobable and use it as the prior for subsequen t analysis. W e illustrate our method by applying it to the constrained minimal sup ersymmetric Standard Mo del and tw o non-universal v arian ts of it. 1 I. INTR ODUCTION With the start of the Large Hadron Collider (LHC) [1], w e ha ve en tered an era in whic h sp eculation ab out new ph ysics has given wa y to detailed exp erimen tal study . This has had the w elcome consequence of focusing attention on a diﬃcult practical question: given the plethora of mo dels of p oten tial new physics, man y dep ending on multiple unkno wn parameters, what is the b est practical w ay to navigate the landscap e of p ossibilities? This is a m ulti-faceted problem, of whic h undoubtedly the most c hallenging is devising reliable bac kground estimates for all the ﬁnal states that are b eing scrutinized. Another challenge is the construction of v ery fast accurate sim ulations [2] of new physics mo dels at h undreds of thousands, even millions, of parameter p oin ts. This is necessary b ecause, in general, the eﬀectiv e cross section,  ( θ ) σ ( θ )—that is, the signal eﬃciency ,  ( θ ), times cross section, σ ( θ )—is a function of the parameters θ of the mo del under in vestigation. In this Paper, we shall assume that b oth of these diﬃcult tasks hav e b een accomplished. Instead we address another imp ortant facet of the problem, namely , that of extracting in- formation about a giv en new ph ysics mo del once LHC data b ecome suﬃcien tly abundan t to test it. W e prop ose a new metho d that is applicable to any multi-parameter mo del that yields a prediction ab out the exp ected signal coun t. W e illustrate the method using three su- p ersymmetric (SUSY) mo dels [3]: the constrained minimal sup ersymmetric Standard Model (CMSSM) [4] and tw o non-univ ersal v arian ts of it. The a v ailabilit y of increasingly p o werful computers has made it p ossible to study multi- parameter models in a holistic manner. Indeed, it has become routine to use techniques such as Mark ov Chain Monte Carlo (MCMC) [5] to explore the multi-dimensional parameter spaces of mo dels suc h as SUSY [6]. This is another w elcome dev elopmen t. Recen t w ork on SUSY mo dels [7] has shown that a holistic approach can yield qualitatively diﬀeren t conclusions from those arriv ed at using the traditional approac h based on b enc hmarks [8]. SUSY mo dels hav e b een studied using b oth frequen tist [9] and Bay esian [10] metho ds. The frequen tist studies t ypically construct conﬁdence regions and obtain the b est-ﬁt p oin t. Sometimes, information ab out individual parameters or pairs of parameters is obtained b y pro jecting the lik eliho o d function on to the parameters of in terest. This procedure is actually a frequen tist/Ba yesian hybrid, whic h amoun ts to using a ﬂat prior on the parameters. A conceptually more consisten t, alb eit appro ximate, frequentist approach is to construct a 2 proﬁle lik eliho od [12 – 14] for the parameter of interest. F or example, if the parameter of in terest is m 0 and l ( m 0 , ω ) ∼ p ( x | m 0 , ω ) is the likelihoo d function for observ ations x , where ω denotes the remaining parameters, the proﬁle likelihoo d for m 0 is l P ( m 0 ) ∼ p ( x | m 0 , ˆ ω ( m 0 )), where ˆ ω ( m 0 ) is the b est ﬁt v alue of the parameters ω for a given v alue of m 0 . The proﬁle lik eliho o d l P ( m 0 ) is then used as if it were a true lik eliho o d. W e prop ose to use the Ba y esian approach [15] b ecause of its strong theoretical founda- tions, its generality and the fact that it is conceptually straigh tforw ard: given a prior π ( θ ) deﬁned on the parameter space Θ of the mo del, where in general θ is multi-dimensional, and a likelihoo d p ( x | θ ), one computes the p osterior densit y p ( θ | x ) ∼ p ( x | θ ) π ( θ ) from which a m yriad of details can be extracted suc h as p oint estimates or credible regions. It is also p ossible to make predictions about whic h data w ould b e most useful to take next, and one can rank mo dels according to their concordance with observ ations. Moreov er, all manner of uncertain ties, irresp ective of their prov enance and how w e choose to lab el them—statistical, systematic, theoretical, b est guess, etc.—can b e accoun ted for in a conceptually coherent and uniﬁed manner. Ev ery fully Ba yesian analysis, ho wev er, must contend with the problem of constructing a prior π ( θ ) on the parameter space of the mo del under inv estigation. This task is esp ecially diﬃcult in circumstances in which intuition pro vides little guidance as is inv ariably the case for m ulti-parameter mo dels. Current studies, whic h place ﬂat or logarithmic priors on the parameters of new physics models, are sensitive to the c hoice of prior [10]; therefore, the c hoice of prior is a critical issue that m ust b e squarely faced. This is the main purp ose of this P ap er. The curren t sensitivit y of results to the prior is sometimes construed as an intrinsic diﬃcult y with the Ba yesian approach. In fact, the correct conclusion to b e drawn is that it is not yet possible to place robust constrain ts on all the parameters of a t ypical m ulti- parameter mo del of new physics, a conclusion that is indep enden t of the metho d used to extract information ab out the mo del b e it frequen tist or Ba yesian. The diﬃculty is not that results are sensitiv e to the prior—this fact tells us something ob vious and imp ortan t: w e need more data and b etter analyses. Rather the diﬃculty is that ﬂat priors on m ulti-dimensional parameter spaces can lead to pathological results, whic h ma y not b e apparent without a careful study . Flat priors hav e b een used successfully , witness the recen t disco very of single top quark pro duction b y DØ [16] and CDF [17]. But these results w ere obtained with a ﬂat 3 prior applied to a single carefully chosen parameter, namely , the cross section [18]. Giv en that our multi-dimensional in tuition ma y b e unreliable, we are faced with a c hoice: either abandon the Bay esian approach—and, in our view, abandon an extremely p o wer- ful set of ideas—or, as we propose, put in tuition aside and use a formal pro cedure with mathematically veriﬁable prop erties to place priors on the parameter spaces. W e prop ose a solution inspired by a set of Bay esian metho ds called r efer enc e analysis [19 – 21], whose key construct is the r efer enc e prior . W e adv o cate the use of reference priors because they lead to inferences with useful prop er- ties, including inv ariance under one-to-one transformations of the parameters and excellent frequen tist co verage. The latter property means that the (Bay esian) credible regions are also approximate (frequentist) conﬁdence regions. Moreo ver, the reference prior can b e p erturbed in a controlled wa y to chec k the robustness of conclusions. Ha ving initiated the inference chain with a reference prior, we can use Ba yesian metho ds to • quan tify the statistical signiﬁcance of a signal, • rank mo dels according to their concordance with observ ations, • estimate mo del parameters, and • design an optimal analysis for a giv en mo del and a given integrated luminosity . In this P ap er, in addition to the main task of constructing m ulti-parameter priors, we address the ﬁrst tw o points—the statistical signiﬁcance of a signal and mo del ranking—and we defer consideration of the last tw o to a future publication. Ba yesian reference analysis [19 – 21] pro vides a principled w a y to approac h the problem of multi-parameter priors. Ho wev er, while the solution it proposes is computationally fea- sible for one-parameter problems, it rapidly b ecomes computationally prohibitiv e for multi- parameter problems using curren t algorithms. Since the 1-parameter problem is a w ell- understo od, solv ed, problem, our prop osed solution b egins with the solution of a 1-parameter problem and pro ceeds to the multi-parameter problem b y imp osing tw o requiremen ts on the m ulti-parameter prior: consistency and equiprobabilit y , b oth of whic h are describ ed in detail b elo w. Our solution pro ceeds in four steps: 4 1. ﬁrst, w e compute the marginal lik eliho o d b y in tegrating the likelihoo d function with resp ect to an evidence-based prior o ver all parameters except the parameter of in terest; 2. next, we compute the reference prior asso ciated with the marginal lik eliho o d; 3. then, we compute the reference p osterior density for the parameter of in terest, 4. and, ﬁnally , we map the reference p osterior densit y to a p osterior densit y on the parameter space of each multi-parameter mo del under study . Clearly , these steps can be applied to an y exp erimen t that has a single parameter of in terest. In this P ap er, we apply the steps to a single coun t exp eriment b ecause it yields the simplest p ossible analysis and the key calculations can b e done exactly . In the follo wing sections, w e describ e the single count mo del, its reference prior, and our metho d for mapping the signal p osterior density to the parameter space of a given multi-parameter mo del. The Paper is organized as follo ws. In Sec. I I, we give a detailed description of the single coun t mo del and its asso ciated reference prior. Our construction of m ulti-parameter priors is describ ed in Sec. I II. In Sec. IV w e illustrate the metho d using three SUSY mo dels, a 2-parameter CMSSM and tw o 5-parameter non-universal generalizations. W e end with a summary and concluding remarks. I I. THE SINGLE COUNT MODEL In the context of the LHC, the single count mo del describes the results of a “cut and coun t” analysis in whic h N proton-proton collision ev en ts are found to pass a given set of selection criteria, that is, cuts. The exp ected num b er of even ts, n , is given b y n = µ + s, (1) where µ is the exp ected num b er of Standard Mo del background ev ents and s ≥ 0—assumed to be purely additiv e—is the expected n umber of signal ev ents due to (unknown) new physics. The observed count is denoted by N and the expected (that is, mean) coun t is denoted b y n . W e shall use upp er case letters for observ ed v alues and lo w er case letters for expected v alues. The result of any exp erimen t can be enco ded in its lik eliho o d function, the probability densit y function (p df ) of the observ ations (sometimes called the probabilit y mass function 5 if the data are discrete) ev aluated at the actual observ ations. F rom the lik eliho o d function and the prior density for the exp ected signal and bac kground we can compute the p osterior probabilit y Pr( s | N ) = p ( s | N ) ds of the signal, that is, the probabilit y that the expected signal lies in the interv al δ = ( s, s + ds ), given the observed coun t N . W e choose to parametrize the likelihoo d in terms of the exp ected signal s rather than the cross section σ , as is done in Ref. [21], so that the results of the counting exp erimen t remain indep enden t of the new physics mo del. The cuts ma y hav e b een motiv ated b y a sp eciﬁc mo del of new physics, how ev er, the signal posterior densit y can be in terpreted using any ph ysics mo del that makes predictions for the exp ected signal in the ﬁnal states considered. Moreo ver, as we shall see, we can devise a purely Bay esian measure of the degree to whic h the observ ation of N ev en ts fa v ors the h ypothesis s > 0 rather than the background-only h yp othesis s = 0, indep enden tly of any presumed mo del of new physics. Moreov er, this can b e readily generalized to a m ulti-count analysis. F or a counting exp erimen t that yields N even ts, w e mak e the usual assumption that the lik eliho o d function is given b y a Poisson distribution, p ( N | µ, s ) = Poisson( N | µ + s ) , (2) with mean µ + s . The asso ciated 2-parameter prior, π ( µ, s ), can b e factorized in tw o wa ys, π ( µ, s ) = π ( s | µ ) π ( µ ) , Metho d 1 (3) π ( µ, s ) = π ( µ | s ) π ( s ) , Metho d 2 , (4) b oth of which w ere considered in Ref. [21]. Here, w e consider Metho d 2 only . W e do so b ecause w e can reduce the lik eliho od function p ( N | µ, s ) to a function of the single parameter s through marginalization, p ( N | s ) = Z ∞ 0 p ( N | µ, s ) π ( µ | s ) dµ, (5) whic h p ermits the application of the 1-parameter reference prior algorithm [21] to compute the reference prior for the exp ected signal, while a voiding the tec hnical issue of nested compact sets [21]. F ollowing Ref. [21], w e mo del the evidence-based prior π ( µ | s ) for the exp ected bac kground b y a gamma densit y , π ( µ | s ) = π ( µ ) = b ( bµ ) Y − 1 / 2 Γ( Y + 1 / 2) e − bµ , (6) 6 where b and Y are known constan ts. W e further assume that the prior is indep endent of the exp ected signal, s . (See App endix A for its deriv ation.) Then, we integrate ov er µ to arrive at the 1-parameter marginal likelihoo d, p ( N | s ) = Z p ( N | µ, s ) π ( µ ) dµ, = Z ( µ + s ) N N ! e − µ − s b ( bµ ) Y − 1 / 2 Γ( Y + 1 / 2) e − bµ dµ, =  b b + 1  Y + 1 2 N X k =0 v N k P oisson( k | s ) , where v ik ≡ Γ( Y + 1 2 + i − k ) Γ( Y + 1 2 ) ( i − k )!  1 b + 1  i − k , (7) for the exp ected signal, s , whose reference prior, π ( s ), is calculated in the next section. A. Reference Priors When w e kno w almost nothing ab out a p oten tial signal it seems prudent to use a prior for the expected signal that is as noncommittal as p ossible. The approach in high energy ph ysics has b een to use a ﬂat prior [18] for a parameter ab out which little is known, or for which one wishes to act as if that is the case. But, for m ulti-parameter mo dels, our in tuition is ill-equipped to c ho ose the parameterization in terms of whic h the prior is ﬂat. W e therefore prop ose a diﬀeren t approac h. Our idea is to construct a prior for eac h new ph ysics model starting with the reference prior for an experiment with a single parameter of interest—here the expected signal, s , for a single coun t exp erimen t. By construction, a reference prior [19 – 21], on av erage and given unlimited data, maximizes the inﬂuence of the data relativ e to the prior. The in tuition that underlies the construction of suc h priors is that the inﬂuence of the observ ations will b e greatest if the “separation” b etw een the p osterior density and the prior is as large as p ossible. Reference analysis [23] quan tiﬁes the separation betw een the tw o den- sities p ( s | N ) and π ( s ) using the Kullbac k-Leibler (KL), div ergence, which for the particular problem w e address is giv en by D [ π , p ] ≡ Z p ( s | N ) ln p ( s | N ) π ( s ) ds. (8) 7 This non-negative quan tit y , whic h is in v arian t under one-to-one transformations of s and zero if and only if the densities p ( s | N ) and π ( s ) are iden tical, may also b e in terpreted as a measure of the information gained from the (single count) exp erimen t. Since we wish to maximize the inﬂuence of the observ ations, w e migh t b e tempted to maximize Eq. (8) with resp ect to the prior, π ( s ). This, ho w ev er, would b e unsatisfactory b ecause the prior w ould then depend on the sp eciﬁc observ ations, which w ould en ter the p osterior density t wice: once in the prior and once in the likelihoo d. It is more satisfactory to use the av erage of D [ π , p ] o ver all possible observ ations. In tegration ov er the space of observ ations—standard practice in the frequentist approach—ma y seem a decidedly un- Ba yesian thing to do. How ev er, the likelihoo d principle [26], the idea that inferences should b e based on the observ ed data only , mak es sense only if w e actually hav e observ ations. Ob viously , b efore w e perform the analysis, w e do not know the v alue of the count N ; therefore, since the coun t is unknown w e should av erage o ver all p ossible realizations of N . Once we know the coun t, our inferences should b e based on N only . F or completeness, w e giv e the k ey details of the reference prior algorithm in the App endix B. The calculation of reference priors simpliﬁes considerably for p osterior densities that are asymptotically normal, that is, that b ecome Gaussian as more and more data are included. In this case, the reference prior coincides with the Jeﬀreys’ prior [23], π ( s ) = s E  − d 2 ln p ( N | s ) ds 2  , (9) where for the single count model the exp ectation is with resp ect to the (marginal) likelihoo d p ( N | s ), giv en in Eq. (7). F or a counting exp eriment, the asymptotic form of the posterior densit y p ( s | N ) is indeed Gaussian. Therefore, the reference prior for p ( N | s ) can b e computed using Eq. (9). Adapting the results of Ref. [21], we ﬁnd, π ( s ) = v u u t ∞ X i =0 [ T 0 i ( s ) − T 1 i ( s ) /s ] 2 T 0 i ( s ) , where T m i ( s ) ≡ i X k =0 k m v ik P oisson( k | s ) for m = 0 , 1 , (10) and v ik are the co eﬃcien ts deﬁned in Eq. (7). The complete reference prior, π ( µ, s ), is the pro duct of Eqs. (6) and (10), while the complete reference p osterior density is p ( µ, s | N ) = p ( N | µ, s ) π ( µ, s ) R ∞ 0 ds R ∞ 0 dµ p ( N | µ, s ) π ( µ, s ) . (11) 8 The reference p osterior density for the exp ected signal is obtained by in tegrating ov er µ , p ( s | N ) = Z ∞ 0 p ( µ, s | N ) dµ, = p ( N | s ) π ( s ) / Z ∞ 0 p ( N | s ) π ( s ) ds, (12) where p ( N | s ) and π ( s ) are giv en b y Eqs. (7) and (10), resp ectiv ely . (See App endix C for more tec hnical details.) B. A Measure of Signal Signiﬁcance Assessing the statistical signiﬁcance of a signal is a standard analysis task in high energy ph ysics [11], one whic h traditionally has b een done with a p -v alue [12]. Here w e prop ose an alternativ e measure that uses the reference p osterior density p ( µ, s | N ). Supp ose we are given some function δ ( µ, s ) that measures the separation b et w een the (comp osite) bac kground plus signal hypothesis, H 1 : µ > 0 , s > 0, and the (comp osite) bac kground-only h yp othesis, H 0 : µ > 0 , s = 0. If the separation b et ween the h yp otheses w ere large enough then presumably w e w ould reject the bac kground-only h yp othesis in fa v or of the alternativ e. But, since we know neither the exp ected background µ nor the exp ected signal s , the natural Bay esian thing to do is to a v erage δ ( µ, s ) with respect to all possible h yp otheses ab out the v alues of µ and s , d ( N ) ≡ E [ δ ( µ, s )] = Z ∞ 0 ds Z ∞ 0 dµ δ ( µ, s ) p ( µ, s | N ) , = Z ∞ 0 ds Z ∞ 0 dµ δ ( µ, s ) p ( N | µ, s ) π ( µ, s ) /p ( N ) , (13) where p ( N ) is the normalization constant p ( N ) = R ∞ 0 ds R ∞ 0 dµ p ( N | µ, s ) π ( µ, s ). If δ ( µ, s ) is in terpreted as a loss function then d ( N ) is a measure of the loss incurred, on av erage, if one were to stubb ornly adhere to the background-only h yp othesis regardless of the outcome of the exp eriment. A signal is declared to b e statistically signiﬁcant if d ( N ) > d ∗ , where d ∗ is some agreed-up on threshold. Moreo v er, the decision to accept or reject H 0 and thereb y reject or accept the alternativ e H 1 ma y b e tak en indep enden tly of an y mo del of new ph ysics. There are man y possible choices for the function δ ( µ, s ). W e prop ose to use the Kullbac k- 9 Leibler div ergence [19, 20], δ ( µ, s ) = ∞ X k =0 p ( k | µ + s ) ln p ( k | µ + s ) p ( k | µ ) , = − s + ( µ + s ) ln(1 + s/µ ) , (14) b et w een the densities p ( k | µ + s ) and p ( k | µ ) asso ciated with hypotheses H 1 and H 0 , resp ec- tiv ely . F or fully sp eciﬁed mo dels, Eq. (14) is simply the expected log-lik eliho od ratio. W e can gain some insigh t in to δ ( µ, s ) by considering a counting exp erimen t for which s << µ , whic h characterizes early searc hes for new ph ysics. In this limit [40], δ ( µ, s ) = − s + ( s + µ ) ln  1 + s µ  ≈ − s + ( s + µ ) h s µ − 1 2 s 2 µ 2 + · · · i ≈ 1 2 s 2 µ , (15) that is, p 2 δ ( µ, s ) ∼ s/ √ µ . This suggests taking the quan tity , q ≡ p 2 d ( N ) , (16) as a Ba yesian analog of the w ell-known (and oft-abused) measure of “signal signiﬁcance,” q = s/ √ µ . As such, it is an analog of an “n-sigma,” that is, the standard re-scaling of a p -v alue using the single tail area of a normal density [12]. This appr oximate corresp ondence pro vides a simple calibration of d ( N ). 1. Gener alization to Multiple Counts F or an exp erimen t that yields K indep endent counts, N k , k = 1 , · · · , K , with exp ected bac kground and signal coun ts µ k and s k , resp ectiv ely , the KL div ergence is simply the sum δ ( µ 1 , s 1 , · · · ) = K X k =1 δ ( µ k , s k ) , (17) 10 o ver terms δ ( µ k , s k ), each of which is giv en b y Eq. (14), while the signal signiﬁcance measure generalizes to d ( N 1 , · · · ) ≡ E [ δ ( µ 1 , s 1 , · · · )] = Z ∞ 0 ds 1 Z ∞ 0 dµ 1 · · · Z ∞ 0 ds K Z ∞ 0 dµ K δ ( µ 1 , s 1 , · · · ) × p ( µ 1 , s 1 , · · · | N 1 , · · · ) , = K X k =1 Z ∞ 0 ds 1 Z ∞ 0 dµ 1 · · · Z ∞ 0 ds K Z ∞ 0 dµ K δ ( µ k , s k ) × p ( N 1 | µ 1 , s 1 ) π ( µ 1 , s 1 ) /p ( N 1 ) · · · p ( N K | µ K , s K ) π ( µ K , s K ) /p ( N K ) , = K X k =1 d ( N k ) , (18) where we hav e used the fact that the p osterior densit y p ( µ 1 , s 1 , · · · | N 1 , · · · ) factorizes into a pro duct of K terms, one for eac h count N k , eac h of whic h integrates to one. I I I. MUL TI-P ARAMETER PRIORS AND MODEL RANKING A. Multi-P arameter Priors W e hav e a well-deﬁned reference p osterior densit y for the signal, p ( s | N ), whic h satisﬁes Z ∞ 0 ds p ( s | N ) = 1 . (19) Our task now is to map it to a densit y p ( θ ) on the parameter space Θ of a giv en ph ysics mo del. By assumption, the mo del predicts the exp ected signal s via a predictor function s = f ( θ ). Consequen tly , the reference p osterior densit y p ( s | N ) induces, or is consistent with, p osterior densities on Θ that satisfy [22] p ( s | N ) = Z Θ δ [ s − f ( θ )] p ( θ ) dθ. (20) Equation (20) is the consistency requirement we alluded to. Note, Eqs. (19) and (20) imply that R dθ p ( θ ) = 1. Equation (20) determines p ( θ ) only to within a class. Therefore, w e need a plausible w ay to choose a sp eciﬁc function from that class that w ould serv e as a suitable p osterior 11 densit y and hence a prior for subsequen t analysis. T o that end, we note that every p oint θ ∈ ∆, where ∆ is the image of δ = ( s, s + ds ) ∈ R , is asso ciated with the same expected signal s ∈ δ . In that sense, the p oin ts in ∆ are indistinguishable; that is, ∆ deﬁnes a set of “lo ok-alik e” (LL) mo dels. W e therefore prop ose that p ( θ ) b e c hosen so that every p oint within ∆ is e quipr ob able , (21) that is, that the density p ( θ ) b e constan t ov er ∆. This choice yields the following expression for p ( θ ), p ( θ ) = p ( s ( θ ) | N ) / A ( s ( θ )) , (22) where, A ( s ) = Z Θ δ [ s − f ( θ )] dθ , (23) is the ar e a of the h yp er-surface deﬁned b y s − f ( θ ) = 0. This c hoice is arguably the simplest for p ( θ ) giv en that the only information at hand is the reference p osterior density for the signal. If, how ev er, one has cogen t information ab out how p ( θ ) should v ary on these h yp er-surfaces, then our simple c hoice can b e replaced with something consisten t with this information and Eq. (20). There are t wo technical c hallenges in our prop osed metho d. The ﬁrst is that, in general, w e do not ha ve explicit functional forms for the mapping s = f ( θ ). In practice, in order to calculate the exp ected signal, we simulate a large num b er of signal even ts for a giv en parameter p oin t θ , we apply cuts to these even ts and we determine what fraction of them surviv e the cuts; that is, we calculate the signal eﬃciency  ( θ ). Then, for a given integrated luminosit y L , we compute the exp ected signal using s =  ( θ ) σ ( θ ) L ≡ f ( θ ), where σ ( θ ) is the cross section. The second challenge is the calculation of the surface term, Eq. (23). W e discuss b oth of these calculations in Sect. IV, in whic h we illustrate the practical application of our metho d. But ﬁrst w e brieﬂy review the standard Ba y esian approac h to mo del ranking. B. Mo del Ranking If Nature is kind to us, we shall ev entually start to see signals of new physics at the LHC. Then, the most imp ortant tasks will b e to characterize the observ ations exp erimentally and determine whic h candidate mo del b est describ es them. 12 Supp ose we wish to rank M = 1 , · · · , J candidate mo dels of new physics according to their concordance with the observ ations. In general, eac h mo del will hav e its own set of parameters θ M , p erhaps diﬀering in meaning and, or, dimensionality . The standard Bay esian approach to mo del ranking is, as usual, direct: calculate the probabilit y of eac h mo del M [27] given the observ ations. The mo del with the highest probability wins. Giv en the likelihoo d function p (data | θ M , M ) and prior π ( θ M , M ) = π ( θ M | M ) π ( M ), w e ﬁrst compute the evidenc e [27], p (data | M ) = Z dθ M p (data | θ M , M ) π ( θ M | M ) , (24) and then the probability of each mo del P ( M | data) = p (data | M ) π ( M ) / J X M =1 p (data | M ) π ( M ) , (25) where π ( M ) is a discrete prior probabilit y distribution ov er the space of mo dels. The p olemical asp ect of Eq. (25) is the need to sp ecify the v alues of π ( M ), on which there seems little chance of agreement. If, how ev er, the mo dels are judged to b e equally implausible—or if the LHC exp erimen ts were to reac h an accord to that eﬀect, it w ould b e appropriate to set π ( M ) = 1 / M , in which case Eq. (25) reduces to P ( M | data) = p (data | M ) / J X M =1 p (data | M ) . (26) Absen t such an accord, it is still possible to rank mo dels using their evidences: the larger the evidence the more fav ored is the mo del. But, there is an imp ortan t ca veat: it is necessary to use prop er priors for π ( θ M | M ), that is, priors that integrate to one. An improper prior is deﬁned only to within an arbitrary scale factor. Consequently , were such a prior to b e used to compute the evidence, the latter w ould b e deﬁned only to within the same arbitrary scale factor. Therefore, in order for the evidences to b e w ell-deﬁned, the priors m ust b e prop er. By construction, this is the case for the m ulti-dimensional priors in tro duced ab o v e. Mo dels can also be rank ed using Bay esian reference analysis. Ho wev er, w e defer the discussion to a future publication. 13 IV. ILLUSTRA TIVE EXAMPLES Our prop osed metho d for constructing multi-parameter priors is quite general. It can b e applied, in principle, to an y physics mo del of an y dimensionalit y pro vided that the mo del mak es a prediction for the parameter of interest, which in our case is the exp ected signal in a counting exp erimen t. F or simplicity , ho wev er, we illustrate the application of the metho d using a SUSY mo del with only tw o free parameters for which the results are easily visualized. W e then consider tw o 5-parameter mo dels. A. 2-D Model The ﬁrst mo del we consider is the sub-mo del of the CMSSM [4] deﬁned by the free parameters m 0 , m 1 / 2 , and the ﬁxed parameters tan β = 10, A 0 = 0 and µ > 0. W e take the CMS b enc hmark p oin t LM1 [8], deﬁned b y the ﬁxed parameters m 0 = 60, m 1 / 2 = 250, tan β = 10, A 0 = 0 and µ > 0, as our true state of Natur e (TSN), which pro vides the “observ ed” count N [28]. F or eac h p oin t in a grid of p oin ts in the m 0 − m 1 / 2 plane, including the p oin t LM1, the SUSY sp ectrum is calculated using SOFTSUSY 3.1 [29] and sparticle deca ys using SUSYHIT [30]. W e generate 1000 7 T eV LHC ev en ts using PYTHIA 6.4 [31] and approximate the resp onse of the CMS detector [32] to these even ts using a mo diﬁed v ersion of the fast detector simulation program PGS [33]. W e apply a CMS m ultijets plus missing transverse energy ( E / T ) even t selection [34] to the ev en ts sim ulated at eac h p oint θ = ( m 0 , m 1 / 2 ) and we take the background estimates from the CMS analysis in Ref. [34]. Three h yp othetical results are considered: i) N = 3 ev en ts observ ed in L = 1 pb − 1 of data; ii) N = 270 even ts observ ed in 100 pb − 1 , and N = 1335 even ts observ ed in 500 pb − 1 . In each case, w e compute the p osterior density p ( s | N ) for the exp ected signal coun t at each p oin t in the m 0 − m 1 / 2 plane and map it to the p osterior density p ( m 0 , m 1 / 2 ), which w e take as the prior π ( m 0 , m 1 / 2 ). The v alue of the surface term in this case is simply the length of the curv e s − f ( m 0 , m 1 / 2 ) = 0. The plots in Fig. 1 sho w the induced p osterior densit y p ( m 0 , m 1 / 2 ), and hence prior π ( m 0 , m 1 / 2 ), for the three integrated luminosities. The plots show several nice features. F or lo w statistics, the prior is featureless in the region to whic h the experiment has no sensitivity , while the low mass region is disfav ored. At mo derate luminosit y the prior p eaks at the right 14 (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A ) θ ( π -1 1 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A ) θ ( π -1 1 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A ) θ ( π -1 100 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CM SSM, A ) θ ( π -1 100 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A ) θ ( π -1 500 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A ) θ ( π -1 500 pb FIG. 1: Induced p osterior densities on the m 0 − m 1 / 2 plane for 1 pb − 1 (left), 100 pb − 1 (cen ter), and 500 pb − 1 (righ t). The TSN is indicated by the black dot. v alue, fa voring the correct mo del and, with the same probabilit y , all its LL mo dels. A t large luminosit y the prior con verges to the correct LL sub-space ∆, which, as noted, is a curve. The fact that the sub-space is not a single p oin t sho ws that an inﬁnite amoun t of data do es not necessarily guaran tee the irrelev ance of the prior that initiated the chain of inference. This is wh y choosing the prior carefully is imp ortant. Since the LL sub-space ∆ is extended, it remains sensitive to the initiating prior, which b ecause of the manner in whic h we choose to map p ( s | N ) to p ( m 0 , m 1 / 2 ) is constan t across the LL sub-space. The upshot of this is that w e should exp ect the initiating prior to b ecome irrelev an t only if an analysis is able to break the mo del degenaracy so that with an inﬁnite amoun t of data the LL sub-space collapses to a p oin t or, more realistically , to a very small sub-space o ver which the v ariation of the initiating prior is negligible. The degeneracy betw een mo dels with the same exp ected signal count—whic h w e argue is a desirable prop ert y—is intrinsic to the approach we prop ose. Ho wev er, having deﬁned a prior ov er the parameter space of the mo del under study , we can mov e well b ey ond a simple coun ting exp erimen t. SUSY mo dels hav e the virtue of making n umerous predictions that can be tested in a v ariety of wa ys. W e argue that the interpretation of data at the LHC should b e done in a manner that is consistent with all the tested predictions of the mo del under consideration. T o do otherwise risks reac hing scientiﬁcally untenable conclusions: for example, that a region of parameter space is still allo w ed when a more complete analysis migh t say quite the opp osite. If we ha v e access to results from diﬀeren t analyses, p erhaps from diﬀerent experiments, we argue that a consisten t analysis should incorprate these results 15 whenev er p ossible. The abilit y to do this in a systematic manner is one of our motiv ations for addressing the problem of multi-parameter priors. In order to break the mo del degeneracy , w e can incorporate the lik elihoo d asso ciated with a set of additional observ ables ~ x and compute the p osterior density p ( m 0 , m 1 / 2 | ~ x ) using the prior π ( m 0 , m 1 / 2 ) computed from the single coun t analysis. An example is giv en in Fig. 2, where the function, p ( m 0 , m 1 / 2 | ~ x ) ∝ p ( ~ x | m 0 , m 1 / 2 ) π ( m 0 , m 1 / 2 ) , (27) is shown as a function of m 0 and m 1 / 2 . W e consider the set of measured electro weak observ- ables, g − 2, B R ( b → sγ ), B R ( B → τ ν ), B R ( B → D τ ν ) /B R ( B → D eν ), R ` 23 , D s → τ ν , D s → µν and ∆ ρ , for whic h the likelihoo d is p ( ~ X | m 0 , m 1 / 2 ) ∝ Y i Gaussian( X i | α i , σ i ) , (28) where x i = α i ( m 0 , m 1 / 2 ) is the predicted v alue of the observ able i for the mo del ( m 0 , m 1 / 2 ), whic h is computed for eac h observ able ab o ve using SuperIso [35] and micrOMEGAs 2.4 [36] and X i ± σ i is the asso ciated exp erimental measuremen t, in which the central v alue X i is taken as the prediction for our TSN, and the uncertaint y σ i is taken from the actual measuremen ts quoted b y the Particle Data Group [37]. (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A |EWobs) θ p( -1 1 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A |EWobs) θ p( -1 1 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A |EWobs) θ p( -1 100 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CM SSM, A |EWobs) θ p( -1 100 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A |EWobs) θ p( -1 500 pb (GeV) 0 m 0 500 1000 1500 (GeV) 1/2 m 200 300 400 500 600 > 0 µ = 10, β = 0, tan 0 CMS SM, A |EWobs) θ p( -1 500 pb FIG. 2: Posterior density induced on the m 0 − m 1 / 2 plane, after the inclusion of the electrow eak observ ables, for 1 pb − 1 (left), 100 pb − 1 (cen ter), and 500 pb − 1 (righ t). The TSN is indicated b y the black dot. The central v alues of the electrow eak observ ables are computed at the TSN p oin t, but w e use the exp erimen tal uncertainties from Refs. [37]. The plots in Fig. 2 show that the electrow eak results are helpful in breaking the mo del degeneracy . W e exp ect this conclusion to remain true for realistic analyses and mo dels. 16 B. 5-D Models W e no w consider tw o 5-parameter models that illustrate the more realistic situation in whic h the use of a regular grid of parameter p oin ts in the space Θ rapidly b ecomes unfeasible due to the well-kno wn “curse of dimensionality”. The standard w a y to circum v ent this problem is to sample p oin ts using Mark o v Chain Monte Carlo. This is what w e prop ose to do in order to appro ximate the p osterior density p ( θ ) where, no w, θ represents a parameter p oin t in the 5-dimensional mo del space. 1. Mo dels W e deﬁne tw o non-universal extensions of the CMSSM that we call NUm 0 and NUm 1 / 2 , whic h resp ectiv ely hav e non-universal m 0 and non-universal m 1 / 2 . W e choose our TSN from NUm 0 , and therefore also refer to it as the “TSN mo del”. W e refer to the other mo del as the “wrong mo del”. Note that this mo del cannot be used to parametrize the TSN p oin t due to its universal m 0 . The free parameters of the tw o mo dels and the parameter v alues at TSN are as follows: • TSN mo del: NUm 0 (CMSSM with non-universal m 0 ): – m 0 (1 , 2) : 250 GeV at TSN – m 0 (3) = m H u,d : 1.5 T eV at TSN – m 1 / 2 where m 1 / 2 = m 1 / 2 (1 , 2) = m 1 / 2 (3) : 300 GeV at TSN – A 0 : 0 GeV at TSN – tan β : 10 at TSN • W rong mo del: NUm 1 / 2 (CMSSM with non-universal m 1 / 2 ): – m 0 where m 0 = m 0 (1 , 2) = m 0 (3) = m H u,d – m 1 / 2 (1 , 2) – m 1 / 2 (3) – A 0 – tan β F or b oth cases, w e take the sign of µ to b e p ositiv e. 17 2. Priors Our metho d follows the common Ba yesian strategy of “sacriﬁcing” a small fraction of the data to generate what w e ha v e referred to as an initiating prior, that is, a prior that p ermits the inference chain to pro ceed. In this example, the multi-parameter priors for the TSN and wrong mo dels are constructed assuming a 100 pb − 1 data-set. W e again use the SOFTSUSY [29], SUSYHIT [30], PYTHIA [31] sequence to generate ev en ts, but Delphes [2] to sim ulate the CMS detector [32], and we apply the same CMS jets plus E / T analysis [34]. F or simplicit y , w e assume that the subsequent analysis is again that of a counting experiment iden tical to the one used to construct the priors, except that the integrated luminosit y is larger. In practice, one w ould w ork hard to adapt, improv e, and c hange the analyses as more and more data are accum ulated. Ho wev er, our purp ose here is not to do a realistic analysis but simply to illustrate our metho d. The quan tities p ertaining to the TSN p oin t, and assuming 100 pb − 1 are: cross section σ = 1 . 35 pb , signal eﬃciency  = 0 . 412 , “observ ed” count N = 169 even ts , bac kground estimate ˆ µ = 113 ± 11 . 3 even ts , sideband yield Y = 100 ev ents , sideband/signal region scale factor b = 0 . 889 . (29) The reference prior using the ab o v e v alues for Y and b is sho wn in Fig. 3. The reference p osterior densit y p ( s | N ) is computed using the num bers at the TSN p oin t. Ho w ever, since it is no longer realistic to use a uniform grid of p oin ts, w e generate a sample of p oin ts θ i from the reference p osterior density p ( s | N ) with s = f ( θ ), for each mo del, using the Metrop olis-Hastings algorithm [38] and m ultiple MCMC chains. Asymptotically , this sam- pling pro cedure will pro duce a density that satisﬁes Eq. (20). Moreov er, to the degree that the chains can thoroughly explore the surfaces s − f ( θ ) = 0, the generated p oin ts will also satisfy Eq. (22); that is, the surface term will b e automatically incorp orated. The mapping from one to multiple dimensions is discussed further in App endix D using a 2-dimensional to y mo del. Figure 4 sho ws the 1-dimensional marginal densities of the induced prior for the TSN 18 expected signal (s) 0 100 200 300 400 500 (s)) ! prior density ( 0 0.5 1 1.5 prior (exact) prior (numerical) FIG. 3: The reference prior, π ( s ), for the single count mo del computed using Eq. (10) (line) compared with the same computed n umerically using Eq. (9) (p oin ts). mo del on which are sup erimposed the p osterior densities. The 1-dimensional marginals for the wrong mo del are shown in Fig. 5. In b oth ﬁgures, the lo cation of the TSN p oint is indicated by the vertical dashed line. Note that in each ﬁgure t w o of the plots are degenerate: the m 1 / 2 (1 , 2) and m 1 / 2 (3) plots in Fig. 4 for the TSN mo del and the m 0 (1 , 2) and m 0 (3) plots in Fig. 5 for the wrong mo del. F or the TSN mo del, most of the p eaks of the 1-dimensional densities are near the TSN p oin t, while for the wrong mo del this is not the case. W e can get a b etter idea of the shap e of the p osterior densities from their 2-dimensional marginals, which are sho wn in Fig. 6. The black p oin t in each plot is the TSN p oin t. One feature whic h seems puzzling at ﬁrst is that the TSN point does not alwa ys lie at the peak of the densities. But, the follo wing should b e noted. If the h yp er-surface s − f ( θ ) = 0 on which the TSN p oin t lies is larger than that of another h yp er-surface asso ciated with a smaller v alue of the reference p osterior densit y p ( s | N ), then it could happ en that the v alue of p ( θ ) on the TSN hyper-surface is actually smaller than its v alue on the other hyper-surface, ev en though the total probability of the TSN hyper-surface is greater than the total probabilit y of other hyper-surfaces. Figure 7 shows what happens to the prior after m ultiplication b y the likelihoo d for the electro weak results. As exp ected, these results mak e a noticeable c hange to the prior in sharp contrast to the result of the counting exp erimen t. This is, p erhaps, not surprising since the observ ed count constrains only the signal strength, whereas the electrow eak results 19 (1,2) (GeV) 0 m 0 500 1000 1500 2000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 0 TSN model: NUm (GeV) Hu,d (3) = m 0 m 0 500 1000 1500 2000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 0 TSN model: NUm (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 0 TSN model: NUm (3) (GeV) 1/2 m 0 200 400 600 800 1000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 0 TSN model: NUm (GeV) 0 A -2000 -1000 0 1000 2000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 0 TSN model: NUm β tan 10 20 30 40 50 60 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 0 TSN model: NUm FIG. 4: Induced marginal densities for the TSN mo del assuming a 100 pb − 1 . The shaded histograms are the priors. The p osterior densities, obtained by w eighting the sampled p oin ts b y the likelihoo d for the coun ting exp erimen t (dark line) and the combined lik eliho od for the electrow eak exp erimen ts (ligh t line), are sup erimp osed on the priors. The v ertical dashed line indicates the p osition of the TSN point. F rom these pro jections, one w ould conclude that the inﬂuence of the result of the coun ting exp erimen t is negligible, while the inﬂuence of the electrow eak results is quite eviden t. constrain m ultiple observ ables that help break the mo del degeneracy . 3. Signal Signiﬁc anc e T able I sho ws ho w the signal signiﬁcance, as deﬁned in Eq. (13), increases as a function of in tegrated luminosit y . W e exp ect this num b er to scale like ∼ √ L , whic h indeed it do es. 20 (1,2) (GeV) 0 m 0 500 1000 1500 2000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 1/2 W rong m odel: NUm (GeV) Hu,d (3) = m 0 m 0 500 1000 1500 2000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 1/2 W rong m odel: NUm (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 1/2 W rong m odel: NUm (3) (GeV) 1/2 m 0 200 400 600 800 1000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 1/2 W rong m odel: NUm (GeV) 0 A -2000 -1000 0 1000 2000 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 1/2 W rong m odel: NUm β tan 10 20 30 40 50 60 Probability density ) θ ( π ) -1 |1fb θ p( |E Wobs) θ p( 1/2 W rong m odel: NUm FIG. 5: Induced marginal densities for the wrong mo del. See Fig. 4 for details. T ABLE I: Signal signiﬁcance as a function of in tegrated luminosity for the TSN mo del. In tegrated luminosity “Observed” count (TSN) Signiﬁcance (fb − 1 ) N even ts d ( N ) p 2 d ( N ) 0.5 331 12.2 4.9 1.0 387 13.6 5.2 2.0 660 19.2 6.2 5.0 1754 33.2 8.2 4. Mo del R anking As we noted, the purp ose of this example is to illustrate the prior construction method. Ho wev er, it is interesting to see what happ ens if we try to rank the TSN and wrong mo dels on the basis of the signal strength only . The results are shown in T able I I. W e ﬁnd that 21 (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm ) θ ( π (1,2) (GeV) 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm ) θ ( π (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm ) θ ( π (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm ) θ ( π (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm ) θ ( π (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm ) θ ( π (GeV) 1/2 m 0 200 400 600 800 1000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (GeV) 1/2 m 0 200 400 600 800 1000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm ) θ ( π (GeV) 1/2 m 0 200 400 600 800 1000 β #tan 10 20 30 40 50 60 0 TSN model: NUm ) θ ( π (GeV) 1/2 m 0 200 400 600 800 1000 β #tan 10 20 30 40 50 60 0 TSN model: NUm ) θ ( π (GeV) 0 A -2000 -1000 0 1000 2000 β #tan 20 40 60 0 TSN model: NUm ) θ ( π (GeV) 0 A -2000 -1000 0 1000 2000 β #tan 20 40 60 0 TSN model: NUm ) θ ( π FIG. 6: Induced 2-dimensional marginal p osterior densities for the TSN mo del. The TSN is indicated b y the black dot. See text for details. 22 (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm |EWobs) θ p( (1,2) (GeV) 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm |EWobs) θ p( (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm |EWobs) θ p( (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (1,2) (GeV) 1/2 m 0 200 400 600 800 1000 0 TSN model: NUm |EWobs) θ p( (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm |EWobs) θ p( (GeV) u,d H (3) = m 0 m 0 500 1000 1500 2000 β #tan 10 20 30 40 50 60 0 TSN model: NUm |EWobs) θ p( (GeV) 1/2 m 0 200 400 600 800 1000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (GeV) 1/2 m 0 200 400 600 800 1000 (GeV) 0 A -2000 -1500 -1000 -500 0 500 1000 1500 2000 0 TSN model: NUm |EWobs) θ p( (GeV) 1/2 m 0 200 400 600 800 1000 β #tan 10 20 30 40 50 60 0 TSN model: NUm |EWobs) θ p( (GeV) 1/2 m 0 200 400 600 800 1000 β #tan 10 20 30 40 50 60 0 TSN model: NUm |EWobs) θ p( (GeV) 0 A -2000 -1000 0 1000 2000 β #tan 20 40 60 0 TSN model: NUm |EWobs) θ p( (GeV) 0 A -2000 -1000 0 1000 2000 β #tan 20 40 60 0 TSN model: NUm |EWobs) θ p( FIG. 7: Induced 2-dimensional marginal p osterior densities for the TSN mo del including the eﬀect of the electro weak results. The TSN is indicated by the black dot. See text for details. 23 ev en with the relatively weak constrain t aﬀorded b y merely coun ting even ts, we are able to rank these mo dels consistently , alb eit weakly . T ABLE I I: Ranking of the TSN and wrong mo dels as a function of in tegrated luminosity . In tegrated Evidence for Evidence for Evidence TSN ov er luminosit y TSN mo del wrong mo del Evidence wrong mo del 0.5 fb − 1 0.00253 0.00205 1.233 1.0 fb − 1 0.00203 0.00164 1.235 2.0 fb − 1 0.00102 0.00083 1.238 5.0 fb − 1 0.00034 0.00028 1.245 V. SUMMAR Y AND CONCLUSIONS W e ha ve prop osed a metho d for building m ulti-parameter priors that follows the general strategy of building a prop er prior using a small p ortion of the data and analyzing the rest using that prior. Since the direct construction of m ulti-parameter priors, with mathemati- cally well-deﬁned prop erties, is a diﬃcult task w e hav e prop osed a metho d that b egins with a simpler task, namely , the construction of a reference prior for an analysis having a single parameter of interest. T ogether with the lik eliho o d function, the reference prior yields a prop er p osterior density that is consisten t with a class of p osterior densities on the param- eter space of the physics mo del under study . W e prop osed c ho osing a particular mem b er from this class to serv e as the m ulti-parameter prior for subsequen t analyses. That prior has the prop ert y that its density is constant on ev ery hyper-surface indexed by the param- eter of in terest. Moreo ver, b ecause it is built from a reference prior, the m ulti-parameter prior is exp ected to yield credible regions with excellent frequen tist prop erties. Finally , the robustness of inferences can b e assessed b y weigh ting the multi-parameter prior π ( θ ) b y , for example, w ( s ) = [ A ( s ) /p ( s | N )] r and studying the sensitivit y of inferences to the exp onen t 0 ≤ r ≤ 1. The exp onen t r p ermits a smo oth interpolation b et w een the reference prior ( r = 0) and a ﬂat prior ( r = 1). Our prop osed construction must surmoun t a tec hnical hurdle: generating a sample of p oin ts in the parameter space of the ph ysics mo del with the prop erties that 1) the num ber 24 of p oin ts on each hyper-surface is prop ortional to the reference p osterior density asso ciated with that hyper-surface and 2) the p oin ts on the h yp er-surface are uniformly distributed. W e sho wed, using three illustrativ e examples, how one migh t address this question, in general. F or high-dimensional mo dels, the use of MCMC seems feasible. Ho wev er, we ha ve found that conv ergence may b e an issue b ecause of the severe degeneracies present when relatively little information is used to create the multi-parameter prior. In a realistic application it will b e necessary to tune the MCMC algorithm to ensure con vergence of the Marko v chains. It w ould b e useful to explore diﬀerent sampling metho ds, such as MultiNest [39], that may b e b etter suited to problems with severe degeneracies. In spite of these c hallenges, ho wev er, we hav e shown that our metho d yields priors that giv e consistent results as more and more data are accumulated. What remains to b e done is to apply the metho d to a real analysis at the LHC. Our expectation is that the method w ould fare w ell. Ac kno wledgments W e thank Jim Berger and Jos´ e Bernardo for discussions on reference priors and Bay esian metho ds in general and Sabine Kraml for discussions on the SUSY mo dels. W e also thank Luc Demortier, Bob Cousins, and Kyle Cranmer for several discussions that helped clarify our though ts. This work w as supp orted in part by the U.S. Departmen t of Energy under grant no. DE-F G02-97ER41022. [1] The Large Hadron Collider, http://lhc.web.cern.ch/lhc . [2] Delphes , S. Ovyn, X. Rouby , and V. Lemaitre, [arXiv:0903.2225 [hep-ph]]. [3] J. W ess, and B. Zumino, Nucl. Phys. B70 , 39 (1974); H. P . Nilles, Phys. Rept. 110 , 1 (1984); H. Baer, and X. T ata, We ak sc ale sup ersymmetry: F r om sup erﬁelds to sc attering events (Cam bridge Universit y Press, Cambridge, 2006). [4] See for example, A. H. Chamseddine, R. L. Arno witt, and P . Nath, Phys. Rev. Lett. 49 , 970 (1982); G. L. Kane, C. F. Kolda, L. Roszk owski, and J. D. W ells, Phys. Rev. D49 , 6173 (1994). [hep-ph/9312272]. 25 [5] A. A. Marko v, Izvestiy a Fiziko-matematic hesk ogo obschestv a pri Kazansk om universitete, 2-ya seriy a, tom 15 , 135 (1906); A. A. Marko v, reprinted in App endix B of R. How ard, Dynamic Pr ob abilistic Systems , V ol. 1: Markov Chains (John Wiley and Sons, 1971). F or a mo dern textb ook in tro duction see, for example, B. A. Berg, Markov Chain Monte Carlo Simulations A nd Their Statistic al A nalysis (W orld Scientiﬁc, Singap ore, 2004). [6] See for example, E. A. Baltz, P . Gondolo, JHEP 0410 , 052 (2004), [arXiv:hep-ph/0407039 [hep-ph]]; C. G. Lester, M. A. P arker, M. J. White, 2, JHEP 0601 , 080 (2006), [hep- ph/0508143]; R. R. de Austri, R. T rotta, L. Roszko wski, JHEP 0605 , 002 (2006). [hep- ph/0602028]; E. A. Baltz, M. Battaglia, M. E. Peskin, T. Wizansky , Phys. Rev. D74 , 103521 (2006). [hep-ph/0602187]; B. C. Allanac h, C. G. Lester, A. M. W eb er, JHEP 0612 , 065 (2006). [hep-ph/0609295]; B. C. Allanac h, C. G. Lester, Comput. Phys. Comm un. 179 , 256 (2008). [arXiv:0705.0486 [hep-ph]]; L. M. H. Hall, H. V. P eiris, JCAP 0801 , 027 (2008). [arXiv:0709.2912 [astro-ph]]; S. Davidson, J. Garay oa, F. P alorini, N. Rius, JHEP 0809 , 053 (2008). [arXiv:0806.2832 [hep-ph]]; H. Baer, S. Kraml, S. Sekmen, H. Summ y , JHEP 0803 , 056 (2008). [arXiv:0801.1831 [hep-ph]]; O. Buchm ueller, R. Cav anaugh, A. De Ro ec k, J. R. Ellis, H. Flacher, S. Heinemeyer, G. Isidori, K. A. Olive et al. , JHEP 0809 , 117 (2008). [arXiv:0808.4128 [hep-ph]]. F. Brummer, S. Fichet, S. Kraml, R. K. Singh, JHEP 1008 , 096 (2010). [arXiv:1007.0321 [hep-ph]]; H. Baer, S. Kraml, A. Lessa, S. Sekmen, X. T ata, JHEP 1010 , 018 (2010). [arXiv:1007.3897 [hep-ph]]. [7] C. F. Berger, J. S. Gainer, J. L. Hewett, and T. G. Rizzo, JHEP 0902 , 023 (2009). [8] G. L. Bay atian et al. [ CMS Collab oration ], J. Phys. G G34 , 995 (2007). [9] See for example, O. Buc hm ueller, R. Ca v anaugh, A. De Ro ec k, J. R. Ellis, H. Flacher, S. Heine- mey er, G. Isidori, K. A. Oliv e et al. , Eur. Ph ys. J. C64 , 391(2009), [arXiv:0907.5568 [hep-ph]]; O. Buchm ueller, R. Ca v anaugh, D. Colling, A. De Ro eck, M. J. Dolan, J. R. Ellis, H. Flacher, S. Heinemey er et al. , Eur. Phys. J. C71 , 1583 (2011). [arXiv:1011.6118 [hep-ph]]. [10] See for example, D. E. Lop ez-F ogliani, L. Roszk owski, R. R. de Austri, T. A. V arley , Phys. Rev. D80 , 095013 (2009). [arXiv:0906.4911 [hep-ph]]; R. T rotta, F. F eroz, M. P . Hobson, L. Roszk o wski, R. Ruiz de Austri, JHEP 0812 , 024 (2008), [arXiv:0809.3792 [hep-ph]]; B. C. Allanac h, K. Cranmer, C. G. Lester, and A. M. W eb er, JHEP 08 , 023 (2007). [11] R. D. Cousins, J. T. Linnemann, and J. T uck er, Nucl. Instrum. Meth. A595 , 480 (2008). [12] G. Co wan, K. Cranmer, E. Gross, and O. Vitells, Eur. Phys. J. C71 , 1554 (2011). 26 [arXiv:1007.1727 [ph ysics.data-an]]. [13] F. F eroz, K. Cranmer, M. Hobson, R. Ruiz de Austri, and R. T rotta, JHEP 1106 , 042 (2011). [arXiv:1101.3296 [hep-ph]]. [14] Y. Akrami, P . Scott, J. Edsjo, J. Conrad, and L. Bergstrom, JHEP 1004 , 057 (2010). [arXiv:0910.3950 [hep-ph]]. [15] C. P . Rob ert, The Bayesian Choic e: fr om De cision-The or etic F oundations to Computational Implementation (Springer, New Y ork, 2007), 2nd ed.; E. T. Jaynes, Pr ob ability The ory: The L o gic of Scienc e , edited by G. L. Bretthorst (Cambridge Universit y Press, Cam bridge, 2003); A. O’Hagan, Kendal l’s A dvanc e d The ory of Statistics, V olume 2B: Bayesian Infer enc e (Ed- w ard Arnold, London, 1994); H. Jeﬀreys, The ory of Pr ob ability (Oxford Univ ersity Press, Oxford, 1961), 3rd ed. [16] V. M. Abazov et al. (D0 Collab oration), Phys. Rev. Lett. 103 , 092001 (2009). [17] T. Aaltonen et al. (CDF Collab oration), Phys. Rev. Lett. 103 , 092002 (2009). [18] I. Bertram, G. Landsb erg, J. Linnemann, R. Partridge, M. Paterno, and H. B. Prosp er, F ermilab preprin t FERMILAB-TM-2104 (2000). [19] J. M. Bernardo, J. R. Statist. So c. B 41 , 113 (1979); J. O. Berger and J. M. Bernardo, J. Amer. Statist. Asso c. 84 , 200 (1989); J. O. Berger and J. M. Bernardo, Biometrik a 79 , 25 (1992); J. O. Berger and J. M. Bernardo, in Bayesian Statistics 4 , edited by J. M. Bernardo, J. O. Berger, A. P . Dawid, and A. F. M. Smith (Oxford Universit y Press, Oxford, 1992), pp. 35- 60, http://www.uv.es/ ~ bernardo/1992Valencia4Ref.pdf ; J. M. Bernardo, in Handb o ok of Statistics 25 , edited by D. K. Dey and C. R. Rao (Elsevier, Amsterdam, 2005), pp. 17-90, http://www.uv.es/ ~ bernardo/RefAna.pdf . [20] L. Demortier, in Statistic al Pr oblems in Particle Physics, Astr ophysics, and Cosmolo gy: Pr o- c e e dings of PHYST A T05 , Eds. L. Lyons and M. K. ¨ Unel (Imperial College Press, London, 2006), pp. 11-14. [21] L. Demortier, S. Jain, and H. B. Prosp er, Ph ys. Rev. D 82 , 034002 (2010). [22] D. T. Gillespie, Am. J. Phys. 51 , 520 (1983). [23] J. O. Berger, J. M. Bernardo, and D. Sun, Ann. Statist. 37 , 905 (2009), http://www.uv.es/ ~ bernardo/2009Annals.pdf . [24] F. F eroz, K. Cranmer, M. Hobson, R. Ruiz de Austri, and R. T rotta, JHEP 1106 , 042 (2011). [arXiv:1101.3296 [hep-ph]]. 27 [25] I.J. Myung, V. Balasubramanian, and M.A. Pitt, Pro c. Natl. Acad. Sci. USA, 97 , 11170 (2000); http://www.ncbi.nlm.nih.gov/pmc/articles/PMC17172 . [26] J.O. Berger, and R.L. W olp ert, The likeliho o d principle , Lecture Notes–Monograph Series, V ol. 6 , Ed. S.S. Gupta (Institute of Mathematical Statistics, Ha yward, 1984). [27] See, for example, D. J. C. Mack ay , Bayesian Metho ds for A daptive Mo dels , PhD Thesis, Caltec h (1992). http://www.inference.phy.cam.ac.uk/mackay/PhD.html . [28] M. Pierini, H. Prosp er, S. Sekmen, and M. Spiropulu, [arXiv:1107.2877 [hep-ph]]. [29] SOFTSUSY , B. C. Allanach, Comput. Phys. Comm un. 143 , 305 (2002). [hep-ph/0104145]. [30] SUSYHIT , A. Djouadi, M. M. Muhlleitner, and M. Spira, Acta Phys. Polon. B38 , 635 (2007). [hep-ph/0609292]. [31] PYTHIA , T. Sjostrand, S. Mrenna, and P . Z. Sk ands, JHEP 0605 , 026 (2006). [hep-ph/0603175]. [32] R. Adolphi et al. [ CMS Collab oration ], JINST 3 , S08004 (2008). [33] PGS , J. Conw a y , et al. , http://physics.ucdavis.edu/ ~ conway/research/software/pgs/pgs4- general.htm . [34] S. Sekmen, Ph.D. Thesis, CMS TS-2009/025. [35] SuperIso , F. Mahmoudi, Comput. Ph ys. Commun. 178 , 745 (2008). [arXiv:0710.2067 [hep- ph]]; F. Mahmoudi, CPHCB,180,1579-1613. 2009 180 , 1579 (2009). [arXiv:0808.3144 [hep-ph]]. [36] micrOMEGAs , G. Belanger, F. Boudjema, A. Pukhov, A. Semenov, Comput. Phys. Commun. 176 , 367 (2007). [hep-ph/0607059]. [37] K. Nak am ura et al. [ Particle Data Group Collab oration ], J. Phys. G G37 , 075021 (2010). [38] N. Metrop olis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. T eller, and E. T eller, J. Chem. Ph ys. 21 , 1087 (1953); W. K. Hastings, Biometrik a 57 , 1970 (1970). [39] MultiNest , F. F eroz, M. P . Hobson, and M. Bridges, [arXiv:0809.3437 [astro-ph]]. [40] In this limit—essentially , when the t w o hypotheses H 1 and H 0 are nearly degenerate—the KL divergence can b e in terpreted as twice the square of the distance b et ween the asso ciated densities in the space of functions [25]. App endix A: Deriv ation of Bac kground Prior This form for the prior π ( µ ) can be motiv ated [21] b y considering an experiment compris- ing tw o data-sets S and B . Data-set S is modeled as a mixture of signal and bac kground 28 ev ents with exp ected bac kground coun t µ . Data-set B , p erhaps a sideband, is presumed to b e o v erwhelmingly dominated by background ev en ts with exp ected background bµ . Al- though we do not kno w µ , we assume that w e kno w the ratio b of the exp ected bac kground in data-set B to that in data-set S . The exp ected background bµ for data-set B is estimated b y the n umber of ev ents Y in that data-set. The likelihoo d for the observed coun t Y in data-set B is tak en to b e Poisson( Y | bµ ), which, together with its reference prior, ∝ 1 / √ µ , yields the posterior densit y p ( µ | Y ) ∝ exp( − bµ )( bµ ) Y − 1 / 2 . This p osterior densit y serv es as the evidence-based prior π ( µ ) for the exp ected bac kground in data-set S . App endix B: Deﬁnition of Reference Prior for the Single Coun t Mo del One b egins with the information gained from K rep etitions of the single count exp erimen t, I K [ π ] ≡ ∞ X N 1 =0 · · · ∞ X N K =0 m ( N ( K ) ) D [ π , p ( s | N ( K ) )] , (B1) where m ( N ( K ) ) = Z p ( N ( K ) | s ) π ( s ) ds, with p ( N ( K ) | s ) = K Y i =1 p ( N i | s ) , (B2) is the marginal densit y for K exp erimen ts. The maximization of the exp ected information gain, I K [ π ], with resp ect to the prior yields the function π K ( s ). By deﬁnition [23], the reference prior π ( s ) is the limit π ( s ) = lim K →∞ π K ( s ) π K ( s 0 ) , with π K ( s ) = exp ( ∞ X N 1 =0 · · · ∞ X N K =0 p ( N ( K ) | s ) ln  p ( N ( K ) | s ) h ( s ) R p ( N ( K ) | s ) h ( s ) ds  ) , (B3) where s 0 is an y ﬁxed p oin t in the space of exp ected signal and h ( s ) is any positive func- tion, suc h as h ( s ) = 1. How ev er, since the p osterior density for the single coun t mo del is asymptotically normal, the reference prior computed using the ab o v e algorithm coincides with Jeﬀreys prior, Eq. (9). 29 App endix C: Calculation of Marginal Likelihoo d Deﬁning the recursive functions, W 0 ( s, z ) = 1 , W k ( s, z ) = z  s k  W k − 1 for k = 1 , · · · , n, Y 0 ( z ) = 1 , Y k ( z ) = z  y − 1 2 + k k   1 b + 1  Y k − 1 , for k = 1 , · · · , n, (C1) w e can write p ( n | s ) and T m n ( s ) as p ( n | s ) =  b b + 1  y + 1 2 n X k =0 W k ( s, z ) Y n − k ( z ) , (C2) T m n = n X k =0 k m W k ( s, z ) Y n − k ( z ) , (C3) with z = 1 for n = 0 and z = e − s/n for n > 0. App endix D: Mapping Pro cedure for a 2D T o y Mo del T o illustrate further ho w the mapping from a 1-D p osterior density to an n -D parameter space works in practice, we consider the case of a mo del describ ed by t wo unkno wn parame- ters x and y . An exp erimen tal measuremen t is a v ailable for the quantit y ρ = p x 2 + y 2 . One builds the reference prior corresp onding to all the p ossible outcomes of the measuremen t of ρ and derives a reference p osterior p ( ρ ). W e no w w ant to ﬁnd a function π ( x, y ) that is consisten t with the 1-D reference p osterior density p ( ρ ). T o solve this problem, we imp ose tw o conditions: • π ( x, y ) is constan t for all the p oin ts ( x, y ) corresp onding to the same v alue of ρ . This implies that π ( x, y ) = π ( ρ ( x, y )). This makes p erfect sense b ecause the only a v ailable information on x and y is the measuremen t of ρ , which cannot break the degeneracy of the iso- ρ con tour. Without an y loss of generality , we can then write π ( x, y ) = p ( ρ ( x, y )) / A ( ρ ( x, y )); 30 FIG. 8: (left) Induced p osterior densit y p 0 ( x, y ) = p ( ρ ( x, y )), where p ( ρ ) is the 1-D reference p osterior density . (right) Ratio of p 0 ( x, y ) marginalized bac k to ρ , via Eq. (20), ov er the reference p osterior densit y p ( ρ ). Clearly the tw o 1-D densities are not the same, as they should b e if the densit y p 0 ( x, y ) were consistent with p ( ρ ). • when marginalized to ρ , through Eq. (20), π ( x, y ) should reco ver p ( ρ ). This consistency requiremen t, together with the ﬁrst, is what p ermits iden tifying A ( ρ ( x, y )) with the “area” of the iso- ρ contour. The ﬁrst requirement is quite natural if one thinks of the Bay esian analysis as an update of our kno wledge ab out the parameters x and y . The second requirement may need further explanation. Supp ose for the moment that the function A ( ρ ( x, y )) do es not en ter the problem. Enforc- ing the ﬁrst condition w ould then imply that π ( x, y ) = p ( ρ ( x, y )). Consider a measuremen t of ρ with a Gaussian likelihoo d. This measurement would translate in to a 2-D function of x and y as shown in the left plot of Fig. 8. Once marginalized, this function gives a function g ( ρ ) whic h diﬀers from p ( ρ ) by a factor linear in ρ , coming from the Jacobian of the ( x, y ) → ρ marginalization. This is shown in the righ t plot of Fig. 8, whic h sho ws the ratio g ( ρ ) /p ( ρ ) as a function of ρ . Ho wev er, in this speciﬁc case, w e kno w the form of the function A ( ρ ); it is simply giv en b y A ( ρ ) = 2 π ρ . Therefore, the correct mapping from 1-D to 2-D yields π ( x, y ) = p ( ρ ( x, y )) / 2 π ρ , sho wn in the left plot of Fig. 9, which giv es a constan t v alue for the ratio g ( ρ ) /p ( ρ ) (see 31 FIG. 9: (left) Induced p osterior densit y p 0 ( x, y ) = p ( ρ ( x, y )) / 2 π ρ , where p ( ρ ) is the 1-D reference p osterior density . (right) Ratio of p 0 ( x, y ) marginalized bac k to ρ , via Eq. (20), ov er the reference p osterior density p ( ρ ). The t wo 1-D densities are identical, as they should be since, b y construction, the densit y p 0 ( x, y ) is consistent with p ( ρ ). righ t plot of Fig. 9) as one would exp ect for a densit y π ( x, y ) that is consisten t with p ( ρ ). In the absence of an analytical solution for A ( x, y ), one could follo w a simple numerical pro cedure, whic h takes full adv an tage of the fact that A ( x, y ) = A ( ρ ( x, y )). This simple fact implies that, b y incorrectly using p 0 ( x, y ) = p ( ρ ( x, y )) one is wrong b y a factor that is constant ov er the iso- ρ contour. This factor is nothing else than the ratio g ( ρ ) /p ( ρ ), mapp ed onto the ( x, y ) plane (see left plot of Fig. 10). This simple construction allows one to solve for the integral, Eq. (23), deﬁning A ( x, y ) without ha ving to perform the in tegral explicitly; one simply w eigh ts eac h p oin t b y g ( ρ ) /p ( ρ ), whic h is sho wn in the righ t-hand plot of Fig. 10). When the corrected function π ( x, y ) is marginalized, the function p ( ρ ) is reco vered b y construction. The use of MCMC to sample the space x, y makes the procedure ev en simpler. Rather than scanning the ( x, y ) plane and asso ciating to each p oint the v alue of p ( ρ ), one samples ( x, y ) according to p ( ρ ) directly . This implies that g ( ρ ) = p ( ρ ) by construction, as one can easily v erify . 32 FIG. 10: (left) Correction map in the x, y plane and (righ t) the same map in the ρ space. 33

Priors for New Physics

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment