Estimating the null distribution for conditional inference and genome-scale screening

Estimating the n ull distribution for conditional inference and genome-scale screening Da vid R. Bic k el July 12, 2021 Otta w a Institute of Systems Biol ogy Departmen t of Bio c he mistry , Microbiology , and Imm un o logy Departmen t of Mathematic s and Statistics Univ ersit y of Otta w a 451 Sm yth Road Otta w a, On tario, K1H 8M5 Abstract In a no v el ap proa c h to the m ultiple test ing problem, Efron (2004; 2007) fo rm ulated estima- tors of the d istribution of test st a t istics or nominal p -v alues u nder a n ull distribution suitable for mo deling the data of thousand s of unaected genes, non-asso ciated single-n ucl eotide p oly- morphisms, or other biological features. Estimators of the n ull dis tribution can impro v e not only the empirical Ba y es pro cedure for whic h it w as originally in tended, but also man y o t her m ult iple comparison pro cedures. Suc h est ima t or s serv e as the groundw ork for the prop osed m ult iple comparison pro cedu r e based on a recen t frequen tist metho d of minimizing p osterior exp ected loss, exemplied with a non-additiv e loss function designed for genomic screening rather than for v ali dation. The merit of estimating the n ull distribution is exam in ed from the v an tage p oin t of condi- tional inference in the remainder of t he pap er. In a sim ulation study of genome-scale m ultiple testing, conditioning the observ ed condence lev el on the est imated n ull distribution as an ap- pro ximate ancillary stat istic mark ed ly impro v ed cond itional inference. T o enable researc hers to determine whet her to rely on a particular estimated n ull distribution for in f eren ce or decision making, an information-theoretic score is pro vided that quan ti es the b enet of conditioning. As the sum of the degree of ancillarit y and the degree of inferen tial relev ance, th e score reects the balance conditionin g w ould strik e b et w een t he t w o conicting terms. Applications to gene expression m i cro arra y data illustrate the metho ds i n tro duced. Keyw ords: ancillarit y; attained condence le v el; co mp osite h yp othesis testing; conditional infer- ence; empirical n ul l distribution; GW A; m ultiple comparison pro ce d ure s ; observ ed c ondence lev el; sim ultaneous inference; sim ultaneo u s signicance testing; SNP 1 1 In tro duction 1.1 Multiple comparison pro cedures 1.1.1 Aims o f m ultiple - co mparison adjus tmen ts Con tro v ersy surrounding whether and ho w to adjust analysis results for m ultiple comparisons can b e partly resolv ed b y recognizing that a pro cedure that w orks w ell for one purp ose is often p o orly suited f o r another sinc e dieren t t yp es of pro cedures solv e distinct statistical pr o b l ems. Metho ds of adjustmen t ha v e b een dev elop ed to attain three goals, the rst t w o of whic h optimize some measure of sample space p erformance: 1. A djustment for sele ction . The most common concern leading to m ultiple-comparison adjust- men ts stems from the observ ation that results can ac hiev e nom inal statistical signica n c e b ecause they w ere selected to do so rather than b ecause of a repro ducible eec t. A dj ustmen ts of this t yp e are usually based on con trol of a T y p e I error rate suc h as a family-wise error rate or a false disco v ery rate as dened b y Benjamini and Ho c h b erg (1995). Dudoit et al. (2003) review ed sev eral options in the con text of gene expression mic roarr a y data. 2. Minimization of a risk function . S te in (1956) pro v ed that the maxim um lik eliho o d estimator (MLE) i s inadmissible for estimation of a m ultiv ariate normal mean under sq ua r e d error loss, ev en in the absence of correlatio n . Efron and Morris (1973) extended the result b y establishing that the MLE is dominated b y a wide class of estimators deriv ed via an empirical Ba y es approac h in whic h the mea n is random. More recen tly , Ghosh (2006) adjusted p -v al ues for m ultiple comparisons b y minimizing their risk as estimators of a p osterior probabil it y . In the presence of genome-scale n um b ers of co mparisons , adjustmen ts based on hierarc hical mo dels are often m uc h less extr e me than t hose needed to adjust for selection. F or t w o e x a mples from microarra y data analysis, Efron (2008) found that p osterior in terv als based on a lo cal false disco v ery rate (LFDR) estimate tend to b e substan tially narro w er than th o s e needed to con trol the false co v erage rate in tro duced b y Benjamini et al. (200 5) to accoun t for selection, and an LFDR-based p osterior mean has insucien t shrink age to w ard the n ull to adequately correct selection bias (Bic k el, 2008a). 3. Estimation of nul l or alternative distributions. Measuremen ts o v er thousands of biologi cal fea- tures a v ailable from studies of genome-scale expression and genom e-wide asso ciation studies ha v e r e cen tly enabled estimation of distributions of p -v alues. Early empirical Ba y es metho ds of estimating the L F DR asso ciate d with eac h n ull h yp othesis emplo y estimates of the distri- bution of test statistics or p -v al ues under the alternativ e h yp othesis (e.g., Efron et al., 2001). Efron (2004; 2007a) w en t further, de monstr ati n g the v alue of also estimati n g the distribution of p -v a lues under the n ull h yp othesis pro vided a sucien tly large n um b er of h yp otheses under sim ultaneous consideration. While all th re e aims are relev an t to Neyman-P earson testing, they dier as m uc h in their relev ance to Fisherian signicanc e testing as in the pro cedures they motiv ate. Ma y o and Co x (2006) p oin ted out that T yp e I error rate con trol is appropriate for making series of decisions but not for inductiv e reasoning, where the inferen tial ev aluation of evidence is of concern apart from loss functions that dep end on ho w that evidence will b e u se d , whic h, as Fisher (1973, pp. 95-96, 103-106) stressed, migh t not ev en b e kno wn at the time of data analysis. Lik e wise, Hill (1990) and Gleser (1990) found optimization o v er the sample space h e lpful for making series of decisions rather t han for dra wing scien tic inferences from a particular observ ed sample. Co x (1958; 200 6) noted that selection of a function to optimize is inheren tly sub jectiv e to the exten t that dieren t decision mak ers ha v e dieren t in terests. F urther, sample spac e optimalit y is often ac hiev ed at the exp ense of induction ab out the parameter giv en the data at hand; for example, optimal condence in terv als result from systematically stretc hing them in samples of lo w v ariance and reducing them in sample s of high v a r i ance relativ e to their conditional coun terparts (Co x, 1958; Barnard, 1976; F raser and Reid, 1990; F raser, 2004a,b). 2 The suitabilit y of the metho ds of b oth of the rst t w o goals for decision rule s as opp osed to inductiv e reasoning is consisten t with the observ ation that con trol of T yp e I error rates ma y b e form ulated as a minimax pr obl em ( e .g., Lehmann 1950; W ald 1961, 1.5), indicating that the second of the ab o v e aims generalizes the rst . Although corrections in order to accoun t for selection are often appl ied when it is b eliev ed that only a small fraction of n ull h yp otheses are false (Co x, 2006), the metho ds of con trolling a T yp e I e r ror rate used to mak e suc h corrections are framed in terms of rejection decisions and th us ma y dep e nd on the n um b er of tests conducted, whic h w ould not b e the case w ere the degree of correction a function only of prior b eliefs. By con trast with the rst t w o aims, the third aim, im p ro v ed sp ecication of the alternativ e or n ull distribution of test statistics, is clearly as imp or tan t in signicanc e testing as in xed-lev el Neyman-P earson testing. In short, while the rst t w o motiv ations for m ultiple comparison pro ce du re s address decision-theoretic problems, only the third p ertains to signicance t e st i ng in the sense of impartiall y w eighing evidence without regard to p ossible consequences of actions that migh t b e tak en as a result of the n di ngs . 1.1.2 Estimating the n ull distributi on Because of its no v elt y and its p oten tial imp ortance for man y frequen tist pro cedures of m ultiple comparisons, th e eect of relying on the follo wing metho d due to Efron (2004; 2007a; 2007b) o f estimating the n ull distribution will b e examined herein. The metho d rests on the assumption that ab out 90% or more of a large n um b er of p -v al ues c or re s p ond to unae cte d fe atur es and th us ha v e a common distribution called the true nul l distribution . If that distribution is uniform, then the assume d n ul l distribution of test statistics with resp ect to whic h the p -v al ues w ere computed is correct. In order to mo del the n ull distribution as a m em b er of the normal family , the p -v al ues are transformed b y Φ − 1 : [0 , 1] → R 1 , the standard normal quan tile function. The parameters of that distribution are estim at e d either b y tting a curv e to t he cen tral region of a h i st og r a m of the transformed p -v al ues (Efron, 2004) or, as used b elo w, b y applying a maxim um lik eliho o d pro ce d ure to a truncated normal distribution (Efron, 2007b). The mai n justication for b ot h algorithms is that since nea r l y all p -v alues are mo deled as v ariates from the true n ull distribution and since the remaining p -v alues are considered dra wn from a distribution with wider tails, the less extreme p -v alues b etter resem ble the true n ull distribution th a n do those that are more extreme. Since the theoretical n ull distribution is standard normal in the transformed domain, deviations fr om the standard normal distribution reect departures in the less extreme p -v alues from uniformit y in the original domain. F or use in m ultiple testing, all of the transformed p -v alues of the data set are treated as test statistics f o r the deriv ation of new p -v al ues with resp ect to the n ull distribution estimate d as de- scrib ed ab o v e instead of t he assumed n ull distribution. Suc h adjusted p -v a lues w ould b e suitable for inductiv e inference or for decision-theoretic a n a lys e s suc h as those con trolling error rates, pr o vided that th e true n ull distribution tends to b e closer to the e st i mated n ull distribution than it is to the assumed n ul l distribution. 1.2 Ov erview The next section presen ts a condence-based distribution of a v ector p a r am eter in order to unify the presen t study of n u l l distribution estimation within a single framew ork. The general framew ork is then applied to the pr o blem of estimating the n ull distribution in Section 3.1. Section 3.2 in tro duces a m ultiple comparisons pro cedure f o r coheren t decisions made p ossible b y the condence-based p osterior without recourse to Ba y esian or empiri cal Ba y esian mo dels. A djusting p -v alues b y t he estimated n ull distribution is in terpreted as inference condi t i onal on that estimate in Section 4. The sim ulati on s tudy of Section 4.1 d e monstr a t e s that estima t i on of the n ull distribution can substan tially impro v e conditional inferenc e ev en when the assumed n ull distribution is correct marginal o v e r a precision statistic. Section 4.2 pro vides a metho d for determining whether the e s ti mated n ull distribution is suc ien tly ancillary and relev an t for eectiv e conditional inference or decision making on the basis of a giv en data set. Section 5 concludes with a discussion of the new ndings and metho ds. 3 2 Statistical framew ork 2.1 Condence lev els as p osterior probabilities The observ ed data v ector x ∈ Ω is mo deled as a realization of a random quan tit y X of distr i but i on P ξ , a pr o b a b i lit y distribution on the measurable space (Ω , Σ) that i s sp ecied b y the ful l p ar ameter ξ ∈ Ξ ⊆ R d . Let θ = θ ( ξ ) denote a p ar ameter of in t er est in Θ and γ = γ ( ξ ) a nuisanc e p ar ameter . Denition 1. In additi on to the ab o v e family of probabilit y measures { P ξ : ξ ∈ Ξ } , co n side r a family o f probabilit y measures { P x : x ∈ Ω } , eac h on the space (Θ , A ) , and a set R ( S ) = n ˆ Θ ρ,s ( ρ ) : ρ ∈ [0 , 1] , s ∈ S o of region estimators corresp onding to a set S of shap e functions, where ˆ Θ ρ,s ( ρ ) : Ω → A for all ρ ∈ [0 , 1] and s ∈ S. If, for e v ery Θ 0 ∈ A , x ∈ Ω , and ξ ∈ Ξ , there exist a co v erage rate ρ and shap e s ( ρ ) suc h t hat P x (Θ 0 ) = ρ = P ξ  θ ( ξ ) ∈ ˆ Θ ρ,s ( ρ ) ( X )  (1) and ˆ Θ ρ,s ( ρ ) ( x ) = P x (Θ 0 ) , then the probabilit y P x (Θ 0 ) is the c ondenc e level of the h yp othesis θ ( ξ ) ∈ Θ 0 according to P x , the c ondenc e me asur e o v er Θ corresp onding to R ( S ) . R emark 2 . Unless the σ -eld A is Borel, the condence lev el of the h yp othesis of in terest will not necessarily b e dened; cf. M cCullagh (2004). Building on w ork of Efron and Tibshirani (1998) and others, P olansky (2007) used the equiv a- len t of P x to concisely comm unicate a distribution of observ ed condence or attained condence lev els for eac h h yp othesis that θ lies in some region Θ 0 . The decision-theoretic certai n t y in terpre- tation of P x as a non-Ba y e s i an p osterior (Bic k el, 2009 ) serv es the same purp ose but also ensures the coherence of actions that minimize e x p ected p osterior loss. Robinson (1979) also considered in terpreting the ratio ρ/ (1 − ρ ) from equation (1) as o dds for b etting on the h yp othesis that θ ∈ Θ 0 . The p osterior distribution need not conform to the Ba y es up date rule (Bic k el, 2009) since deci- sions that minimize p ost e rior exp ected loss, or, equiv alen tly , maximize exp ected utilit y , are coheren t as long as the p osteri or distribution is some nitely additiv e probabilit y distribution o v er parameter space ( se e, e.g., Anscom b e and Aumann, 1963). It follo ws that an in telligen t agen t th a t acts as if ρ/ (1 − ρ ) are fai r b etting o dds for the h yp othesis that θ lies i n a lev el- ρ condence region estimated b y some region estimator of exact co v erage rate ρ is coheren t if and only if its actions minimi ze exp ected loss with the exp ectation v alue o v er a condence measure as the probabilit y distribution dening the exp ectatio n v alue (cf. Bic k el, 2009). Minimizing exp ected loss o v er the parameter space, whether based on a condence p osterior o r on a Ba y esian p o s te r i or, diers fundamen tally from the decision-theoretic approac h of Section 1.1 in that the former i s optimal gi v en the single sample actually observ ed whereas the latter is optimal o v e r rep eated sampling. Section 3.2 illus- trates the minimization of condence-measure exp ected lo s s with an application to screening on the basis of ge nomics data. 2.2 Condence lev els v ersus p -v alues Whether condence lev el s agree with p -v alues dep ends on the parameter o f in terest and on the c hosen h yp otheses. If θ is a scalar and the n ull h yp othesis is θ = θ 0 , the p -v al ues asso ciated with the a lternativ e h yp otheses θ > θ 0 and θ < θ 0 are P x (( −∞ , θ 0 )) and P x (( θ 0 , ∞ )) , resp ectiv ely; cf. Sc h w eder and Hjort (2002). On the other hand, a p -v al ue asso ciated wi th a t w o-sided alternativ e is not t ypica lly equal to the condence lev el P x ( { θ 0 } ) . P olansky (2007 , pp. 126-128, 216) discusses the tendency of the attai n e d condence le v el of a p oin t or simple h yp otheses suc h as θ = θ 0 to v anish in a con tin uous parameter space. That only a nite n um b er of p oin ts in h yp othesis space ha v e nonzero condence is required of an y evidence scale that is fractional in the sense that the total strength of evidence o v e r Θ is nite. (F ractional scal es e nable st a t e men ts of the form, the negativ e, n ull, and p o s i t i v e h yp otheses are 80% , 15%, and 5% supp orted b y the data, resp e ctiv ely .) While the usual t w o-sided p -v alue v a n i s he s only for sucien tly large sampl es, the condence le v el P x ( { θ 0 } ) t ypically is 0% ev en for 4 the smallest samples and th us do es not lead to the app earance of a parado x of o v er-p o w ere d  studies. As a remedy , Ho dges and Lehmann (1954) prop osed testing an in terv al h yp othesis θ ∈ Θ 0 dened in terms of scien tic signicance; in this situation, as with comp osi t e h yp othesis-testing in general, P x (Θ 0 ) con v erges in probabilit y to 1 Θ 0 ( θ ) ev en though the t w o-sided p -v alue do es not (Bic k el, 2009). (T esting a simple n u l l h yp othesis against a comp osite al ternativ e h yp othesis yields a similar discrepancy b et w een a t w o-sided p -v alue and metho ds that resp ect th e lik eliho o d principle (Levine, 1996; Bic k el , 2008b).) There are none t he less situations that, when using p -v al ues for statistical signicance, n e cessitate testing a h y p othesis kn o wn to b e false for all practical purp oses. Co x (1977) calle d a n ull h yp othesis θ = θ 0 dividing if it is not considered b ecause it could p ossibly b e appro ximately true but rather b ecause it lies on the b oundary b et w een θ < θ 0 and θ > θ 0 , the t w o h yp otheses of gen uine in terest. F or e x a mple, a test of equalit y of means and its asso ciated t w o- s i ded p -v alue often serv e the purp ose of determining whether there are enough data to de termine the direction of the dierence when it is kno wn that there is some appreciable dierence (C o x, 1977). That goal can b e more directly attained b y comparing the condence lev els P x (( −∞ , θ 0 )) and P x (( θ 0 , ∞ )) . While rep orting the ratio or maxim um of P x (( −∞ , θ 0 )) and P x (( θ 0 , ∞ )) w ould summarize the conde n c e lev el of eac h of t w o regions in a single n um b er, suc h a n um b er ma y b e more susceptible to misin terpretatio n than a re p ort of the pair of co n de nce lev els. 2.3 Sim ultaneous inference In the t ypical genome-scale problem, there are d scalar parame ters θ 1 , θ 2 , ..., θ d and d corresp onding observ abl es X 1 , X 2 , ..., X d , suc h that d ≥ 1000 and θ i = θ i ( ξ ) is a subparameter of the distribution of X i , the random quan tit y of whic h the observ ation x i ∈ Ω i is a realized v ector. The i th of the d h yp otheses to b e sim ultaneously tested is θ i ∈ Θ 0 i for some Θ 0 i in Θ i , a subset of R 1 . Represen ting n umeric tuples under the angul ar brac k et con v en tion to distinguish the op en i n terv al ( x, y ) from the ordered p a ir h x, y i , θ = θ ( ξ ) = h θ 1 , θ 2 , . . . , θ d i is the d -dimensional subparameter of in terest and the joi n t h yp othesis is θ ( ξ ) ∈ Θ 0 , where Θ 0 = Θ 0 1 × Θ 0 2 × · · · × Θ 0 d . F or an y δ ∈ { 1 , 2 , ..., d − 1 } , inference ma y fo cus on δ of the scalar parameters as opp osed to the en tire v ector θ . F or example , separate consideration of the condence lev els of h yp oth e ses suc h as θ 1 ∈ Θ 0 1 or of h θ 1 , θ 2 i ∈ Θ 0 1 × Θ 0 2 can b e informativ e, e s p ecially if d is high. Eac h of the comp onen ts of the fo cus index ι = h i (1) , i (2) , . . . , i ( δ ) i is in { 1 , ..., d } and is unequal to ea c h of its other comp onen ts. The prop er subset ˜ Θ 0 ι = Θ 0 i (1) × Θ 0 i (2) × · · · × Θ 0 i ( δ ) of ˜ Θ ι = Θ i (1) × Θ i (2) × · · · × Θ i ( δ ) is dened in order to w eigh the evidence for the h yp othesis that ˜ θ ι =  θ i (1) , θ i (2) , . . . , θ i ( δ )  ∈ ˜ Θ 0 ι . Setting Θ 0 ι = Θ 0 1 × Θ 0 2 × · · · × Θ 0 d suc h that Θ 0 j = Θ j for all j / ∈ { i (1) , i (2) , . . . , i ( δ ) } , dene the marginal distribution P x ι suc h that P x ι  ˜ Θ 0 ι  is equal to the condence lev el P x (Θ 0 ι ) . Th us, P x ι is a probabilit y measure marginal o v er all θ j with j / ∈ { i (1) , i (2) , . . . , i ( δ ) } . The follo wing lemma streamlines inference fo cused on whether ˜ θ ι ∈ ˜ Θ 0 ι , or, equiv alen tly , θ ( ξ ) ∈ Θ 0 ι , b y establishing sucien t conditions for the condence lev el marginal o v er some of the d comp o- nen ts of θ to b e e q ual to the parameter co v erage probabilit y marginal o v er the data corresp onding to those c omp onen ts. Lemma 3. Considering a fo cus index ι and ˜ X ι =  X i (1) , X i (2) , . . . , X i ( δ )  , let ˆ Θ ι ρ,s ( ρ ) : Ω → ˜ A ι b e the c orr esp onding le vel- ρ set estimator of some shap e p ar ameter s ( ρ ) dene d such that for every x ∈ Ω , ˆ Θ ι ρ,s ( ρ ) ( x ) is the c anonic al pr oje ction of ˆ Θ ρ,s ( ρ ) ( x ) fr om A to ˜ A ι , the σ -eld of the mar ginal distribution P x ι . If ther e is a map ˜ Θ ι ρ,s ( ρ ) : ˜ Ω ι → ˜ A ι such that ˜ Θ ι ρ,s ( ρ )  ˜ X ι  and ˆ Θ ι ρ,s ( ρ ) ( X ) ar e identic al ly distribute d, then P x ι is the c o ndenc e me asur e over ˜ Θ ι c orr esp onding to n ˜ Θ ι ρ,s ( ρ ) : ρ ∈ [0 , 1] , s ∈ S o . Pr o of. By the ge neral denition of condence lev el (1), P x ι  ˜ Θ 0 ι  = P x (Θ 0 ι ) = P ξ  θ ∈ ˆ Θ ρ,s ( ρ ) ( X )  , 5 where the c o v erage rate ρ and shap e parameter s ( ρ ) are constrained suc h that ˆ Θ ρ,s ( ρ ) ( x ) = Θ 0 ι for the observ e d v alue x of random elemen t X. Hence, using A ι to denote the ev en t that θ j ∈ Θ 0 j , P x ι  ˜ Θ 0 ι  = P ξ  ˜ θ ι ∈ ˆ Θ ι ρ,s ( ρ ) ( X ) , A ι  (2) with the c o v erage rate ρ and shap e parameter s ( ρ ) restricted suc h that ˆ Θ ι ρ,s ( ρ ) ( x ) = ˜ Θ 0 ι . Considering j / ∈ { i (1) , i (2) , . . . , i ( δ ) } , the ev en t A ι satises P ξ ( A ι ) = 1 since Θ 0 j = Θ j , thereb y elim inating A ι from equation (2). Because ˜ Θ ι ρ,s ( ρ ) exists b y assumption, ˜ Θ ι ρ,s ( ρ ) ( ˜ x ι ) = ˜ Θ 0 ι results and ˜ Θ ι ρ,s ( ρ )  ˜ X ι  replaces ˆ Θ ι ρ,s ( ρ ) ( X ) in equation (2) since they are iden ti cally distributed. Therefore, P x ι  ˜ Θ 0 ι  = ρ = P ξ  ˜ θ ι ∈ ˜ Θ ι ρ,s ( ρ )  ˜ X ι  , where the co v erag e rate ρ and shap e parameter s ( ρ ) are c onst rai ned suc h that ˜ Θ ι ρ,s ( ρ ) ( ˜ x ι ) = ˜ Θ 0 ι for the observ e d v alue ˜ x ι =  x i (1) , x i (2) , . . . , x i ( δ )  of ˜ X ι . Conditional indep endence i s suci en t to satisfy the lemma's condition of iden tically distributed region estimators: Theorem 4. If X i is c onditional ly indep endent o f X j and θ j given θ i for al l i 6 = j, then, for any fo cus index ι, ther e is a map ˜ Θ ι ρ,s ( ρ ) : ˜ Ω ι → ˜ A ι such that ˜ Θ ι ρ,s ( ρ ) ( ˜ x ι ) = ˆ Θ ι ρ,s ( ρ ) ( x ) with ˜ x ι =  x i (1) , x i (2) , . . . , x i ( δ )  for every x ∈ Ω , and the mar ginal distribution P x ι is the c ondenc e me asur e over ˜ Θ ι c orr esp onding to n ˜ Θ ι ρ,s ( ρ ) : ρ ∈ [0 , 1] , s ∈ S o . Pr o of. By the conditional indep endence assumption, ˆ Θ ι ρ,s ( ρ ) ( X ) is conditionally indep enden t of θ j and X j for al l j / ∈ { i (1) , i (2) , . . . , i ( δ ) } giv en ˜ θ ι , en tailing the existence of a map ˜ Θ ι ρ,s ( ρ ) : ˜ Ω ι → ˜ A ι suc h that ˜ Θ ι ρ,s ( ρ )  ˜ X ι  and ˆ Θ ι ρ,s ( ρ ) ( X ) are ide n tically distributed. Then the ab o v e lemma yields the consequen t. The theorem facilitates inference separately fo cused on eac h scal ar subparameter θ i on the basis of the obs e rv ation that X i = x i ∈ Ω i : Corollary 5. If X i is c onditional ly indep e n dent of X j and θ j given θ i for al l i 6 = j, then, for any i ∈ { 1 , 2 , . . . , k } , the mar ginal distribution P x h i i is the c ondenc e me asur e over Θ i c orr esp onding to some set n ˜ Θ h i i ρ,s ( ρ ) : ρ ∈ [0 , 1] , s ∈ S o of interval e sti m ators , e ach a map ˜ Θ h i i ρ,s ( ρ ) : Ω i → ˜ A h i i . Pr o of. Under the stated c onditions, the theorem en tails the existenc e of a map ˜ Θ h i i ρ,s ( ρ ) : ˜ Ω h i i → ˜ A h i i suc h that ˜ Θ h i i ρ,s ( ρ )  ˜ x h i i  = ˆ Θ h i i ρ,s ( ρ ) ( x ) with ˜ x h i i = x i for ev ery x ∈ Ω and en tails t hat the marginal distribution P x h i i is the condence measure o v er ˜ Θ h i i corresp onding to n ˜ Θ h i i ρ,s ( ρ ) : ρ ∈ [0 , 1] , s ∈ S o . R emark 6 . The applications of Sections 3 and 4 exploit this prop ert y in order to dra w inferences from the condence lev els P x h 1 i ((inf Θ 1 , θ 0 )) , P x h 2 i ((inf Θ 2 , θ 0 )) , . . . , P x h d i ((inf Θ d , θ 0 )) of the h yp otheses θ 1 < θ 0 , θ 2 < θ 0 , ..., θ d < θ 0 , resp ectiv ely , for v ery large d . Here, δ = 1 , eac h sub sc r i pt h j i is the 1-tuple represen tation of the v ector ι with j as its only c omp onen t, and θ 0 is the scalar sup re m um shared b y all d h yp otheses. 3 Null estimation for genome-scale screening 3.1 Estimation of the n ull p osterior In the presence of h undreds or thousands of h y p otheses, the no v el metho dology of Efron (2007a) can impro v e e v i d e n tial inference b y estimation of th e n ull distribution. While Efron (2007a) originally 6 applied the estimator to eectiv ely condition the LFDR on an estimated di s tribution of n ull p - v a lues, he noted that its applications p oten tially encompass an y pro cedure that dep ends on the distribution of statistics under the n ull h yp othesis. Indeed, the sto c hasticit y of p a r a meters that enables estimation of the LFDR b y the empi r i cal Ba y es m ac hinery need not b e assumed for the pre- decision purp ose of deriving the lev el of condence that eac h gene is dieren tial ly expressed. Th us, the metho dology of Efron (20 07a) outlined in Section in terms of p -v al ues can b e appropriated to adjust condence lev els (2) since P x h i i (( −∞ , θ 0 )) , the lev el of condence that a scalar subparameter θ i is less than a giv en sc alar θ 0 , is n umerica lly equal to p x h i i ( θ 0 ) , the upp er-tailed p -v al ue for the test of the h yp othesis that θ i = θ 0 . Sp ecically , condenc e lev els are adjusted in this pap er according to the estimate d condence measure under the n ull h yp othesis rather than according to an assumed condence measure under the n ull h yp othesis. T reating the parameters indicating dieren tial expression as xed rather than as exc hangeable random quan tities arguably p ro vides a closer t to the biological system in the sense that certain genes remain dieren tially expressed and other g enes remain b y comparison equiv alen tly e x pressed across con trolled conditions under rep eated sampling. While the condence measure is a probabilit y measure on parameter space, its probabilities are in te rp re t e d as a degrees of condence suitable for coheren t decision making (3.2), not as ph ysical probabilities mo del ing a frequency of ev en ts in the system. The in terpretation of p a r a meter randomness in LFDR metho ds is less clear except when the LFDR is seen as an appro ximation to a Ba y esian p osterior probabilit y under a hie r arc hical mo del. Example 7. A tomato dev elopmen t exp erimen t of Alba et al. (2005) yielded n = 6 observ ed ratios of m utan t expression to wild-t yp e expression in most of the d = 13 , 340 genes on the microarra y with missing data for man y ge n e s. F or the i th gene, the in terest parameter θ i is the exp ectation v alue of X i , the logarithm of the expression ratio. The h yp othesis θ i < 0 corresp onds to do wnregulation of gene i in the m utan t, whereas θ i > 0 corresp onds to u pre gulation. T o ob viate estim ation of a join t distribution of d parameters, the indep ende nce co n di tions of Corollary 5 are assumed t o hold. Also assuming normally distributed X i , the one-sample t -test ga v e the upp er-tail p -v al ue equal to the condence lev el P x i h i i ( R − ) for eac h gene. The notation is that of Remark 6, except with the replacemen t of eac h x subscript with x i to emphasize that only the i th observ ed v ec t o r i n ue nces the condence lev e l corresp onding to the i th parameter. Efron's (2007b) maxim um-lik eliho o d metho d of estimating the n ull distribution from a v ector of p -v a lues pro vided the estimated n ull condence measure that is v ery close t o the empirical distribution of the data (Fig. 1), whic h i s consisten t with but do es not imply the truth of all n ull h yp otheses of equiv al en t express i on ( θ i = 0) . Using that estimate of the n ull distribution in place of the uniform distribution corresp onding to the Studen t t distribution of test statistics has the eect of adjusting eac h condenc e lev el. Since extreme condence le v els are adj usted to w ard 1/2, the estimated n u l l reduces the condenc e lev el b oth of genes with large v alues of P x i h i i ( R − ) (condence of the h yp othesis θ i < 0 ) and of those with large v alues of P x i h i i ( R + ) (condence of t he h yp othesis θ i > 0 ). Fig. 2 displa ys the eect of this condence-lev el adjustmen t in more detail. 3.2 Genome-scale screening loss Carlin and Loui s (2000, B.5.2) observ ed that with a suitable non-additiv e loss function, optimal decisions in the pr e sence of m ultiple comparisons can b e made on the basis of minimizing p osterior exp ected loss. A simple non-additi v e loss function is L a,c ( M , m ) = cM 1+ a + m, (3) where M and m are resp ectiv ely the n um b er of incorrect decisions and the n um b er of non-decisions concerning the d comp onen ts of θ ; M + m ≤ d. The scalars a ∈ R 1 and c > 0 reect dieren t asp ects of risk a v ersion: a is an acceleration in the sense of quan tifying the in teractiv e c omp ounding ee ct of m ulti p l e errors, whereas if a = 0 , then c is the ratio of the cost of making an incorrect decision to the c ost of not making an y decision or, equiv alen tly , the b enet of making a correct decision. 7 Figure 1: The blac k curv e is the estimated cum ul ativ e distribution function (CDF ) of the con- dence lev els under the n ull distribution, whic h c orr e s p onds to equiv alen tly expressed or unae cted genes; the gra y curv e is the e mpirical CDF of all condence lev els, including those of dieren tially expressed or aecte d genes. Here, observ ed condence co ecien ts corresp onding to h yp otheses are in terpreted as lev els of certain t y (2.1, 3.2). Departure of the blac k curv e from the diagonal line reects violation of indep endence or of the lognormal assumption used to compute the condence lev els. As one-sided p -v al ues, these condence lev els w ould b e uniform under the h yp othesis of equiv alen t expression giv en the assumptions; i.e., the Φ − 1 -transformed condence l ev els of unaf- fected genes are ass ume d to b e N  0 , 1 2  , where Φ − 1 is the standard normal quan tile function. The d i st ri b uti on of Φ − 1 -transformed c ondence lev els un de r that n ull h yp othesis w as estimate d to instead b e N  − 0 . 21 , (1 . 55) 2  . The data set, mo de l, and n ull distribut i on estimator are those of Example 7. 8 Figure 2: Impact of n ull estimation on the condence lev el as the measure of c ertain t y or st a - tistical signicance. The data set, mo del, a n d n ull distribution estima t o r are those of Example 7 and Fig. 1. L eft p anel: The transformed condence lev el Φ − 1  P x i h i i ( R − )  for gene i v ersus the expression ratio estimated as the geometric sample mean of the observ ed expression ratio for the same gene. Here, the condenc e lev el P x i h i i ( R − ) is the degree of certain t y of the h yp othe- sis that the mean log-transformed expression ratio is negativ e or, equiv alen tly , of the h yp othesis that the tr ue expression ratio is less t han 1. The horizon tal lines a r e dra wn at P x i h i i ( R − ) = 99% and at P x i h i i ( R + ) = 1 − P x i h i i ( R − ) = 99% . Of the original 13,340 genes, 1062 genes ha v e less than the t w o observ ations needed for the te s t s tatistic and 2 genes ha v e i nn i te normal-transformed condence le v els and th us are not displa y ed. Eac h circle corresp onds to a gene, with blac k for P x i h i i  R − ; ˆ F 0  , the condence lev el of θ i ∈ R − using the estimated n ull distribution ˆ F 0 and with gra y for P x i h i i  R − ; ˜ F 0  , the same except using the assumed n ull distribution ˜ F 0 . R i g ht p anel: The dierence b et w een P x i h i i  R − ; ˆ F 0  and P x i h i i  R − ; ˜ F 0  v ersus the estimated expression ratio. Orange circles represen t genes satisfying P x i h i i  R − ; ˜ F 0  > 99% but P x i h i i  R − ; ˆ F 0  ≤ 99% ; blac k circles represen t genes satisfying P x i h i i  R + ; ˜ F 0  > 99% but P x i h i i  R + ; ˆ F 0  ≤ 99% . 9 Figure 3: Num b er d − m of decisions on whether the i th gene is o v erexpressed ( θ i > 0) or underex- pressed ( θ i < 0) plotted against 1 + a, where a is the degree to whic h the l oss p er incorrect decision increases wi t h the n um b er of incorrect decisions (3). The sign call or decision on t he direction of regulation for eac h gene w as either made or not made suc h that the follo wing Mon te Carlo appro xi- mation to t he exp ected loss E x ( L a, 9 ( M , m )) = R L a, 9 ( M , m ) dP x w as minimized base d alternately on the assumed n ull distribution ˜ F 0 and on the estimated n ull distribution ˆ F 0 . The k th of the 10 4 v a lues of θ i w as dra wn from the frequen tist p osterior (1) indep enden tly for eac h gene i to compute the correct sign decisions according to the k th realization; suc h correct decisions y i elded M k and m k , the n um b er of incorrect sign decisions and the n um b er of non-de cisions. The indep endence of σ -elds corresp ondi n g to eac h gene's scalar comp onen t of θ guaran teed b y Corollary 5 implies E x ( L a, 9 ( M , m )) . = 10 − 4 P 10 4 k =1 L a, 9 ( M k , m k ) . The data set, mo del, and n ull distribution estimator are those of Example 7 and Figs. 1 and 2. Bic k el (2004) and Müll er et al. (2004) applied additiv e loss ( a = 0) to decisions of whether or not a biological feature is aected. That sp eci al case, ho w ev e r , do e s not accurately represen t the screening purp ose of mo s t genome-scale studies, whic h is to form ulate a reasonable n um b er of h yp otheses ab out features for conrmation in a follo w-up exp erim en t. More suitable for that goal, a > 0 allo ws gene r ati on of h yp otheses for at least a few features ev en on sligh t evidence without leading to unmanageably h i gh n um b ers of f e atures ev en in the presence of decisiv e evidence. Fig. 3 displa ys the r e su l t of minimizing suc h an exp ected loss with resp ect to the condence p osterior (1) under the ab o v e c lass of loss functions (3) for de cisions on the direction of dieren tial gene expression (Example 7). (T aking the exp e ctation v alue o v er the condence measure rather than o v er a Ba y esian p osterior measure w as justied in Section 2.1.) 4 Null estimation as conditional inference 4.1 Sim ulation study T o record the eect of n ull distr i but i on estimation on inductiv e inference, a sim ula t i on study w as conducted with K = 500 indep enden t samples eac h of d = 10 , 000 indep enden t observ able v ectors, of whic h 95% corresp ond to unaected a n d 5% to aected features suc h as genes or single-n ucleotide p olymorphisms (SNPs). In Example 7, an aected gene is one for whic h there is dieren tial ge ne expression b et w een m utan t and wild t yp e. Assuming that eac h scalar parameter θ i is constrained to lie in the s a me set Θ 1 , the one-sided p -v a lue of eac h observ able is e q ual to P x k,i ((inf Θ 1 , θ 0 )) , the k th condence le v el of θ i < θ 0 , the h yp othesis that the parame ter of in terest for the i th observ able v ector or fe atur e is less than some v alue θ 0 dividing t w o meaningful h yp otheses, as discussed in Section 2.2 and illustrated in Fig. 2. (This notation diers from that of Remark 6 in adapting the sup erscript of the condence le v el and from t ha t of Example 7 in dropping the subscript of x k,i for ease of reading.) As θ i = θ 0 is tr e ated as a n ull h yp othesis for the purp os e of estimating or assuming the n ull distribution, it naturally corresp onds an unaected feature. Eac h condence lev el w as generated from Φ , the standard normal CDF, of Z k,i ∼ N  0 , ς 2 k  for i ∈ { 1 , . . . , 9500 } or of 10 Z ∼ N  5 ς k / 2 , (5 ς k / 4) 2  for i ∈  9501 , . . . , 10 4  . Rather than xing ς k at 1 for all k (Efron, 2 007a, Fig. 5), ς k w as instead allo w ed to v ary across samples in order to mo del sample-sp ecic v ariation that inuences the distribution of p -v al ues. F or ev ery k in { 1 , . . . , K } , log ς k is indep enden t and equal to 2/3 with probabil it y 30%, 1 with probabilit y 40%, or 3/2 with probabilit y 30%. Eac h sim ulated sample w as analyzed with the same maxim um-l ik eliho o d metho d of estimating the n ull distribution used in the ab o v e gene expression example , in whic h the realized v alue of ς k w as predicted to b e ab out 3/2 (Fig. 1). Because ς k is an ancillary statistic in the sense that its distribution is n o t a function of the parameter and since estimation of the n ull distribution appro ximates conditioning the p -v alues and equiv ale n t condence lev els on the estimated v alue of ς k , n ull estimation is required b y the conditionalit y principle (Co x, 1958), in agreemen t with the analogy with conditioning on observ ed ro w or column totals in con tingency tables (Efron, 2007a). See Shi (2008) for further e x pl anation of the r e lev ance of the p ri n c iple to estimation of the n ull distribution. A ccordingly , p e rf o r m ance of e ac h metho d o f com p uti n g co n de n c e lev els, whether under the assumed n ull distribution ˜ F 0 or estimated n ull distribution ˆ F 0 , w as ev aluated i n terms of the pro ximit y of P x k,i ((inf Θ 1 , θ 0 ) ; F 0 ) , the condence lev el of θ i < θ 0 for trial k and feature i based on the n ull h yp othesis of distribution F 0 ∈ n ˆ F 0 , ˜ F 0 o , to P x k,i ((inf Θ 1 , θ 0 ) | ς k = σ k ) , the corresp onding true condence lev el conditional on the realize d v alue σ k of ς k used to generate the sim ulated data of trial k . F or some α ∈ [0 , 1] , the c onservative err or of relying on F 0 as the distribution under the n ull h yp othesis for the k th trial is the a v erage dierence in the n um b er of condence lev els incorrectly included in B = [ α, 1 − α ] and the n um b er incorrectly included in ¯ B = [0 , 1] \B : X i ∈I 1 B  P x k,i (Θ 0 1 ; F 0 )  1 ¯ B  P x k,i (Θ 0 1 | σ k )  − 1 B  P x k,i (Θ 0 1 | σ k )  1 ¯ B  P x k,i (Θ 0 1 ; F 0 )  |I | , (4) where Θ 0 1 = (inf Θ 1 , θ 0 ) and where I = { 1 , . . . , 9500 } for the unaected features o r I =  9501 , . . . , 10 4  for the aected features. Here, α = 1% to quan tify p erformance ne ar condence v alues relev an t to the infe r e nce problem of in terpreting the v alue of P x k,i ((inf Θ 1 , θ 0 ) ; F 0 ) as a degree of eviden tial supp ort for θ i < θ 0 . V alues of the conserv atism (4) for the sim ulati on study descri b ed ab o v e app ear in Fig. 4. T o determine the eect of analyzing condence lev els that are v alid marginal (unconditional) p -v al ues for the mixture distribution, the condence lev els v alid giv en ς k = 1 w ere transformed suc h that those c orr e sp onding to unaected features are tail-area probabilities under the marginal n ull distribution: P θ 0 ( Z k,i < z k,i ) = X σ ∈{ 2 / 3 , 1 , 3 / 2 } P ( ς k = σ ) P θ 0 ( Z k,i < z k,i | ς k = σ ) , where Φ ( z k,i ) or P θ 0 ( Z k,i < z k,i ) is the observ ed condence lev el of θ k,i < θ 0 b efore or after trans- formation, resp ectiv ely . Fig. 5 di s pl a ys the results. 4.2 Merit of estimating the n ull distribution While the degree of undesirable c onserv atism illustrates the p oten tial b enet of n ull estimation (4.1), it do es not pro v i de case-sp ecic guidance on whether to estimate the n ull di s tributio n for a giv en data set generated b y an unkno wn distribution. F raming the estimated n ull distr i but i on as a conditioning statistic mak es suc h guidance a v ailabl e from an adaptation of a general measure (Llo yd, 199 2) that quan ties the b enet of conditioning inference on a giv en st a t i st i c. Since an appro ximately ancillary statistic can b e m uc h more rele v an t for infere nce than an exactly ancillary statistic, Llo yd ( 1 992) quan tied the b enet of conditioning on a statistic b y the sum of its deg r e e of ancillarit y and its degree of relev ance, eac h degree dened in terms of observ ed Fisher information. T o assess t he b enet of conditioning i nf e rence on the estimated n ull distribution, the ancillarit y and r e lev ance are instead measured in terms of some nonnegativ e div ergence or r elative in formation 11 Figure 4 : Conserv ativ e error (4) when the assumed n ull distribution is equal to the true n ull distribution conditional on the most common v alue of the precision statistic ( ς k = 1) . The n ull distribution F 0 is the estimated distribution ˆ F 0 in the top t w o plots and the assumed d i st ri b uti on ˜ F 0 in the b ottom t w o plots. The t w o plots on the left and righ t giv e the errors a v eraged o v er the 500 false and the 9500 true n ull h yp otheses, resp ectiv ely . 12 Figure 5 : Conserv ativ e error (4) when the assumed n ull distribution is equal to the true n ull distribution marginal o v er t he di s tribution of precision statistic ς k . The four plots ha v e the same arrangemen t as those of Fig. 4. 13 I ( F || G ) b et w een distributions F and G as foll o ws. The anci llarit y of the estimated distribution ˆ F 0 for d 1 aected features is the exten t to whic h the parameter of in terest is indep e n de n t of the estimate: A ( d 1 ) = − I  ˆ F d 1 0 || ˆ F 0  . (5) Here, ˆ F d 1 0 represen ts the estimated n ul l distribution with its d 1 aected features repla ced with unaected features. More precisely , ˆ F d 1 0 is the estimate of the n ull distribution obtained b y replacing eac h of the d 1 condence lev els f a r thest from 0.5 with ( r − 1 / 2) /d, the exp ected order s tatistic under the assumed n ull distribution, where r is the rank of the distance o f the replace d condence lev el from 0.5. Exact ancillarit y , A ( d 1 ) = 0 , th us results only when ˆ F d 1 0 = ˆ F 0 , whic h holds appro x i mately for a ll d 1 if ˆ F 0 is close to the assumed n ull distribution. Conditioning on a n ull distribution estimate is eectiv e t o the exten t t ha t its relev ance, R = I  ˆ F 0 || ˜ F 0  , (6) is higher than its nonancil larity , I  ˆ F d 1 0 || ˆ F 0  . The imp ortance of tail probabilities in statistica l inference calls for a measure of div ergence I ( F || G ) b et w een distributions F and G with more tail dep endence than the Kullbac k-Leibler div ergence. The Rén yi div ergenc e I q ( F || G ) of order q ∈ (0 , 1) satises this requiremen t, and I 1 / 2 ( F || G ) has pro v ed eectiv e in signal pro cessing as a com p rom ise b et w een the div ergence with the most extreme dep endence on improbable ev e n ts (lim q → 0 I q ( F || G )) and the Kullbac k-Leibler div ergence (lim q → 1 I q ( F || G )) . Another adv an tage of q = 1 / 2 is that the comm utivit y prop ert y I q ( F || G ) = I q ( G || F ) holds only for that order. The notation presen ts I q ( F || G ) as the order- q information gained b y replacing G with F (Rén yi, 1 970, 9.8). Since the random v ariables of the assumed and estimated n ull distributions are p -v al ues or condence lev els transformed b y Φ − 1 (Fig. 1) and sinc e b oth distributions are normal, the relati v e information of order 1 / 2 is simply I 1 / 2 ( F || G ) = − 2 log 2 ( µ F − µ G ) 2 4 ( σ 2 F + σ 2 G ) + 1 2 ln  σ 2 F + σ 2 G 2 σ F σ G  ! with F = N  µ F , σ 2 F  and G = N  µ G , σ 2 G  . Assem bling the ab o v e elemen ts, the net infer ential b enet of estimating the n ull distribution is B ( d 1 ) = A ( d 1 ) + R = I 1 / 2  ˆ F 0 || ˜ F 0  − I 1 / 2  ˆ F d 1 0 || ˆ F 0  (7) if there a r e d 1 aected features, where ˜ F 0 = N (0 , 1) and where t he ancillarit y A ( d 1 ) and relev ance R are giv en b y equations (5) and (6) with I = I 1 / 2 . Basing inference on the estimated n ull distribution is eectiv e to the exten t that B ( d 1 ) > 0 . Fig. 6 uses the gene expression data to i llustrate the use of B ( d 1 ) to determine whether to rel y on the estimated n ull distribution ˆ F 0 or on the assumed n ull distribution ˜ F 0 for inference. 5 Discussion Whereas most adjustmen ts for m ultiple comparisons a r e aimed at minimizing net loss incurred o v e r a series of decisions optimized o v er the s am ple space rather than at w eighing evidence in a particular data set for a h yp othesis, adjustmen t s resulting from estimation of the distribution of test statistics under the n ull h yp othesis are appropriate for all forms of frequen t i s t h yp othesis testing (1.1). A form seldom considered in non-Ba y esian con texts is that of making coheren t de cisions b y minimizing loss a v erage d o v er the parameter space. T aking a step to w ard lling this gap, Sec t i on 3.2 pro vides a loss function suitable for genome-sc ale screening rather than for conrmatory testing and illustrates its appl ication to the detecting evidence of gene upregulation or do wnregulation in microarra y data. 14 Figure 6: The nonancillarit y − A ( d 1 ) v ersus the h yp othetical n um b er k 1 of aected features. The gra y horizo n tal line is t he relev ance R of n ul l estimation and th us indicates the p oin t at whic h c on- ditioning on the e st i mate go es from b e necial ( | A ( d 1 ) | < R ) to deleterious ( | A ( d 1 ) | > R ) according to equation (7). The data set, mo del, and n ull distribution estimator are thos e of Example 7 and Figs. 1, 2, and 3. Sim ulations measured the exten t to whic h estim ating the n ull distribution impro v es conditional inference in an extreme m ultiple-c omparisons setting suc h as that of nding evidence for di eren tial gene expression in mic roarr a y measuremen ts (4.1). While condence lev els of evidence tended to err on the conserv ativ e side under b oth the estimated and assumed n ull distributions, c ons e rv ativ e error quan tied b y n um b ers of condence lev els in [1% , 99%] compared to the condence lev els conditional on the precision statistic ς k w as excessiv e under the assumed n ull but negligible under the estimated n ull (Fig. 4). (Since the same pattern of relativ e conditional p erformance w as obtained b y more realistically setting log ς k equal to a v ar i ate that is indep enden t and uniformly distributed b et w een log (1 / 2) and log (2) , those results w ere not displa y ed.) Due to the hea vy tails of the marginal distribution of pre-transformed condence lev els under the n ull h yp othesis, transforming them to satisfy that distribution under the assumed n ull increased their conditional conserv atism, resulting in ab out the same p erformance of estimated and assumed n ull distributions with resp ect to the aec ted features. The c ase of the unaected fe atur e s is more in tere s ting : the assumed n ull distribution, whic h after the transformation is marginally exact and henc e v alid for Neyman-P earson h yp othesis testing, incurs 35% more conserv ativ e error than the estimated n ull distribution (F i g. 5). Th us, the use of the ma r gi nal n ull distribution in place of N (0 , 1) , the distribution conditional on the cen tral comp onen t of the mixture, substan tially increases conserv ativ e error irresp ectiv e of whether the n ull i s assumed or estim at e d. These results suggest that condence lev els b etter serv e inductiv e inference when deriv ed from a plausible conditional n ull distribution than from the marginal distribution ev en though the latter conforms to the Neyman-P earson standard. This recommendation reinforces the conditionalit y princi ple, whic h is appropriate f o r the inferen tial goal of signicance testing as opp osed to the v arious dec ision-theoretic motiv a t i ons b ehind Neyman- P earson t e st i ng (1.1). Since the ndings of the sim ulation study do not guaran tee the eectiv eness of an estimated n ull distribution ˆ F 0 o v e r the assumed n ull distribution ˜ F 0 , Section 4.2 ga v e an information-theoretic score for determining whether to dep end on ˆ F 0 in plac e of ˜ F 0 for inference on the basis of a particular data set. The score serv es as a to ol for disco v ering whether the anci llarit y and inferen tial relev ance of ˆ F 0 call for its use in inference and decision making. 6 A c kno wledgmen ts This researc h w as partially supp orted b y the F acult y of Medi cine of the Univ ersit y of Otta w a and b y Agriculture and Agri-F o o d Canada. I thank Xuemei T ang for pro viding the fruit dev elopme n t microarra y data. The Biob ase (Gen tleman et al., 2004) and lo cfdr (Efron, 2007b) pac k ages of R (R Dev elopmen t Core T eam, 200 8) facilitated the computational w ork. 15 References Alba, R., P a yton, P ., F e i, Z., McQuinn, R., Debbie, P ., Martin, G. B., T anksley , S. D., Gio v annoni, J. J., 2005. T ranscriptome and selected metab olite analyses rev eal m ultiple p oin ts of eth ylene con trol during tomato fruit dev elopmen t . Plan t Cell 17 ( 1 1), 29542965. Anscom b e, F. J., Aumann, R. J., Mar. 1963. A denition of sub jectiv e probabilit y . The Annals of Mathematical Statistics 34 (1), 199205. Barnard, G. A., 1976. Conditional inference is not inecien t. Sc and i na vian Jour na l of Statistics 3 (3), 132134. Benjamini, Y., Ho c h b erg, Y., 1995. Con trolling the false disco v ery rate: A practical and p o w erful approac h to m ultiple testing. Journal of the Ro y al Statistical So c iet y B 57, 289300. Benjamini, Y., Y ekutieli, D., Edw ards, D., Shaer, J. P ., T amhane, A. C., W estfall, P . H., H olland, B., Benjamini, Y., Y ekutieli, D., 2005. F alse disco v e r y rate-adjusted m ultiple condence in terv als for selected parameters. Journal of the American Statistical Asso ciation 100 (46 9) , 7193. Bic k el, D. R., 200 4. Err o r -rate a n d de cision-theoretic metho ds of m ultiple testing: Whic h genes ha v e high ob jectiv e probabilities of dieren tial expression? Statistical Applications in Genetics and Molecular Biology 3 (1), 8. Bic k el, D. R., 2008a. Correcting the estimated lev el of dieren tial expression for gene selection bias: Application to a microarra y study . Statistical Applications in Gene tics and Molec u l ar Biology 7 (1), 10. Bic k el, D. R., 2008b. The strength of statistical evidence for comp osite h yp o t he s e s with an appli- cation to m ul t i ple comparisons. T ec hnical Rep ort, Otta w a Institute of Systems Biol ogy , COBRA Preprin t Se r i es, Article 49, a v ail able at tin yurl.com/7y a ys p. Bic k el, D. R., 2009 . Coheren t frequen tism. T ec hnical Rep ort, Otta w a Institute of Systems Biology , e-prin t 0907.0 139. Carlin, B. P ., Louis, T. A., June 2000. Ba y es and Empirical Ba y es Metho ds for Data Analysis, Second Edition, 2nd Edition. Chapman & Hall/CR C, New Y ork . Co x, D. R., 1958. Some problems connected with statistica l inference. T he Annals of Mathematical Statistics 29 (2), 357372. Co x, D. R., 1977. The role of signicanc e tests. Scandina vian Journal of Statistics 4, 4970. Co x, D. R., 2006. Principles of Statistical Inference. C a m bridge Univ ersit y Press, Cam bridge. Dudoit, S., Shaer, J. P ., Boldric k, J. C., 20 03. Multiple h yp othesis testing in microarra y exp eri- men ts. Statistical Sci ence 18 (1), 71103. Efron, B ., 2004. Large -s c ale sim ultaneous h yp othesis testing: The c hoi ce of a n ull h yp othesis. Journal of the American Statistical Asso ciation 99 (465), 96104. Efron, B. , 2007a. Correlation and large-scale sim ultaneous signicance testing. Journal of the Amer- ican Statistical Asso ciation 102 (477), 93 103. Efron, B., 2007b. Si ze, p o w er and false d i sco v ery rates. Annals of Statistics 35, 13511377. Efron, B., 2008. Microarra ys, empirical ba y es and the t w o-groups m o del. Statistical Science 23 (1), 122. Efron, B ., Morris, C., 1973. Stein's estima t i on rule a n d its comp etitorsan empirical ba y es approac h. Journal of the American Statistical Asso ciation 68 (341), 117130. 16 Efron, B., Tibshirani, R., 1998. The problem of r e gions. Annals of Statistics 26 (5), 16871718. Efron, B., Tibshirani, R., Storey , J. D., T usher, V., 2001. Empirical ba y es analysis of a microarra y exp erimen t. J. Am. Stat . Asso c . 96 (456), 11511160. Fisher, R. A., 1973. Statistical Metho ds and Scien tic Inference. Hafner Press, New Y ork. F raser, D. A. S., 2 004a. Anci llaries and conditional inference. Statistical Science 19 (2), 333351. F raser, D. A. S., 2004b. [ancillaries and conditional inference]: R e j o inder. Statistical Science 19 (2), 363369. F raser, D. A . S., Reid, N., Jun. 1990. Discussion: An ancilla r i t y parado x whic h app ears in m ulti ple linear regression. The A nn a ls of Statistics 18 (2), 503507. Gen tleman, R. C., Carey , V. J., and, D. M. B., 2004. Bio conductor: Op en soft w are dev elopmen t for computational biology and bioinformatics. Genome biology 5, R80. Ghosh, D., 2006. Shrunk en p-v alues for assessing die r e n tial expression with applications to genomic data analysis. Biometrics 62 (4), 10991106. Gleser, L . J., Jun. 1990. Discussion: An ancillarit y parado x whic h app ears in m ultipl e linear regres- sion. The Annals of Statistics 18 (2), 5075 13. Hill, B. M., Jun. 1990. Discussion: An ancillarit y parado x whic h app ears in m ultiple line ar regres- sion. The Annals of Statistics 18 (2), 5135 23. Ho dges, J. L., J., Lehmann, E. L., 1954. T e s ting the appro ximate v alidit y of statistical h yp otheses. Journal of the Ro y al S tati s ti cal So ciet y . Series B (Metho dol ogical) 16 (2), 261268. Lehmann, E. L., Mar. 1950. Some princ iples of the theory of testing h yp otheses. The Annals of Mathematical Statistics 21 (1), 126. Levine, R.A., C. G., 1996. Con v e rgence of p osterior o dds. Journal of Statistical Planning and Inference 55 (3), 3 31344. Llo yd, C., 1992 . Eectiv e conditioning. Austral. J. Stat i st . 34, 241260. Ma y o, D. G., Co x, D. R., 2006. F requen tist statistics as a theory of inductiv e inference . IMS Lecture Notes - Monograph Series, The Second Eric h L. Lehmann Symp osium - Optimalit y . McCullagh, P ., 2004. Fiducial prediction. T ec hnical Rep ort, Univ ersit y of Chicago. Müller, P ., P a r m igiani, G., Rob ert, C., Rousseau, J., 2004. Optimal sample size for m ultiple testing: the case of gene expression microarra ys. Journal of the America n St a t i s ti cal Asso ciation 99, 990 1001. P olansky , A. M., 2007. Obs e rv ed Con de nce Lev els: Theory and Application. Chapman and Hall, New Y ork. R Dev elopmen t Core T eam, 2008. R: A language and e n vironmen t for statistical computing. R F oundation for Statistical Computing, Vi enna, Austria. Rén yi, A., 197 0. Probabilit y Theory . North-Holland, Amsterdam. Robinson, G. K., 1979. Conditional prop ertie s of statistical pro ce dur e s . The Annals of Statistic s 7 (4), 742755. Sc h w eder, T. , Hjort, N. L ., 2002. Condence and lik eliho o d. Sca n di na vian Journal of Statistics 29 (2), 309332. 17 Shi, J., L. D. W. A ., 2008. Signicance lev els for studies with correlated test statistics. Biostatistics 9 (3), 458466. Stein, C., 1956. Inadm iss i bilit y of the usual estimator for the me an of a m ultiv ariate normal distribu- tion. Pro ceedings of the Third Berk eley Symp osium on M at he matical Statistics and Probabilit y 1, 197206. W ald, A., 1961. Statistical Decision F unctions. John Wil ey and Sons, New Y ork. 18

Estimating the null distribution for conditional inference and genome-scale screening

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment