Exchangeable lower previsions

EXCHANGEABLE LO WER PREVISIONS GER T DE COOMAN, ERIK QU AEGHEBEUR, AND ENRIQUE MIRAND A A B S T R AC T . W e extend de Finetti’ s (1937) notion of exc hangeabili ty to ﬁnite and count- able sequenc es of v ariables, when a subject ’ s belie fs about them are mode lled using coher- ent lo wer pre visions rather than (linear) prev isions. W e prove repr esentation theorems in both the ﬁnite and the counta ble case, in ter m s of sampling without and with repla cement, respect ive ly . W e also est ablish a con ver gence resul t for sample means of exc hangeable sequence s. Finally , we s t udy and solve the problem of exchange able natural exte nsion: ho w to ﬁnd the most conser vati ve (point-wise smallest) coh erent and exc hangeable lower pre vision that dominat es a gi ven lowe r pre vision. 1. I N T RO D U C T I O N This paper deals with belief models f or both ﬁnite and coun table sequences of exchan ge- able ran dom variables takin g a ﬁnite number o f values. When su ch sequen ces of r andom variables are assumed to be exchang eable, this m ore o r less means that th e spe ciﬁc or der in which they are observed is deem ed irrelev an t. The ﬁrst de tailed study of exchangea bility was made b y de Finetti (1 937 ) (with the ter- minolog y of ‘equiv alen t’ events). He proved the now famo us Representation Theorem, which is often interpreted as stating that a sequen ce of random variables is exchange- able if it is cond itionally in dependent and identically distributed (II D). 1 Other impor tant work on exchang eability was do ne by , am ongst many other s , He witt and Sav ag e (195 5 ), Heath and Sudderth (197 6 ), Diacon is and Freedman ( 1980 ) and, in th e co ntext o f th e be- havioural the ory o f im precise prob abilities that we are going to consider here, by W alley (1991). W e refer to Kallenberg (2002, 2005) for mod ern, measure-th eoretic discussions of exchangeab ili ty . One o f the reasons wh y exchangeability is deemed impo rtant, especially by Bayesians, is that, by virtue of d e Finetti’ s Representation Th eorem, an exchang eable mo del can b e seen as a conve x mix ture of mu ltinomial models. This has given some g round (de Finetti, 1937, 1975; Dawid, 1985) to the claim that aleatory pro babilities an d IID proce s ses can be eliminated fro m statistics, and that we can restrict ourselves to considerin g exchangeable sequences instead. 2 De Finetti presented his study of excha ngeability in terms of the beha v ioural no tion of previsions, o r f air pr ices. The central assumptio n un derlying his approach is that a subject should b e able to specify a fair price P ( f ) for any risky transaction (which we shall call a gamble ) f (de Finetti, 197 4 , Chapter 3 ). This is tan tamount to r equiring that he should always be willing and able to decid e, for any real num ber r , between selling the gamble f fo r r , or buying it for th at price. This may no t always be realistic, and for this K ey words and phrases. Exchangeabi lity , lower pre vision, Representa tion T he orem, Bernstein polynomials, con ver gence in distrib ution, excha ngeable natural ex tension, sampling without replac ement, multinomial sam- pling, imprec ise probabil ity , coherenc e. 1 See de Finetti (1975, Sect ion 11.4); and Cif arelli and Regazz ini (1996) for an ov ervie w of de Finetti ’ s work. 2 For a crit ical discussio n of this claim, see W all ey (1991, Section 9.5.6). 1 2 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A reason, it h as been suggested th at we should explicitly allow for a sub ject’ s indec is io n, by distinguishing between his lo wer p r evision P ( f ) , whic h is the supremum price for wh ich he is willing to buy the ga mble f , and h is u pper pre vision P ( f ) , which is the inﬁmum price for which he is willin g to sell f . For any real nu mber r strictly between P ( f ) and P ( f ) , th e subject is then not specifying a choice between selling or buying the gamb le f for r . Su ch lower and u pper previsions are also subject to ce rtain rationality or cohe rence criteria, in very m uch the same way as (pre cis e) p re visions are on de Finetti’ s accou nt. The resulting theo ry of co her ent lo w er pr evisions , sometim es also called the behavioural theory o f imprecise pro babilities, and brilliantly def ended by W alley (1991), g eneralises de Finetti’ s be ha viou ral treatmen t o f subjective, epistemic p robability , and tries to ma k e it mor e re alis tic by allowing for a subject’ s in decision. W e give a b rief overview of this theory in Section 2. Also in this theory , it is interesting to consider what are the consequ ences of a subject’ s exc ha ngeability assessment , i.e. , that the or der in which we co nsider a n umber of ran dom variables is of no con s eq uence. This is o ur motiv ation for study ing excha ngeable lower previsions i n th is paper . An assessment of exchange ability will have a clear impact on the structure of so- called e xchan g ea ble co herent lo wer p re visions. W e shall show they can be written as a combinatio n of (i) a co herent ( linear) p re vision expressing that pe rmutations of realisations of such sequen ces are considered eq ually likely , an d (ii) a coher ent lower prevision for the ‘freq uency’ o f occur rence of th e different values th e random variables can take. Of course, this is the essenc e of re presentation in de Fin etti’ s sense: we gene ralise his results to coherent lower previsions. A subject’ s proba bility assessments may b e local , in the sense th at they con cern the probab ilities or pr e visions of speciﬁc events or r andom variables. A s sessments may on the other hand also be structural (see W alley, 1991, Chapter 9), in which case they specify relationships that should hold between the probabilities or previsions of a num ber of events or r andom variables. One may wonder if (and how) it is p ossible to com bine lo cal with structural assessments, such as exchangeability . W e show that this is in deed the case, and giv e a sur prisingly simple pr ocedure, called exc ha ngeable n atural extension , for ﬁnd ing the point-wise smallest (m ost conservati ve) cohere nt and exchang eable lower pr e vision that domina tes the local assessments. As an example, we use our con clusions to take a fresh look at the o ld question whethe r a given exchangeab le model for n variables can be extended to an e xch angeable mod el f or n + k v ar iables. Before we g o o n, we want to dr a w a tt en tion to a num ber of distinctive featur es of our approa ch. First o f all, the usu al proo fs o f th e Represen tation Th eorem, such as the one s giv en by de Finetti (1937), Heath an d S u dderth (197 6 ), or Kallenb er g (200 5 ), do n ot lend themselves very easily to a gener alis atio n i n terms o f coh erent lo wer previsions. In prin ci- ple it would be possible, at least in some c ases , to start with the v er s io ns already kno wn for (precise) previsions, an d to derive their co unterparts f or lower previsions u sing so-c alled lower en velop e theorems (see Section 2 fo r more details). This is the metho d that W alley (1991, Sections 9.5. 3 and 9. 5.4) s u ggests. But we have decided to follow a different route: we derive ou r results d irectly fo r lower p re visions, using an approa ch based on Ber nstein polyno mials, and we obtain the ones for p re visions as special cases. W e believe th is m ethod to be more elegant and self-contained, and it ce rtainly has the a dditional beneﬁt of drawing attention to what we feel is the essence of de Finetti’ s Representation Theorem: specify ing a coheren t be lief mod el for a countab le exchan geable sequen ce is tantam ount to specify - ing a coher ent (lo wer) p re vision on the linear space of poly nomials on some simplex, and nothing more. EXCHANGEABL E LOWER PREVISIONS 3 Secondly , we shall f ocus on, and use the lan guage of, (lower and upp er) pre vision s for gambles, rathe r than (lower and upper) pro babilities for e vents. Ou r emp hasis on prevision or expectation, rather than probab ilit y , is in keeping with de Finetti’ s (1974) and Whittle’ s (2000) appr oach to pro babilistic modelling. But it is n ot merely a matter of aesth etic pref- erence: as w e shall see, in the b eha viou ral theory of impr ecise probab il ities, th e language of g ambles is m uch m ore expressiv e th an that of events, and we n eed its f ull expressiv e power to derive our results. The plan of the paper is as follows. In Section 2, we introduce a num ber of results fro m the theory of coh erent lower pr e visions necessary to u nderstand th e rest o f the paper . In Section 3, we deﬁne exchang eability for ﬁnite sequence s of r andom v ariab les, and establish a re presentation of coher ent exchangeab le lo wer previsions in terms of sampling withou t replacemen t. In Section 4, we extend the notion o f exchan geability to countable sequences of ran dom variables, and in Section 5 we generalise d e Finetti’ s Represen tation Th eorem (in terms of multin omial sampling) to exchan geable coher ent lower previsions. The r e- sults we ob tain allow us to develop a limit law f or sample means in Section 6 . Section 7 deals with exchangeab le natur al extension : com bining local assessments with exchange- ability . In an ap pendix, we have gath ered a fe w usefu l results about multiv ariate Bernstein polyno mials. 2. L O W E R P R E V I S I O N S , R A N D O M V A R I A B L E S A N D T H E I R D I S T R I B U T I O N S In this section, we want t o pr o v ide a brief sum mary of ideas, and known as well as new results from the theo ry of co herent lower p re visions (W alley, 1991). T his sho uld lead to a better und erstanding of the developments in the sectio ns that fo llo w . For results th at are mentioned without proof, proof s can be fo und in W alley (1991). 2.1. Epistemic uncertainty models. Consider a r and om va riable X th at may assume v al- ues x in some n on-empty set X . By ‘ra ndom’, we mea n that a subject is uncerta in about the actu al value of the variable X , i.e., does not know what this actual value is. But w e do assum e that the actual value of X can b e determined , at least in prin ciple. Th us we may f or in stance co nsider tossing a co in, wh ere X is the outco me of the coin toss, and X = { heads , tails } . It does not really matter here to distinguish between a subject’ s belief before tossing the coin, o r after the toss wher e, say , the outco me has been k ep t hidden from the subject. All that matters for us here is that our subject is in a state of (partial) ignorance because of a lack of knowledge. Th e un certainty mo dels that we are going to describe here are therefor e epistemic , rather tha n physical, probab ility mo dels. Our subject may be uncer tain about the value of X , but he may entertain certain belief s about it. These beliefs may lead him to enga ge in cer tain risky transactions whose outco me depend s on the actual value of X . W e are g oing to try and model his beliefs math ematically by zoomin g in on such risky transaction s. They are captured by the mathema ti ca l concep t of a gamb le o n X , which is a b ounded ma p f f rom X to the set R o f real n umbers. A gamble f r epresents a rand om re ward: if the sub ject accepts f , this means that he is willing to eng age in the following tran saction: we deter mine the actual value x that X assumes in X , and then the subject recei ves the (possibly negati ve) rew ar d f ( x ) , exp ressed in units of some predeter mined lin ear utility . Let us de note by L ( X ) the set of all g ambles on X . De Finetti (197 4 ) has p roposed to mod el a sub ject’ s beliefs b y eliciting his fair price, or pre vision , P ( f ) for cer tain gam bles f . This P ( f ) ca n be de ﬁ n ed as the unique r eal number p su ch that th e sub ject is willin g to buy th e g amble f for all p rices s ( i.e., accep t the gamb le f − s ) and sell f for all price s t (i.e. , accept the gamb le t − g ) for all s < p < t . 4 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A The pr oblem with this app roach is that it presupp oses that ther e is such a r eal num ber , or, in other words, that the subject, whatever h is be liefs abo ut X are, is willing, for (almo st) ev er y real r , to ma k e a choice between buying f f or the price r , or selling it for that price. 2.2. Coherent lower previsions a nd natura l extension. A way to add ress this problem is to consider a m odel which allo ws ou r subject to b e und ecided for some prices r . This is done in W alley’ s (1 991 ) the ory of lower and uppe r pre v is io ns. The lower p r evis io n of the gamble f , P ( f ) , is o ur subject’ s suprem um acceptab le buying price for f ; similarly , ou r subject’ s upper pr evision , P ( f ) , is his inﬁmum acceptable selling price for f . Hence, he is willing t o buy the gamble f for all prices t < P ( f ) an d sell f for all prices s > P ( f ) , but he may be undecided for prices P ( f ) ≤ p ≤ P ( f ) . Since buying the gamble f f or a pr ice t is the same as selling th e gamble − f fo r the price − t [in bo th cases we ac cept the gamble f − t ], the lower and u pper previsions are conjuga te function s: P ( f ) = − P ( − f ) for a n y gam ble f . This allows us to c oncentrate on one of these functions, since we can immediately deri ve results for the other . In this p aper , we focus mainly on lower pr e visions. If a subject has mad e assessments abou t the supremum buying price ( lo wer p re vision) for all gambles in some domain K , we have to check that these assessments are c onsistent with each other . First o f all, we say th at the lower pre v is io n P avoids sur e loss when sup x ∈ X " n ∑ k = 1 λ k [ f k ( x ) − P ( f k )] # ≥ 0 (1) for any natural nu mber n , any ga mbles f 1 , . . . , f n in K and any no n-negati ve real numb ers λ 1 , . . . , λ n . When the inequality (1) is not satisﬁed, there is some n on-negati ve co mbination of acceptable tran sactions th at results in a tr ansaction that makes our subject lose utiles, no matter the outcome, and we then say that his lo wer pr e vision P incurs sur e loss . More genera lly , we say that the lower prevision P is coherent when sup x ∈ X " n ∑ k = 1 λ k [ f k ( x ) − P ( f k )] − λ 0 [ f 0 ( x ) − P ( f 0 )] # ≥ 0 (2) for any natural nu mber n , any ga mbles f 0 , . . . , f n in K and any no n-negati ve real numb ers λ 0 , . . . , λ n . Coherenc e means that our subject’ s supre mum acce ptable buying price for a gamble f in the domain cann ot be ra is ed by co nsidering the a cceptable transactions implicit in other gambles. In particular , it m eans that P av oids sure lo ss . W e call an upp er prevision coheren t if its con jugate lower pre vision is. If a lower prevision P is deﬁned on a linear space of ga mbles K , th en the co herence requirem ent (2) is eq ui valent to the following co nditions: for any gambles f an d g in K and any non-negative real number λ , it should hold that: (P1) P ( f ) ≥ inf f [accepting sure gains]; (P2) P ( λ f ) = λ P ( f ) [n on-negati ve h omogeneity]; (P3) P ( f + g ) ≥ P ( f ) + P ( g ) [ super -ad diti vity]. Moreover , a lower prevision on a genera l do main is coh erent if a nd only if it can b e ex- tended to a coheren t lower prevision on some linear space. A coherent lo wer prevision that is deﬁned on indicators of events o nly is called a coher- ent lower p r ob ability . The indica tor I A of an event A is the { 0 , 1 } -valued gamble given by I A ( x ) : = 1 if x ∈ A an d I A ( x ) : = 0 otherwise. On the oth er h and, a lower prevision P on so me set of gam bles K that av oid s sur e loss can always be ‘cor rected’ and extended to a co herent lower prevision on L ( X ) , EXCHANGEABL E LOWER PREVISIONS 5 in a least-com mittal m anner: the (poin t-wise) smallest, an d therefo re mo st con s er v ative, coheren t lower prevision o n L ( X ) that (p oint-wise) do minates P on K , is called the natural e xtension of P , and it is giv en for all f in L ( X ) b y E ( f ) : = sup ( inf x ∈ X  f ( x ) − n ∑ k = 1 λ k [ f k ( x ) − P ( f k )]  : n ≥ 0 , λ k ≥ 0 , f k ∈ K ) . (3) The n atural extensio n o f P provides the sup remum acce ptable buying price s that we can derive for any ga mble f takin g into accou nt only th e buying prices for the gam bles in K and the no tion of co herence. I nterestingly , P is coh erent if and only if it coincid es with its natural extension E on its domain K , an d in th at case E is the point-wise smallest coher ent lower prevision that e x tends P to L ( X ) . 2.3. Linear previsions. If the lower prevision P ( f ) a nd the upper prevision P ( f ) for a gamble f happ en to c oincide, then the common value P ( f ) = P ( f ) = P ( f ) is called the subject’ s (pre cis e) pr evision for f . Pr e visions are fair pric es in de Finetti’ s (1 974) sen s e. W e shall call th em p r ecise pr obability mo dels, a nd lower previsions will be called impr e- cise . Specifying a prevision P on a do main K is tantam ount to specify ing bo th a lower prevision P and an upper pre vision P on K such that P ( f ) = P ( f ) = P ( f ) . Since then, by conjuga c y , P ( f ) = − P ( − f ) = − P ( − f ) , it is also equ i valent to sp ecifying a lower pre vision P on the larger and negation in variant dom ain K ′ : = K ∪ − K , by lettin g P ( f ) : = P ( f ) if f ∈ K and P ( f ) : = − P ( − f ) if f ∈ − K . This prevision P is the n called coher ent, or linear , if an d only if the assoc iated lo wer p re vision P is co herent, and this is e qui valent to the following cond iti o n sup x ∈ X " n ∑ k = 1 λ k [ f k ( x ) − P ( f k )] − m ∑ ℓ = 1 µ ℓ [ g ℓ ( x ) − P ( g ℓ )] # ≥ 0 for any natural n umbers n and m , any g ambles f 1 , . . . , f n and g 1 , . . . , g m in K and any non-n e gative real numbe rs λ 1 , . . . , λ n and µ 1 , . . . , µ m . A p re vision on th e set L ( X ) of all gambles is linear if and only if it is a po s itive ( f ≥ 0 ⇒ P ( f ) ≥ 0 ) and no rmed ( P ( 1 ) = 1) real linear functional. A pre vision on a general domain is linear if an d only if it c an be extended to a linear p re vision on all gambles. W e shall denote by P ( X ) th e set of all linear pr e visions on L ( X ) . The re s tric ti o n o f a linear prevision P on L ( X ) to the set ℘ ( X ) o f (in dicators of ) all events, is a ﬁn itely additiv e pro bability . Conversely , a ﬁnitely ad diti ve p robability on ℘ ( X ) h as a u nique exten s io n (namely , its natur al extension as a coh erent lower proba- bility) to a linear p re vision on L ( X ) . In this sense, such linear previsions and ﬁnitely additive prob abilities can be considere d equ i valent: f or precise probab ility models, the languag e of events is as e xp ressi ve as that of gambles. A lin ear prevision that is deﬁne d on indicators of events only , and th erefore called a coheren t pro bability , is alw ays the restrictio n of some ﬁnitely additive prob ability . There is an interesting link b etween precise and imprecise prob ability mo dels, e x pressed throug h the following so- called lo wer envelope theorem : A lower prevision P on some domain K is co herent if and only if it is the lower envelope of som e set of linear previsions, and in particular of the conv ex set M ( P ) of all linear pr e visions that dominate it: for all f in K , P ( f ) = inf { P ( f ) : P ∈ M ( P ) } , 6 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A where M ( P ) : = { P ∈ P ( X ) : ( ∀ f ∈ K )( P ( f ) ≥ P ( f )) } . W e can also use the set M ( P ) to calculate the natural extension of P : for any gamble f on X , we have that E ( f ) : = inf { P ( f ) : P ∈ M ( P ) } . If we have a co herent lower probab ility deﬁned on some set of events, then ther e will generally be many (i.e ., an inﬁnity o f) coheren t lo wer previsions that extend it to all g am- bles. In this sense, th e language of gambles is actually more e xpres sive th an that of ev en ts when we are co nsidering lo wer r ather than p recise previsions. As already signalled in the Introd uction, this is th e main reason why in the following sections, we shall fo rmulate our study of exchangea ble lower p re visions in terms of gam bles a nd lower previsions rather than e vents and lower probabilities. 2.4. Important consequences of coherence. Let us list a fe w consequences of coherence that we shall hav e occasion to use fur ther on. Besides th e pro perties (P1)–(P3) we have al- ready men tioned that hold when the domain of P is a linear space, t h e following p roperties hold for a coherent lower prevision whenev er the gambles in volved belong to its domain : (i) P is monoton e : if f ≤ g , then P ( f ) ≤ P ( g ) . (ii) inf f ≤ P ( f ) ≤ P ( f ) ≤ sup f . Moreover , cohere nt lower a nd u pper pr e visions are contin uous with respect to un iform conv ergen ce of g ambles: if a seq uence of gam bles f n conv erges uniform ly to a gam ble f , meaning tha t for e very ε > 0 there is some n 0 such that | f n ( x ) − f ( x ) | < ε for all n ≥ n 0 and f or all x ∈ X , then P ( f n ) conver ge s to P ( f ) and P ( f n ) conver ge s to P ( f ) . I n particular, this imp lies that a coheren t lower prevision deﬁned on some dom ain K can be u niquely extended to a coh erent lower prevision on the un iform closure of K . As an immediate corollary , a coh erent lower prevision on L ( X ) is un iquely determ ined b y the values it assumes on simple gambles, i.e., gambles that ass u me only a ﬁnite number of v alues. W e end this section by intro ducing a nu mber of ne w notions, which can not be fo und i n W alley (19 91 ). They gener alis e familiar d eﬁnitions in standard , m easure-theoretic pr oba- bility to a context where coherent lo wer previsions are u sed as belief mode ls . 2.5. The distribution of a ra nd o m variable. W e shall call a sub ject’ s coh erent lower prevision P on L ( X ) , modelling his beliefs about the value tha t a rand om variable X assumes in the set X , his distribution for that random variable. Now consider another set Y , and a ma p ϕ from X to Y , then we can co nsider Y : = ϕ ( X ) as a rando m variable assuming values in Y . W ith a gamble h o n Y , ther e correspo nds a gamble h ◦ ϕ on X , wh ose lower p re vision is P ( h ◦ ϕ ) . This lead s us to deﬁne the distribution of Y = φ ( X ) as the ind uced c oherent lower prevision Q o n L ( Y ) , deﬁned by Q ( h ) : = P ( h ◦ ϕ ) , h ∈ L ( Y ) . For an event A ⊆ Y , we see th at I A ◦ ϕ = I ϕ − 1 ( A ) , where ϕ − 1 ( A ) : = { x ∈ X : ϕ ( x ) ∈ A } , and consequ ently Q ( A ) = P ( ϕ − 1 ( A )) . So we see that the n otion of an induced lower prevision gener alis es that of an induc ed prob ability m easure. Finally , consid er a sequen ce o f rand om variables X n , all taking values in some metr ic space S . Denote by C ( S ) th e set of all contin uous gam bles on S . For each rando m variable X n , we have a distrib u tion in the form of a c oherent lower p re vision P X n on L ( S ) . Then we say tha t the rand om v ariab les con ver ge in distribution if for all h ∈ C ( S ) , the seque nce of real num bers P X n ( h ) c on verges to some r eal number, which we den ote by P ( h ) . The limit lower prevision P on C ( S ) that we can deﬁne in this way , is coherent, bec ause a point-wise limit of coherent lower previsions al ways is. EXCHANGEABL E LOWER PREVISIONS 7 3. E X C H A N G E A B L E R A N D O M V A R I A B L E S W e are now ready to r ecall W alley’ s (19 91 , Section 9.5) no tion o f exchangeability in the context of the theory of c oherent lower p re visions. W e shall see that it gen eralises de Finetti’ s deﬁnition for linear pre v is io ns (de Finetti, 1937, 1975). 3.1. Deﬁnition and basic properties. Con s id er N ≥ 1 r andom variables X 1 , . . . , X N tak- ing v alues in a non-empty and ﬁnite set X . 3 A subject’ s beliefs abou t the values tha t th ese random variables X = ( X 1 , . . . , X N ) assume jointly in X N is giv en by their (joint) distribu- tion, which is a cohe rent lower prevision P N X deﬁned on the set L ( X N ) of all gambles on X N . Let us deno te by P N the set of all per mutations o f { 1 , . . . , N } . W ith any such permu - tation π we can associate, by th e procedu re of lifting , a permu tation of X N , also den oted by π , that maps any x = ( x 1 , . . . , x N ) in X N to π x : = ( x π ( 1 ) , . . . , x π ( N ) ) . Similarly , with any gamble f on X N , we can consider the per muted gamb le π f : = f ◦ π , or in oth er words, ( π f )( x ) = f ( π x ) for all x ∈ X N . A subject judg es the r andom variables X 1 , . . . , X N to be exchangeable wh en he is d is - posed to exchang e any gamb le f for th e permu ted gam ble π f , mean ing tha t P N X ( π f − f ) ≥ 0, 4 for any per mutation π . T aking into account the properties of coherence, this m eans tha t P N X ( π f − f ) = P N X ( f − π f ) = 0 for all gambles f on X N and all permutatio ns π in P N . In this case, we shall also ca ll the joint coh erent lower prevision P N X exc ha ngeable . A su bject will make an assumption of exchangeability when there is evidence that the pro cesses gen erating the values o f th e random variables are ( physically) similar (W alley, 19 91 , Section 9.5. 2), and consequen tly the order in which the variables are observed is not importan t. When P N X is in p articular a linea r p re vision P N X , exchangeab ili ty is equiv alen t to hav- ing P N X ( π f ) = P N X ( f ) for all gambles f a nd all permu tations π . Ano ther equivalent for- mulation can be given in terms of the ( probability) mass functio n p N X of P N X , deﬁn ed b y p N X ( x ) : = P N X ( { x } ) . I ndeed, if we app ly lin earity to ﬁnd that P N X ( f ) = ∑ x ∈ X N f ( x ) p N X ( x ) , we see that the exchang eability cond ition f or linear pr e visions is equiv alen t to having p N X ( x ) = p N X ( π x ) for all x in X N , o r in other words, the mass function p N X should be in variant under permutation of the indices. This is essentially de Finetti’ s (1937 ) deﬁnition for the exchangeab ility of a pr e vision. The fo llo wing pr oposition, men tioned by W alley (1991, Section 9.5), an d whose pr oof is immediate and th erefore omitted, establishe s an ev en stronger link between W alley’ s and de Finetti’ s n otions of exchang eability . Proposition 1. An y coherent lower pr evision on L ( X N ) that domin ates an exchangeable coherent lower pr evision, is also exchangeable. Mor eover , let P N X be the lower en velo pe of some set of linear pr evisions M N X , in the sense that P N X ( f ) = min  P N X ( f ) : P N X ∈ M N X  for all gamb les f on X N . Then P N X is exchangeable if an d only if all the line ar pr evi- sions P N X in M N X ar e exchangeable. 3 W e could easily deﬁne excha ngeability for va riables that assume value s in a set X that is not necessaril y ﬁnite. But since we only prove in teresting results for ﬁnite X , we hav e decided to use a ﬁnit ary context from the outset. 4 This means that the subject is willin g to acce pt the gamble π f − f , i.e., to exchang e f for π f , in return for any posi tiv e amount of utilit y ε , ho we ver sm a ll. 8 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A If a co herent lower prevision P N X is exchan geable, it is im mediately gu aranteed to be also permutable 5 in the sense that P N X ( π f ) = P N X ( f ) for all gambles f o n X N and all permutation s π in P N . The converse does not hold in general. For lin ear previsions P N X , p ermutability is eq ui va- lent to exchan geability , but th is equ i valence is gener ally bro k en fo r cohere nt lower p re vi- sions that are not linear . 6 Clearly , if X 1 , . . . , X N are exchangea ble, then any permutatio n X π ( 1 ) , . . . , X π ( N ) is exchangeab le as well, and has the same distribution P N X . Moreover , any selection of 1 ≤ n ≤ N rand om variables from amongst the X 1 , . . . , X N are exchangeable too, and their distribution is giv en by P n X , which is the X n -marginal o f P N X , gi ven by P n X ( f ) : = P N X ( e f ) for all gamb les f on X n , wh ere the ga mble e f on X N is the cylin drical extension of f to X N , gi ven by e f ( z 1 , . . . , z N ) : = f ( z 1 , . . . , z n ) for all ( z 1 , . . . , z N ) in X N . Runnin g example. This is the place to intr oduce our runnin g example. As we go along , we shall try to clarify our reason ing by loo king a t a speciﬁc special case, that is as sim- ple as possible, namely wher e the ra ndom variables X k we co nsider can a s sum e only two values. So we m ight be loo king at to ss in g coin s , or thumb tacks, an d co nsider mod elling the exchangeability assessment that the order in which these coin ﬂips are considered is of no co nsequence. M ore gen erally , our ra ndom variables m ight be th e ind icators of ev en ts : X k = I E k , and then we co nsider the ev en ts E 1 , . . . , E N to b e exchang eable when the ord er in which they are observed is of no consequ ence. Formally , we denote t h e set of possible v alues fo r such v ar iables by B = { 0 , 1 } , where 1 and 0 could stand for heads and tails, success and f ailu re, th e occu rrence or not of an event, and so on. In what follows, we shall o ften call 1 a success, and 0 a failure. The joint random variable X = ( X 1 , . . . , X N ) then assumes v alues in the space B N , which is mad e up of all N -tup les o f zer os and ones. As an example, in the case N = 3, two possible elemen ts o f B 3 are ( 1 , 0 , 1 ) an d ( 0 , 1 , 1 ) . These elements can be related to each other b y a pe rmutation of the indices, i.e., o f the ord er in wh ich they occ ur , and th erefore any exchan geable linear p re vision sho uld assign the same probab ility mass to them. And any exchangeable cohe rent lo wer prevision is a lower en velop e of such exchangea ble linear previsions. ♦ 3.2. Count vectors. In terestingly , exchang eable c oherent lower previsions h a ve a very simple repr esentation, in terms of samplin g without replaceme nt. 7 T o see how this co mes about, consider any x ∈ X N . The n the so-called (perm utation) invariant atom [ x ] : = { π x : π ∈ P N } is the smallest n on-empty sub set o f X N that contains x an d th at is invariant und er all permutatio ns π in P N . W e shall d enote the set of perm utation inv ariant atoms of X N 5 W e use the t erminology in W alley (1991, Secti on 9.4). 6 This is an instance of a more genera l phenomenon: we can generally consider two types of in va riance of a belief model (a coherent lo wer pre vision) with respect to a semigroup of transformations: weak and str ong in va riance. The former , of which permuta bility is a special case, tell s us that the model or the be liefs are symmet- rical (symmet ry of evi dence), whereas the latter , of which exch angeabilit y is a speci al case, reﬂects that a sub ject belie ves there is symmetry (e videnc e of symmetry). Strong in v ariance genera lly implie s weak inv arianc e, but the two noti ons in genera l only coinci de for line ar previsi ons. For more det ails, see De Cooman and Miranda (2007). 7 Actuall y this is a special case of a much more genera l represent ation result for coheren t lo wer previsi ons on a ﬁnite space that are strongly in va riant with respect to a ﬁnite group of permutat ions of that space; see (De Cooman and Miranda, 2007) for more detai ls. Here we giv e a differe nt proof. EXCHANGEABL E LOWER PREVISIONS 9 by A N X . It constitutes a partition of the set X N . W e can characterise these in variant ato ms using the counting maps T N x : X N → N 0 deﬁned for all x in X in such a way that T N x ( z ) = T N x ( z 1 , . . . , z N ) : = |{ k ∈ { 1 , . . . , N } : z k = x }| is the number of componen ts of the N - tuple z that as sum e the value x . Here | A | denotes the number of elements in a ﬁnite set A , and N 0 is the set of all no n-negati ve in te ge rs (in cluding zero). W e shall de note by T N X the vector-valued map from X N to N X 0 whose compo nent maps ar e the T N x , x ∈ X . Observe that T N X actually assumes values in the set of co unt vectors N N X : = ( m ∈ N X 0 : ∑ x ∈ X m x = N ) . Since p ermuting the comp onents of a vector leav es the counts inv arian t, meaning that T N X ( z ) = T N X ( π z ) for all z ∈ X N and π ∈ P N , we see that for all y and z in X N y ∈ [ z ] ⇐ ⇒ T N X ( y ) = T N X ( z ) . The co unting map T N X can therefo re be inter preted as a bijection (one-to -one and onto ) between th e set o f in variant atoms A N X and th e set of cou nt vectors N N X , and we can identify any in variant ato m [ z ] by the co unt vector m = T N X ( z ) of any ( and theref ore all) of its elements. W e shall therefore also denote this atom by [ m ] ; and clearly y ∈ [ m ] if an d only if T N X ( y ) = m . The numb er of elemen ts ν ( m ) in any inv arian t atom [ m ] is given by the numb er of different ways in which the compo nents of any z in [ m ] can be per muted, and is therefore giv en by ν ( m ) : =  N m  = N ! ∏ x ∈ X m x ! . If the joint ra ndom variable X = ( X 1 , . . . , X N ) assumes the value z in X N , then th e correspo nding coun t vector assumes the v alue T N X ( z ) in N N X . This means that we can see T N X ( X ) = T N X ( X 1 , . . . , X N ) as a random variable in N N X . I f the a vailable info rmation about the values that X assumes in X N is given by the co herent exchange able lo wer prevision P N X – the distribution of X –, then the corresponding uncertainty model for the v alues that T N X ( X ) assumes in N N X is given by the coherent in duced lower pr e vision Q N X on L ( N N X ) – the distribution of T N X ( X ) –, given by Q N X ( h ) : = P N X ( h ◦ T N X ) = P N X  ∑ m ∈ N N X h ( m ) I [ m ]  (4) for all gam bles h on N N X . W e shall now p rov e a theor em th at shows that, conversely , any exchangeable cohe rent lower p re vision P N X is in fact completely d etermined b y th e correspo nding distribution Q N X of the cou nt vectors, also called its count distribution . It also establishes a relationship between exchangeab il ity and sampling wit h out r eplacement. T o get wh ere we want, con si d er an ur n with N balls of different types, where the differ- ent type s ar e char acterised by the elemen ts x of the set X . Suppo s e the comp osition of the urn is given by the count vector m ∈ N N X , mea ning that m x balls are of type x , for x ∈ X . W e are now going to subsequ ently select (in a r andom way) N balls fro m the ur n, without replacing them. Den ote by Y k the ran dom variable in X that is the type of the k - th b all selected. The p ossible outcomes of this experimen t, i.e., the possible values of th e jo int random variable Y = ( Y 1 , . . . , Y N ) are precisely the elements z of the permu tation inv ar iant atom [ m ] , and r andom selection s im ply means that each of these outco mes is equally likely . Since there are ν ( m ) such possible outcom es, each of them has probability 1 / ν ( m ) . Also, 10 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A any z not in [ m ] has zero probability of being the outco me of ou r sampling procedure. This means that for any gamble f on X N , its (precise) prevision (or expectation) is given by MuHy N X ( f | m ) : = 1 ν ( m ) ∑ z ∈ [ m ] f ( z ) . The linear prevision MuHy N X ( ·| m ) is the o ne assoc iated with a multiple hyper-geometric distribution (Jo hnson et al., 1997, Chapter 39), whence the notatio n. Indeed, for any x = ( x 1 , . . . , x n ) in X n , where 1 ≤ n ≤ N , the probab ility of drawing a sequence of balls x fro m an urn with compo s ition m is giv en by MuHy N X ( { x } × X N − n | m ) = ν ( m − µ µ µ ) ν ( m ) = 1 ν ( µ µ µ ) ∏ x ∈ X  m x µ x  /  N n  where µ µ µ = T n X ( x ) . This m eans that the p robability of d rawing withou t re placement any sample with count vector µ µ µ is ν ( µ µ µ ) times this pro bability [there ar e that many suc h sam- ples], and is therefore giv en by ν ( m − µ µ µ ) ν ( µ µ µ ) ν ( m ) = ∏ x ∈ X  m x µ x  /  N n  , which indeed gives the m ass func tion for the multiple h yper-geometric distribution. For any permutation π of { 1 , . . . , N } MuHy N X ( π f | m ) = 1 ν ( m ) ∑ z ∈ [ m ] f ( π z ) = 1 ν ( m ) ∑ π − 1 z ∈ [ m ] f ( z ) = MuHy N X ( f | m ) , (5) since π − 1 z ∈ [ m ] iff z ∈ [ m ] . This mean s that the linear p re vision MuHy N X ( ·| m ) is ex- changeab le. The following theorem establishes an e ven stronger result. Theorem 2 (Representation theor em for ﬁnite sequ ences of exchan geable v ariab les) . Let N ≥ 1 an d let P N X be a cohe r en t e xchangeab le lower p r evision on L ( X N ) . Let f be any gamble on X N . Then the following statements ho ld: 1. The ga mble ˆ f on X N given by ˆ f : = 1 | P N | ∑ π ∈ P N π f is permutation in variant, mean ing that π ˆ f = ˆ f for all π ∈ P N . It is ther efore constant on th e permutation in variant atoms of X N , and also given by ˆ f = ∑ m ∈ N N X I [ m ] MuHy N X ( f | m ) . (6) 2. P N X ( f − ˆ f ) = P N X ( ˆ f − f ) = 0 , and therefor e also P N X ( f ) = P N X ( ˆ f ) . 3. P N X ( f ) = Q N X ( MuHy N X ( f |· )) , where MuHy N X ( f |· ) is the gamble on N N X that assumes the value MuHy N X ( f | m ) in m ∈ N N X . Consequently a lower pr evision o n L ( X N ) is exchangeable if and o nly if it has the form Q ( MuHy N X ( ·|· )) , wher e Q is an y coh er e nt lower pr evision on L ( N N X ) . Pr oof. The ﬁrst statement is fairly immedia te. W e therefo re turn at on ce to the seco nd statement. Obser v e that f − ˆ f = 1 | P N | ∑ π ∈ P N [ f − π f ] . Now use the coheren ce [super- additivity and non- negati ve homogeneity] , and the exchan geability of the lower pr e vision P N X to ﬁnd that P N X ( f − ˆ f ) ≥ 1 | P N | ∑ π ∈ P N P N X ( f − π f ) = 0 . EXCHANGEABL E LOWER PREVISIONS 11 In a comp letely s imilar way , we get P N X ( ˆ f − f ) ≥ 0. Sin ce it also follows from the co her - ence [ super -ad diti vity] of P N X that P N X ( f − ˆ f ) + P N X ( ˆ f − f ) ≤ P N X ( 0 ) = 0 , we ﬁnd that indeed P N X ( f − ˆ f ) = P N X ( ˆ f − f ) = 0. Now let g : = f − ˆ f , th en f = ˆ f + g and ˆ f = f − g , and use the coherenc e [supe r -additivity and accepting sure gains] of P N X to infer that P N X ( f ) ≥ P N X ( ˆ f ) + P N X ( g ) = P N X ( ˆ f ) ≥ P N X ( f ) + P N X ( − g ) = P N X ( f ) , whence indeed P N X ( f ) = P N X ( ˆ f ) . T o prove the third stateme nt, use P N X ( f ) = P N X ( ˆ f ) tog ether with Equations (4) a nd (6) to ﬁnd that P N X ( f ) = P N X ( ˆ f ) = Q N X ( MuHy N X ( f |· )) . These statemen ts imply th at any exchang eable coherent lower pr e vision is o f the f orm Q ( MuHy N X ( ·|· )) , where Q is some coh erent lower pre vision on L ( N N X ) . Con versely , if Q is any c oherent lower pr e vision on L ( N N X ) , then Q ( MuHy N X ( ·|· )) is a coher ent lower prevision on L ( X N ) that is exchangeab le: simply observe that for any gamble f on X N and any π ∈ P N , Q ( MuHy N X ( f − π f |· )) = Q ( MuHy N X ( f |· ) − MuHy N X ( π f |· )) = Q ( 0 ) = 0 , taking into account that e ach MuHy N X ( ·| m ) is an exchan geable linea r prevision [ Equa- tion (5)].  This theorem implies that any exchangea ble co herent lower prevision on X N can b e associated with, or equ i valently , that any collec tion of N exchangea ble random variables in X can be seen as the result of, N ran dom draws witho ut replacem ent fr om an u rn with N b alls whose types are characterised by the elements x of X , whose composition m is unknown, but for which the a vailable informa ti o n ab out the co mposition is modelled by a coherent lower pre vision on L ( N N X ) . 8 That exchangea ble linear pr e visions can b e interpr eted in ter ms of sampling with out replacemen t from an urn with unk no wn co mposition, is of cou rse well-known, and es- sentially go es back to de Fine tti ’ s work o n exchan geability; see ( de Finetti, 19 37 ) and (Cifarelli and Regazzini , 19 96 ). Heath and Sudde rth (1 976 ) g i ve a simple pro of fo r vari- ables that may assume two values. But we b elie ve our proo f 9 for the more gen eral case of exchangeab le coheren t lower pr e visions and random v ariables that may assume more than two values, is co nceptually even simpler than Heath and Sudder th’ s p roof, even though it is a special case of a much more gener al re presentation r esult (De Cooman and Mirand a , 2007, T heorem 30). The essence of the pr esent proof in the sp ecial case of linear previsions P is captu red wonderfully well by Z abell ’ s (1992, Section 3. 1) succinct statement: “ Thus P is exchangeab le if and on ly if two sequence s h a ving the same freque nc y vector ha ve the same probability . ” Runnin g example. W e co me back to the simple case co nsidered bef ore, where X = B . Any two elements x and y o f B N can b e related by some per mutation of the ind ices { 1 , . . . , N } iff they have the same nu mber of successes s = T N 1 ( x ) = T N 1 ( y ) (and of cou rse, the same number of failures f = N − s ). W e can identify the count space N N B = { ( s , f ) : s + f = N } 8 When P N X , and therefore also Q N X , is a linear previ sion, i.e., a precise probabil ity model, this interpre- tatio n follo ws from the Theorem of T otal Probabil ity , by interpre ting the MuHy N X ( ·| m ) as conditiona l prev i- sions, and Q N X as a marginal . For imprecise models P N X and Q N X , the va lidity of this interpre tation follo ws by analogou s reasoning, using W alle y’ s Margina l Extension Theorem; see W alle y (1991, Section 6.7) and Miranda and De Cooman (2006). 9 W all ey (1991, Chapter 9) also m e ntions this result for exchang eable coherent lo wer previsi ons. 12 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A with the set { s : s = 0 , . . . , N } , and count vectors m = ( s , N − s ) with th e cor responding number of successes s , which is what we shall do from now on. The 2 N elements of B N are divided into N + 1 in variant atom s [ s ] o f elements with th e same n umber of successes s , each o f which has ν ( s ) =  N s  = N ! s ! ( N − s ) ! elements. W e have depicted the situation for N = 3 in Fig ure 1. ( 0 , 0 , 0 ) ( 1 , 0 , 0 ) ( 0 , 1 , 0 ) ( 0 , 0 , 1 ) ( 1 , 1 , 0 ) ( 1 , 0 , 1 ) ( 0 , 1 , 1 ) ( 1 , 1 , 1 ) s = 0 s = 1 s = 2 s = 3 F I G U R E 1 . The fo ur in variant atoms [ s ] in the space N 3 B , charac terised by the number of successes s . Exchang eability for ces each of the elemen ts within an inv ar iant atom [ s ] to b e ‘equally likely’. So each [ s ] is to be considered as a ‘lu mp’, within which prob ability mass is dis- tributed un iformly . The only freed om exchangeability leaves us with , lies in assignin g probab ilities to the lump s [ s ] . T his is the essence of T heorem 2, which tells us th at any exchangeab le coherent lower prevision P N B on L ( B N ) can be seen as the c omposition o f a coheren t lower prevision Q N B on L ( { 0 , 1 , . . . , N } ) , r epresenting beliefs abou t the num - ber of successes s , and the hyp er -geometric distributions on [ s ] , which gu arantee that the probab ility is distributed u niformly over ea ch of the ν ( s ) =  N s  elements o f [ s ] : fo r any gamble f on B N , Hy N ( f | s ) : = MuHy N B ( f | s , N − s ) = 1 ν ( s ) ∑ x ∈ [ s ] f ( x ) . ♦ For an exchangeab le rando m variable X = ( X 1 , . . . , X N ) , with (exchangeab le) distribu- tion P N X on L ( X N ) , we have seen that we can comp letely chara cterise this distribution by the correspo nding distribution of the count vectors Q N X on L ( N N X ) . W e have also seen that any selectio n of 1 ≤ n ≤ N rando m variables from amon gst the X 1 , . . . , X N will be exchang eable too, and that their distribution is giv en by P n X , wh ich is the X n -marginal o f P N X . T here is moreover an interesting relation between th e dis- tributions Q N X and Q n X of th e corr es p onding count vector s , which we shall derive in the next section (Equa tion ( 9 )) . On the other hand , it is well-kn o wn ( s ee for instance Diaconis and Freedman (1980); we shall come back to this in Section 7) that if we have an exchangeab le N -tuple ( X 1 , . . . , X N ) , it is not a l ways possible to e xten d it to an exchange able N + 1-tuple . In the next section, we inv estigate what happens when we consider exchange- able tuples of arbitrary length. 4. E X C H A N G E A B L E S E Q U E N C E S 4.1. Deﬁnitions. W e no w gener alis e the de ﬁ n ition of e xch angeability from ﬁnite to count- able sequ ences o f rando m variables. Con sider a coun table sequen ce X 1 , . . . , X n , . . . of random variables taking values in the same non-e mpty set X . This sequence is ca lled exc ha ngeable if a n y ﬁnite collection of rando m variables taken from this seq uence is ex- changeab le. This is clearly equ i valent to req uiring that the rando m variables X 1 , . . . , X n should be exchangeable for all n ≥ 1. EXCHANGEABL E LOWER PREVISIONS 13 W e can also consider the exchan geable s eq uence as a single ran dom v aria ble X assum- ing values in the set X N , where N is the set o f the n atural num bers ( positi ve integers, without zero). Its possible values x are sequen ces x 1 , . . . , x n , . . . of eleme nts of X , or in other words, map s from N to X . W e can mode l the av ailable inf ormation about the value th at X assumes in X N by a co herent lower prevision P N X on L ( X N ) , called the distribution of the exchang eable r andom sequen ce X . The rando m seque nce X , o r its d is tr ib ution P N X , is clearly exchang eable if and only if a ll its X n -mar gin als P n X are exchang eable for n ≥ 1. These marginals P n X on L ( X n ) are deﬁned as follows: fo r any gamble f on X n , P n X ( f ) : = P N X ( e f ) , where e f is the cylindrical extension o f f to X N , deﬁned by e f ( x ) : = f ( x 1 , . . . , x n ) fo r all x = ( x 1 , . . . , x n , x n + 1 , . . . ) in X N . In ad dition, the family of exch angeable c oherent lower pr e visions P n X , n ≥ 1, satisﬁes the following ‘ time consistency ’ req uirement: P n X ( f ) = P n + k X ( e f ) , (7) for all n ≥ 1, k ≥ 0 , and all gambles f on X n , where n o w e f denotes the cylindric al exten - sion of f to X n + k : P n X should be the X n -marginal of any P n + k X . It follows at once that any ﬁnite collection of n ≥ 1 ran dom variables taken fro m such an exchangeab le sequen ce has the same distribution as the ﬁrst n variables X 1 , . . . , X n , which is the exchangeable coherent lo wer prevision P n X on L ( X n ) . Con versely , sup pose we have a collectio n of exchangeab le coh erent lower previsions P n X on L ( X n ) , n ≥ 1 th at satisfy the time consistency requ irement (7). Th en any co - herent lower prevision P N X on L ( X N ) th at has X n -marginals P n X is exchang eable. The smallest, or most conservati ve such (exchan geable) co herent lower pre v is io n is given by E N X ( f ) : = sup n ∈ N P n X ( proj n ( f )) = lim n → ∞ P n X ( proj n ( f )) , where f is any ga mble on X N , and its lower pr ojection pro j n ( f ) o n X n is the gamb le on X n that is deﬁned by proj n ( f )( x ) : = inf z k = x k , k = 1 ,..., n f ( z ) for all x ∈ X n , i.e., the lower projection of f on x is the inﬁmum of f over th e elemen ts of X N whose projectio n on X n is x . See (De Cooman and Miranda, 2006, Section 5) for more details. 4.2. Time co ns istency of the count distributions. I t will be of crucial inter es t for what follows to ﬁnd out what a re the consequ ences of the time con s istency requ irement (7) o n the marginals P n X for the corr es p onding family Q n X , n ≥ 1, of distributions of the co unt vectors T n X ( X 1 , . . . , X n ) . Consider therefo re n ≥ 1, k ≥ 0 and any gamble h on N n X . Let f : = h ◦ T n X , then Q n X ( h ) = P n X ( f ) = P n + k X ( e f ) = Q n + k X ( MuHy n + k X ( e f |· )) , where the ﬁrst equality fo llo ws fro m Equation (4), the second from Eq uation (7), and the last from Theo rem 2. Now for any m ′ in N n + k X , and any z ′ = ( z , y ) in X n + k = X n × X k 14 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A we hav e that T n + k X ( z ′ ) = T n X ( z ) + T k X ( y ) and therefo re MuHy n + k X ( e f | m ′ ) = 1 ν ( m ′ ) ∑ z ′ ∈ [ m ′ ] e f ( z ′ ) = 1 ν ( m ′ ) ∑ ( z , y ) ∈ [ m ′ ] f ( z ) = 1 ν ( m ′ ) ∑ m ∈ N n X m ≤ m ′ ∑ y ∈ [ m ′ − m ] ∑ z ∈ [ m ] f ( z ) = 1 ν ( m ′ ) ∑ m ∈ N n X m ≤ m ′ ν ( m ′ − m ) ν ( m ) MuHy n X ( f | m ) = ∑ m ∈ N n X ν ( m ′ − m ) ν ( m ) ν ( m ′ ) h ( m ) , (8) since MuHy n X ( f | m ) = h ( m ) , and ν ( m ′ − m ) is zero un less m ≤ m ′ . So we see that time consistency is equiv alen t to Q n X ( h ) = Q n + k X  ∑ m ∈ N n X ν ( · − m ) ν ( m ) ν ( · ) h ( m )  (9) for all n ≥ 1 , k ≥ 0 and h ∈ L ( N n X ) . 5. A R E P R E S E N TA T I O N T HE O RE M F O R E X C H A N G E A B L E S E Q U E N C E S De Finetti (19 37 , 1975) has p rov en a re presentation result for exchan geable sequences with lin ear p re visions th at generalises Theo rem 2, and where multino mial distributions take over th e r ˆ ole th at th e mu ltiple hyp er -geo metric ones play for ﬁnite co llections of ex- changeab le variables. One simple and intuitive way (see also de Finetti, 1975, p. 218) to understan d why th e rep resentation result can b e thu s extended from ﬁnite collections to countab le sequences, is b ased on the fact that the multino mial distribution can be seen as as limit o f multip le hyper-geom etric ones (Jo hnson et al., 1997, Ch apter 39 ). This is also the central idea beh ind Heath and Sudder th ’ s (1976) simple p roof of this r epresentation result in the case of variables that may only assume two possible values. Howe ver , there is another, arguably e ven simpler , appr oach to proving the same results, which we p resent here. It also works f or exchangeability in the co nte x t of coher ent lo wer previsions. And as we shall have occasio n to explain further on, it h as the additional ad- vantage of clearly indicatin g what the ‘represen tation’ is, and where it is uniquely deﬁned. W e make a start at proving our rep resentation theorem by tak ing a look at multin omial processes. 5.1. Multinomial processes a r e exchangea b le. Consider a s eq uence of random variables Y 1 , . . . , Y n , . . . tha t are mutually indepe ndent, and such that each random variable Y n has the same p robability mass fu nction θ θ θ : the prob ability that Y n = x is θ x for x ∈ X . 10 Observe that θ θ θ is an element of the X -simplex Σ X = ( θ θ θ ∈ R X : ( ∀ x ∈ X )( θ x ≥ 0 ) and ∑ x ∈ X θ x = 1 ) . Then fo r any n ≥ 1 and any z in X n the p robability that ( Y 1 , . . . , Y n ) is equal to z is g i ven by ∏ x ∈ X θ T x ( z ) x , which yields the multinomia l mass fun ction (Joh nson et al., 1997, Chap- ter 35). As a result, we hav e for any gamble f o n X n that its corresponding (mu ltinomial) 10 In other words, the ran dom v ariables are IID. EXCHANGEABL E LOWER PREVISIONS 15 prevision (expectation ) is given by Mn n X ( f | θ θ θ ) = ∑ z ∈ X n f ( z ) ∏ x ∈ X θ T x ( z ) x = ∑ m ∈ N n X ∑ z ∈ [ m ] f ( z ) ∏ x ∈ X θ m x x = ∑ m ∈ N n X MuHy n X ( f | m ) ν ( m ) ∏ x ∈ X θ m x x = CoMn n X ( MuHy n X ( f |· ) | θ θ θ ) , (10) where we deﬁned the (count multinomial) linear prevision CoMn n X ( ·| θ θ θ ) on L ( N n X ) by CoMn n X ( g | θ θ θ ) = ∑ m ∈ N n X g ( m ) ν ( m ) ∏ x ∈ X θ m x x , (11) where g is any gamble o n N n X . Th e c orresponding probability mass for any coun t vector m , namely 11 CoMn n X ( { m }| θ θ θ ) = ν ( m ) ∏ x ∈ X θ m x x = : B m ( θ θ θ ) , (12) is the p robability of observ ing some value z for ( Y 1 , . . . , Y n ) wh ose count vector is m . The polyno mial fun ction B m on the X -simplex is called a (multivariate) Bernstein (b asis) poly- nomial . W e h a ve lis ted a nu mber of very interesting properties for these special polynom i- als in the Append ix. One imp ortant fact, which we shall need q uite so on, is that the set  B m : m ∈ N n X  of all Bernstein (basis) p olynomials of ﬁxed degree n f orms a basis for the linear space o f all (multivariate) p olynomials on Σ X whose degree is at mo s t n ; henc e their name. If we h a ve a polynom ial p of d e gr ee m , this means that for any n ≥ m , p has a unique (Bernstein) decomp osition b n p ∈ L ( N n X ) such that p = ∑ m ∈ N n X b n p ( m ) B m . If we combin e this with Equ ations ( 11 ) and ( 12) , we ﬁnd th at b n p is the uniqu e gamble on N n X such that CoMn n X ( b n p |· ) = p . W e dedu ce fro m Equa tion (10) and Theo rem 2 that the linear prevision Mn n X ( ·| θ θ θ ) on L ( X n ) – the distribution o f ( Y 1 , . . . , Y n ) – is exchangea ble, and that CoMn n X ( ·| θ θ θ ) is the correspo nding distribution for the corresp onding cou nt vectors T n X ( Y 1 , . . . , Y n ) . Th erefore the sequence of IID random variables Y 1 , . . . , Y n , . . . is exchang eable. Runnin g example. Let us go b ack to ou r examp le, where X = B . Here the B -simplex Σ B = { ( θ , 1 − θ ) : θ ∈ [ 0 , 1 ] } can be id entiﬁed with the unit interval, an d every elemen t θ θ θ = ( θ , 1 − θ ) can be identiﬁed with the probability θ of a success. The co unt multinom ial distribution CoMn n B ( ·| θ θ θ ) now of cou rse turns into th e (co unt) binomial distribution CoBi n ( ·| θ ) on L ( { 0 , . . . , n } ) , gi ven by CoBi n ( g | θ ) : = n ∑ s = 0 g ( s )  n s  θ s ( 1 − θ ) n − s = n ∑ s = 0 g ( s ) B n s ( θ ) (13) for any gam ble g on the set { 0 , 1 , . . . , n } of po s sible values for th e numb er of succe s ses s . In this expression, the B n s ( θ ) : =  n s  θ s ( 1 − θ ) n − s are the n + 1 (uni variate) Bernstein basis polyno mials of degree n (Lor entz , 1 986 ; Prautzsch et al., 2 002 ) . For ﬁxed n , they add up to one and are linear ly in dependent, and they form a basis for the linear sp ace of all polyno mials on [ 0 , 1 ] of degree at mo st n . ♦ 11 W e a ssum e i mplicitly that a 0 = 1 for all a ≥ 0. 16 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A 5.2. A repr esenta tion theorem. Consider the following linear subspace of L ( Σ X ) : V ( Σ X ) : = { CoMn n X ( g |· ) : n ≥ 1 , g ∈ L ( N n X ) } = { Mn n X ( f |· ) : n ≥ 1 , f ∈ L ( X n ) } , each of whose elements is a polynomia l functio n on the X - s im ple x : CoMn n X ( g | θ θ θ ) = ∑ m ∈ N n X g ( m ) ν ( m ) ∏ x ∈ X θ m x x = ∑ m ∈ N n X g ( m ) B m ( θ θ θ ) , and is actu ally a linea r comb ination o f Bernstein basis p olynomials B m with coe f ﬁcients g ( m ) . So V ( Σ X ) is the linear spac e spa nned by all Bernstein b asis po lynomials, and is therefor e the set of all po lynomials on the X - s im ple x Σ X . Now if R X is any coh erent lower prevision o n L ( Σ X ) , then it is easy to see that the family of coherent lo wer previsions P n X , n ≥ 1, de ﬁned by P n X ( f ) = R X ( Mn n X ( f |· )) , f ∈ L ( X n ) (14) is still exchangeab le and time c onsistent, an d the corre s p onding count distributions ar e giv en by Q n X ( f ) = R X ( CoMn n X ( g |· )) , g ∈ L ( N n X ) . (15) Here, we are going to show that a converse result also hold s : for any tim e con s istent family of exchan geable coherent lower previsions P n X , n ≥ 1 , ther e is a co herent lower prevision R X on V ( Σ X ) such th at Equation (14), or its reform ulation for cou nts (1 5 ), holds. W e shall c all such an R X a r epresentation , or representing coh erent lo wer pre vision , for the family P n X . Of co urse, any rep resenting R X , if it exists, is unique ly d etermined on V ( Σ X ) . So consider a family of coheren t lower previsions Q n X on L ( N n X ) that are time con - sistent, me aning that Equation (9) is satisﬁed. It sufﬁces to ﬁnd an R X such that (15) holds, because the correspon ding exchan geable lower previsions P n X on L ( X n ) are then uniquely determined by Theorem 2, and automatically satisfy the condition (14). Our proposal is to deﬁne the functio nal R X on the s et V ( Σ X ) as follows : consider any element p of V ( Σ X ) . Then, by deﬁnition, ther e is some n ≥ 1 and a corr esponding unique b n p ∈ L ( N n X ) such that p = CoMn n X ( b n p |· ) . W e then let R X ( p ) : = Q n X ( b n p ) . Of cou rse, the ﬁrst thin g to check is whe ther this deﬁnition is consistent: any poly nomial p of degree m has uniqu e re presentations b n p for all n ≥ m , which means that we h a ve to check that no in consistencies can arise in th e sense that Q n 1 X ( b n 1 p ) 6 = Q n 2 X ( b n 2 p ) fo r some n 1 , n 2 ≥ m . It turns out that this is guar anteed by the time con s istency of t h e P n X , or that of the correspon ding Q n X , as is made apparen t by the pro of of the following lemma. Lemma 3 . Consider a po lynomial of degr ee m, and let n 1 , n 2 ≥ m. Then Q n 1 X ( b n 1 p ) = Q n 2 X ( b n 2 p ) . Pr oof. W e may assume witho ut loss of generality that n 2 ≥ n 1 . The Bernstein decom posi- tions b n 1 p and b n 2 p are then related by Zhou’ s formula [see Equation (22) in the Appendix ]: b n 2 p ( m 2 ) = ∑ m 1 ∈ N n 1 X ν ( m 2 − m 1 ) ν ( m 1 ) ν ( m 2 ) b n 1 p ( m 1 ) , m 2 ∈ N n 2 X . Consequently , by the tim e con s istency require ment ( 9 ) , we indeed get that Q n 2 X ( b n 2 p ) = Q n 1 X ( b n 1 p ) .  W e also ha ve to check whether the functional R X thus deﬁned on the linear space V X is a coheren t lower prevision. Th is is es tab lished in t h e following lemma. Lemma 4. R X is a coher en t lo w er p r evis io n on the linea r space V ( Σ X ) . EXCHANGEABL E LOWER PREVISIONS 17 Pr oof. W e show that R X satisﬁes the necessary and sufﬁcient con ditions (P1)–( P3 ) fo r coheren ce of a lower prevision on a linear space. W e ﬁrst prove that (P1) is satisﬁ ed . Consider any p ∈ V ( Σ X ) . Let m be the degree of p . W e must show that R X ( p ) ≥ min p . W e ﬁnd th at R X ( p ) = Q n X ( b n p ) ≥ min b n p for a ll n ≥ m , because of th e cohere nce [ accepting sure gains] of the coun t lower pr e visions Q n X . But Proposition 8 in the Appen dix tells us that min b n p ↑ min p , whence indeed R X ( p ) ≥ min p . Next, consider any p in V ( Σ X ) an d any real λ ≥ 0. Consider any n th at is not smaller than the degree of p . Since obvio usly b n λ p = λ b n p , we get R X ( λ p ) = Q n X ( b n λ p ) = Q n X ( λ b n p ) = λ Q n X ( b n p ) = λ R X ( p ) , where th e third equ ality follows fro m the co herence [no n-negati ve hom ogeneity] of th e count lower prevision Q n X . This tells us that the lower prevision R X satisﬁes the non - negativ e homogeneity requirement (P2). Finally , consider p and q in V ( Σ X ) , and any n that is not s m aller than the maximum of the degrees of p and q . Since obviou sl y b n p + q = b n p + b n q , we get R X ( p + q ) = Q n X ( b n p + q ) = Q n X ( b n p + b n q ) ≥ Q n X ( b n p ) + Q n X ( b n q ) = R X ( p ) + R X ( q ) , where the ineq uality follo ws fro m the coher ence [super - additi vity] of the co unt lo wer p re- vision Q n X . This tells us that the lower pr e vision R X also satisﬁes th e super-additivity requirem ent (P3) and as a conseq uence it is coheren t.  W e can summarise the argument above as f ollo ws. Theorem 5 (Repr esentation th eorem for exchange able sequences) . Given a time consistent family of exchangeable coherent lower pr evisions P n X on L ( X n ) , n ≥ 1 , there is a u nique coherent lower p r evision R X on th e linear space V ( Σ X ) of a ll polyno mial ga mbles o n the X -simplex, such that for all n ≥ 1 , all f ∈ L ( X n ) and all g ∈ L ( N n X ) : P n X ( f ) = R X ( Mn n X ( f |· )) and Q n X ( g ) = R X ( CoMn n X ( g |· )) . (16) Hence, the b elief mo del governing any co untable exchan geable sequ ence in X can be completely character ised by a co herent lower prevision on the linear spac e of poly nomial gambles on Σ X . In the particular case where we have a time co nsistent family of exchange able lin ear previsions P n X on L ( X n ) , n ≥ 1 , then R X will be a lin ear prevision R X on th e linear space V ( Σ X ) o f all polyn omial gamb les on the X - s im ple x . As such, it will be ch arac- terised by its values R X ( B m ) o n the Ber nstein basis polyno mials B m , m ∈ N n X , n ≥ 1, or on any other basis of V ( Σ X ) . It is a consequence of coh erence that R X is also uniquely d etermined on the set C ( Σ X ) of all continu ous gamb les on the X -simplex Σ X : by the Ston e-W eierstaß theorem , a n y such gamble is the uniform limit of s o me s eq uence of polynom ial gamb les, and coher ence implies that the lo wer pr e vision of a uniform limit is the limit of the lo wer previsions. This unicity result cannot b e extended to more gener al (d iscontinuous) types of g ambles: the coheren t lower pr e vision R X is not uniqu ely d etermined o n the set o f all gamb les L ( Σ X ) on the simplex: and th ere m ay b e different coheren t lower previsions R 1 X and R 2 X on L ( Σ X ) satisfying Equ ation (1 6 ). 12 But any such lower previsions will ag ree on 12 See Miranda et al. (2007) for a study of the gambles whose prevision is determine d by the previsio n of the polynomia ls. 18 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A the class V ( Σ X ) of polyno mial gambles, wh ich is the class o f gam bles we need in or der to characterise the exchangeable sequence. 13 W e n o w inv estigate the meaning of the re presenting lower prevision R X a bit further . Consider the sequenc e of so-called frequency rando m variables F n : = T n X ( X 1 , . . . , X n ) / n correspo nding to an exchang eable sequence of rand om variables X 1 , . . . , X n , . . . , an d as- suming values in th e X -simplex Σ X . T he distribution P F n of F n , i.e., the coh erent lower prevision on L ( Σ X ) tha t m odels the av ailab le inf ormation abo ut the values th at F n as- sumes in Σ X , is gi ven by P F n ( h ) : = Q n X ( h ◦ 1 n ) = R X ( CoMn n X ( h ◦ 1 n |· )) , h ∈ L ( Σ X ) , because we k no w that Q n X is the distribution of T n X ( X 1 , . . . , X n ) , and also takin g in to a c- count Theorem 5 for the last equality . Now , CoMn n X ( h ◦ 1 n | θ θ θ ) = ∑ m ∈ N n X h  m n  B m ( θ θ θ ) is the Bernstein a ppr o ximant o r appr oximatin g Bernstein polyn omial of degree n fo r the gamble h , and it is a known result ( s ee (Feller, 197 1 , Section VII .2), (Heitzin ger et al., 2003, Section 2)) th at the sequen ce of appr oximating Bernstein p olynomials CoMn n X ( h ◦ 1 n |· ) conver g es un iformly to h for n → ∞ if h is con tinuous. So, b ecause R X is deﬁned uniquely , and is unifo rmly con tinuous, on the set C ( Σ X ) , we ﬁnd th e following result, which provides an inter pretation for the repr es en tation R X , and which can b e seen as another gener alis atio n of d e Finetti’ s Repr esentation Theo rem: R X is the limit of th e fre- quency distrib ution s. Theorem 6. F or all continuou s gamb les h on Σ X , we have that lim n → ∞ P F n ( h ) = R X ( h ) , or , in other wor d s , the sequenc e o f distributions P F n conver ges p oint-wise to R X on C ( Σ X ) , and in this speciﬁc sense, the sample frequen cies F n conv erge i n distribution . Runnin g e xa mple . Back to our example, wher e X = B . Here the Representation Theorem (Theor em 5) states that the coh erent count lower previsions Q n B , n ≥ 1, fo r any exchange - able sequence of variables in B have the form Q n B ( g ) = R B ( CoBi n ( g |· )) , for a ll g ambles g on th e set { 0 , 1 , . . . , n } of po s sible numb ers of successes s , wher e the (count) bin omial distrib ution CoB i n ( ·| θ ) is given by Equation (1 3 ), and R B is som e coher- ent lower prevision deﬁned on the set V ([ 0 , 1 ]) o f all polyn omials on [ 0 , 1 ] , which is the set of possible v alu es for the pro bability θ of a success. This R B can be uniq uely extended to a co herent lower prevision on th e set C ([ 0 , 1 ]) of all co ntinuous gambles (functio ns) on [ 0 , 1 ] . And Th eorem 6 assures us that this R B on C ([ 0 , 1 ]) is the ‘limiting d ist r ib ution’ of the frequency of s u ccesses F n 1 = T n 1 ( X 1 , . . . , X n ) / n , as the number of ‘trials’ n goes to inﬁnity . When all the count distributions Q n B are linear pre vision s Q n B , then the rep resentation R B is a linear pr e vision R B , and vice versa . This linea r prevision on C ([ 0 , 1 ]) , or e qui valently , on V ([ 0 , 1 ]) is co mpletely determin ed by (and of course completely determines) i ts values 13 W e refrain here from imposi ng conditions other tha n coherence (e.g., rela ted to σ -additi vity) on such e xten- sions, which could guarante e unicity on the set of all measurabl e gambles; see Miranda et al. (2007) for related discussion. EXCHANGEABL E LOWER PREVISIONS 19 on any b asis of the set of polyno mials on [ 0 , 1 ] . If we take as a basis the set { θ n : n ≥ 0 } , then we see t h at R B is completely determined by its (raw) moment sequence m n = R B ( θ n ) , n ≥ 0. It is well-k no wn (see for instance Feller, 19 71 , Section VII.3 ) that in th e ca s e of ﬁnitely additive probabilities, or linear p re visions, a momen t sequ ence uniquely determin es a distribution fun ction, except in its disco ntinuity points. An d this bring s us r ight back to de Finetti’ s (1937) version of the Representatio n Theorem: “la loi de prob abilit ´ e Φ n ( ξ ) = P ( Y n ≤ ξ ) tend vers une limite pour n → ∞ . [. . . ] il s’ensuit q u’il existe une loi-lim it e Φ ( ξ ) telle que lim n → ∞ Φ n ( ξ ) = Φ ( ξ ) sauf peut- ˆ etr e pour les points de discon tinuit ´ e . ” 14 ♦ 6. L O O K I N G A T T H E S A M P L E M E A N S Consider an e xch angeable seque nce X 1 , . . . , X n , . . . , and any gamble f on X . Th en the sequence f ( X 1 ) , . . . , f ( X n ) , . . . is again an exchangea ble seq uence of rando m variables, now taking values in the ﬁnite set f ( X ) . W e are interested in the sample means S n ( f )( X 1 , . . . , X n ) : = 1 n n ∑ k = 1 f ( X k ) which form a sequence o f rand om variables in [ inf f , sup f ] . For any m in N n X and any z ∈ [ m ] , S n ( f )( z ) = 1 n n ∑ k = 1 f ( z k ) = 1 n ∑ x ∈ X m x f ( x ) = : S X  f | m n  where fo r each θ θ θ ∈ Σ X , we have deﬁned the linear prevision S X ( ·| θ θ θ ) on L ( X ) b y S X ( f | θ θ θ ) : = ∑ x ∈ X f ( x ) θ x . Ob serve that S X ( f |· ) is a very special (linear) polyn omial gamble on the X -simplex. W e then get MuHy n X ( S n ( f ) | m ) = 1 ν ( m ) ∑ z ∈ [ m ] S n ( f )( z ) = 1 ν ( m ) ∑ z ∈ [ m ] S X  f | m n  = S X  f | m n  so we ﬁn d for the distribution P S n ( f ) of the sam ple mean S n ( f ) , which is a coh erent lo wer prevision on L ([ inf f , su p f ]) , that P S n ( f ) ( h ) = P n X ( h ( S n ( f ))) = Q n X ( h ( S X ( f |· )) ◦ 1 n ) , h ∈ L ([ inf f , sup f ]) . In terms of the representing lower prevision R X , we see that CoMn n X ( h ( S X ( f |· ) ◦ 1 n ) | θ θ θ ) = ∑ m ∈ N n X h ( S X ( f | m n )) B m ( θ θ θ ) is the appr oximating Bernstein p olynomial for the gamble h ( S X ( f |· )) on Σ X . So for all continuo us gambles h on [ inf f , su p f ] , h ( S X ( f |· )) is a con tinuous gamble on Σ X , an d is therefor e the uniform limit of its sequence of appr oximating Bernstein polyn omials. Since a coherent lower pre vision is uniformly continuou s , we see that lim n → ∞ P S n ( f ) ( h ) = R X ( h ( S X ( f |· ))) . (17) This tells us that for an exchange able sequence X 1 , . . . , X n , . . . th e sequ ence of sample means S n ( f )( X 1 , . . . , X n ) conver g es in distribution. 14 Our itali cs. In de Finetti’ s notation, Y n is our F n 1 , and Φ n its distrib ution function. 20 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A 7. E X C H A N G E A B L E N A T U R A L E X T E N S I O N Throu ghout this paper, we h a ve always consider ed exchangeable lo wer p re visions P N X deﬁned on the set L ( X N ) of all g ambles on X N . At ﬁrst sight, it seems an impo s sible task to sp ecify or assess such an exchange able lower prevision: a subject m ust specify an u ncountable inﬁnity of supremu m accep table pr ices, and at the same time keep tra ck of all the sym metry req uirements impo s ed by exch angeability , as we ll a s th e coh erence requirem ent. Alternatively , a subject must specify a coh erent coun t lower prevision Q N X on L ( N N X ) , and this means specifying an u ncountable inﬁnity of real numb ers Q N X ( g ) , for all g ambles g on N N X . 15 Is it there fore realistic, o r o f any prac ti ca l relev an ce, to co nsider such exchangeab le coheren t lower previsions? Indee d it is, an d we now want to show why . 7.1. The general problem. What will u s u ally happen in prac tice, is th at a subject makes an assessment that N variables X 1 , . . . , X N taking values in a ﬁnite set X are exchang e- able, 16 and in ad dition speciﬁes su premum acceptab le b uyin g prices P ( f ) f or all gamb les in som e (typically ﬁnite, but not n ecessarily so) set of gamb les K ⊆ L ( X N ) . The q ues- tion then is: can we turn these assessments into an e xchangeable coher ent lower pr evision P N X deﬁned on all of L ( X N ) , that is furthermore a s small (least-committal, co nservative) as possible? T o answer this question, we begin by look ing at the most conser v ative ( i.e., point-wise smallest) exch angeable coher ent lower pr e vision E P N for N variables. Since the most conservati ve coheren t lower prevision on L ( N N X ) is the vacuou s lower prevision, given b y Q N X ( g ) = min m ∈ N N X g ( m ) , our Representation Theorem for ﬁnite exch angeable sequen ces (Theor em 2) tells us th at E P N ( f ) = min m ∈ N N X MuHy N X ( f | m ) (18) for all gamb les f on X N , w hose correspondin g co unt lo wer prevision is vacuous. It models a subject’ s b eliefs about sampling with out replacemen t from an urn with N balls, where th is subject is completely ignorant about the composition of the urn. Using this E P N , we can in voke a gener al theo rem we h a ve proven elsewhere, abo ut the existence of c oherent lower p re visions that are (strongly) inv ariant u nder a mo noid of transform ations (De Coom an and Miranda, 2007, Theor em 1 6) to ﬁnd that 17 ENE-1. th ere are exchangeable coheren t lower p re visions on L ( X N ) that dominate P on K if and only if E P N  n ∑ k = 1 λ k [ f k − P ( f k )]  ≥ 0 for all n ≥ 0, λ k ≥ 0 and f k ∈ K , k = 1 , . . . , n ; (19) 15 When Q N X is a linear pre vision Q N X , it sufﬁce s to specify a ﬁnite number of real numbers Q N X ( { m } ) , for m in N N X , but such an extre m e ly efﬁcien t reducti on is general ly not possible for coherent count lower pre vi- sions Q N X . 16 This is a so-cal led structura l assessment in W alle y’ s (1991) terminology . 17 Equation (19) is close ly related to the a voiding sure l oss condition (1), but where the supremum is replaced by the cohere nt upper pre vision E P N . Similarly , Equation (20 ) is related to the expression (3) for natural exten- sion, b ut where th e inﬁmum operator is rep laced by t he cohe rent lowe r previ sion E P N . T he re is a sm a ll and easily correct able oversigh t in the formulation of Theorem 16 of De Cooman and Miranda (2007), as becomes imme- diate ly apparent when considering its proof: it is there (but should not be) formulated without the m ul tipliers λ k ≥ 0. EXCHANGEABL E LOWER PREVISIONS 21 ENE-2. in that case the po int-wise smallest (mo st conser v ative) exchangea ble coher ent lower pre vision E P , P N on L ( X N ) that domin ates P on K is given by E P , P N ( f ) : = sup ( E P N  f − n ∑ k = 1 λ k [ f k − P ( f k )]  : n ≥ 0 , λ k ≥ 0 , f k ∈ K ) , (20) and is called the exc ha ngeable natural e xtension of P . If we now c ombine Eq uation (18) with Equ ations (19) and (20), and d eﬁne the lower prevision Q on the set H : =  MuHy N X ( f |· ) : f ∈ K  ⊆ L ( N N X ) by letting 18 Q ( g ) : = sup  P ( f ) : MuHy N X ( f |· ) = g , f ∈ K  for all g ∈ H , then it is but a small technical step to prove the following result. Theorem 7 (Excha ngeable n atural extension ) . Ther e are exc h angeable co her ent lower pr evisions on L ( X N ) tha t dominate P on K if and only if Q is a lower p r evision 19 on H that avo ids sur e loss. In that case E P , P N = E Q ( MuHy N X ( ·|· )) , i.e., th e co unt distribution for the e xchangeable n atur a l extension E P , P N of P is th e na tur al extension E Q of the lower pr evision Q . Since there are quite efﬁcient algorithm s (W alle y et al., 2004) fo r calculating the natu ral extension o f a lower prevision ba s ed on a ﬁnite num ber of assessments, this the orem n ot only has intuitive a ppeal, but it provide s u s with an elegant a nd efﬁcient mann er to ﬁnd the exchangeable n atural extension, i.e., to combine (ﬁnitary) local assessments P with the structural assessment of exchangeab ili ty . 7.2. From n to n + k exchangeable random variables? Sup pose we have n rand om vari- ables X 1 , . . . , X n , that a subject judges to be exchangeable, and whose distrib utio n is given by the exchang eable coher ent lower prevision P n X on L ( X n ) , with co unt distribution Q n X on L ( N n X ) . Can this model be e xtended to a coh er ent e xchangeable mod el fo r n + k vari- ables? And if so, what is the most conservative such e xtend ed model? It is well-k no wn that when P n X is a linear p re vision, it canno t generally be extende d (Diaconis and Freedman, 1980). In the more gen eral case that we are consider ing here, we now look at ou r Theorem 7 to provid e us with an elegant answer: the pr oblem considered here is a special case of the one studied in Section 7.1. Indeed , if w e den ote, as b efore in Section 4.1, b y e f the cylindr ical e x tension to X n + k of the gamble f o n X n , then w e see that the lo cal a s sessmen ts P are deﬁned o n the set of gambles K : = n e f : f ∈ L ( X n ) o ⊆ L ( X n + k ) by P ( e f ) : = P n X ( f ) , f ∈ L ( X n ) . Ob- serve that her e N = n + k . If we rec all Eq uation (8 ) in Section 4.2, then we see th at th e correspo nding set H ⊆ L ( N n + k X ) is given by H : = { g : g ∈ L ( N n X ) } , where for any gamble g on N n X and all µ µ µ ∈ N n + k X g ( µ µ µ ) : = ∑ m ∈ N n X ν ( m ) ν ( µ µ µ − m ) ν ( µ µ µ ) g ( m ) = P ( g | µ µ µ ) , 18 Observe that it i s necessary that Q ( g ) s ho uld be ﬁnite, in order for the condition (19) to hold. 19 The expl icit requirement that Q is a lo wer previ sion means that Q must be no where inﬁnite. 22 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A where P ( ·| µ µ µ ) is th e line ar p re vision associated with dr a wing n balls with out replacem ent from an u rn with composition µ µ µ . Moreover , fo r a n y h in H , there is a uniq ue gam ble g on N n X such that h = g . 20 This implies that the corr esponding lower prevision Q on H is giv en by Q ( g ) : = Q n X ( g ) , g ∈ L ( N n X ) . Now ob s er v e that (a) λ = λ for all real λ ; (b) λ g = λ g for all g in L ( X n ) and all real λ ; (c) g 1 + g 2 = g 1 + g 2 for all g 1 and g 2 in L ( X n ) . This tells u s that H is a linear subsp ace o f L ( N N X ) that c ontains all co nstant gambles. Moreover , because Q n X is a coheren t lower prevision, we ﬁnd that (i) Q ( h 1 + h 2 ) ≥ Q ( h 1 ) + Q ( h 2 ) for all h 1 and h 2 in H ; (ii) Q ( λ h ) = λ Q ( h ) for all real λ ≥ 0 an d all h in H ; (iii) Q ( h + λ ) = Q ( h ) + λ for all real λ and all h in H . Because Q and H h a ve these special prop erties, the con dition f or P n X to be extend able to some coherent exchan geable model fo r n + k variables, name ly that Q av oids sur e loss on H , simp liﬁes to max g ≥ Q ( g ) for all g ∈ L ( N n X ) , i.e., to max µ µ µ ∈ N n + k X ∑ m ∈ N n X ν ( m ) ν ( µ µ µ − m ) ν ( µ µ µ ) g ( m ) ≥ Q n X ( g ) for all g ∈ L ( N n X ) . The expr ess ion fo r the natura l extension E Q of Q , app licable when the above cond ition holds, can also be simpliﬁed signiﬁcantly , a gain becau se o f the spe cial pro perties o f Q and H : E Q ( h ) = sup ( inf h h − n ∑ k = 1 λ k [ g k − Q ( g k )] i : n ≥ 0 , λ k ≥ 0 , g k ∈ L ( N n X ) ) = sup  inf  h − g + Q ( g )  : g ∈ L ( N n X )  = sup  Q ( g + inf [ h − g ]) : g ∈ L ( N n X )  = sup  Q ( g ) : g ≤ h , g ∈ L ( N n X )  = sup n Q n X ( g ) : g ≤ h , g ∈ L ( N n X ) o , for all g ambles h o n N n + k X . Th e p oint-wise smallest extension o f P n X to a coheren t ex- changeab le mod el on L ( X n + k ) is then the coher ent exchangeab le lower pr e vision with count distribution E Q , because of Theorem 7. In the well-known case that P n X is a linear pr e vision P n X , and there fore Q n X is also a linear prevision Q n X , the conditio n fo r extendib ilit y can also be written as min µ µ µ ∈ N n + k X P ( g | µ µ µ ) ≤ Q n X ( g ) for all g ∈ L ( N n X ) , where on th e left ha nd side we now see the lower pr e vision of the gamble g , a s soc iated with drawing n balls fro m an u rn with n + k balls, of unknown composition . When th is is satisﬁed, the lower p re vision Q will a ctually be a lin ear prevision Q o n the line ar space H , an d E Q will be the lower envelope o f all linea r p re visions Q n + k X on L ( N n + k X ) that 20 T o s e e this, consider the polynomia l p = ∑ µ µ µ ∈ N n + k X h ( µ µ µ ) B µ µ µ . Use Zhou’ s formula [Equation (22) in the Appendix] t o ﬁnd that if h = g , th en also p = ∑ m ∈ N n X g ( m ) B m , and c onsider that expan sions in a Bernste in basis are unique . EXCHANGEABL E LOWER PREVISIONS 23 extend Q . Similarly , th e exchang eable natural extensio n will b e th e lower envelope of all the exchangeable linear pre vision s P n + k X on L ( X n + k ) that extend P n X . 8. C O N C L U S I O N S W e have shown that th e notio n of excha ngeability has a n atural place in the theo ry of coheren t lower previsions. Indee d, on our app roach using Bernstein poly nomials, and gam- bles r ather than events, it seems fairly natural an d easy to derive repr esentation theor ems directly for coh erent lower p re visions, a nd to derive the correspo nding results for precise probab ilities (linear previsions) as special cases. Interesting re s u lts can also ob tained in a context of p redicti ve inferen ce, where a co - herent exchangeab le lower p re vision for n + k variables is upda ted with the info rmation that the ﬁrst n variables ha ve been obser v ed to assume c ertain v alues. For a fairly detailed discussion of these issues, we refer to De Cooman and Miranda (2007, Section 9.3). In Section 6 , we have argued that th e sample m eans S n ( f )( X 1 , . . . , X n ) converge in dis- tribution. It is possible (an d q uite e asy for that matter) to prove stron ger results. In deed, using an appr oach that is comp letely similar to the on e origina ll y used by de Finetti (1937), we can prove that for all non- ne g ati ve n and p : P N X ([ S n + p ( f ) − S n ( f )] 2 ) ≤ 2 p n ( n + p ) sup f 2 . In o ther words, f or any ﬁxed p ≥ 1 , the sequ ence S n + p ( f ) − S n ( f ) ‘converges in mean - square’ to zero as n → ∞ . Even stronger, we ﬁnd tha t for any non-negative k and ℓ P N X ([ S k ( f ) − S ℓ ( f )] 2 ) ≤ 2 | k − ℓ | k ℓ sup f 2 , and therefore the seque nce S n ( f ) ‘Cauch y-con verges in mean-square’ . These con vergence results ca n also be used to d eri ve the con vergence in distribution of the S n ( f ) , but we consider the approach using Bernstein polyno mials to be distinctly mor e elegant. A C K N O W L E D G E M E N T S W e acknowledge ﬁnancial support b y research grant G.013 9.01 of the Flemish Fund for Scientiﬁc Research (FWO), and by projec ts MT M2004-012 69 , TSI200 4-06801-C04 - 0 1 . Erik Quaeghe beur’ s resear ch was ﬁnanced by a Ph.D. grant of the Institute f or the Promo- tion of Innovation throug h Scienc e and T ech nology i n Fland ers (IWT Vlaand eren). W e would like to thank J ¨ urgen Garloff for very helpfu l co mments an d po inters to th e literature about multiv ariate Bernstein polynomia ls . A P P E N D I X A. M U LT I V A R I A T E B E R N S T E I N P O L Y N O M I A L S W ith any n ≥ 0 and m ∈ N n X there correspo nds a Bern stein ( basis) po lynomial of de- gree n o n Σ X , giv en by B m ( θ θ θ ) = ν ( m ) ∏ x ∈ X θ m x x , θ θ θ ∈ Σ X . These polyn omials have a number of very interesting prop erties (see for instance Prautzsch et al., 2 002 , Chap ters 10 and 11), which we list here: B1. The set  B m : m ∈ N n X  of all Bernstein p olynomials of ﬁxed degree n is linearly indepen dent: if ∑ m ∈ N n X λ m B m = 0, then λ m = 0 f or all m in N n X . B2. The set  B m : m ∈ N n X  of all Bernstein poly nomials of ﬁxed de g ree n forms a parti- tion of unity: ∑ m ∈ N n X B m = 1. B3. All Ber nstein basis polyno mials are non- ne g ati ve, and strictly positive in th e interior of Σ X . 24 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A B4. The set  B m : m ∈ N n X  of all Bernstein poly nomials of ﬁxed de g ree n forms a basis for the linear space of all polynom ials whose degree is at most n . Property B4 follows fro m B1 and B2. It follows from B4 that: B5. Any polyn omial p of degree m has a unique expansion in terms of the Bernstein basis polyno mials of ﬁxed degree n ≥ m , or in other words, there is a unique gamble b n p on N n X such that p = ∑ m ∈ N n X b n p ( m ) B m = CoMn n X ( b n p |· ) . This tells us [also use B2 and B3] that each p ( θ θ θ ) is a con vex co mbination of the Bernstein coefﬁcients b n p ( m ) , m ∈ N n X whence min b n p ≤ min p ≤ p ( θ θ θ ) ≤ max p ≤ m ax b n p . (21) It follows fro m a comb ination of B2 an d B 4 that f or all k ≥ 0 and all µ µ µ in N n + k X , b n + k p ( µ µ µ ) = ∑ m ∈ N n X ν ( m ) ν ( µ µ µ − m ) ν ( µ µ µ ) b n p ( m ) . (22) This is Zhou’s formula (see Prau tzsch et al. , 20 02 , Section 1 1.9). Hence [let p = 1 an d use B2] we ﬁnd that for all k ≥ 0 an d all µ µ µ in N n + k X , ∑ m ∈ N n X ν ( m ) ν ( µ µ µ − m ) ν ( µ µ µ ) = 1 . (23) The expressions (22) and (23) also im ply that each b n + k p ( µ µ µ ) is a co n vex combin ation o f the b n p ( m ) , and theref ore min b n + k p ≥ min b n p and max b n + k p ≤ max b n p . Com bined with the inequalities in (21), this leads to: [ min p , m ax p ] ⊆ [ min b n + k p , max b n + k p ] ⊆ [ m in b n p , max b n p ] (24) for all n ≥ m a nd k ≥ 0. T his m eans that th e non -decreasing seq uence m in b n p conv erges to some r eal numb er no t greater than min p , and, similarly , the non- increasing sequ ence max b n p conv erges to some real num ber not smaller than ma x p . The f ollo wing propo sit io n strengthen s this. Proposition 8. F or any polynomial p o n Σ X of de gree m, lim n → ∞ n ≥ m [ min b n p , max b n p ] = [ m in p , max p ] = p ( Σ X ) . Pr oof. This follows from the fact that the b n p conv erge uniform ly to the p olynomial p as n → ∞ ; see f or instance T ru mp and Prautzsch ( 1996). Alternatively , it can be shown (see Prautzsch et al., 2002, Section 11.9) that for n ≥ m b n p ( µ µ µ ) = ∑ m ∈ N m X b m p ( m ) B m ( µ µ µ n ) + O ( 1 n ) = p ( µ µ µ n ) + O ( 1 n ) , µ µ µ ∈ N n X . From this, we d educe that min b n p ≥ min p + O ( 1 n ) f or any n ≥ m , and as a co nsequence lim n → ∞ , n ≥ m min b n p ≥ min p . If we use now Equa tion (24), we s ee that lim n → ∞ , n ≥ m min b n p = min p . The proo f of the other equality is completely analogou s.  EXCHANGEABL E LOWER PREVISIONS 25 R E F E R E N C E S D. M. Cifarelli and E. Regazzini. De Finetti’ s co ntrib utio ns to pro bability and statistics. Statistical Science , 11:253 –282, 19 96. A. P . Dawid. Probability , symmetry , and f requency . British Journal for the Ph ilosophy of Science , 36(2) :107–128, 19 85. G. de Cooma n and E. Mir anda. W eak and stro ng laws of large numbe rs for coheren t lower previsions. Journal of Statistical Plann ing a nd I nfer ence , 20 06. Subm itted for publication . G. de Cooman and E. Miranda. Sy mmetry of models versus models of symm etry . In W . L. Harper and G . R. Wheeler, editors, Pr oba bility and I nfer ence: Essays in Hono r of Henry E. K ybur g, Jr . , pages 67–14 9. Kin g’ s College Publications, 2007. B. de F in etti. La pr ´ evision: ses lois logiqu es, ses sour ces su bjecti ves. Annales de l’Institut Henri P oincar ´ e , 7:1–68, 1937. En glish t r anslation in Kyb u r g Jr . an d Smokler (1964). B. de Finetti. T eo ria delle Pr o babilit ` a . Ein audi, T urin, 197 0. B. d e Finetti. Theory of Pr obab ility , volume 1. Joh n Wile y & Son s , Chich ester , 1974 . English translation of de Finetti (1970). B. d e Finetti. Theory of Pr obab ility , volume 2. Joh n Wile y & Son s , Chich ester , 1975 . English translation of de Finetti (1970). P . Diaconis and D. Freed man. Finite exchang eable sequen ces. Th e Annals of Pr obability , 8:745– 764, 198 0. W . Feller . An In tr od uction to Pr obability Theory an d Its Applica ti o ns , volume II . John W iley and Sons, Ne w Y ork, 1971. D. C. He ath and W . D. Sudder th. De Finetti’ s theo rem o n exchangeable variables. Th e American Statistician , 30:188– 189, 1976 . C. Heitzing er , A. H ¨ ossinger, and S. Selber herr . On Smo othing Three-Dimensional Monte Carlo Ion Im plantation Simulation Results. IEEE T ransactions on Computer-Aided De- sign of inte grated cir cuits and systems , 22(7) :879–883, 2 003. E. Hewitt and L. J. Sav age. Symmetr ic measures on Cartesian p roducts. T r a nsactions o f the American Mathematical Society , 80:470 –501, 19 55. N. L. Jo hnson, S. K otz, and N. Balakrishnan. Discr ete Multivariate Distributions . Wile y Series in Probability and Statistics. John W iley and Sons, Ne w Y o rk, 199 7. O. Kallenberg. F ou ndations of Mo dern Pr obab ility . Springer-V erlag , New Y ork, secon d edition, 2002. O. Kallen berg. Pr ob abilistic Symmetries and Invariance Principles . Springer, Ne w Y ork, 2005. H. E. Kyb u r g J r . and H. E. Smokler, editors. Studies in Sub jective Pr ob ability . Wile y , Ne w Y ork , 1964 . Second editio n (with new material) 1980. G. G. Lo rentz. Bernstein P olyno mials . Chelsea Pub lis h ing Com pany , New Y ork, NY , second edition, 1986. E. Mir anda and G . de Coom an. Marginal extension in th e theory o f coheren t lower previ- sions. I nternational J ou r n al of Appr oximate Reason ing , 2006. do i: 10 .1016/j.ijar .2006. 12.00 9. In press. E. M iranda, G. d e Cooman, and E. Quaegh ebeur . The Hausdorff mome nt problem under ﬁn ite additivity . Journal of Theoretical Pr oba bility , 20 07. doi: 10 .1007/ s10959 - 007- 0055 -4. In pr ess . H. Prautzsch , W . Boehm, and M. Paluszny . B ´ ezier an d B- Spline T echnique s . Spr inger , Berlin, 2002. 26 GER T DE COOMAN, ERIK QU AEGHE BEUR, AND ENRIQUE MIRAND A W . Trump an d H. Pra utzsch. Ar bitrary degree elev ation of B ´ ezier r epresentations. Com - puter Aided Geometric Design , 13:387– 398, 199 6. P . W a lle y . Statistical Reasoning wi th Imp r ecise P r oba bilities . Chapman and Hall, Lond on, 1991. P . W alley , R. Pelessoni, and P . V icig. Direct algorithm s f or checkin g co nsistenc y and mak- ing inferences f rom condition al proba bility assessments. Journal of Statistical Plan ning and Infer ence , 126:11 9–151, 2 004. P . Whittle. Pr ob ability via Expectatio n . Spring er , Ne w Y ork , fourth edition, 2000 . S. L. Zabell. Predicting the u npredictable. Synthese , 90 :205–232, 1992 . Reprinted in Zabell (2005). S. L. Zabell. Symmetry a nd Its Disconten ts : Essays on the History of Inductive P r oba - bility . Cam bridge Studies in Pro bability , Indu ction, and Decision Theo ry . Cambridg e University Press, Cambridge, UK, 2005. G H E N T U N I V E R S I T Y , S Y S T E M S R E S E A R CH G RO U P , T E C H N O LO G I E PA R K –Z W I J NA A RD E 9 1 4 , 9 0 5 2 Z W I J N AA R D E , B E L G I U M E-mail addre ss : gert.decooman@ ugent.be, erik.q uaeghebuer@ugent.be R E Y J U A N C A R LO S U N I V E R S I T Y , D E P T . O F S TA T I S T I C S A N D O P E R ATI O N S R E S E A R CH . C - T U L I P ´ A N , S / N , 2 8 9 3 3 , M ´ O S T O LE S , S PA I N E-mail addre ss : enrique.mirand a@urjc.es

Exchangeable lower previsions

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment