Order-distance and other metric-like functions on jointly distributed random variables

Order-distance and other metric-li k e functions on join tly distributed random v aria bles Eh tibar N. Dzhafaro v ∗ and Janne V. Kujala † Abstract W e construct a class of real-v alued nonnegativ e binary functions on a set of join tly distributed random va riables, whic h satisfy the triangle in- equalit y and v anish at identical arguments (pseudo-q uasi-metrics). These functions are u seful in dealing with the problem of selective probabilistic causalit y encoun tered in behavioral sci ences and in quantum ph ysics. The problem reduces to that of ascertaining the existence of a joint d istribu- tion for a set of v ariables with kno wn distributions of certa in subsets of this set. Any violation of the triangle inequalit y or its consequences b y one of our functions when applied to such a s et rules out the existence of this joi nt distribution. W e fo cus on an es p ecially vers atile and widely ap- plicable p seudo-quasi-metric called an order-distance and its sp ecial case called a classiﬁcation distance. Keyw ords: Bell-CHSH-Fine inequalities, Einstein-P odolsky- Rosen paradigm, probabilistic causality in b ehavio ral sciences, pseudo-q uasi- metrics o n random v ariables, quantum entangleme nt, selective inﬂuences. 2010 Mathematics Sub ject Classi ﬁcation: Primary 60B99, Secondary 81Q99, 91E45. W e sho w how cer tain metric-like functions on jointly distributed ra ndo m v ariables ( pseudo-quasi-metrics int ro duced in Section 1) can b e used in deal- ing with the pro blem of selectiv e probabilistic ca usality (introduced in Section 2), illustrating this on examples taken from b e haviora l sciences and q uant um ph ysics (Section 3). Although most o f Section 2 applies to a rbitrary pseudo- quasi-metrics o n jointly distributed random v a riables, w e single out o ne, ter med or der-distanc e , whic h is espe c ia lly useful due to its versatilit y . W e discuss ex- amples of other pseudo-quasi-metrics and rules for their co nstruction in Section 4. 1 Order p.q.-metrics Random v ar iables in this paper are unders too d in the broadest sense, as mea- surable functions X : V s → V , no restrictions b eing imp osed on the sample ∗ Corresponding author. Purdue Universit y , USA, eh tibar@purdue .edu. Supported by AFOSR gran t F A95 50-09-1-0252. † Unive rsit y of Jyv äskylä, Finland, janne.v.kujala@jyu.ﬁ. Supp orted by Academ y of Fi n- land gra n t 121855. 1 spaces ( V s , Σ s , µ s ) a nd the induced pro babilit y space s , ( V , Σ , µ ) , with the usual meaning of the terms (sets of v alues V s , V , sigma- algebras Σ s , Σ , and pro ba- bilit y measur es µ s , µ ). In par ticula r , any set X of join tly distributed r andom v ariables (functions on the s ame sample space) is a random v aria ble, and its induced probability s pace (or, simply , di stribution ) X = ( V , Σ , µ ) is referre d to as the joint distribution of its elemen ts. Given a class of random v a riables X , no t necessar ily join tly distributed, let X ∗ be the class o f distributions X for all X ∈ X . F or an y class function f ∗ : X ∗ → R (reals), the function f : X → R deﬁned b y f ( X ) = f ∗  X  is ca lled observable (a s it do es not dep end on sample spaces, t ypically unobser v able). W e will co n venien tly confuse f and f ∗ for observ able f unctions, so that if f is deﬁned o n X , then f ( Y ) , iden tiﬁed with f ∗  Y  , is also deﬁned for any Y 6∈ X with Y ∈ X ∗ . (This conv en tion is used in Section 2, when we apply a function deﬁned on a set of random v ar iables H to diﬀerent but iden tically distributed sets of A -v ariables .) F or an a rbitrary nonempty set Ω , let H = { H ω : ω ∈ Ω } be a indexed set of joint ly distributed random v a riables H ω with distributions H ω = ( V ω , Σ ω , µ ω ) . F or any α, β ∈ Ω , the ordered pair ( H α , H β ) is a random v aria ble wit h distribu- tion ( V α × V β , Σ α × Σ β , µ α,β ) , and H × H is a set of joint ly distributed random v ariables (hence also a random v ariable). Deﬁnition 1.1. W e call a n observ able function d : H × H → R a pseudo-qua si- metric ( p.q. -metric ) on H if, for all α, β , γ ∈ Ω , (i) d ( H α , H β ) ≥ 0 , (ii) d ( H α , H α ) = 0 , (iii) d ( H α , H γ ) ≤ d ( H α , H β ) + d ( H β , H γ ) . F or terminological clarity , the conv entional pseudometrics (also ca lled s emi- metrics) obtain by adding the pro per t y d ( H α , H β ) = d ( H β , H α ) ; the conv en- tional quasimetrics are obtained by adding the property α 6 = β ⇒ d ( H α , H β ) > 0 . A conven tional metric is both a pseudo metric and a quasimetric. (See, e.g., Zolotarev , 1 976, for discus sion of a v ariety of metrics and pseudometrics o n random v a riables.) By obvious argument w e ca n genera lize the triangle inequality , (iii): for any H α 1 , . . . , H α l ∈ H ( l ≥ 3 ), d ( H α 1 , H α l ) ≤ l X i =2 d  H α i − 1 , H α i  . (1) W e refer to this inequalit y (which plays a cen tral role in this pap er) as the chain ine quality . Let R ⊂ [ ( α,β ) ∈ Ω × Ω V α × V β , and w e wr ite a  b t o designate ( a, b ) ∈ R . Let R b e a total or der, that is, tran- sitive, reﬂexive, a nd connected in the se ns e that for any ( a, b ) ∈ S ( α,β ) ∈ Ω × Ω V α × 2 V β , a t leas t o ne of the relations a  b a nd b  a holds. W e deﬁne the equiv alence a ∼ b a nd strict o rder a ≺ b induced b y  in the usual w ay . Fina lly , w e assume that for any ( α, β ) ∈ Ω × Ω , the sets { ( a, b ) : a ∈ V α , b ∈ V β , a  b } are µ α,β -measurable. This implies the µ α,β -measurability of the sets { ( a, b ) : a ∈ V α , b ∈ V β , a ≺ b } , { ( a, b ) : a ∈ V α , b ∈ V β , a ∼ b } . Thu s, if all V ω are in terv als of re a ls,  can b e c hosen to coincide with ≤ , and (assuming the usua l Bor e l sigma algebra ) all the prop erties a bove are sa tisﬁed. Another example: for ar bitrary V ω , provided each Σ ω contains at lea st n > 1 disjoint nonempt y sets, one can partition V ω as S n k =1 V ( k ) ω , with V ( k ) ω ∈ Σ ω , and put a  b if and only if a ∈ V ( k ) α , b ∈ V ( l ) β and k ≤ l . Again, a ll proper ties ab ov e are clearly satisﬁed. Deﬁnition 1.2. The function D ( H α , H β ) = Pr [ H α ≺ H β ] = ˆ a ≺ b d µ α,β ( a, b ) is called a n or der p. q.-metric , or or der-distanc e , on H . That the deﬁnition is well-constructed follows from Theorem 1.3. Or der-distanc e D is a p.q.-metric on H . Pr o of. Let α, β , γ ∈ Ω , and H α = A , H β = B , and H γ = X . That D ( A, B ) is determined by the distributi on o f ( A, B ) is obvious from the deﬁnition. The prop erties D ( A, B ) ≥ 0 and D ( A, A ) = 0 are obvious to o. T o pro ve the triangle inequality , D ( A, B ) = Pr [ A ≺ B ] = Pr [ A ≺ B ≺ X ] + Pr [ A ≺ B ∼ X ] + Pr [ A ≺ X ≺ B ] + Pr [ A ∼ X ≺ B ] + Pr [ X ≺ A ≺ B ] , D ( A, X ) = Pr [ A ≺ X ] = Pr [ A ≺ X ≺ B ] + Pr [ A ≺ B ∼ X ] + Pr [ A ≺ B ≺ X ] + Pr [ A ∼ B ≺ X ] + Pr [ B ≺ A ≺ X ] , D ( X, B ) = Pr [ X ≺ B ] = Pr [ X ≺ B ≺ A ] + Pr [ X ≺ A ∼ B ] + Pr [ X ≺ A ≺ B ] + P r [ A ∼ X ≺ B ] + Pr [ A ≺ X ≺ B ] . So D ( A, X ) + D ( X , B ) − D ( A, B ) = Pr [ B ≺ A ≺ X ] + Pr [ A ∼ B ≺ X ] + Pr [ X ≺ B ≺ A ] + Pr [ X ≺ A ∼ B ] + Pr [ A ≺ X ≺ B ] ≥ 0 . 3 Since in the last expression all even ts are pairwise exclusive, we have D ( A, X ) + D ( X , B ) − D ( A, B ) ≤ 1 . This may seem an attra ctiv e addition to the triangle inequa lit y . The inequality is redundant, how ever, as it is subsumed by the triangle inequalities holding on { A, B , X } . Rewr iting the expression ab ov e as D ( A, B ) + 1 − D ( X , B ) − D ( A, X ) ≥ 0 , it immediately fo llows from D ( A, B ) + D ( B , X ) − D ( A, X ) ≥ 0 and D ( B , X ) = P r [ B ≺ X ] ≤ 1 − Pr [ X ≺ B ] = 1 − D ( X, B ) . 2 Selectiv e probabili stic causalit y Consider an indexed set W =  W λ : λ ∈ Λ  , with each W λ being a s e t referre d to as a (deterministic) input , with the elements of { λ } × W λ called input p oints . Input p oints therefore are pa irs of the for m x = ( λ, w ) and s hould not b e confused with input v a lues w . A nonempty set Φ ⊂ Q λ ∈ Λ W λ is called a se t of (allow a ble) t re atments; a treatment therefor e is also a set o f pairs of the for m ( λ, w ) . Let there be a collection o f sets of random v a r iables, referred to as (random) outputs , A φ =  A λ φ : λ ∈ Λ  , φ ∈ Φ , such that the distribution o f A φ (i.e., the join t distribution of a ll A λ φ in A φ ) is known for every treatment φ . W e deﬁne A λ =  A λ φ : φ ∈ Φ  , λ ∈ Λ , with the unders ta nding tha t A λ is n ot a random v ariable (i.e., A λ φ for diﬀerent φ are no t joint ly distributed). The following problem is encountered in a wide v ar iet y of contexts (see Dzha- farov, 2003; Dzhafarov & Gluho vsky , 200 6; Kujala & Dzhafarov, 2008). W e say that t he dependence of random outputs A λ φ on the deterministic inputs W λ is (canonically) sele ctive if, for every λ ∈ Λ a nd every φ ∈ Φ , the o utput A λ φ is “inﬂuenced” by none of the input points in φ except, p os sibly , for the o ne be- longing to { λ } × W λ . The question is how one sho uld deﬁne this selectivity of “inﬂuences” rigo rously , and how one can determine whether this selectivity holds. This pr oblem w as introduced to b ehavioral sciences in Sternberg (196 9 ) and T ownsend (19 84). In quantum physics, using diﬀerent ter minology , it was in tro duced in Bell (1 964) and elab orated in Fine (1982a- b). The deﬁnition ca n be given in several equiv alent forms, of whic h we presen t the one foca l for the present context. 4 Deﬁnition 2.1 . The dependence of  A λ : λ ∈ Λ  on  W λ : λ ∈ Λ  (or the “inﬂuence” o f the latter o n the former) is (canonically) selective if there is a set of join tly distributed random v ariables H =  H λ w : w ∈ W λ , λ ∈ Λ  (one random v aria ble f or ev ery v a lue of ev ery input), such that, for every φ ∈ Φ , H φ = A φ , where H φ =  H λ w : ( λ, w ) ∈ φ, λ ∈ Λ  and A φ =  A λ φ : λ ∈ Λ  (the corresp onding elements of H φ and A φ being those shar ing the same λ ). This deﬁnition is known as the Joint Distribution Criterion (JDC) for se- lectivit y of inﬂuences, and the set H satisfying this deﬁnition is referred to a s a (h ypo thetical) JDC-set. Specia lized forms of this cr iterion in q uant um physics can b e found in Suppe s & Zanotti (19 81) and Fine (19 82a-b); in the b ehav- ioral context and in complete generality this criterion is given (derived from an equiv alent deﬁnition) in Dzhafarov & Kujala (2010 ). R emark 2.2 . The adjectiv e “canonical” in the deﬁnition refers to the one -to-one corresp ondence betw een W λ and A λ sharing the same λ . A see ming ly more gen- eral scheme, in which diﬀeren t A λ are selectively inﬂuenced b y diﬀerent ( po ssibly ov er lapping) subsets of  W λ : λ ∈ Λ  is alwa ys reducible to the canonical form b y cons ider ing, for every A λ , the Ca rtesian pr o duct o f the inputs inﬂuencing it a sing le input, and redeﬁning co r resp ondingly the sets of input p oints a nd the set of allow a ble treatments. The simplest consequence of JDC is that the selectivity of inﬂuences implies mar ginal sele ct ivity (Dzhafarov, 2003; T ownsend & Sc hw eic kert, 198 9 ), deﬁned as follows. F or a ny Λ ′ ⊂ Λ we can uniquely pres ent a n y φ ∈ Φ as φ ′ ∪ φ ′ , wher e φ ′ ∈ Q λ ∈ Λ ′ W λ and φ ′ ∈ Q λ ∈ Λ − Λ ′ W λ . Then, if JDC is sa tisﬁed, the joint distribution o f n A λ φ ′ ∪ φ ′ : λ ∈ Λ ′ o do es not depend on φ ′ . R emark 2.3 . In the followin g w e a lwa ys assume that marginal selectivity is satisﬁed. The relev a nce of the o rder-distance and other p.q.-metrics on the sets of joint ly distributed random v ariables to the pr oblem of selectivity lies in the general test (necessary co ndition) for selectivit y of inﬂuences , formulated a fter the following deﬁnition. Deﬁnition 2.4. W e call a se q uence o f input points x 1 = ( α 1 , w 1 ) , . . . , x l = ( α l , w l ) 5 (where w i ∈ W α i for i = 1 , . . . , l ≥ 3 ) tr e atment- r e alizable if there are treatments φ 1 , . . . , φ l ∈ Φ (n ot necessar ily pairwis e distinct), such that { x 1 , x l } ⊂ φ 1 and { x i − 1 , x i } ⊂ φ i for i = 2 , . . . , l . If a JDC-set H exists, then for an y p.q.-metric d on H w e should have d  H α 1 w 1 , H α l w l  = d  A α 1 φ 1 , A α l φ 1  and d  H α i − 1 w i − 1 , H α i w i  = d  A α i − 1 φ i , A α i φ i  for i = 2 , . . . , l whence d  A α 1 φ 1 , A α l φ 1  ≤ l X i =2 d  A α i − 1 φ i , A α i φ i  . (2) This chain inequalit y , written entirely in terms of observ a ble probabilities, is referred to as a p.q.-metric test for selectivity of inﬂuences. If this inequal- it y is violated for at least o ne tr eatment -realizable sequence of input p oints, no JDC-set H ex ists, and the selectivity is ruled o ut. Note: if the seq uen c e φ (1) , . . . , φ ( l ) ∈ Φ for a given x 1 , . . . , x l can be ch osen in more than one way , the observ able quantities d  A α 1 φ (1) , A α l φ (1)  and d  A α i − 1 φ ( i − 1) , A α i φ ( i )  remain in v ari- ant due to the (tacitly assumed) marginal selectivit y . As an example, let Λ = { 1 , 2 } , W 1 = [0 , 1] , W 2 = [0 , 1] , Φ = W 1 × W 2 . F or any φ = { (1 , v ) , (2 , w ) } = ( v , w ) , let n A 1 φ , A 2 φ o hav e a biv a riate normal dis- tribution with zero means, unit v aria nce s , and co rrelation ρ = min (1 , v + w ) . Marginal s electivit y is trivially satisﬁed. Do  W 1 , W 2  inﬂuence  A 1 , A 2  se- lectively? F o r any biv ar ia te normally distributed ( A, B ) , let us deﬁne A ≺ B iﬀ A < 0 , B ≥ 0 . Then the corr espo nding order-distance on the hypothetical JDC-set H is D  H 1 v , H 2 w  = arccos (min (1 , v + w ) ) 2 π . The sequence of input points (1 , 0 ) , (2 , 1) , (1 , 1) , (2 , 0) is trea tment-realizable, so if H exists, we should hav e D  H 1 0 , H 2 0  ≤ D  H 1 0 , H 2 1  + D  H 2 1 , H 1 1  + D  H 1 1 , H 2 0  . The numerical substitutions yield, how ever, 1 4 ≤ 0 + 0 + 0 , and a s this is false, t he hypothesis that  W 1 , W 2  inﬂuence  A 1 , A 2  selectively is rejected. The theorem b elow and its cor ollary show that one only needs to chec k the chain inequality for a s pecia l subset of all p ossible treatment-realizable se quences x 1 , . . . , x l . 6 Deﬁnition 2.5. A treatmen t-realizable sequence x 1 , . . . , x l is called irr e ducible if x 1 6 = x l and the only subsequences { x i 1 , . . . , x i k } with k > 1 that are subsets of trea tmen ts are pairs { x 1 , x l } and { x i − 1 , x i } , for i = 2 , . . . , l . O ther wise the sequence is r e ducible . Theorem 2.6. Given a p.q.-metric d on the hyp othetic al JDC-set H , ine quality (2) is satisﬁe d for al l tr e atment-r e alizable se quenc es if and only if this ine quality holds for al l irr e ducible se quenc es. Pr o of. W e prov e this theorem by showin g that if (2) is violated for s o me re- ducible sequence x 1 , . . . , x l , then it is violated for some prop er subsequence thereof. Clearly , x 1 6 = x l bec a use otherwise (2) is no t violated. F or l = 3 , x 1 , x 2 , x 3 is reducible o nly if it is contained in a treatment: but then (2) would be satisﬁed. So l > 3 , and the reducibilit y of x 1 , . . . , x l means that there is a pair { x p , x q } belonging to a treatment , with ( p, q ) 6 = (1 , l ) a nd q > p + 1 . But then (2) m ust be violated for either x p , . . . , x q or x 1 , . . . , x p , x q , . . . , x l (allowing for p = 1 o r q = l but not b oth). If Φ = Q λ ∈ Λ W λ (all logically p ossible tr eatment s ar e allow able), then an y subsequence x i 1 , . . . , x i k of input p oints with pairwise distinct α i 1 , . . . , α i k be- longs to some treatment. Therefore an irreducible sequence c a nnot contain po in ts of more than tw o inputs, and it is easy to see tha t then it must b e a sequence o f pairwise distinct x 1 ∈ { α } × W α , x 2 ∈ { β } × W β , ..., x 2 m − 1 ∈ { α } × W α , x 2 m ∈ { β } × W β ( α 6 = β ). It is also easy to s ee that if m > 2 , ea ch of the s ubse ts { x 1 , x 4 } and { x 2 , x 5 } will belo ng to a trea tmen t. Hence m = 2 is the only po ssibilit y for an ir reducible sequence. Corollary 2.7 . If Φ = Q λ ∈ Λ W λ , t hen ine quality (2) is satisﬁe d f or al l tr e atment- r e alizable se quenc es if and only if this ine quality holds fo r al l t et r adic se quen c es of the form x, y , s, t , with x, s ∈ { α } × W α , y , t ∈ { β } × W β , x 6 = s , y 6 = t , α 6 = β . R emark 2.8 . This formulation is given in Dzhafar ov and Kujala (201 0), althoug h there it is unnecessarily conﬁned to metrics of a sp ecial kind. 3 An application The four t ables b elow represent results o f an exper imen t with a 2 × 2 factor ial design, { x, x ′ } × { y , y ′ } , and tw o binary resp onses, A and B . In relation to our general notation, we have here Λ = { 1 , 2 } , W 1 = { x, x ′ } , W 2 = { y , y ′ } , and four treatments ( x, y ) , . . . , ( x ′ , y ′ ) ; fo r ev er y treatmen t φ , the r andom outputs A 1 φ and A 2 φ are re pr esented by , resp ectively , A φ and B φ , each having tw o po ssible v alues, arbitrar ily lab eled. This design is argua bly the simplest p ossible, and it is ubiquitous in science. In a psychological double-detection exp eriment (see, e.g., T ownsend & Noz aw a, 1995 ), the input v a lues may repr esent presence ( x and y ) or absence ( x ′ and y ′ ) of a designated signal in tw o stimuli labeled 1 and 2 , presented side-b y-side. The participant in such a n exp eri men t is a sked to 7 indicate whether the signa l was present or absent in stimulus 1 and in stim ulus 2. The output v alues A = ◦ and B = ⊓ ma y indicate either t hat the resp onse was “signa l pr e sent ” or that the response was co rrect; and analo gously for A = • and B = ⊔ (either “signal absent” or an incorr ect resp onse). The en tries p ij , q ij , etc. repres en t join t proba bilities of the corresp onding outcomes , a i · , a ′ i · , etc. r epresent marginal pr o babilities. The question to b e answered is: does the resp onse to a g iven stimulus ( A to 1 and B to 2) selec tively dep end o n that stimulus alo ne (despit e A and B b eing stochastically dependent for ev ery treatment) , o r is A or B inﬂuenced by bo th 1 and 2? φ = ( x, y ) B xy = ⊔ B xy = ⊓ A xy = • p 11 p 12 a 1 · A xy = ◦ p 21 p 22 a 2 · b · 1 b · 2 φ = ( x ′ , y ) B x ′ y = ⊔ B x ′ y = ⊓ A x ′ y = • r 11 r 12 a ′ 1 · A x ′ y = ◦ r 21 r 22 a ′ 2 · b · 1 b · 2 φ = ( x, y ′ ) B xy ′ = ⊔ B xy ′ = ⊓ A xy ′ = • q 11 q 12 a 1 · A xy ′ = ◦ q 21 q 22 a 2 · b ′ · 1 b ′ · 2 φ = ( x ′ , y ′ ) B x ′ y ′ = ⊔ B x ′ y ′ = ⊓ A x ′ y ′ = • s 11 s 12 a ′ 1 · A x ′ y ′ = ◦ s 21 s 22 a ′ 2 · b ′ · 1 b ′ · 2 Another importa n t situation in which w e encounter fo r mally the sa me prob- lem is the Einstein-Podolsky-Ro sen (EPR) pa radigm. T wo particles are emitted from a common so urce in such a wa y that they remain entangle d (hav e hig hly correla ted prop erties , such as momenta or spins) as they run awa y from ea ch other (Asp ect, 19 99; Mermin, 1985). An experiment may consist, e.g., in mea- suring the spin of electro n 1 along one of tw o a xes, x or x ′ , and (in another lo cation but sim ultaneously in some inertial frame of reference) mea suring t he spin o f electron 2 along one of t wo axes, y or y ′ . The outcome A of a measure- men t on electron 1 is a random v ar ia ble with t wo p ossible v alues, “ up” o r “down,” and the same holds for B , the outcome of a measurement on electron 2. The question here is: do the measurements on electrons 1 and 2 selectively aﬀect, resp ectively , A and B (even though g enerally A and B a r e not indep endent at any of the four combinations o f spin axes)? If the answer is negativ e, then the measurement of one electron aﬀects the outcome of the measurement o f a nother electron even thoug h no signal can b e exchanged betw een tw o distant even ts that ar e sim ultaneous in some frame of reference. What ma kes this situation formally iden tical to th e double-detection exa mple describ ed ab ov e is that the measurements p erformed along diﬀerent axes on t he same particle, x and x ′ or y and y ′ , are non-c ommuting , i. e., they cannot be per formed sim ultaneous ly . This makes it poss ible to consider such measurements a s mutually exclusive v alues of an input. 8 Theorem 3.1. [Fine, 198 2a-b] A J DC-set H =  H 1 x , H 1 x ′ , H 2 y , H 2 y ′  satisfying  H 1 x , H 2 y  = { A xy , B xy } , n H 1 x , H 2 y ′ o = { A xy ′ , B xy ′ } ,  H 1 x ′ , H 2 y  = { A x ′ y , B x ′ y } , n H 1 x ′ , H 2 y ′ o = { A x ′ y ′ , B x ′ y ′ } exists if and only if t he fol lowi ng eight ine qualities a r e satisﬁe d: − 1 ≤ p 11 + r 11 + s 11 − q 11 − a ′ 1 · − b · 1 ≤ 0 , − 1 ≤ q 11 + s 11 + r 11 − p 11 − a ′ 1 · − b ′ · 1 ≤ 0 , − 1 ≤ r 11 + p 11 + q 11 − s 11 − a 1 · − b · 1 ≤ 0 , − 1 ≤ s 11 + q 11 + p 11 − r 11 − a 1 · − b ′ · 1 ≤ 0 . (3) W e refer to (3 ) as Bel l-CHSH- Fine ine qualities , wher e CHSH abbreviates Clauser, Horne, Shimon y , & Holt (1969): in this work Bell’s (1964 ) approach was developed in to a sp ecial version of (3). R emark 3.2 . The pro of given in Fine (1982a-b) that (3 ) is b oth necess ary and suﬃcient (under marginal selectivit y) for the existence of a JDC-set can be conceptually simpliﬁed: the Bell-CHSH-Fine inequa lities can b e algebraica lly shown to be the criterio n for the existence o f a vector Q with 1 6 probabilities Pr  H 1 x = • , H 1 x ′ = • , H 1 x = ⊔ , H 1 x = ⊔  , . . . , Pr  H 1 x = ◦ , H 1 x ′ = ◦ , H 1 x = ⊓ , H 1 x = ⊓  that sum to one and whos e appropriately chosen partial sums yield the 8 ob- serv able probabilities p 11 , q 11 , r 11 , s 11 , a 1 · , b · 1 , a ′ 1 · , b ′ · 1 (other pr obabilities b eing determined due to marginal selectivity). This is a simple linear progr amming ta s k, and the Bell-CHSH-Fine inequalities can b e derived “mechanically” by a facet en umera tion algorithm (s e e W erner & W o lf, 2001a -b, and Basoalto & P erciv al, 2003 ). The p oint of in terest in the present c ont ext is that the Bell- CHSH-Fine inequalities, who se ra ther obscure structure does not seem to ﬁt their funda- men tal imp ortance, turn o ut to b e in terpretable as the triang le inequalities for appropria tely chosen order-dista nce s. Consider the chain inequalities for the order-dista nce D 1 obtained by putting • = ⊔ = 1 , ◦ = ⊓ = 2 , and identifying  with ≤ : q 12 = D 1 ( H 1 x ,H 2 y ′ ) ≤ D 1 ( H 1 x ,H 2 y ) + D 1 ( H 2 y ,H 1 x ′ ) + D 1 ( H 1 x ′ ,H 2 y ′ ) = p 12 + r 21 + s 12 , p 12 = D 1 ( H 1 x ,H 2 y ) ≤ D 1 ( H 1 x ,H 2 y ′ ) + D 1 ( H 2 y ′ ,H 1 x ′ ) + D 1 ( H 1 x ′ ,H 2 y ) = q 12 + s 21 + r 12 , s 12 = D 1 ( H 1 x ′ ,H 2 y ′ ) ≤ D 1 ( H 1 x ′ ,H 2 y ) + D 1 ( H 2 y ,H 1 x ) + D 1 ( H 1 x ,H 2 y ′ ) = r 12 + p 21 + q 12 , r 12 = D 1 ( H 1 x ′ ,H 2 y ) ≤ D 1 ( H 1 x ′ ,H 2 y ′ ) + D 1 ( H 2 y ′ ,H 1 x ) + D 1 ( H 1 x ,H 2 y ) = s 12 + q 21 + p 12 . (4) 9 Consider also the inequalities for the order- distance D 2 obtained by putting • = ⊓ = 1 , ◦ = ⊔ = 2 , and identifying  with ≤ : q 11 = D 2 ( H 1 x ,H 2 y ′ ) ≤ D 2 ( H 1 x ,H 2 y ) + D 2 ( H 2 y ,H 1 x ′ ) + D 2 ( H 1 x ′ ,H 2 y ′ ) = p 11 + r 22 + s 11 , p 11 = D 2 ( H 1 x ,H 2 y ) ≤ D 2 ( H 1 x ,H 2 y ′ ) + D 2 ( H 2 y ′ ,H 1 x ′ ) + D 2 ( H 1 x ′ ,H 2 y ) = q 11 + s 22 + r 11 , s 11 = D 2 ( H 1 x ′ ,H 2 y ′ ) ≤ D 2 ( H 1 x ′ ,H 2 y ) + D 2 ( H 2 y ,H 1 x ) + D 2 ( H 1 x ,H 2 y ′ ) = r 11 + p 22 + q 11 , r 11 = D 2 ( H 1 x ′ ,H 2 y ) ≤ D 2 ( H 1 x ′ ,H 2 y ′ ) + D 2 ( H 2 y ′ ,H 1 x ) + D 2 ( H 1 x ,H 2 y ) = s 11 + q 22 + p 11 . (5) Theorem 3.3. Each right-hand Bel l-CHSH-Fine ine quality is e quivalent t o the c orr esp onding chain ine quality i n (4) for the or der-distanc e D 1 . Each left-hand Bel l-CHSH-Fine ine qu ality is e quivalent to the c orr esp onding chain ine quality in (5 ) fo r the or der-distanc e D 2 . Pr o of. W e show the pro of for the ﬁr s t of the Bell-CHSH-Fine double-inequalities. The e q uiv alence of p 11 + r 11 + s 11 − q 11 − a ′ 1 · − b · 1 ≤ 0 to q 12 ≤ p 12 + r 21 + s 12 obtains b y using the iden tities q 12 = a 1 · − q 11 , p 12 = a 1 · − p 11 , r 21 = b · 1 − r 11 , s 12 = a ′ 1 · − s 11 . The e q uiv alence of p 11 + r 11 + s 11 − q 11 − a ′ 1 · − b · 1 ≥ − 1 to q 11 ≤ p 11 + r 22 + s 11 follows from the iden tit y r 22 = 1 + r 11 − a ′ 1 · − b · 1 . 4 Concluding remarks The order- distances are versatile and hav e a broad sphere o f applicability be- cause order relatio ns on the domains of an y g iven set of ra ndom v a riables can alwa y s b e deﬁned in many diﬀere nt w ays. If no other structure is a v ailable, this can a lw ays b e done by the partitioning of the domains men tioned in Section 1 10 and used in the example with biv ariate normal distributions in Section 2 as well as for the binary v a riables o f the previous section: V ω = S n k =1 V ( k ) ω , V ( k ) ω ∈ Σ ω , ω ∈ Ω , putting a  b if and only if a ∈ V ( k ) α , b ∈ V ( l ) β and k ≤ l . Due to its univ ersality a nd conv enience of us e, it des e rves a sp ecial name, classiﬁc ation distanc e . Under additional co nstraints one ca n suggest many o ther p.q.-metrics on sets o f jointly dis tr ibuted r andom v a riables. Th us, if the v ariables in H are real-v a lued with the conv ent ional B o rel sigma algebra s, one ca n deﬁne, f or any A, B ∈ H , d ( p ) ( A, B ) = ( p p E [ | A − B | p ] for 1 ≤ p < ∞ , ess sup | A − B | for p = ∞ , (6) where ess sup | A − B | = inf { v : P r [ | A − B | ≤ v ] = 1 } . These p.q.-metr ics ar e conven tional metr ic s . In the cont ext o f selective inﬂu- ences these metrics have b een introduced in Kujala & Dzhafarov (200 8) and further analyzed in Dzhafar ov & Kujala (2010). An imp ortant pro per t y of d ( p ) is that the result o f a d ( p ) -based distance- t yp e test is not inv ariant with r esp ect to input-v alue-sp eciﬁc transfor mations of the r andom v ariables A λ φ , φ ∈ Φ , λ ∈ Λ . This means that the test can b e p erformed on a p otent ial inﬁnity of sets of random v a riables B λ φ = F  x λ , A λ φ  , with x λ ∈  { λ } × W λ  ∩ φ . If the jointly distributed rando m v ar iables constituting the set H ar e discr ete, one can use infor mation-based p.q.-metric. P erhaps the s implest of them is h ( A | B ) = − X a,b p AB ( a, b ) log p AB ( a, b ) p B ( b ) , A, B ∈ H, (7) with the conv entions 0 lo g 0 0 = 0 log 0 = 0 . is This function is called c onditional entr opy . The identit y h ( A | A ) = 0 is obvious, a nd the tria ngle inequalit y , h ( A | B ) ≤ h ( A | C ) + h ( C | B ) , follows from the standard information theory (in)equalities, h ( A | B ) ≤ h ( A, C | B ) , h ( A, C | B ) = h ( A | C , B ) + h ( C | B ) , and h ( A | C, B ) ≤ h ( A | C ) . Note that, unlike with the distance d ( p ) ab ov e, the test of selectiveness based on h ( A, B ) (and other information-based distances) is inv ariant with resp ect to all bijectiv e trans formations of the v ariables. The additively symmetrized (i.e., pseudometric) version of this p.q.-metric, h ( A | B ) + h ( B | A ) is well-known (Cov er & Thomas, 1990 ). 11 There are numerous ways of creating new p.q.-metrics from the ones already constructed, including those ta ken fr o m outside probabilistic co ntext. Th us, if d is a p.q.-metric on a set S , then, for any se t H of jointly distributed rando m v ariables taking their v alues in S , D ( A, B ) = E [ d ( A, B )] , A, B ∈ H, is a p.q.-metric on H . This follo ws from the fact that expe c ta tion E preserves inequalities and equalities ident ically satisﬁed for all p ossible r ealizations of the arguments. Thus, the distance d (1) ( A, B ) = E [ | A − B | ] trivially obtains fro m the metric | a − b | on reals. In the same w ay one o btains the w ell- known F r échet distance F ( A, B ) = E  | A − B | 1 + | A − B |  . Below we pr esent an incomplete lis t of transforma tio ns which, g iven a p.q.- metric (qua simetric, pseudometr ic , con vent ional metric) d on a space H of joint ly distributed random v ariables pro duces a new p.q.-metric (resp ectively , quasimetric, pseudo metric, or co nv entional metric) o n the same space. The pro ofs a re trivial or well-known. The arrows = ⇒ should be read “can b e trans- formed in to.” 1. d = ⇒ d q ( q < 1 ). In this wa y , for example, we can o btain metrics d ( p,q ) ( A, B ) = ( (E [ | A − B | p ]) q/p for 1 ≤ p < ∞ , (ess sup | A − B | ) q for p = ∞ from the metrics d ( p ) deﬁned in (6). 2. d = ⇒ d/ (1 + d ) , a s tandard way of creating a b ounded p.q.- metric. 3. d 1 , d 2 = ⇒ max { d 1 , d 2 } or d 1 , d 2 = ⇒ d 1 + d 2 . This transformations can be used to symmetrize p.q.-metrics, d ( A, B )+ d ( B , A ) o r ma x { d ( A, B ) , d ( B , A ) } (although this is never useful when using c hain inequalities as necessar y conditions: a n y vio lation o f a chain inequality with the symmetrized quan- tities implies a viola tion of this inequality b y the o riginal p.q.-metric, but not vice versa). 4. A gener alization of the previous: { d υ : υ ∈ Υ } = ⇒ sup { d υ } and { d υ : υ ∈ Υ } = ⇒ E [ d U ] , wher e { d υ : υ ∈ Υ } is a family of p.q.-metrics, and U desig nates a random v ariable with a proba bilit y measure m , so that d ( A, B ) = ˆ υ ∈ Υ d υ ( A, B ) d m ( υ ) . T o illustr ate t he latter wa y of constructing p.q.-metrics, consider a classiﬁca tion distance with binary pa rtitions: the doma in V ω of every H ω in H is partitioned in to tw o (measurable) subsets, W (1) ω ,υ and W (2) ω ,υ . Making these partitions random, 12 i.e., allowing the index υ to ra ndomly v a ry in any way whatever, we get a new p.q.-metric. In the specia l case when all random v aria bles in H take their v alues in t he set of real num bers , and W (1) ω ,υ is deﬁned by z ≤ υ ( z ∈ V ω ⊂ R , υ ∈ R ) , the randomization of the partitions r e duces to that of the separa tion point υ . The p.q.- metric then b ecomes d S ( A, B ) = P r [ A ≤ U < B ] where U is some random v aria ble. An additively symmetrized (i.e., pseudomet- ric) v ersion of this p.q.- metr ic, d S ( A, B ) + d S ( B , A ) , was intro duced in T a ylor (1984, 1985) under the name “separation (pseudo)metric,” and shown to be a co n ven tional metric if U is chosen stochastically indep endent of all ra ndom v ariables in H . References [1] Aspect, A. (1999). Bell’s inequality tests : More idea l tha n ever. Natur e 398 189-1 90. [2] Basoal to, R.M., & Perciv al, I.C. (2003). BellT est and CHSH exp eriment s with more than tw o s ettings. J. Phys. A 36 7411– 7423. [3] Bell, J. (1964). O n the Einstein-Podolsk y-Rosen par adox. Physics 1 195-20 0. [4] Clauser, J.F. , Horne, M. A. , Shimony, A., & H ol t, R.A . (1969). P rop osed experiment to test lo c al hidd en-v aria ble theories. Phys. Rev. L ett. 23 880-8 84. [5] Dzhaf ar ov, E.N. (2003 ). Selective inﬂuence through co nditional independence. Psy chometrika 68 7–26. [6] Dzhaf ar ov, E.N. , & Gluhovsky, I. (2006 ). Notes o n selective inﬂuence, proba bilistic causality , and probabilistic dimensionality . J. Math. Psych . 50 3 90–40 1. [7] Dzhaf ar ov, E.N ., & Kujal a, J.V. (2010). The Joint Distri- bution Criterio n and the Distance T ests for selective probabilistic causality . F r ontiers in Quantitative Psycholo gy and Me asur ement 1 :151 doi: 1 0 .3389 / fpsyg.2010 .00151. [8] Fine, A. (1 9 82a). Joint distributions, quantum correlations, and commut ing observ ables. J. Math. Phys. 23 1306 -1310 . [9] Fine, A. (198 2b). Hidden v a riables, join t probability , and t he Bell inequalities. Phys. R ev. L ett. 4 8 291- 295. [10] Kujala , J.V ., & Dzhaf aro v, E. N. (2008 ). T es ting for selec tivity in the dependence of r andom v ar iables on external facto rs. J. Math. Psych. 52 128–14 4. 13 [11] Mermin, N.D. (1 985). Is the mo on there when no bo dy lo oks? Reality and the quantum theory . Physics T o day 38 38-4 7. [12] Sternberg, S. (1969). The discov er y of pro ces sing stages: Ex- tensions o f Donders ’ metho d. In W.G . K oster (Ed.), Atten tion and P erformance II. A cta Psych olo gic a 30 276–3 15. [13] Suppes, P., & Zanotti, M. (19 81). When are probabilistic ex- planations p os sible? Synthese 48 191- 1 99. [14] T a ylor, M.D. (1984). Separ ation metrics for r eal-v alued ra ndom v ariables. Int. J. Math. Math. Sci. 7 407-4 08. [15] T a ylor, M.D. (1985). New metrics for weak conv ergence of dis- tribution functions. Sto chastic a 9 5-1 7. [16] Townsend, J.T. (1984). Unco vering ment al pro cesses with facto- rial ex per imen ts. J. Math. Psych. 28 363–4 00. [17] Townsend, J.T., & N oza w a, G . (199 5 ). Spa tio -tempo r al prop- erties o f elementary p erception: An investigation of parallel, serial, and coactive theories. J. Ma th. Psych . 39 321– 359. [18] Townsend, J.T., & Schweicker t, R. (1989). T ow ard the tri- chotom y metho d of reactio n times: Laying the foundation o f sto chastic men tal netw ork s. J. Ma th. Psych . 33 30 9–32 7 . [19] Werner, R.F., & Wolf, M.M. (2001). All multipartite Bell correla tio n inequalities for tw o dichotomic obser v ables p er site. arXiv:quant-ph/01020 2 4v1. [20] Werner, R.F., & Wolf, M.M. (200 1 ). Be ll inequa lities a nd ent anglement. arXiv:quant-ph/0107 0 93v2. [21] Z olot arev, V. M. (1976 ). Metric distances in spa ces of random v ariables and their distributions. Mathematics of t he USSR- S b ornik 30 373-4 0 1. 14 PROCEEDINGS OF THE AMERICAN MA THEMA TICAL SOCIETY V olume 00, N umber 0, P ages 000–000 S 0002-9939(XX)0000 -0 ORDER-DIST ANCE AND O THER METRIC-LIKE FUNCTIONS ON JOINTL Y DISTRIBUTED RAN DO M V ARIABLES EHTIBAR N. DZHA F ARO V AND JANNE V. KU JALA Abstract. W e construct a class of real-v alued nonnegativ e binary f unctions on a set of joint ly distributed random v ariables, which satisfy the triangle inequality and v anish at identical argu- men ts (pse udo-quasi-metrics). W e apply these functions t o the problem of selective probabilistic causalit y encoun tered in behavior al science s and i n quant um ph ysics. The problem r educes to that of ascert aining the existence of a joi nt distribution for a set of v ariables with known dis- tributions of certain sub sets of th is s et. Any violation of the t riangle i nequality by one of our functions when applied to such a set rules out the existence of the joint distribution. W e fo cus on an esp ecially v ersatile and widely applicable class of pseudo-quasi-metrics c alled o rder-distances. W e sho w, in part icular, that the Bell-CHSH-Fine inequalties of quan tum ph ysics follo w from the triangle inequa lities for appropriately deﬁned or der-distances. W e show how certain metric-like functions o n jointly distributed random v a riables ( pseudo-quasi- metrics in tro duced in Section 1) can b e used in dealing with the probl em o f selective probabilistic causality (introduced in Section 2), illustrating this on examples taken from b ehavioral sciences and quantum ph ysic s (Section 3 ). Although most o f Section 2 a pplies to arbitrary pseudo-quasi- metrics on join tly distributed r a ndom v ar iables, we s ing le out one, termed or der-distanc e , which is esp ecially useful due to its v er satilit y . W e discuss examples of other pseudo-quas i-metrics a nd rules for their construction in Section 4. 1. Order p.q.-metrics Random v ar iables in this paper are understo o d in the broadest sense, as measurable functions X : V s → V , no restr ictio ns b eing imposed on the sa mple spaces ( V s , Σ s , µ s ) and the induced probability space s , ( V , Σ , µ ) , with the usual meaning of the terms (sets of v a lues V s , V , s igma- algebras Σ s , Σ , and pro babilit y mea s ures µ s , µ ). In particular , any set X of jointly distr ibuted random v ariables (functions on the same sample space) is a random v ariable, a nd its induced probability space (or, simply , distribution ) X = ( V , Σ , µ ) is referred to as the joint distribution of its elements. Given a class of r andom v ariables X , not necessa r ily jointly distributed, let X ∗ be the class of distributions X for all X ∈ X . F or any cla ss function f ∗ : X ∗ → R (reals), the function f : X → R deﬁned b y f ( X ) = f ∗  X  is called observable (as it does not dep end on sa mple spaces, t ypically unobserv able). W e will c o nv eniently confuse f and f ∗ for observ able functions, so that if 2000 Mathematics Subje c t Classiﬁc ation. Prim ary 60B99, Secondary 81Q99 , 91E45. Key wor ds and phr ases. Bell -CHSH-Fine inequalities, Einstein-Podolsky-Rosen paradigm, probabilistic causality , pseudo-quasi-metrics on random v ariables, quan tum ent anglemen t, selectiv e inﬂuences. First author’s wo rk is supported b y AFOSR gr an t F A9550-09-1-0252. Second author’s work is supported b y Ac adem y of Fi nland gr an t 121855 . c  XXXX A merican Mathematical Society 1 2 EHTIBAR N. DZHAF ARO V AND JANNE V. KUJA LA f is deﬁned on X , then f ( Y ) , identiﬁed with f ∗  Y  , is also deﬁned for any Y 6∈ X with Y ∈ X ∗ . (This conv ent ion is use d in Section 2 , when w e apply a function deﬁned on a set of random v ar iables H to diﬀerent but identically distributed sets of A -v ar iables.) F or an arbitrar y nonempty s e t Ω , let H = { H ω : ω ∈ Ω } be a indexed set o f jointly distributed random v a riables H ω with distr ibutions H ω = ( V ω , Σ ω , µ ω ) . F or an y α, β ∈ Ω , the order ed pair ( H α , H β ) is a random v aria ble with distribution ( V α × V β , Σ α × Σ β , µ α,β ) , and H × H is a set of joint ly distributed random v ariables (hence also a random v ariable). Deﬁnition 1.1 . W e call an observ able function d : H × H → R a pseudo-qua si-metric ( p.q.-metric ) on H if, for all α, β , γ ∈ Ω , (i) d ( H α , H β ) ≥ 0 , (ii) d ( H α , H α ) = 0 , (iii) d ( H α , H γ ) ≤ d ( H α , H β ) + d ( H β , H γ ) . F or terminologica l clarity , the conv ent ional pseudometrics (also called semimetrics) obtain b y adding the prop erty d ( H α , H β ) = d ( H β , H α ) ; the conv en tional quasimetrics ar e obta ined by adding the pr o pe r t y α 6 = β ⇒ d ( H α , H β ) > 0 . A conven tional metric is b oth a pse udometric and a quasimetric. (See, e.g., [27] for discussion of a v ariety of metrics a nd pseudometrics on ra ndo m v ariables.) By o b vious arg umen t we can generalize the triangle inequality , (iii): fo r any H α 1 , . . . , H α l ∈ H ( l ≥ 3 ), (1.1) d ( H α 1 , H α l ) ≤ l X i =2 d  H α i − 1 , H α i  . W e refer to this inequality (which plays a central ro le in th is pap er) as the ch ain ine quality . Let R ⊂ [ ( α,β ) ∈ Ω × Ω V α × V β , and we write a  b to designate ( a, b ) ∈ R . Let R b e a total or der, that is, transitive, reﬂexive, and connected in the sense that for any ( a, b ) ∈ S ( α,β ) ∈ Ω × Ω V α × V β , at least one o f the r elations a  b and b  a holds. W e deﬁne the equiv alence a ∼ b and str ict o r der a ≺ b induced by  in the usual wa y . Finally , we ass ume that for any ( α, β ) ∈ Ω × Ω , the sets { ( a, b ) : a ∈ V α , b ∈ V β , a  b } are µ α,β -measurable. This implies the µ α,β -measurability of the sets { ( a, b ) : a ∈ V α , b ∈ V β , a ≺ b } , { ( a, b ) : a ∈ V α , b ∈ V β , a ∼ b } . Thu s, if all V ω are in terv als of r eals,  can be chosen to co incide with ≤ , and (assuming the usual Borel sigma algebra) all the prop erties above ar e sa tisﬁed. Another example: for ar bitrary V ω , provided each Σ ω contains at leas t n > 1 disjoint nonempty sets, one can pa rtition V ω as S n k =1 V ( k ) ω , with V ( k ) ω ∈ Σ ω , and put a  b if and only if a ∈ V ( k ) α , b ∈ V ( l ) β and k ≤ l . Again, all prop erties ab ov e are clearly satisﬁed. Deﬁnition 1.2. The function D ( H α , H β ) = Pr [ H α ≺ H β ] = ˆ a ≺ b d µ α,β ( a, b ) is called an or der p. q.-metric , or or der-distanc e , o n H . ORDER-DIST ANCE AND OTHER M ETRIC-LIKE FUNCTIONS 3 That the deﬁnition is well-constructed follows from Theorem 1.3. Or der-distanc e D is a p.q.-metric on H . Pr o of. Let α, β , γ ∈ Ω , and H α = A , H β = B , and H γ = X . That D ( A, B ) is determined b y the distribution of ( A, B ) is obvious from the deﬁnition. The prop erties D ( A, B ) ≥ 0 and D ( A, A ) = 0 are obvious to o. T o prov e the triangle inequality , D ( A, B ) = Pr [ A ≺ B ] = Pr [ A ≺ B ≺ X ] + Pr [ A ≺ B ∼ X ] + Pr [ A ≺ X ≺ B ] + P r [ A ∼ X ≺ B ] + Pr [ X ≺ A ≺ B ] , D ( A, X ) = Pr [ A ≺ X ] = Pr [ A ≺ X ≺ B ] + P r [ A ≺ B ∼ X ] + Pr [ A ≺ B ≺ X ] + Pr [ A ∼ B ≺ X ] + P r [ B ≺ A ≺ X ] , D ( X, B ) = Pr [ X ≺ B ] = Pr [ X ≺ B ≺ A ] + P r [ X ≺ A ∼ B ] + Pr [ X ≺ A ≺ B ] + P r [ A ∼ X ≺ B ] + Pr [ A ≺ X ≺ B ] . So D ( A, X ) + D ( X , B ) − D ( A, B ) = Pr [ B ≺ A ≺ X ] + Pr [ A ∼ B ≺ X ] + Pr [ X ≺ B ≺ A ] + Pr [ X ≺ A ∼ B ] + Pr [ A ≺ X ≺ B ] ≥ 0 .  Since in the la st expression a ll even ts are pairwise exclusive, we hav e D ( A, X ) + D ( X , B ) − D ( A, B ) ≤ 1 . This may seem a n attractive addition to the tria ng le inequality . The inequality is r e dunda nt, how- ever, as it is subsumed by the triangle inequalities holding on { A, B , X } . Rewr iting the express ion ab ov e as D ( A, B ) + 1 − D ( X , B ) − D ( A, X ) ≥ 0 , it immediately follows from D ( A, B ) + D ( B , X ) − D ( A, X ) ≥ 0 and D ( B , X ) = P r [ B ≺ X ] ≤ 1 − Pr [ X ≺ B ] = 1 − D ( X , B ) . 2. Selective probabilistic ca usality Consider an indexed set W =  W λ : λ ∈ Λ  , with each W λ being a set referre d to as a (de- terministic) input , with the elemen ts of { λ } × W λ called input p oints . Input points therefore are pairs of the form x = ( λ, w ) , with w ∈ W λ , and should not b e co nfused with input values w . A nonempt y set Φ ⊂ Q λ ∈ Λ W λ is called a set of (a llow able) tr e atments. A tr eatment therefore is a function φ : Λ → S λ ∈ Λ W λ such that φ ( λ ) ∈ W λ for any λ ∈ Λ . Note that s ym bo l φ not followed b y an argument alwa ys refers to the entire function, the set { ( λ, φ ( λ ) : λ ∈ Λ ) } . In the following we use tw o k inds of random v ar iables: those indexed as A λ φ (each cor r esp onding to a ﬁxed index λ ∈ Λ a nd a ﬁxed fu nction φ ) a nd thos e in dexed as H λ w (with w ∈ W λ ), corresp onding to input po int s ( λ, w ) . Let ther e be a collection of sets of random v ar iables, refer red to a s (random) outputs , A φ =  A λ φ : λ ∈ Λ  , φ ∈ Φ , 4 EHTIBAR N. DZHAF ARO V AND JANNE V. KUJA LA such that the distribution of A φ (i.e., the jo in t distribution of all A λ φ in A φ ) is known for every treatment φ . W e deﬁne A λ =  A λ φ : φ ∈ Φ  , λ ∈ Λ , with the understanding that A λ is not a random v ariable (i.e., A λ φ for diﬀerent φ are not jointly distributed). T o illustr ate the notatio n, le t Λ = { 1 , 2 , . . . } and W λ be t he set o f reals for all λ ∈ Λ . A treatment φ then is a real-v alued function (seq uence) { (1 , φ (1)) , (2 , φ (2)) , . . . } = ( φ (1) , φ (2) , . . . ) , where φ (1) ∈ W 1 , φ (2) ∈ W 2 , etc. Let Φ b e a nonempty set of such sequences . Fixing one of th em, φ = ( w 1 , w 2 , . . . ) , A φ = A ( w 1 ,w 2 ,... ) = n A 1 ( w 1 ,w 2 ,... ) , A 2 ( w 1 ,w 2 ,... ) , . . . o ; ﬁxing, say , λ = 2 and allowing ( w 1 , w 2 , . . . ) range ov er Φ , A λ = A 2 = n A 2 ( w 1 ,w 2 ,... ) : ( w 1 , w 2 , . . . ) ∈ Φ o . The follo wing problem is encountered in a wide v a riety of contexts [6 , 7, 15]. W e say that the dependence of random outputs A λ φ on the deterministic inputs W λ is (canonically) sele ctive if, for any distinct λ, λ ′ ∈ Λ and any φ ∈ Φ , the output A λ φ is “not inﬂuenced” by φ ( λ ′ ) . The question is how o ne s hould deﬁne this s electivit y of “inﬂuences” rig orous ly , and ho w one ca n determine whether this selectivity holds. This problem was in tro duced to b ehavioral sciences by Sternberg [18] and T ownsend [2 2]. In quantum physics, using diﬀerent terminolo gy , it was int ro duced by Be ll [3] and elab orated b y Fine [10, 11]. The deﬁnition can be given in several equiv alent forms, of whic h w e present the one fo cal for the present context. Deﬁnition 2. 1 . The dependence o f outputs  A λ : λ ∈ Λ  on inputs  W λ : λ ∈ Λ  (or the “ inﬂu- ence” of the latter on the for mer) is (canonically) selectiv e if there is a set o f jointly distributed random v ariables H =  H λ w : w ∈ W λ , λ ∈ Λ  (one random v ariable for every v alue of every input), such that, fo r a n y treatment φ ∈ Φ , H φ = A φ , where H φ = n H λ φ ( λ ) : λ ∈ Λ o and A φ =  A λ φ : λ ∈ Λ  (the corresp onding element s of H φ and A φ being those sharing the same λ ). This deﬁnition is k nown as the Joint Distribution Criterion (JDC) for selectivity of inﬂuences, and th e set H s a tisfying this deﬁnition is referred to as a (hypothetical) JDC-set. Sp ecialized forms of this criterion in quan tum ph ysics can b e found in [19] and [10, 11]; in the behavioral con text and in complete generality this criterion is given (derived from an equiv alent deﬁnition) in [8]. R emark 2.2 . The adjective “canonical” in the deﬁnition refers to the one- to-one corres po ndence betw een W λ and A λ sharing the s ame λ . A seemingly mor e general scheme, in which diﬀerent A λ are selectiv ely inﬂuenced b y diﬀerent (po ssibly ov erlapping) subse ts of  W λ : λ ∈ Λ  is alwa ys reducible to the canonical form by c o nsidering, for every A λ , the Cartesian pro duct of the inputs inﬂuencing it a single input, a nd redeﬁning co rresp ondingly the sets of input points a nd the set of allow a ble treatments. ORDER-DIST ANCE AND OTHER M ETRIC-LIKE FUNCTIONS 5 The simplest co nsequence o f JDC is that the sele c tivity o f inﬂuences implies mar ginal sele ctiv- ity [6, 24], deﬁned as follows. F or any Λ ′ ⊂ Λ we can uniquely present any φ ∈ Φ as φ ′ ∪ φ ′ , where φ ′ ∈ Q λ ∈ Λ ′ W λ and φ ′ ∈ Q λ ∈ Λ − Λ ′ W λ . Then, if JDC is satisﬁed, the joint distribut ion of n A λ φ ′ ∪ φ ′ : λ ∈ Λ ′ o do es not depend on φ ′ . R emark 2.3 . In the following we always assume that mar ginal se lectivit y is satisﬁed. The r elev ance of the order-distance and o ther p.q.-metrics on the sets of jointly distributed random v ar iables to the problem of selectivit y lies in the g eneral test (neces s ary condition) for selectivity of inﬂuences, formulated after the following deﬁnition. Deﬁnition 2.4. W e call a sequence of input p oints x 1 = ( α 1 , w 1 ) , . . . , x l = ( α l , w l ) (where w i ∈ W α i for i = 1 , . . . , l ≥ 3 ) tr e atment-r e alizable if there ar e treatmen ts φ 1 , . . . , φ l ∈ Φ (not necessarily pairwise distinct), such that { x 1 , x l } ⊂ φ 1 and { x i − 1 , x i } ⊂ φ i for i = 2 , . . . , l . If a JDC-set H ex is ts, then for any p.q.-metric d on H w e should hav e d  H α 1 w 1 , H α l w l  = d  A α 1 φ 1 , A α l φ 1  and d  H α i − 1 w i − 1 , H α i w i  = d  A α i − 1 φ i , A α i φ i  for i = 2 , . . . , l whence (2.1) d  A α 1 φ 1 , A α l φ 1  ≤ l X i =2 d  A α i − 1 φ i , A α i φ i  . This chain inequality , written entirely in terms of o bserv able pro babilities, is r eferred to a s a p .q.- metric test for sele c tivity of inﬂuences. If this inequality is violated for at leas t one treatment- realizable s e q uence of input p oints, no JDC-set H exists, and the selectivity is r uled out. Note: if the sequence φ (1) , . . . , φ ( l ) ∈ Φ for a given x 1 , . . . , x l can b e c hosen in more than o ne wa y , the observ able quantit ies d  A α 1 φ (1) , A α l φ (1)  and d  A α i − 1 φ ( i − 1) , A α i φ ( i )  remain in v ariant due to the (tacitly assumed) marginal selectivit y . As an example, let Λ = { 1 , 2 } , W 1 = [0 , 1] , W 2 = [0 , 1 ] , Φ = W 1 × W 2 . Let n A 1 φ , A 2 φ o for a n y treatment φ hav e a biv ar iate nor mal distribution w ith zero means, unit v ar iances, and correlation ρ = min (1 , w 1 + w 2 ) , where w 1 = φ (1) , w 2 = φ (2) . Marginal selectivit y is trivially satisﬁed. Do  W 1 , W 2  inﬂuence  A 1 , A 2  selectively? F or any biv aria te normally distributed ( A, B ) , let us deﬁne A ≺ B iﬀ A < 0 , B ≥ 0 . Then the corresp onding order-distance o n the h yp othetica l JDC-set H is D  H 1 w 1 , H 2 w 2  = arccos (min (1 , w 1 + w 2 )) 2 π . The sequence of input p oints (1 , 0) , (2 , 1) , (1 , 1) , (2 , 0) is trea tment-realizable, so if H exists, we should hav e D  H 1 0 , H 2 0  ≤ D  H 1 0 , H 2 1  + D  H 2 1 , H 1 1  + D  H 1 1 , H 2 0  . 6 EHTIBAR N. DZHAF ARO V AND JANNE V. KUJA LA The numerical substitutions yield, how e ver, 1 4 ≤ 0 + 0 + 0 , and as this is false, the h yp othesis that  W 1 , W 2  inﬂuence  A 1 , A 2  selectively is rejected. The theo rem b elow and its coro lla ry show that one only needs to chec k the chain inequalit y for a spec ia l s ubset of all p o ssible treatment-realizable sequences x 1 , . . . , x l . Deﬁnition 2.5. A tre a tmen t-realizable seq uence x 1 , . . . , x l is called irr e ducible if x 1 6 = x l and the only subsequences { x i 1 , . . . , x i k } with k > 1 that are subsets of treatments are pairs { x 1 , x l } and { x i − 1 , x i } , for i = 2 , . . . , l . O ther wise the sequence is r e ducible . Theorem 2.6. Given a p.q.-metric d on the hyp othetic al JDC-set H , ine qu ality (2.1) is satisﬁe d for al l tr e atment-r e alizable se quenc es if and only if this ine quality holds for al l irr e ducible se quenc es. Pr o of. W e prov e this theorem by showing that if (2.1) is violated for some reducible seq uence x 1 , . . . , x l , then it is vio lated for some prop er subsequence thereof. Clearly , x 1 6 = x l bec a use otherwise (2.1) is no t violated. F or l = 3 , x 1 , x 2 , x 3 is reducible o nly if it is con tained in a treatmen t: but then (2.1) would b e satisﬁed. So l > 3 , and the r educibilit y o f x 1 , . . . , x l means that there is a pair { x p , x q } b elong ing to a treatment, with ( p, q ) 6 = (1 , l ) and q > p + 1 . But then (2.1) m ust b e violated for either x p , . . . , x q or x 1 , . . . , x p , x q , . . . , x l (allowing f or p = 1 or q = l but not bo th).  If Φ = Q λ ∈ Λ W λ (all lo gically p oss ible trea tmen ts ar e allow able), then any s ubsequence x i 1 , . . . , x i k of input po int s with pairwise distinct α i 1 , . . . , α i k belo ngs to some treatment. Therefore an irr e- ducible sequence cannot contain p oints of mo re than t wo inputs, and it is easy to see that then it m ust b e a s equence o f pairwise distinct x 1 ∈ { α } × W α , x 2 ∈ { β } × W β , ..., x 2 m − 1 ∈ { α } × W α , x 2 m ∈ { β } × W β ( α 6 = β ). It is also easy to see that if m > 2 , eac h of the s ubse ts { x 1 , x 4 } a nd { x 2 , x 5 } will be long to a treatment. Hence m = 2 is the only p ossibility for an irr educible sequence. Corollary 2.7 . If Φ = Q λ ∈ Λ W λ , then ine quality (2.1) is satisﬁe d for al l t r e atm en t-r e alizab le se quenc es if and only if this ine quality holds for al l tetr adic se quenc es of the form x, y , s, t , with x, s ∈ { α } × W α , y , t ∈ { β } × W β , x 6 = s , y 6 = t , α 6 = β . R emark 2 .8 . This formulation is g iven in [8], although there it is unnecessa rily conﬁned to metrics of a sp ecial kind. 3. An applica tion The fo ur tables b elow r epresent results of an exp eriment with a 2 × 2 factor ial design, { x, x ′ } × { y , y ′ } , and tw o bina r y r espo nses, A and B . In relation to our genera l notation, w e hav e here Λ = { 1 , 2 } , W 1 = { x, x ′ } , W 2 = { y, y ′ } , and four treatmen ts ( x, y ) , . . . , ( x ′ , y ′ ) ; for every trea tmen t φ , the r andom outputs A 1 φ and A 2 φ are represented b y , resp ectively , A φ and B φ , each ha ving t w o po ssible v alues, arbitra rily lab eled. This design is argua bly the simplest po ssible, and it is ubiquitous in science. In a ps y ch ologica l double-detection experiment (see, e.g ., [23]), the input v alues ma y represent presence ( x a nd y ) or absence ( x ′ and y ′ ) of a designated s ignal in tw o stim uli labeled 1 and 2 , presen ted side-by-side. The participan t in such an exp eri ment is as ked to indicate whether the s ignal was present or absent in stim ulus 1 and in stim ulus 2. The output v alues A = ◦ and B = ⊓ ma y indicate either that the r esp onse was “signal pr esent ” or that the res po nse was correct; and analog o usly for A = • and B = ⊔ (either “signal absent” or an incorrect resp onse). The entries p ij , q ij , etc. represen t join t proba bilities o f the corresp onding outcomes, a i · , a ′ i · , etc. represen t marginal probabilities. The question to b e answered is: do es the resp onse to a given stimulus ( A ORDER-DIST ANCE AND OTHER M ETRIC-LIKE FUNCTIONS 7 to 1 and B to 2 ) s electively depend on that stimulus alone (despite A a nd B b eing sto chastically dependent for every treatment ), or is A or B inﬂuenced by b oth 1 and 2? φ = ( x, y ) B xy = ⊔ B xy = ⊓ A xy = • p 11 p 12 a 1 · A xy = ◦ p 21 p 22 a 2 · b · 1 b · 2 φ = ( x ′ , y ) B x ′ y = ⊔ B x ′ y = ⊓ A x ′ y = • r 11 r 12 a ′ 1 · A x ′ y = ◦ r 21 r 22 a ′ 2 · b · 1 b · 2 φ = ( x, y ′ ) B xy ′ = ⊔ B xy ′ = ⊓ A xy ′ = • q 11 q 12 a 1 · A xy ′ = ◦ q 21 q 22 a 2 · b ′ · 1 b ′ · 2 φ = ( x ′ , y ′ ) B x ′ y ′ = ⊔ B x ′ y ′ = ⊓ A x ′ y ′ = • s 11 s 12 a ′ 1 · A x ′ y ′ = ◦ s 21 s 22 a ′ 2 · b ′ · 1 b ′ · 2 Another imp ortant situation in whic h we enco unter formally the same problem is the Einstein- P o dolsky-Rosen (EPR) paradigm. T w o particles a re emitted from a c o mmon source in s uc h a way that they remain entangle d (have highly correlated pro p er ties , suc h as moment a o r spins) as they run awa y from eac h other [1, 16]. An experiment may consist, e.g., in measuring the spin o f electron 1 along one of t w o axes, x or x ′ , and (in another location but simultaneously in some iner tial frame of refer ence) measuring the spin of electr on 2 along one o f t wo axes, y or y ′ . The o utco me A of a measurement on electron 1 is a random v ariable with t wo p ossible v a lues, “up” or “down,” and the same holds for B , the o utcome of a measurement on electron 2. The question her e is: do the measurements o n electrons 1 and 2 selectively aﬀect, resp ec tively , A and B (even though genera lly A and B are not independent at any of th e four com binations of spin ax es)? If the answ er is neg ative, then the mea surement of o ne electron aﬀects the o utcome of the measurement of another e lectron even though no s ignal can be exchanged betw een tw o distant even ts that are simult aneous in some frame of refer e nce. What mak es this situation formally identical to the double-detection example describ ed above is that the mea surements p erformed along diﬀerent axes on the same particle, x and x ′ or y and y ′ , a re non-c ommu ting , i.e., they cannot be p erformed simultaneously . This makes it po ssible to consider such measurements as mutually exclusive v alues of a n inpu t. Theorem 3.1. (Fine [10, 1 1] ) A JDC-set H =  H 1 x , H 1 x ′ , H 2 y , H 2 y ′  satisfying  H 1 x , H 2 y  = { A xy , B xy } , n H 1 x , H 2 y ′ o = { A xy ′ , B xy ′ } ,  H 1 x ′ , H 2 y  = { A x ′ y , B x ′ y } , n H 1 x ′ , H 2 y ′ o = { A x ′ y ′ , B x ′ y ′ } exists if and only if t he fol lowi ng eight ine qu alities ar e satisﬁe d: (3.1) − 1 ≤ p 11 + r 11 + s 11 − q 11 − a ′ 1 · − b · 1 ≤ 0 , − 1 ≤ q 11 + s 11 + r 11 − p 11 − a ′ 1 · − b ′ · 1 ≤ 0 , − 1 ≤ r 11 + p 11 + q 11 − s 11 − a 1 · − b · 1 ≤ 0 , − 1 ≤ s 11 + q 11 + p 11 − r 11 − a 1 · − b ′ · 1 ≤ 0 . W e refer to (3.1) a s Bel l-CHSH-Fine ine qualities , where CHSH abbreviates Clauser, Hor ne, Shimony , & Holt [4]: in this work Bell’s [3] approa ch w a s developed into a s pecia l version of (3.1). R emark 3.2 . The pro of given in [10, 11] that (3.1) is b oth necessary and suﬃcient (under mar ginal selectivity) for the existence of a JDC-set can b e conceptua lly simpliﬁed: the Bell-CHSH-Fine inequalities can b e algebraically sho w n to b e the criterion for the existence of a v ector Q with 16 8 EHTIBAR N. DZHAF ARO V AND JANNE V. KUJA LA probabilities Pr  H 1 x = • , H 1 x ′ = • , H 1 x = ⊔ , H 1 x = ⊔  , . . . , Pr  H 1 x = ◦ , H 1 x ′ = ◦ , H 1 x = ⊓ , H 1 x = ⊓  that sum to one and whose appropriately c hosen partial sums yield the 8 observ able probabilities p 11 , q 11 , r 11 , s 11 , a 1 · , b · 1 , a ′ 1 · , b ′ · 1 (other probabilities b eing determined due to marginal selectivity). This is a simple linear pro- gramming task, and the Bell-CHSH-Fine inequalities can be derived “mechanically” by a facet enu meration algor ithm (see [25, 26] and [2]). F or extensions of the Bell-CHSH-Fine inequalities to m ultiple particles, multiple spin axes, and multiple ra ndom o utputs, see [9] and [17]. F or modern accounts of mathematical a nd interpretational asp ects of the entanglemen t problem in quant um ph ysics, see [12, 13, 14]. The po int of in terest in the pre s en t context is tha t the B e ll- CHSH-Fine inequalities, whose ra ther obscure structur e does not seem to ﬁt their fundamental imp o rtance, turn out to be in terpreta ble as the triangle inequalities for appropria tely c hosen order-dis ta nces. Consider the chain inequalities for the order -distance D 1 obtained by putting • = ⊔ = 1 , ◦ = ⊓ = 2 , and identif ying  with ≤ : q 12 = D 1 ( H 1 x ,H 2 y ′ ) ≤ D 1 ( H 1 x ,H 2 y ) + D 1 ( H 2 y ,H 1 x ′ ) + D 1 ( H 1 x ′ ,H 2 y ′ ) = p 12 + r 21 + s 12 , p 12 = D 1 ( H 1 x ,H 2 y ) ≤ D 1 ( H 1 x ,H 2 y ′ ) + D 1 ( H 2 y ′ ,H 1 x ′ ) + D 1 ( H 1 x ′ ,H 2 y ) = q 12 + s 21 + r 12 , s 12 = D 1 ( H 1 x ′ ,H 2 y ′ ) ≤ D 1 ( H 1 x ′ ,H 2 y ) + D 1 ( H 2 y ,H 1 x ) + D 1 ( H 1 x ,H 2 y ′ ) = r 12 + p 21 + q 12 , r 12 = D 1 ( H 1 x ′ ,H 2 y ) ≤ D 1 ( H 1 x ′ ,H 2 y ′ ) + D 1 ( H 2 y ′ ,H 1 x ) + D 1 ( H 1 x ,H 2 y ) = s 12 + q 21 + p 12 . (3.2) Consider also the inequalities for the o rder-distance D 2 obtained b y putting • = ⊓ = 1 , ◦ = ⊔ = 2 , and iden tifying  with ≤ : q 11 = D 2 ( H 1 x ,H 2 y ′ ) ≤ D 2 ( H 1 x ,H 2 y ) + D 2 ( H 2 y ,H 1 x ′ ) + D 2 ( H 1 x ′ ,H 2 y ′ ) = p 11 + r 22 + s 11 , p 11 = D 2 ( H 1 x ,H 2 y ) ≤ D 2 ( H 1 x ,H 2 y ′ ) + D 2 ( H 2 y ′ ,H 1 x ′ ) + D 2 ( H 1 x ′ ,H 2 y ) = q 11 + s 22 + r 11 , s 11 = D 2 ( H 1 x ′ ,H 2 y ′ ) ≤ D 2 ( H 1 x ′ ,H 2 y ) + D 2 ( H 2 y ,H 1 x ) + D 2 ( H 1 x ,H 2 y ′ ) = r 11 + p 22 + q 11 , r 11 = D 2 ( H 1 x ′ ,H 2 y ) ≤ D 2 ( H 1 x ′ ,H 2 y ′ ) + D 2 ( H 2 y ′ ,H 1 x ) + D 2 ( H 1 x ,H 2 y ) = s 11 + q 22 + p 11 . (3.3) Theorem 3.3. Each ri ght-hand Bel l-CHSH-Fine ine quality i s e quivalent to the c orr esp onding chain ine quality in (3 .2) for the or der-distanc e D 1 . Each left-hand Bell-CHS H-Fine ine quality is e quivalent to the c orr esp onding ch ain ine quality i n (3.3) for the or der-distanc e D 2 . Pr o of. W e show the pro of for the ﬁrst of the Bell-CHSH-Fine double-inequalities. The equiv alence of p 11 + r 11 + s 11 − q 11 − a ′ 1 · − b · 1 ≤ 0 to q 12 ≤ p 12 + r 21 + s 12 obtains b y using the iden tities q 12 = a 1 · − q 11 , p 12 = a 1 · − p 11 , r 21 = b · 1 − r 11 , s 12 = a ′ 1 · − s 11 . ORDER-DIST ANCE AND OTHER M ETRIC-LIKE FUNCTIONS 9 The equiv alence of p 11 + r 11 + s 11 − q 11 − a ′ 1 · − b · 1 ≥ − 1 to q 11 ≤ p 11 + r 22 + s 11 follows from the identit y r 22 = 1 + r 11 − a ′ 1 · − b · 1 .  4. Concluding remarks The order-distances a re versatile and have a broad sphere of applicabilit y b ecause order relatio ns on the domains of any given set o f r andom v ariables can always b e deﬁned in ma n y diﬀerent ways. If no o ther structur e is av aila ble, this can alwa ys b e done by the p artitioning o f the domains mentioned in Section 1 and used in the e x ample with biv aria te normal distributions in Section 2 as well as for the binary v ar iables o f the previo us section: V ω = S n k =1 V ( k ) ω , V ( k ) ω ∈ Σ ω , ω ∈ Ω , putting a  b if and only if a ∈ V ( k ) α , b ∈ V ( l ) β and k ≤ l . Due to its universality and conv enience of use, it deserves a spec ia l na me, classiﬁc ation distanc e . There a re nu merous w ays of creating new p.q.-metrics from the ones already constructed, in- cluding those ta ken from outside probabilistic co n text. Thu s, if d is a p.q.-metric on a set S , then, for any set H of join tly distributed random v ariables taking their v alues in S , D ( A, B ) = E [ d ( A, B )] , A, B ∈ H, is a p.q.-metric o n H . This follows from the fact that exp ectation E preser ves inequalities and equalities identically sa tisﬁed for all p ossible r ealizations of the a r guments. Another example: given an y family of p.q.-metrics { d υ : υ ∈ Υ } , their a verage with resp e c t to a random v ariable U with a probability measure m , d ( A, B ) = ˆ υ ∈ Υ d υ ( A, B ) d m ( υ ) , is a p.q.-metric. As a sp ecial case, consider a classiﬁcatio n distance with binar y partitions: the domain V ω of every H ω in H is partitioned into t wo (measura ble) subsets, W (1) ω ,υ and W (2) ω ,υ . Making these partitions random, i.e., a llowing the index υ to ra ndomly v a ry in an y wa y whatever, we get a new p.q.-metric. In the sp ecial case when a ll ra ndo m v ariables in H ta ke their v alues in the set of real num ber s, and W (1) ω ,υ is deﬁned by z ≤ υ ( z ∈ V ω ⊂ R , υ ∈ R ), the ra ndomization o f the partitions reduces to that of the separation p oint υ . The p.q.-metric then b ecomes d S ( A, B ) = Pr [ A ≤ U < B ] where U is some random v ariable. An additively symmetriz e d (i.e., ps e udo metric) version o f this p.q.-metric, d S ( A, B ) + d S ( B , A ) , was in tro duced in [20, 21] under the na me “separa tion (pseudo)metric,” and s hown to b e a co n ven tional met ric if U is chosen s to chastically indep endent of all random v a riables in H . 10 EHTIBAR N. DZHAF ARO V AND JANNE V. KUJA LA References [1] Aspect, A. (1999) . Bell’s inequalit y tests : More i deal than eve r. Natur e 398 189 -190. [2] Basoal to, R.M., & Perciv al, I.C. (2003) . BellT est and CHSH exp eriment s with more than t wo settings. J. Phys. A 36 7411–7423. [3] Bell, J. (1964) . On the Einstein-Podolsky-Rosen parado x. Physics 1 1 95-200. [4] Clauser, J.F. , Horne, M.A. , Sh imony, A., & Hol t, R.A. (1969). Proposed experiment to test lo cal hidden-v ariable theories. Ph ys. R ev. L e t t. 23 880-884. [5] Cov er, T.M. & Thomas, J.A. (1990 ). Elements of Information The ory . New Y ork: Wiley . [6] Dzhaf aro v, E.N. (2003) . Selectiv e inﬂuence through conditional i ndep endence. Psychometrika 68 7–26. [7] Dzhaf aro v, E.N., & Gluhovsky, I. (2006). Notes on se lectiv e inﬂuence, pr obabilistic causalit y , and probabilistic dimensionalit y . J. M ath. Psyc h. 50 39 0–401. [8] Dzhaf aro v, E.N., & Kujala, J.V. ( 2010). The Joint Di stribution Criterion and the Distance T ests for selectiv e probabilistic causalit y . F r ontiers in Quantitative Psycholo gy and Me asur ement 1 :151 doi: 10.3389/fpsyg.2010 .00151. [9] Dzhaf aro v, E.N., & Kujala , J.V. (2010 ). Selectivit y in probabilistic causality : Where psyc hology runs in to quant um ph ysics. [10] Fine, A. (1982). Joint distributions, q uan tum correlations, and commuting observ ables. J. Math. Phys. 23 130 6-1310. [11] Fine, A. (1982). Hidden v ariables, joint prob ability , and the Bell inequalities. Phys. R ev. Lett. 48 291-295. [12] Gudder, S. (2010). Finite quan tum measure spaces. Am er. Math. Mon., 117 512-527 [13] Khrennik o v, A. (2008). EPR–Bohm experiment and Bell’s inequalit y: Quan tum ph ysics meets proba- bility theory . The or. Math. Phys . 157 1448–1460. [14] Khrennik o v, A. (20 09). Contextual A ppr o ach to Quantum F ormalism . Berlin: Springer . [15] Kujala , J.V., & Dzhaf ar o v, E.N. ( 2008). T esting for selec tivit y in the dep endence of r andom v ari- ables on external factors. J. Math. Psych. 52 128–144. [16] Mermin, N.D. (1985) . Is the mo on there when nobo dy l o oks? Realit y and the quan tum theory . Physics T o day 38 38-47. [17] Per es, A. (1999). All the Bell inequalities. F ound. Phys . 29 589-614 . [18] Sternberg, S. (1969) . The discov ery o f processing stages: Extensions of Donders’ method. In W.G. Kost er (Ed.), Atten tion and P erformance II. A cta Psycholo gica 30 276 –315. [19] Suppes, P., & Zanotti, M . (19 81). When are probabilistic explanation s p ossible? Synthese 48 19 1- 199. [20] T a ylor , M.D. (1984). Separation metrics for real-v alued random v ariables. Int. J . Math. Ma th. Sci. 7 407-408. [21] T a ylor , M.D. (1985). New metrics for wea k conv ergence of distribution functions. Sto chastic a 9 5-17. [22] Tow nsend, J .T. (1984) . Unco v ering mental pr ocesses with factorial experiments. J. Math. Psych. 28 363–400. [23] Tow nsend, J.T., & Noza w a, G. (1995). Spatio-temporal properties of elementar y perception: An inv estigation of parallel, serial, and coactive the ories. J. Math. Psych. 39 321–359. [24] Tow nsend, J .T., & Sch weicker t, R. (1989). T o wa rd the tric hotom y method of reaction times: La ying the f oundation of stochastic men tal net w orks. J. Math. Psych. 33 309–327. [25] Werner , R.F., & Wolf, M.M. (2001). Al l multipartite Bell correlation inequalities for t wo dichot omic observ ables p er site. ar Xiv:quan t-ph/010202 4v1. [26] Werner , R.F., & Wolf, M.M. (2001). Bell inequa lities and entangleme nt . arXiv:quan t-ph/ 0107093v 2. [27] Zolot a rev, V .M. (1976). M etric distances i n spaces of random v ariables and their distributions. Math- ematics of the USSR-Sb ornik 30 373-401. ORDER-DIST ANCE AND OTHER M ETRIC-LIKE FUNCTIONS 11 Purdue University , USA E-mail addr ess : ehtibar@purd ue.edu URL : h ttp://ww w2.psych. purdue.edu/~ehtibar Universi ty of Jyv äskylä, Finland E-mail addr ess : janne.v.kuja la@jyu.fi . URL : h ttp://us ers.jyu.f i/~jvkujala

Order-distance and other metric-like functions on jointly distributed random variables

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment