Binary Expansion Group Intersection Network

Binary Expansion Group In tersection Net w ork Sic heng Zhou ∗ and Kai Zhang † Marc h 27, 2026 Abstract Conditional indep endence is cen tral to mo dern statistics, but beyond special parametric families it rarely admits an exact co v ariance characterization. W e in tro duce the binary exp ansion gr oup inter- se ction network (BEGIN), a distribution-free graphical representation for multiv ariate binary data and bit-enco ded m ultinomial v ariables. F or arbitrary binary random vectors and bit represen tations of m ulti- nomial v ariables, we pro ve that conditional independence is equiv alent to a sparse linear represen tation of conditional exp ectations, to a blo ck factorization of the corresponding interaction co v ariance matrix, and to blo ck diagonalit y of an asso ciated generalized Sc hur complemen t. The resulting graph is indexed by the in tersection of multiplicativ e groups of binary in teractions, yielding an analogue of Gaussian graph- ical mo deling b eyond the Gaussian setting. This viewpoint treats data bits as atoms and local BEGIN molecules as building blocks for large Mark ov random ﬁelds. W e also sho w ho w dy adic bit represen tations allo w BEGIN to approximate conditional independence for general random vectors under mild regularity conditions. A k ey tec hnical device is the Hadamar d prism , a linear map that links in teraction co v ariances to group structure. 1 In tro duction Conditional independence is a cornerstone of statistical reasoning. It underlies the interpreta- tion of multiv ariate asso ciations, the construction of graphical mo dels, and man y pro cedures in causal inference, v ariable selection, and structure learning. In classical settings, condi- tional independence is often studied through parametric mo dels whose cov ariance structure has a direct probabilistic in terpretation. In man y mo dern applications, ho wev er, the av ail- able data are heterogeneous, high dimensional, or only weakly mo deled, so exact parametric assumptions are diﬃcult to justify . This tension is esp ecially acute in distribution-free inference. On the one hand, condi- tional indep endence is one of the most natural w a ys to formalize “no direct asso ciation after adjustmen t. ” On the other hand, fully distribution-free conditional-indep endence testing is fundamen tally imp ossible without additional structure; see, for example, Shah and P eters (2020). The challenge, then, is to identify structure that is b oth mathematically exact and as assumption-lean as p ossible. This pap er builds on the multiresolution viewp oint of binary exp ansion statistics (Zhang, 2019; Zhang et al., 2021; Brown et al., 2025), whic h treats data bits as atomic units of information. A t the bit lev el, binary v ariables exhibit an exact linearity that do es not p ersist for general v ariables. That observ ation suggests asking whether conditional indep endence can b e c haracterized exactly through co v ariances once one works with suitable binary in teraction features beyond the original v ariables themselv es. Graphical mo dels pro vide the natural language for this question (Lauritzen, 1996; W ain- wrigh t and Jordan, 2008; Koller and F riedman, 2009; Drton and Maathuis, 2017). In Gaus- sian graphical mo dels, conditional indep endence is equiv alen t to sparsity of the precision ∗ Sicheng Zhou is an undergraduate studen t (E-mail: sichengz@mit.edu), Department of Electrical Engineering and Computer Science. Massach usetts Institute of T echnology , Cambridge, MA 02139. † Kai Zhang is a Professor (Corresponding author. E-mail: zhangk@email.unc.edu), Department of Statistics and Operations Research, University of North Carolina at Chap el Hill, Chap el Hill, NC 27599. 1 matrix. That equiv alence driv es b oth the interpretation of the graph and the design of scalable estimation pro cedures. Outside the Gaussian family , ho w ever, zeros in the inv erse co v ariance matrix generally do not enco de conditional independence. F or discrete data, several imp ortan t alternativ es are a v ailable. Classical log-linear mo dels already provide exact factorization-based characterizations of binary conditional indep en- dence, whereas Ising mo dels and more general Mark ov random ﬁelds imp ose explicit factor- ization assumptions. Loh and W ain wrigh t (2012) study generalized co v ariance matrices built from suﬃcient statistics of multinomial exp onen tial-family mo dels and show that their in- v erses can reﬂect graph structure under those mo deling assumptions. Lauritzen et al. (2021) deriv e strong implications for binary distributions under total p ositivit y constraints. Our goal here is diﬀerent: an exact cov ariance-based c haracterization in an in teraction basis for arbitrary m ultiv ariate binary distributions. With the central result Theorem 2.3, this pap er mak es the following four main contribu- tions: (i) W e pro ve that for arbitrary binary random vectors ( A , B , C ), conditional indep endence A ⊥ ⊥ C | B is equiv alen t to a co v ariance structure indexed b y the intersection of m ultiplicative groups of binary in teractions. The k ey ob ject is not the in verse of the full co v ariance matrix, but a generalized Sch ur complement asso ciated with the in teraction blo c k generated b y B . (ii) W e show that the same framew ork remains v alid for bit-enco ded multinomial v ariables, including rank-deﬁcien t cases created by deterministic constrain ts or structural zeros. (iii) W e in tro duce the Hadamar d prism , a con v enient linear map for the co v ariance algebra of binary interactions that clariﬁes the link b etw een co v ariance of binary interactions, W alsh–Hadamard transforms, and Bo olean F ourier analysis. (iv) W e extend the framework b eyond discrete data by showing that dyadic quan tiza- tions preserve conditional indep endence asymptotically and yield explicit appro xima- tion bounds under H¨ older-t yp e con tinuit y of the relev ant conditional laws. T aken together, these results yield the graph interpretation in Corollary 2.5, which con- tin ues to hold in singular multinomial enco dings. T o our kno wledge, this corollary pro vides the ﬁrst exact distribution-free cov ariance-based graphical characterization of conditional indep endence for arbitrary multiv ariate binary data. The graph that emerges is indexed not merely by the original v ariables but b y in tersections of m ultiplicative groups generated b y their in teractions. F or that reason, we call the resulting representation the binary exp ansion gr oup interse ction network (BEGIN). A useful wa y to p osition BEGIN is relative to Gaussian graphical mo dels and generalized co v ariance constructions. BEGIN is Gaussian-like in spirit b ecause conditional independence is read oﬀ from a sparse matrix ob ject and the resulting structure suggests no dewise pro jec- tion viewp oin ts. It is fundamentally non-Gaussian, ho w ev er, b ecause the relev ant no des are in teraction features and the correct matrix ob ject is a generalized Sch ur complemen t rather than an ordinary precision matrix. Compared with exp onen tial-family generalized cov ariance metho ds, BEGIN requires w eak er mo deling assumptions: it does not rely on strict positivity , a prescribed clique factorization, or a particular parametric family . By viewing data bits as atoms and BEGIN as molecules, w e provide examples sho wing how lo cal BEGIN structures can serv e as building blo cks for larger Mark o v random ﬁelds Section 2 develops the BEGIN c haracterization for binary v ariables and bit-represen ted m ultinomial v ariables. Section 3 studies dyadic approximations for general random v ectors. 2 Section 4 closes with brief remarks on implications and future w ork. Pro ofs are provided in the supplemen tary material. Notation W e use the following notation throughout. F or y ∈ R p , diag( y ) denotes the diagonal matrix with diagonal entries y . F or a matrix M , M + denotes the Mo ore–Penrose in verse. The sym b ol H p denotes the 2 p × 2 p Hadamard matrix obtained by Sylvester’s construction, H p = 1 1 1 − 1 ! ⊗ · · · ⊗ 1 1 1 − 1 ! | {z } p factors . (1.1) F or a binary random vector X = ( X 1 , . . . , X p ) ∈ {± 1 } p , let ⟨ X ⟩ denote the multiplicativ e group generated b y the co ordinates of X , and for multiple binary vectors X 1 , . . . , X k , let ⟨ X 1 , . . . , X k ⟩ denote the group generated b y the union of their co ordinates. The p ower ve ctor X ⊗ ∈ {± 1 } 2 p collects the elements of ⟨ X ⟩ via X ⊗ := 1 X 1 ! ⊗ 1 X 2 ! ⊗ · · · ⊗ 1 X p ! . (1.2) W e index the co ordinates of X ⊗ b y Λ ∈ { 0 , 1 } p and write X Λ := Q p j =1 X Λ j j . F or a matrix M and index sets R , C , M [ R , C ] denotes the corresp onding submatrix. F or a random ob ject Z , Supp( Z ) denotes its supp ort with p ositive probabilities. F or a set S , |S | denotes its cardinalit y . F or probabilit y measures P and Q on a common measurable space, TV( P , Q ) := sup S | P ( S ) − Q ( S ) | denotes total v ariation distance. 2 Main Results 2.1 Characterization of Conditional Indep endence for Binary V ariables Let A ∈ {± 1 } r , B ∈ {± 1 } s , and C ∈ {± 1 } t b e binary random vectors. Deﬁnition 2.1. W e say that A and C ar e conditionally indep endent given B , written A ⊥ ⊥ C | B , if for al l a , b , c with P ( B = b ) > 0 , P ( A = a , C = c | B = b ) = P ( A = a | B = b ) P ( C = c | B = b ) . T o connect conditional indep endence with co v ariance structure, w e allow zero-probabilit y cells in the joint distribution of ( A , B , C ). This is essential for multinomial v ariables enco ded b y data bits. If a multinomial v ariable has m categories with p ositive probabilit y and 2 k − 1 < m ≤ 2 k , then it can b e represented b y k bits, but that represen tation may imp ose deterministic constraints among interaction features and hence lead to singular co v ariance matrices. The follo wing rank characterization makes that singularity transparent. Theorem 2.2. F or a binary r andom ve ctor X ∈ {± 1 } p , let Σ ⟨ X ⟩\{ 1 } := Co v  ⟨ X ⟩ \ { 1 } , ⟨ X ⟩ \ { 1 }  b e the c ovarianc e matrix of the nonc onstant elements of ⟨ X ⟩ . Then rank  Σ ⟨ X ⟩\{ 1 }  = | Supp( X ) | − 1 . 3 W e no w deﬁne the interaction index sets that go v ern BEGIN: B := ⟨ B ⟩ \ { 1 } , L := ⟨ A , B ⟩ \ ⟨ B ⟩ , R := ⟨ B , C ⟩ \ ⟨ B ⟩ . Let Σ be the cov ariance matrix of the concatenated interaction v ector indexed by B ∪ L ∪ R , ordered as ( B , L , R ). When the join t pmf of ( A , B , C ) is strictly p ositive, Σ is p ositive deﬁnite; in general it is only p ositive semideﬁnite. Theorem 2.3. (BEGIN) The fol lowing statements ar e e quivalent. (a) A ⊥ ⊥ C | B . (b) Sparse conditional-exp ectation representation. F or every Λ 1 ∈ { 0 , 1 } r and Λ 2 ∈ { 0 , 1 } t , ther e exist c o eﬃcient ve ctors α Λ 1 , γ Λ 2 ∈ R 2 s such that E h A Λ 1 | B , C i = E h A Λ 1 | B i = α ⊤ Λ 1 B ⊗ , E h C Λ 2 | A , B i = E h C Λ 2 | B i = γ ⊤ Λ 2 B ⊗ . (2.1) (c) Blo ck factorization of co v ariance blo c ks. Ther e exist matric es M 1 and M 2 such that Σ =    Σ B Σ B M ⊤ 1 Σ B M ⊤ 2 M 1 Σ B Σ L M 1 Σ B M ⊤ 2 M 2 Σ B M 2 Σ B M ⊤ 1 Σ R    . (2.2) (d) Blo ck-diagonal generalized Sc h ur complemen t. The gener alize d Schur c omplement of Σ B in Σ , S := Σ [ L ∪ R , L ∪ R ] − Σ [ L ∪ R , B ] Σ + B Σ [ B , L ∪ R ] , is blo ck diagonal with r esp e ct to ( L , R ) ; that is, S = S L 0 0 S R ! . Theorem 2.3 iden tiﬁes the exact cov ariance ob ject b ehind conditional indep endence for bi- nary data. Part (b) follows from the binary expansion linear eﬀect (BELIEF) representation in Brown et al. (2025). The no v elty here is that this in teraction-lev el linearit y is equiv alent to the cov ariance factorization in part (c) and to the generalized Sc h ur-complement sparsit y in part (d). Under conditional indep endence, the conditional exp ectation of every interaction on the left or righ t dep ends on ( A , B , C ) only through the 2 s in teraction co ordinates in B ⊗ ; without conditional indep endence, the corresp onding represen tations generally require 2 r + s or 2 s + t co eﬃcien ts. P arts (c) and (d) of Theorem 2.3 pro vide a one-to-one characterization of conditional indep endence in terms of the cov ariance structure of ⟨ A , B ⟩ ∩ ⟨ B , C ⟩ for an arbitrary binary v ector ( A , B , C ). W e emphasize that conditional independence in binary v ariables must b e expressed through intersections of groups: for binary v ariables, the σ -ﬁeld is determined by the group they generate. If one replaces B b y a prop er subset of the group intersection, the equiv alence can fail in either direction: sparsit y of the corresponding Sc hur complemen t need not imply conditional independence, and conditional indep endence need not imply sparsit y . The supplemen tary material provides explicit coun terexamples for b oth failures. Because the relev an t cov ariance blo cks are indexed b y in tersections of m ultiplicative groups of binary in teractions, w e call this structure the binary exp ansion gr oup interse ction network (BEGIN). 4 Theorem 2.3 also sho ws wh y the ordinary in verse co v ariance matrix is not the righ t ob ject for describing conditional indep endence outside the Gaussian setting. When Σ is singular, the Mo ore–Penrose inv erse Σ + need not reﬂect the relev an t sparsit y pattern. BEGIN in- stead isolates the interaction blo ck generated by B and the asso ciated generalized Sch ur complemen t. This is motiv ated b y Theorem 2.5 of Bro wn et al. (2025), whic h implies that the ro ws of Σ [ L ∪ R , B ] lie in the row space of Σ B . Equiv alently , if we deﬁne M := Σ [ L ∪ R , B ] Σ + B , (2.3) then Σ [ L ∪ R , B ] = M Σ B , Σ [ B , L ∪ R ] = Σ B M ⊤ . (2.4) This deterministic row-space iden tity motiv ates the use of the Sch ur–Banac hiewicz inv erse of Ouellette (1981), rather than the more commonly used Moore–Penrose inv erse Σ + . Deﬁnition 2.4. Deﬁne the Sc hur–Banac hiewicz generalized inv erse of Σ by Ω := Σ + B + Σ + B FS + F ⊤ Σ + B − Σ + B FS + − S + F ⊤ Σ + B S + ! , F := Σ [ B , L ∪ R ] . (2.5) W e use the Sch ur–Banac hiewicz inv erse rather than Σ + b ecause its Ω [ L ∪ R , L ∪ R ] is exactly S + . It therefore preserv es the separation structure in Corollary 2.5 induced by the generalized Sc h ur complement, ev en when Σ is singular, and yields an exact c haracterization for rank-deﬁcien t m ultinomial encodings. Corollary 2.5. The Schur–Banachiewicz inverse Ω is symmetric and satisﬁes Σ Ω Σ = Σ . Mor e over, the fol lowing statements ar e e quivalent. (a) A ⊥ ⊥ C | B . (b) Ω [ L , R ] = 0 . (c) In the undir e cte d gr aph with vertex set B ∪ L ∪ R and an e dge b etwe en two distinct vertic es whenever the c orr esp onding entry of Ω is nonzer o, the set B sep ar ates L fr om R . Corollary 2.5 shows that BEGIN plays the same structural role for binary interaction features that the precision matrix pla ys in Gaussian graphical mo dels. In particular, since Ω [ L ∪ R , L ∪ R ] = S + , the graph can b e read directly from the sparsity pattern of the Sc h ur– Banac hiewicz in verse. As in Gaussian graphical mo deling, this viewp oin t suggests p ossible no dewise estimation strategies, though dev eloping their ﬁnite-sample theory is b ey ond the scop e of this note. Unlik e the Gaussian case, how ever, the relev ant no des are interaction features and the underlying matrix ma y be singular. It is also useful to compare BEGIN with the generalized cov ariance approach of Loh and W ainwrigh t (2012) and with the Ising mo del. A pairwise Ising mo del is a strictly p ositive Mark ov random ﬁeld on the original v ariables, so its graph is sp eciﬁed by a factorized likeli- ho o d; classical log-linear mo dels lik ewise provide exact factorization-based characterizations of binary conditional indep endence. Loh and W ain wright (2012) remain within discrete exp onen tial-family graphical mo dels and show that, after augmenting the co v ariance ma- trix b y suﬃcient statistics dictated by a triangulation, its inv erse is block graph-structured. BEGIN diﬀers in that it identiﬁes conditional indep endence exactly through the co v ariance conditions in Theorem 2.3, without assuming strict p ositivity , a clique factorization, or a ﬁxed parametric likelihoo d. In this sense, BEGIN can b e viewed as a lo cal building blo c k for Mark ov random ﬁelds o ver binary or m ultinomial v ariables. Section 2.2 provides examples illustrating ho w suc h local BEGIN structures can b e assembled into larger Mark ov graphs. 5 W e also note that when Supp( B ) ⊊ {± 1 } s , the co eﬃcients in part (b) and the matrices in part (c) need not b e unique. The equiv alence itself is unaﬀected: BEGIN is a structural statemen t ab out existence, factorization, and sparsit y , not ab out unique represen tations. 2.2 Examples By incorp orating interactions, the BEGIN framework can represen t conditional indep endence structures that are diﬃcult to displa y faithfully in classical w ays and can serv e as building blo c ks for Mark o v structures o v er binary v ariables. 1. Three binary v ariables. In the simplest case r = s = t = 1, Figure 1(a) sho ws BEGIN for A ⊥ ⊥ C | B . In addition to the original v ariables, BEGIN in tro duces the in teraction no des AB and B C . The graph splits naturally into a left wing { A, AB } , a cen ter { B } , and a righ t wing { C , B C } . The left wing together with the center generates ⟨ A, B ⟩ = { 1 , A, B , AB } , the center together with the right wing generates ⟨ B , C ⟩ = { 1 , B , C , B C } , and their in tersection is ⟨ B ⟩ \ { 1 } = { B } . 2. A binary ﬁrst-order Mark o v c hain. Let ( A 1 , . . . , A k ) ∈ {± 1 } k b e a (not necessar- ily stationary) ﬁrst-order Mark ov c hain. BEGIN contains the chain no des A 1 , . . . , A k together with the in teraction no des A j A j +1 for j = 1 , . . . , k − 1; see Figure 1(b) for k = 4. An unrestricted joint pmf on {± 1 } k has 2 k − 1 free parameters, whereas a nonstationary ﬁrst-order Mark ov mo del has only 2 k − 1. BEGIN makes that reduc- tion visible at the in teraction-no de level and suggests a sparse matrix representation for the c hain. Moreov er, Figure 1(b) is the union of the o v erlapping BEGIN molecules ⟨ A 1 , A 2 ⟩ ∩ ⟨ A 2 , A 3 ⟩ and ⟨ A 2 , A 3 ⟩ ∩ ⟨ A 3 , A 4 ⟩ ; more generally , the chain is assem bled from the BEGIN molecules on ⟨ A j , A j +1 ⟩ ∩ ⟨ A j +1 , A j +2 ⟩ , j = 1 , . . . , k − 2. 3. A higher-order conditioning set. Brown et al. (2025) pro vide an example in whic h B ⊥ ⊥ ( A 1 , A 2 , A 3 ) | ( A 1 A 2 , A 2 A 3 , A 3 A 1 ) . This form of conditional independence is not naturally expressed b y a standard graph on A 1 , A 2 , A 3 , and B . BEGIN, b y contrast, yields a direct undirected graph on in teraction no des corresp onding to ⟨ A 1 A 2 , A 1 A 3 , B ⟩ ∩ ⟨ A 1 , A 2 , A 3 ⟩ ; see Figure 1(c). 4. A Mark ov random ﬁeld b ey ond the Ising mo del. The BEGIN structure in the previous example can also serve as a building blo ck for a four-node global Marko v ran- dom ﬁeld. Under the relab eling X 1 = B , X 2 = A 1 A 2 , X 3 = A 1 A 2 A 3 , and X 4 = A 1 A 3 , Figure 1(c) represents X 1 ⊥ ⊥ X 3 | ( X 2 , X 4 ) through the group in tersection ⟨ X 1 , X 2 , X 4 ⟩ ∩ ⟨ X 2 , X 3 , X 4 ⟩ . If ( X 1 , X 2 , X 3 , X 4 ) ∈ {± 1 } 4 further satisﬁes the BEGIN molecule X 2 ⊥ ⊥ X 4 | ( X 1 , X 3 ) , then these tw o statemen ts are exactly the nontrivial sep- aration relations of the four-cycle standard graph X 1 − X 2 − X 3 − X 4 − X 1 . Hence the distribution satisﬁes the global Marko v prop erty with resp ect to this graph. If the joint pmf is strictly p ositiv e, then ( X 1 , X 2 , X 3 , X 4 ) is an Ising mo del on the four-cycle. With- out strict p ositivity , the same pair of BEGIN molecules still deﬁnes a global Marko v random ﬁeld, but not necessarily an Ising mo del, b ecause zeros in the join t pmf are allo wed. Th us, this pair of BEGIN molecules yields a class of four-cycle global Mark ov random ﬁelds that is strictly larger than the Ising family . 6 A AB B C BC (a) BEGIN for A ⊥ ⊥ C | B as ⟨ A, B ⟩ ∩ ⟨ B , C ⟩ . A 1 A 2 A 3 A 4 A 1 A 2 A 2 A 3 A 3 A 4 (b) BEGIN for a Marko v chain ( A 1 , A 2 , A 3 , A 4 ). B A 1 A 2 B A 1 A 3 B A 2 A 3 B A 1 A 2 A 2 A 3 A 1 A 3 A 1 A 2 A 3 A 1 A 2 A 3 (c) Left wing, center, and righ t wing of BEGIN for B ⊥ ⊥ ( A 1 , A 2 , A 3 ) | ( A 1 A 2 , A 2 A 3 , A 1 A 3 ), corresp onding to ⟨ A 1 A 2 , A 1 A 3 , B ⟩ ∩ ⟨ A 1 , A 2 , A 3 ⟩ . Figure 1: Examples of BEGIN, where conditional indep endence is represented through intersections of m ultiplicative groups of binary interactions. 2.3 The Hadamard prism The pro of of Theorem 2.3, pro vided in the supplemen tary material, relies on a linear mapping from R 2 p to R 2 p × 2 p that pac kages the cov ariance algebra into a matrix operator. This map- ping is closely related to con v olution on ( Z 2 ) p and is diagonalized b y the W alsh–Hadamard transform. Related constructions also app ear in the literature on group-circulan t matrices and Bo olean F ourier analysis (T erras, 1999; O’Donnell, 2014). Because this op erator is cen- tral to the co v ariance c haracterization underlying BEGIN and may also b e of indep endent in terest for future research, we give it a dedicated name. Deﬁnition 2.6. F or y ∈ R 2 p , deﬁne the Hadamard prism of y by η p ( y ) := 1 2 p H p diag( H p y ) H p . (2.6) Because H p is orthogonal up to scale, the eigenv alues of η p ( y ) are prop ortional to the co ordinates of H p y . F or binary interaction v ectors, those co ordinates are directly linked to cell probabilities and in teraction means (Zhang, 2019). The Hadamard prism also satisﬁes a recursion that is useful for second-moment calculations: for y 1 , y 2 ∈ R 2 d , η d +1 y 1 y 2 !! = η d ( y 1 ) η d ( y 2 ) η d ( y 2 ) η d ( y 1 ) ! . (2.7) This recursiv e form also suggests that the Hadamard prism ma y b e useful b ey ond BEGIN for studying structured cov ariance patterns of binary v ariables. 3 Appro ximating Conditional Indep endence for General V ariables The binary and multinomial theory ab ov e can b e used as a m ultiresolution approximation device for general random vectors. F ollowing Zhang (2019), Zhang et al. (2021) and Bro wn 7 et al. (2025), w e consider binary expansion as a wa y to enco de real-v alued v ariables through data bits. The classical expansion U = ∞ X k =1 A k 2 k , U ∈ [ − 1 , 1] , A k ∈ {± 1 } , suggests appro ximating U by its d -bit truncation U d = P d k =1 A k / 2 k . After marginal standardization, let ( U , V , W ) ∈ [ − 1 , 1] r × [ − 1 , 1] s × [ − 1 , 1] t . Deﬁne the dy adic quan tizer Q d ( x ) := − 1 + 2 − d + 2 1 − d j 2 d − 1 ( x + 1) k ∈ n − 1 + 2 − d , − 1 + 2 − d + 2 1 − d , . . . , 1 − 2 − d o , and apply it comp onen twise to vectors. The next result shows that exact dyadic condi- tional indep endence at ev ery resolution implies the p opulation notion and yields an explicit appro ximation rate under H¨ older-type contin uity . Theorem 3.1. F or ( U , V , W ) ∈ [ − 1 , 1] r × [ − 1 , 1] s × [ − 1 , 1] t and d ≥ 1 , deﬁne U d := σ  Q d ( U )  , V d := σ  Q d ( V )  , W d := σ  Q d ( W )  . Then the fol lowing statements hold. (a) If U d ⊥ ⊥ W d | V d for every d ≥ 1 , then U ⊥ ⊥ W | V . (b) Supp ose U ⊥ ⊥ W | V . A ssume ther e exist α ∈ (0 , 1] and c onstants L U , L W < ∞ such that for al l v , v ′ ∈ [ − 1 , 1] s , TV  L ( U | V = v ) , L ( U | V = v ′ )  ≤ L U ∥ v − v ′ ∥ α 2 , and TV  L ( W | V = v ) , L ( W | V = v ′ )  ≤ L W ∥ v − v ′ ∥ α 2 . Deﬁne ∆ d := sup S ∈U d , T ∈W d E     P ( S ∩ T | V d ) − P ( S | V d ) P ( T | V d )     . Then, for every d ≥ 1 , ∆ d ≤ L U L W s α 2 2 α (1 − d ) − 2 . In p articular, ∆ d → 0 as d → ∞ . P art (a) shows that exact dy adic conditional indep endence at every resolution implies the p opulation statement. Part (b) con trols the conv erse direction quan titatively . Without con- tin uity assumptions, discretization can create spurious conditional asso ciations or Simpson’s parado x; see, for example, Gong and Meng (2021). Under H¨ older-t yp e regularit y , how ev er, the dy adic approximation error deca ys at an explicit rate. This pro vides theoretical sup- p ort for using BEGIN on the leading data bits of contin uous or mixed-t yp e v ariables as a principled appro ximation to conditional indep endence. 4 Discussion This note establishes an exact cov ariance-based graphical characterization of conditional in- dep endence for arbitrary m ultiv ariate binary data in the binary-expansion in teraction basis, including singular multinomial enco dings. The characterization is distribution-free and is 8 expressed in ob jects that are natural from the p ersp ectiv e of multiresolution binary expan- sion. F or m ultinomial and discretized con tinuous v ariables, the same viewp oint provides a principled wa y to relate exact bit-lev el statemen ts to appro ximation results for more general v ariables. These results suggest sev eral directions for future work. One concerns structur e le arn- ing : how should one estimate the sparse BEGIN graph eﬃcien tly from ﬁnite samples when the in teraction feature space is large? The Sc hur-complemen t characterization and the BE- LIEF represen tation suggest no dewise pro cedures, regularized in v erse problems, and screen- ing rules tailored to interaction groups, but dev eloping their ﬁnite-sample prop erties lies b ey ond the scop e of this note. A second concerns statistic al the ory : high-dimensional consis- tency , robustness to approximate sparsity , and ﬁnite-sample guarantees remain op en. A third concerns c ausal and scientiﬁc interpr etation : the bit-level p ersp ectiv e suggests resolution- dep enden t notions of adjustmen t, mediation, and an appro ximation of causality . A c kno wledgmen ts Zhang’s research w as partially supp orted b y NSF gran ts DMS-2152289 and TI-2449855, as w ell as BSF gran t 2024055. The initial form ulation and pro of of Theorem 2.3 were completed while Zhou w as a junior student at the Princeton In ternational Sc ho ol of Mathematics and Science (PRISMS). Zhou and Zhang thank PRISMS for supp orting researc h collab orations in volving high sc ho ol students. The Hadamard prism w as dev elop ed during Zhang’s visit to Mic hael Baio cc hi at Stanford Universit y . Zhang thanks Baio cc hi and Stanford Universit y for the hospitality . The authors thank Michael Baio cchi, Emman uel Cand` es, P eng Ding, F ang Han, Jan Hannig, Daniel Kessler, Han Liu, Y ufeng Liu, Xiao-Li Meng, Heyang Ni, Art Ow en, Ev an Sc hw artz, Chengc hun Shi, Daniel Y ekutieli, W an Zhang, Y uhao Zhou, Hongtu Zh u, and Jos ´ e Zubizarreta for helpful comments and discussions. References Bro wn, B., K. Zhang, and X.-L. Meng (2025). BELIEF in dep endence: Leveraging atomic lin- earit y in data bits for rethinking generalized linear mo dels. The A nnals of Statistics 53 (3), 1068–1094. Drton, M. and M. H. Maathuis (2017). Structure learning in graphical mo deling. A nnual R eview of Statistics and Its A pplic ation 4 (1), 365–393. Gong, R. and X.-L. Meng (2021). Judicious judgment meets unsettling up dating: dilation, sure loss and Simpson’s parado x. Statistic al Scienc e 36 (2), 169–190. K oller, D. and N. F riedman (2009). Pr ob abilistic gr aphic al mo dels: principles and te chniques . MIT press. Lauritzen, S., C. Uhler, and P . Zwiernik (2021). T otal p ositivity in exp onen tial families with application to binary v ariables. The A nnals of Statistics 49 (3), 1436–1459. Lauritzen, S. L. (1996). Gr aphic al mo dels . Clarendon Press. Loh, P .-L. and M. J. W ainwrigh t (2012). Structure estimation for discrete graphical mo d- els: Generalized cov ariance matrices and their inv erses. A dvanc es in Neur al Information Pr o c essing Systems 25 . 9 O’Donnell, R. (2014). A nalysis of b o ole an functions . Cambridge Universit y Press. Ouellette, D. V. (1981). Sch ur complemen ts and statistics. Line ar A lgebr a and its A pplic a- tions 36 , 187–295. Shah, R. D. and J. P eters (2020). The hardness of conditional indep endence testing and the generalised co v ariance measure. The A nnals of Statistics 48 (3), 1514 – 1538. T erras, A. (1999). F ourier analysis on ﬁnite gr oups and applic ations . Num b er 43. Cam bridge Univ ersity Press. W ainwrigh t, M. J. and M. I. Jordan (2008). Graphical mo dels, exp onen tial families, and v ariational inference. F oundations and T r ends ® in Machine L e arning 1 (1-2), 1–305. Zhang, K. (2019). BET on indep endence. Journal of the A meric an Statistic al A sso cia- tion 114 (528), 1620–1637. Zhang, K., W. Zhang, Z. Zhao, and W. Zhou (2021). BEA UTY p o wered BEAST. arXiv pr eprint arXiv:2103.00674 . 10

Binary Expansion Group Intersection Network

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment