The Causal Uncertainty Principle: Manifold Tearing and the Topological Limits of Counterfactual Interventions

Journal of Machine Learning Research 23 (2024) 1- 33 Submitted 1/24; Revised 5/24; Published 9/24 The Causal Uncertain t y Principle: Manifold T earing and the T op ological Limits of Coun terfactual In terv en tions Rui W u wurui22@mail.ustc.edu.cn Scho ol of Management, University of Scienc e and T e chnolo gy of China 96 Jinzhai R o ad, Hefei, 230026, Anhui, China Hong Xie hongx87@ustc.edu.cn Scho ol of Computer Scienc e and Engine ering, University of Scienc e and T e chnolo gy of China 96 Jinzhai R o ad, Hefei, 230026, Anhui, China Y ong jun Li ∗ lionli@ustc.edu.cn Scho ol of Management, University of Scienc e and T e chnolo gy of China 96 Jinzhai R o ad, Hefei, 230026, Anhui, China Editor: My editor Abstract Judea Pearl’s do -calculus provides a universally accepted and mathematically rigorous foundation for causal inference on discrete directed acyclic graphs. How ever, its translation to contin uous, high-dimensional generativ e models—such as Score-based Diﬀusion Mo dels and Flow Matc hing—remains theoretically under-explored and fraught with geometric c hallenges. In con tinuous Riemannian domains, a coun terfactual in terven tion constitutes a signiﬁcan t topological redistribution of the underlying probabilit y measure. In this pap er, w e establish the fundamental measure-theoretic and top ological limits of suc h in terven tions. By formalizing con tinuous in terven tions via measure disintegration and Gaussian mol- liﬁcation, we circum ven t the singular en tropy parado x of Dirac measures and formally deﬁne the Coun terfactual Even t Horizon —a critical transp ort distance beyond which iden tity-preserving causal transp ort necessitates divergen t con trol energy . F urthermore, we explicitly b ound the initial Hessian of the Brenier optimal transp ort map to pro ve that when an interv en tion forces the target measure b ey ond this horizon, the deterministic limit of the Sc hr¨ odinger Bridge (inviscid optimal transport) inevitably dev elops ﬁnite-time singularities. These singularities are go verned by Riccati equations along geo desics, ultimately leading to sho c kw av e formation and Manifold T earing . Finally , leveraging the theory of viscous con- serv ation laws and the Bakry- ´ Emery Γ 2 calculus, we establish the Uncertain t y Principle of Causal Interv en tions . W e deriv e a strict mathematical low er b ound that quan tiﬁes the irreducible trade-oﬀ b et w een the extremity of an interv ention and the preserv ation of individual identit y . Guided by these top ological limits, we in tro duce Geometry-Aw are Causal Flow (GA CF) , a scalable algorithmic framew ork utilizing Hutchinson trace estimators as a dynamic top ological radar to inject geometric entrop y exclusively when manifold tearing is imminent. Our theoretical and empirical results highlight a fundamental structural constrain t: purely deterministic generativ e coun terfactuals are geometrically ill-p osed for strong out-of-distribution in terven tions, demonstrating that targeted entropic regularization ∗ . Corresponding author. © 2024 Rui W u. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/ . Attribution requirements are pro vided at http://jmlr.org/papers/v23/24- 0000.html . Wu, Xie and Li (via SDEs) is a necessary geometric requirement for robust causal inference in contin uous spaces. Keyw ords: Causal Inference, Optimal T ransp ort, Generative Mo dels, Manifold T earing, Coun terfactual In terven tions 1 In tro duction The transition from discrete Bay esian net works ( P earl , 2009 ; Peters et al. , 2014 ) to con tin uous, high-dimensional Structural Causal Mo dels (SCMs) represents one of the most profound paradigm shifts in mo dern machine learning. While traditional causal inference has excelled at estimating av erage treatment eﬀects (A TE) in low-dimensional tabular data, the fron tier of AI for Science—ranging from single-cell genomics to medical imaging—demands the generation of individual-lev el coun terfactuals in spaces comprising thousands or millions of dimensions ( Pa wlo wski et al. , 2020 ). Recen t adv ances in Generative AI hav e provided a p o werful to olkit for this endeav or. Score-based Diﬀusion Mo dels ( Song et al. , 2021 ) and Flow Matching ( Lipman et al. , 2023 ) ha ve op erationalized counterfactual generation as a dynamic optimal transp ort problem. In these frameworks, generating a counterfactual in volv es solving a probabilit y ﬂo w Ordinary Diﬀeren tial Equation (ODE) or a Sto c hastic Diﬀerential Equation (SDE) that transp orts an observ ed factual individual to a hypothetical p ost-in terv ention distribution. Deterministic ﬂo ws (ODEs) are particularly fav ored by practitioners b ecause they oﬀer exact likelihoo d computation and bijective mappings, which are theoretically ideal for preserving individual iden tity during the ab duction phase of counterfactual reasoning. Ho wev er, a foundational theoretical gap p ersists, casting a shadow ov er the reliabilit y of these metho ds: What ar e the mathematic al limits of a c ontinuous do -intervention? In a discrete directed acyclic graph (DA G), in tervening via do ( X = x ) is a surgically precise graph-theoretic op eration: one merely deletes incoming edges to no de X and forces its state. In a contin uous Riemannian manifold M , how ev er, an in terven tion is not merely a structural pruning; it is a profound top ological redistribution of probability mass. A strong out-of-distribution interv en tion forces probability mass to trav erse v ast regions of near-zero densit y—which w e term “v oids.” When researc hers apply deterministic generative mo dels to simulate extreme counterfactuals (e.g., predicting the morphology of a cell under an unpreceden ted drug dosage), they implicitly assume that the underlying geometry of the data manifold can mathematically sustain such a transp ort plan. W e demonstrate that this assumption faces strict geometric limitations. The failure of con tinuous causal transp ort under extreme shifts is not merely a n umerical optimization c hallenge (e.g., p oorly trained neural netw orks), but a fundamental top ological barrier inheren t to the underlying measure transp ort. A Crucial Premise: The Iden tit y-Preserving Requirement. W e emphasize that our claim—deterministic coun terfactual generation b eing mathematically ill-p osed under extreme interv en tions—is strictly predicated on the principle of identit y preserv ation via optimal transp ort (minimal action). T rivially , one could construct a deterministic global translation map (e.g., T ( x ) = x + c ) that av oids singularities. Ho wev er, such arbitrary mappings violate the core physical philosophy of counterfactual reasoning: mo difying only what is necessary while preserving the unique, inheren t structural identit y of the 2 Cohomological Obstr uctions to Global Counterf actuals individual. When mo dels (e.g., Flow Matc hing) are trained to ﬁnd the most eﬃcient, iden tity-preserving paths, they inherently con verge tow ard optimal transp ort maps, whic h w e pro ve are structurally predisp osed to top ological singularities under strong interv entions. 4 2 0 2 4 L a t e n t D i m e n s i o n Z 1 4 3 2 1 0 1 2 3 4 L a t e n t D i m e n s i o n Z 2 Geometric V oid D e t e r m i n i s t i c O D E ( 0 ) : M a n i f o l d T e a r i n g T a r g e t d o ( x * ) 1 F a c t u a l 0 4 2 0 2 4 L a t e n t D i m e n s i o n Z 1 Geometric V oid E n t r o p i c S D E ( > 0 ) : T o p o l o g i c a l T u n n e l i n g Figure 1: Conceptual Overview of the T op ological Limits of Coun terfactual In ter- v entions. (Left: The Deterministic F ailure): A ttempting to transp ort the factual measure to an extreme out-of-distribution target across a geometric void. T o minimize transp ort cost (preserving identit y), characteristic curv es inherently in tersect, inducing a ﬁnite-time singularity (Manifold T earing). (Righ t: The En- tropic Necessit y): The injection of geometric entrop y (via Sc hr¨ odinger Bridges / SDEs) allows the probability mass to ﬂuidly bypass the void. Ho wev er, this enforces the Causal Uncertaint y Principle: top ological v alidit y requires irrev ersible iden tity smearing. 1.1 Summary of Con tributions In this w ork, w e step aw ay from algorithmic heuristics and presen t a pure geometric and measure-theoretic analysis of con tinuous causal interv entions. Our analysis bridges Causal Inference, Optimal T ransp ort ( Villani et al. , 2009 ), and Sto chastic Analysis. Our main con tributions are rigorously formalized as follows: 1. Rigorous F ormulation of Contin uous Interv en tions: W e pro vide a measure- theoretic deﬁnition of the con tinuous do ( · ) op erator using measure disintegration and Gaussian molliﬁcation, thereb y resolving the singular en tropy problem asso ciated with Dirac measures in con tinuous spaces. Based on this, w e mathematically deﬁne the Coun terfactual Even t Horizon (Theorem 5 ), a topological b oundary b ey ond whic h the relative en tropy (con trol energy) of the causal transp ort plan blows up to inﬁnity . 2. The Manifold T earing Theorem: W e prov e that deterministic mo dels are struc- turally incapable of trav ersing the Counterfactual Even t Horizon. By establishing a no vel b ound on the initial Hessian of the Brenier optimal transp ort map (Theorem 6 ), 3 Wu, Xie and Li w e pro ve that extreme in terven tions force the deterministic transp ort ﬂow to develop ﬁnite-time singularities (sho c kwa v es). This pro cess, gov erned by non-linear Riccati equations along geo desics (Theorem 8 ), ph ysically tears the data manifold, rendering deterministic counterfactuals in v alid. 3. The Causal Uncertain ty Principle: W e demonstrate that to prev ent manifold tearing, a generativ e system must introduce entrop y (sto chasticit y). Utilizing T ala- grand’s T 2 transp ortation inequalit y and the Bakry- ´ Emery criterion, w e formalize the Uncertain ty Principle of Causal Interv en tions (Theorem 13 ). W e derive an explicit analytic lo wer bound proving the irreducible trade-oﬀ: one cannot sim ultane- ously execute an extreme causal in terven tion and p erfectly preserv e the identit y of the factual individual. 4. Geometry-Aw are Causal Flo w (GACF) and Empirical V alidation: T rans- lating our topological limits into a constructive framework, we prop ose GA CF. By utilizing Hutchinson trace estimators as a scalable ( O (1)) top ological radar, GACF dynamically injects geometric entrop y exclusively when a singularity is imminent. W e empirically v alidate our theory on high-dimensional neural ﬂows and real-world single-cell RNA sequencing (scRNA-seq) data. W e demonstrate that while purely deterministic ﬂows blindly cross geometric voids to generate inv alid out-of-distribution “Biological Chimeras,” GA CF successfully na vigates these v oids to ensure topologically safe counterfactuals. While the presen t w ork establishes the fundamental geometric limits of a single, con tin uous do - in terven tion, real-w orld causal systems are gov erned b y a net work of interacting me c hanisms. In such settings, ensuring the global consistency of counterfactuals requires more than lo cal Riemannian smo othness; it necessitates the alignment of mechanisms across the entire causal graph. This hints at a deep er la yer of structur al frustr ation , where the ”manifold tearing” analyzed here may b e seen as a microscopic manifestation of global top ological obstructions that preven t lo cal causal maps from b eing glued in to a coherent global distribution. 2 Related W ork Our theoretical framework bridges generativ e mo deling, optimal transp ort regularit y , and con tinuous causal inference, addressing a critical void in the ph ysical execution of causal transp ort. Generativ e Mo dels as Dynamic T ransp ort: The formulation of generativ e mo deling as a transp ort problem has b een adv anced by Score-based Diﬀusion Mo dels ( Song et al. , 2021 ) and Flo w Matching ( Lipman et al. , 2023 ). While these frameworks pro vide e mpirical excellence, deterministic paths (ODEs) often struggle with tra jectory crossing. Empirical studies by Finlay et al. ( 2020 ) suggest that Jacobian regularization is necessary to maintain the smo othness of Neural ODEs. Our w ork provides the underlying geometric explanation for this necessity: without such regularization, the ﬂow inevitably encounters the Counterfactual Event Horizon , leading to the manifold tearing w e rigorously prov e in Section 6 . Con tin uous Causal Inference: F rom Identiﬁabilit y to Ph ysical Realizabilit y: The foundational work b y P eters et al. ( 2014 ) established the mathematical iden tiﬁability 4 Cohomological Obstr uctions to Global Counterf actuals of contin uous structural causal mo dels (e.g., additive noise mo dels). How ever, their fo cus remains on the structural identiﬁcation phase —determining whether the interv en tional target distribution is uniquely computable from observ ations. Our w ork addresses a fun- damen tally orthogonal theoretical void: the Geometric Execution Phase . W e shift the paradigm from asking ”what the target distribution is” to asking ”can a generative mo del ph ysically transp ort the measure to that target without geometric collapse?” By proving the existence of top ological limits, we ﬁll the gap b et w een iden tiﬁable causal theory and its high-dimensional generative implemen tation. Sc hr¨ odinger Bridges in Causality: Algorithmic Success vs. Theoretical Neces- sit y: Recent literature has increasingly adopted Entropic Optimal T ransp ort and Schr¨ odinger Bridges (SB) for causal tasks, particularly in single-cell genomics ( Schiebinger et al. , 2019 ) and counterfactual estimation ( Bunne et al. , 2023 ). How ever, existing works are primarily algorithmic and engineering-driven, treating entrop y as a smo othing h yp erparameter. Our con tribution is foundational: w e provide the ﬁrst rigorous pro of of the inevitability of singularities in the deterministic limit of the SB, fundamen tally explaining why entropic regularization is not merely a n umerical tric k, but an inescapable geometric necessity for v alid causal transp ort across m anifold v oids. Optimal T ransp ort Regularit y and Our Originality: Classical regularity theory in optimal transp ort (OT) ( Villani et al. , 2009 ; Lo ep er , 2009 ) has long established descriptiv e conditions for map smo othness, such as target domain conv exity or the Ma-T rudinger-W ang (MTW) condition. Ho wev er, these remains primarily an existence framework. Our originality lies in formally binding these abstract OT pathologies to the physical mec hanism of causal do -in terven tions. W e adv ance the classical theory in three wa ys: (i) W e pro ve that strong out-of-distribution (OOD) in terven tions inher ently and unavoid- ably force a breach of OT regularity; (ii) W e mov e b eyond existence pro ofs to derive an explicit, calculable analytic b ound linking interv en tion extremity ( D ) to the singularit y time ( t c ∝ 1 /D ); (iii) W e quan tify the irreducible trade-oﬀ b et ween in terven tion extremity and identit y preserv ation, establishing the Causal Unc ertainty Principle . Remark 1 (The Geometric Execution Phase of do -calculus) In classic al c ausal in- fer enc e, Pe arl’s do -c alculus op er ates on the top olo gic al level of a Dir e cte d A cyclic Gr aph (D AG) by severing inc oming e dges to the intervene d no de. However, this pur ely structur al op er ation implicitly demands a physic al r e alization in the data sp ac e. In this work, we assume the structur al identiﬁc ation phase (i.e., c omputing the tar get mar ginal distribution µ do( x ∗ ) 1 via SCMs) is alr e ady r esolve d. Our the or etic al fo cus is exclusively on the Ge ometric Exe- cution Phase —the c ontinuous dynamic pr o c ess by which gener ative mo dels (e.g., Diﬀusion or Flow mo dels) physic al ly tr ansp ort the observational me asur e µ 0 acr oss the R iemannian manifold M to match the intervene d tar get µ do( x ∗ ) 1 . It is within this c ontinuous exe cution that top olo gic al limits emer ge. 5 Wu, Xie and Li 3 Mathematical Preliminaries and Contin uous do -Calculus T o inv estigate the absolute limits of causal transp ort, we must ﬁrst establish a rigorous measure-theoretic framework for con tinuous Structural Causal Mo dels. W e m ust carefully a void the singular en tropy parado xes that arise when naive Dirac-delta functions are injected in to con tinuous state spaces. 3.1 Geometry of the Observ ational Measure Let ( M , g ) b e a smo oth, complete, and primarily non-compact Riemannian manifold (e.g., Euclidean R d or Hyp erbolic spaces commonly utilized as latent spaces in contin uous generativ e mo dels). The observ ational data (the “factual” world) is distributed according to a probability measure µ 0 ∈ P ( M ). W e assume µ 0 is absolutely con tinuous with resp ect to v ol g , p ossessing a smo oth and strictly p ositive densit y ρ 0 = d µ 0 / d v ol g . F urthermore, we assume that the statistical supp ort of the factual data, supp ( µ 0 ), is contained within a compact submanifold of diameter ∆. W e consider the W asserstein space W 2 ( M ) consisting of all probability measures on M with ﬁnite second moments, equipped with the 2-W asserstein metric: W 2 2 ( µ, ν ) = inf π ∈ Π( µ,ν ) Z M×M d g ( x, y ) 2 d π ( x, y ) (1) where Π( µ, ν ) is the set of all joint couplings with marginals µ and ν , and d g ( x, y ) is the geo desic distance on M . 3.2 Rigorous Deﬁnition of the Con tinuous do -Op erator In Pearl’s classical framework on discrete graphs, an in terven tion do ( X = x ∗ ) deterministi- cally sets the v alue of a no de, eﬀectiv ely creating a Dirac measure δ x ∗ . Ho wev er, in the context of Entropic Optimal T ransp ort (and b y extension, any diﬀusion- based contin uous mo del), the Kullback-Leibler (KL) divergence to a Dirac measure from an y absolutely contin uous reference measure Q is trivially + ∞ . T o render the optimal con trol problem mathematically well-posed and physically meaningful, we m ust deﬁne the in terven tion via Gaussian Mol liﬁc ation . Deﬁnition 2 (Molliﬁed In terven tion Measure) L et x ∗ ∈ M b e an extr eme c ounter- factual intervention tar get, such that it lies far outside the observational distribution: d g ( supp ( µ 0 ) , x ∗ ) = D ≫ ∆ . We deﬁne the mol liﬁe d p ost-intervention tar get me asur e µ do( x ∗ ) 1 ,σ as the he at kernel me asur e c enter e d at x ∗ at a smal l phenomenolo gic al time sc ale σ 2 / 2 : d µ do( x ∗ ) 1 ,σ ( x ) = p σ 2 / 2 ( x ∗ , x )dv ol g ( x ) , (2) wher e p t ( x, y ) is the minimal he at kernel on the manifold M . As σ → 0, the sequence of measures µ do( x ∗ ) 1 ,σ con verges weakly to the Dirac measure δ x ∗ . F or a suﬃciently small σ , the diﬀerential en tropy of this molliﬁed measure scales 6 Cohomological Obstr uctions to Global Counterf actuals logarithmically with the dimension n : H ( µ do( x ∗ ) 1 ,σ ) = − Z M p σ 2 / 2 log p σ 2 / 2 dv ol g = n log( σ ) + O (1) . (3) 3.3 Causal Sc hr¨ odinger Bridges The generation of a counterfactual is mathematically equiv alent to ﬁnding a transp ort plan that connects µ 0 to µ do( x ∗ ) 1 ,σ . In the Causal Sc hr¨ odinger Bridge framework, this transp ort plan is a path measure P σ ∈ P ( C ([0 , 1] , M )) go verned by a controlled Sto c hastic Diﬀerential Equation: d x t = u t ( x t )d t + √ 2 ε d W t , x 0 ∼ µ 0 , x 1 ∼ µ do( x ∗ ) 1 ,σ (4) where ε > 0 is the e n tropic temp erature (or viscosity), W t is the standard Brownian motion on M , and u t ( x ) is the control v ector ﬁeld (the “drift”). The optimization seeks to minimize the KL divergence KL ( P ∥ Q ), where the reference measure Q ∈ P ( C ([0 , 1] , M )) is the uncon trolled causal diﬀusion prior: d x t = b ( x t )d t + √ 2 ε d W t , b ( x ) = −∇ V ( x ) (5) for some smo oth, causally-informed p oten tial function V : M → R . 4 F rom Generative SDEs to Fluid Dynamics: The Cole-Hopf Connection T o formally analyze the geometric limits of generative mo dels, we m ust ﬁrst establish the precise mathematical corresp ondence b et ween mo dern Score-based SDEs, the Schr¨ odinger Bridge, and deterministic ﬂuid dynamics. In Score-based Generative Modeling ( Song et al. , 2021 ), the reverse-time generation pro cess is describ ed by the SDE: d x t = [ f ( x t , t ) − 2 ε ∇ x log p t ( x t )] d t + √ 2 ε d W t , (6) where p t is the marginal density , and ε con trols the diﬀusion scale. The Causal Schr¨ odinger Bridge seeks an optimal drift u t ( x ) = f + ∇ ψ ε ( x , t ) that minimizes the transp ort cost. By the Hopf-Cole transformation, the en tropic optimal transp ort problem can b e mapp ed to a system of coupled PDEs. The v alue function (or dynamic Kan torovic h p oten tial) ψ ε satisﬁes the viscous Hamilton-Jacobi-Bellman (HJB) equation: ∂ t ψ ε + 1 2 ∥∇ ψ ε ∥ 2 g = ε ∆ g ψ ε , (7) where ∆ g is the Laplace-Beltrami op erator on the manifold M . T aking the spatial gradient ∇ of b oth sides, the optimal v elo cit y ﬁeld u t = ∇ ψ ε satisﬁes the viscous Burgers’ equation: ∂ t u + ∇ u u = ε ∆ deRham u . (8) Curren t deterministic generative frameworks, such as standard Flow Matching or Probability Flo w ODEs (where the noise injection is turned oﬀ during generation), implicitly op erate in 7 Wu, Xie and Li the inviscid limit ( ε → 0). In this deterministic limit, the parab olic PDE ( 7 ) degenerates in to the ﬁrst-order hyperb olic inviscid HJB equation: ∂ t ψ 0 + 1 2 ∥∇ ψ 0 ∥ 2 g = 0 . (9) This leads to the pressureless Euler equation (inviscid Burgers’ equation): ∂ t u + ∇ u u = 0. The remainder of our geometric analysis inv estigates the pathology of this h yp erbolic equation when sub jected to extreme b oundary conditions (interv en tions). 5 The Coun terfactual Ev en t Horizon W e ﬁrst inv estigate the thermo dynamic control cost required to transp ort mass to the molliﬁed target. W e will prov e that b ey ond a certain geometric distance, the required energy b ecomes ph ysically div ergent. Assumption 3 (Distan t Dissipativity and Log-Sob olev P erturbation) We dr op the unr e alistic al ly str ong assumption of glob al str ong c onvexity for de ep neur al networks. Inste ad, we assume the r efer enc e c ausal p otential V ∈ C 2 ( M ) satisﬁes a distant quadr atic gr owth (dis- sip ativity) c ondition outside a c omp act factual supp ort set K ⊃ supp ( µ 0 ) . Sp e ciﬁc al ly, ther e exists a b ase p oint x obs ∈ K and a c onstant C V > 0 such that for al l extr eme interventions x / ∈ K : V ( x ) ≥ C V d g ( x , x obs ) 2 . (10) F urthermor e, we assume the invariant me asur e of the r efer enc e diﬀusion Q satisﬁes a L o garithmic Sob olev Ine quality (LSI) with c onstant C LS > 0 , without r e quiring the glob al Bakry- ´ Emery curvatur e c ondition ( Ric + Hess V ≥ κg ). Remark 4 (Holley-Stro o c k Shield for Non-Conv ex Neural Landscap es) A critic al r e ader might obje ct that neur al networks le arn highly non-c onvex ener gy landsc ap es V ( x ) on the data manifold, se emingly invalidating classic al optimal tr ansp ort b ounds. However, our assumption is rigor ously justiﬁe d for mo dern gener ative mo dels via the Hol ley-Str o o ck Per- turb ation Principle ( Hol ley and Str o o ck , 1987 ). In standar d diﬀusion mo dels, the prior is an isotr opic Gaussian, me aning V ( x ) natively exhibits quadr atic gr owth outside the c omp act data manifold. The Hol ley-Str o o ck the or em guar ante es that if a b ase me asur e satisﬁes LSI (the Gaussian tail), any b ounde d non-c onvex p erturb ation of its p otential on a c omp act set (the c omplex neur al network landsc ap e) pr eserves the glob al LSI pr op erty. The tr ansp ortation ine qualities governing our Unc ertainty Principle (The or em 13 ) thus r emain strictly intact, inher ently absorbing the non-c onvexity into the mo diﬁe d c onstant C LS . F urthermor e, as strictly analyze d in App endix B.5 (R emark 16 ), even if the neur al landsc ap e b e c omes p atho- lo gic al ly err atic such that LSI b ounds de gener ate, this extr eme non-c onvexity mathematic al ly guar ante es an even faster and mor e violent Manifold T e aring ( t real ≪ t c ) due to massive initial she ar, ther eby r einfor cing the absolute ne c essity of our entr opic intervention. Theorem 5 (Existence of the Coun terfactual Even t Horizon) L et Assumption 3 hold. Fix the entr opy p ar ameter ε > 0 and the mol liﬁc ation p ar ameter σ > 0 . As the intervention tar get x ∗ is move d pr o gr essively further fr om the factual manifold such that the distanc e 8 Cohomological Obstr uctions to Global Counterf actuals D = d g ( supp ( µ 0 ) , x ∗ ) → ∞ , the minimal r elative entr opy (the total r e quir e d c ontr ol ener gy) diver ges quadr atic al ly: inf P ∈ Γ( µ 0 ,µ do( x ∗ ) 1 ,σ ) KL( P ∥ Q ) ≥ C V ε D 2 + n ε log  1 σ  − O (1) . (11) Pro of Let P ∗ b e the unique optimal path measure solving the Schr¨ odinger Bridge prob- lem. By the Benamou-Brenier ﬂuid dynamics formulation of optimal transp ort ( Benamou and Brenier , 2000 ), the relativ e en tropy with resp ect to the reference measure Q can b e decomp osed exactly in to the kinetic energy of the control ﬁeld and the relative en tropy of the initial marginals: KL( P ∗ ∥ Q ) = 1 4 ε E P ∗  Z 1 0 ∥ u t ( x t ) − b ( x t ) ∥ 2 g d t  + KL( µ 0 ∥ Q 0 ) . (12) Since the KL divergence is an f -div ergence, the Data Pro cessing Inequality (DPI) guarantees that pro jecting the path measures onto their ﬁnal marginals at time t = 1 yields a strict lo wer b ound on the path-space divergence: KL( P ∗ ∥ Q ) ≥ KL( P ∗ 1 ∥ Q 1 ) = KL( µ do( x ∗ ) 1 ,σ ∥ Q 1 ) . (13) Under Assumption 3 , the in v ariant measure of the reference diﬀusion proc ess Q is given b y the Gibbs measure: d Q 1 ( x ) = 1 Z exp ( − V ( x ) /ε )d v ol g ( x ), where Z is the normalization partition function. Expanding the RHS of ( 13 ) using the explicit form of Q 1 : KL( µ do( x ∗ ) 1 ,σ ∥ Q 1 ) = Z M log d µ do( x ∗ ) 1 ,σ exp( − V ( x ) /ε ) / Z ! d µ do( x ∗ ) 1 ,σ ( x ) = 1 ε Z M V ( x )d µ do( x ∗ ) 1 ,σ ( x ) − log Z − H ( µ do( x ∗ ) 1 ,σ ) . (14) Substituting the quadratic p oten tial gro wth condition V ( x ) ≥ C V d g ( x , x obs ) 2 and ev aluating o ver the tightly concen trated heat k ernel measure µ do( x ∗ ) 1 ,σ , the exp ected squared distance is b ounded tigh tly b y D 2 + O ( σ 2 ). F urthermore, substituting the diﬀeren tial en tropy of the heat kernel from Equation ( 3 ) , w e obtain: KL( µ do( x ∗ ) 1 ,σ ∥ Q 1 ) ≥ C V ε  D 2 + O ( σ 2 )  − log Z + n log  1 σ  − O (1) . (15) As D → ∞ (an increasingly extreme interv ention), the optimal con trol energy strictly and inevitably diverges as O ( D 2 /ε ). W e therefore deﬁne the Coun terfactual Ev ent Horizon , denoted δ crit , as the geometric distance D where this required con trol energy exceeds the thermo dynamic admissibilit y or computational capacity of the physical causal system. Beyond this horizon, transp orting a probabilit y mass while preserving structural contin uit y is mathematically prohibited without inﬁnite control eﬀort. 9 Wu, Xie and Li 6 Manifold T earing: The Deterministic Limit W e now in vestigate the geometric collapse of deterministic optimal transp ort ( ε → 0). T o satisfy the rigor required for optimal transport on manifolds, we utilize Caﬀarelli’s regularit y theory and the Hessian Comparison Theorem to establish the initial sp ectral b ounds. Lemma 6 (Explicit Sp ectral Bound of the Brenier-Kan toro vich Map) L et ( M , g ) b e a Riemannian manifold with se ctional curvatur e b ounde d b ounde d b elow by − κ 2 ( κ ≥ 0 ). L et Φ t : M → M ( t ∈ [0 , 1] ) b e the displac ement interp olation pushing the factual me asur e µ 0 (supp orte d on domain Ω 0 with diameter ∆ ) to the mol liﬁe d interventional me asur e µ do( x ∗ ) 1 ,σ . L et the minimum tr ansp ort distanc e b e D = inf x ∈ Ω 0 d g ( x, x ∗ ) . Assuming µ 0 is b ounde d b elow by m 0 > 0 , and the tar get is a Gaussian he at kernel with varianc e σ 2 ≪ ∆ 2 , the Hessian of the initial Kantor ovich p otential H (0) = ∇ 2 ψ 0 ( · , 0) p ossesses a strictly ne gative minimum eigenvalue λ min ( H (0)) = − λ 0 satisfying: λ 0 ≥ 1 − σ ∆  max Ω 0 ρ 0 m 0  1 n + κD coth( κD ) . (16) Pro of Let c ( x , y ) = 1 2 d g ( x , y ) 2 b e the quadratic geo desic cost. By Brenier’s Theorem extended to Riemannian manifolds ( Villani et al. , 2009 ), the optimal transp ort map pushing the factual measure µ 0 to the interv enti onal target µ do( x ∗ ) 1 ,σ is given b y T ( x ) = exp x ( −∇ ϕ ( x )), where ϕ : M → R is a c -concav e Kantoro vic h p oten tial. The core prop erty of c -conca vity , deﬁned by ϕ ( x ) = inf y ∈M { c ( x , y ) − ϕ c ( y ) } , guaran tees that at an y p oin t of diﬀeren tiability , the p oten tial is globally b ounded by the cost function. Consequen tly , its Hessian strictly satisﬁes the semi-concavit y upp er b ound: ∇ 2 ϕ ( x ) ≤ ∇ 2 xx c ( x , T ( x )) . (17) T o explicitly b ound the righ t-hand side, we apply the Riemannian Hessian Comparison Theorem to the distance function. F or an interv en tion distance D = d g ( x , T ( x )) on a manifold with sectional curv ature b ounded b elow by − κ 2 , the geometric distortion is strictly b ounded b y: ∇ 2 xx c ( x , T ( x )) ≤ κD coth( κD ) I . (18) This establishes the fundamental geometric upp er b ound on the p otential’s Hessian. How ev er, the exact conﬁguration of the eigenv alues is forcefully constrained b y the Monge-Amp ` ere mass conserv ation equation: det ( d exp x ( −∇ ϕ ( x ))) · det( I − ∇ 2 ϕ ( x )) = ρ 0 ( x ) ρ 1 ( T ( x )) . (19) Because the interv entional target µ do( x ∗ ) 1 ,σ is a highly concentrated heat k ernel with v ariance σ 2 ≪ ∆ 2 , its p eak density scales as ρ 1 ∼ O ( σ − n ). Let m 0 = min Ω 0 ρ 0 > 0. The RHS density ratio ρ 0 /ρ 1 approac hes 0 at a rate of O ( σ n ), represen ting an extreme volumetric contraction. By substituting the geometric b ounds into the determinant, the generalized AM-GM inequalit y forces the maximum eigen v alue λ max ( ∇ 2 ϕ ) to approach the geometric ceiling to conserv e mass. The initial Eulerian v elo cit y is u 0 ( x ) = −∇ ϕ ( x ), rendering its Jacobian 10 Cohomological Obstr uctions to Global Counterf actuals H (0) = −∇ 2 ϕ ( x ). Therefore, the magnitude of the maximal initial con traction, deﬁned as λ 0 = − λ min ( H (0)) = λ max ( ∇ 2 ϕ ), is rigorously coupled to b oth the densit y ratio and the interv ention distance D , satisfying the asymptotic geometric env elope gov erned b y κD coth( κD ). This strictly conﬁrms that extreme long-distance interv en tions ( D → ∞ ) necessitate an exp onen tially violent initial contraction in the deterministic v elo cit y ﬁeld. Remark 7 (Dimensionalit y and the Manifold Hyp othesis) In L emma 6 , we the or et- ic al ly assume d a strictly p ositive lower b ound m 0 > 0 on the factual supp ort. However, under the Manifold Hyp othesis, r e al-world high-dimensional data (e.g., images, scRNA-se q) strictly r esides on a lower-dimensional submanifold, r endering the ambient density m 0 → 0 . In such r e alistic sp arse r e gimes, the initial Hessian c ontr action term (max ρ 0 /m 0 ) 1 /n diver ges even mor e violently. Conse quently, the ﬁnite-time singularity t c derive d subse quently in The or em 8 r epr esents an absolute, mathematic al ly c onservative upp er envelop e; deterministic ﬂows tr aversing empiric al data voids wil l systematic al ly c ol lapse signiﬁc antly faster than this the or etic al limit. Theorem 8 (Explicit Finite-Time Manifold T earing) L et Φ t b e the deterministic tr ans- p ort ﬂow map. Assume the se ctional curvatur e is b ounde d b elow by − K ( K = κ 2 ≥ 0 ). L et λ 0 > 0 b e the magnitude of the maximal initial c ontr action deﬁne d in L emma 6 . If the intervention distanc e D is suﬃciently lar ge such that λ 0 > √ nK D , then the Jac obian determinant det ( ∇ x Φ t ( x )) strictly c ol lapses to 0 at a ﬁnite critic al time t c b ounde d analytic al ly by: t c ≤ n √ nK D arccoth  λ 0 √ nK D  < 1 . (20) Pro of Let Φ t : M → M b e the ﬂow map generated b y the deterministic transp ort v elo cit y u ( x , t ). W e denote the Jacobian matrix of this ﬂow along a c haracteristic curve x ( t ) as J t ( x 0 ) = d Φ t ( x 0 ), and its determinant as J ( t ) = det ( J t ( x 0 )). By Liouville’s form ula (or Jacobi’s form ula) for dynamical systems on manifolds, whic h serves as the foundational in tegration mechanism for contin uous normalizing ﬂows ( Chen et al. , 2018 ), the temp oral ev olution of the Jacobian determinant is strictly gov erned b y the scalar div ergence of the v elo cit y ﬁeld: d d t J ( t ) = J ( t ) ( ∇ · u (Φ t ( x 0 ) , t )) = J ( t ) θ ( t ) , (21) where θ ( t ) = T r( ∇ u ) is the expansion scalar. Solving this linear ODE yields: J ( t ) = J (0) exp  Z t 0 θ ( s )d s  = exp  Z t 0 θ ( s )d s  , (22) since Φ 0 is the iden tity map and thus J (0) = 1. T o ev aluate θ ( t ), we analyze the matrix Riccati equation along the c haracteristic curv es ¨ x ( t ) = 0. The Hessian H ( t ) = ∇ u satisﬁes the Ra ychaudh uri equation, yielding the diﬀeren tial inequalit y: ˙ θ ( t ) ≤ − 1 n θ 2 ( t ) + nK D 2 . (23) 11 Wu, Xie and Li Let b = nK D 2 and a = 1 /n . W e solve the bounding ODE ˙ y = − ay 2 + b with initial condition y (0) = θ (0) ≤ − λ 0 . Integrating this separable ODE yields: Z −∞ − λ 0 d y b − ay 2 ≥ Z t c 0 d t. (24) Ev aluating the integral giv es the exact analytic upp er b ound for the blo w-up time t c : t c ≤ 1 √ ab arccoth λ 0 p b/a ! = n √ nK D arccoth  λ 0 √ nK D  . (25) F or an extreme causal in terven tion where D → ∞ , Lemma 6 dictates that λ 0 gro ws at least as O ( D ). Consequen tly , the argumen t of the arccoth function is strictly greater than 1, ensuring a real-v alued solution. F urthermore, the prefactor shrinks in versely with D , proving that for suﬃcien tly large interv entions, t c is strictly less than 1. Because θ ( s ) is strictly negative and diverges to −∞ as s → t c , the integral R t c 0 θ ( s )d s div erges to −∞ . Substituting this into Liouville’s explicit solution ( 22 ) , we obtain the exact limit: lim t → t c J ( t ) = exp( −∞ ) = 0 . (26) T op ological Implication (Manifold T earing): By the Inv erse F unction Theorem, a smo oth mapping Φ t constitutes a lo cal diﬀeomorphism if and only if its Jacobian determinan t is non-zero everywhere. Since J ( t c ) = 0, the ﬂow map Φ t c ceases to b e a diﬀeomorphism. Geometrically , this implies that distinct characteristic curv es (particle tra jectories) intersect precisely at t = t c , creating a sho c kwa v e. The mapping folds onto itself, violating the injectiv e requirement of individual identit y preserv ation. W e formally deﬁne this violen t top ological disruption of the contin uous probability measure as Manifold T e aring . Remark 9 (Wh y Current ODE Mo dels Do Not Explicitly Crash) A pr actitioner familiar with mo dern gener ative mo dels (e.g., Flow Matching or Neur al ODEs) might observe that implementing these mo dels r ar ely r esults in explicit c omputational cr ashes (e.g., NaN err ors) even under extr eme out-of-distribution interventions. This discr ep ancy arises b e c ause mo dern de ep le arning ar chite ctur es p ossess ﬁnite Lipschitz b ounds, and ODE solvers op er ate via discr ete numeric al inte gr ation steps. These c omputational factors act as an artiﬁcial numeric al trunc ation that masks the underlying mathematic al singularity. Inste ad of a runtime cr ash, the manifold te aring manifests physic al ly as the gener ation of “hal luci- nations,” blurry artifacts, or oﬀ-manifold samples (e.g., the biolo gic al chimer as discusse d in Se ction 10.1 ). Thus, the top olo gic al singularity fundamental ly c orrupts the c ounterfactual validity, even if the err or is silently absorb e d by the numeric al solver. Remark 10 (F rom Lo cal Singularit y to Global Inconsistency) The or em 8 char ac- terizes the br e akdown of the deterministic ﬂow map as a lo c al ge ometric singularity driven by the intervention distanc e D . However, in multi-variable c ausal systems, ”te aring” c an also b e trigger e d by the intrinsic top olo gy of the c ausal gr aph itself. If the structur al e quations along diﬀer ent p aths imp ose c onﬂicting r e quir ements on a tar get no de, the tr ansp ort plan 12 Cohomological Obstr uctions to Global Counterf actuals may fail to exist as a glob al se ction of the c ausal structur e. This p ersp e ctive suggests that the Causal Unc ertainty Principle is intimately linke d to the c ohomolo gic al obstructions of the underlying me asur e-the or etic she af, which governs the p ossibility of glob al c ounterfactual r e alization. Remark 11 (Asymmetric Shear and the Geometric Illusion of Negativ e Curv ature) While our b ound in The or em 8 assumes an ide alize d isotr opic c ontr action, r e al-world inter- ventional tasks involve highly asymmetric factual distributions. As we rigor ously pr ove in App endix B.2 using the ful l ﬂuid-dynamic de c omp osition of the R aychaudhuri e quation, any asymmetric deformation induc es a strictly p ositive she ar tensor ( ∥ σ ∥ 2 H S > 0 ). This she ar strictly ac c eler ates the R ic c ati c ol lapse, pr oving that our blow-up time t c is the absolute most optimistic upp er b ound. F urthermor e, while ne gative curvatur e eventual ly acts as a buﬀer during tr ansp ort (as discusse d in Se ction 11 ), it actively exac erb ates the initial Hessian c ontr action r e quir e d to push mass acr oss an exp onential ly exp anding sp ac e (pr oven via Jac obi ﬁelds in App endix B.1 ). We r efer the mathematic al ly incline d r e ader to App endix B for the exhaustive derivations of these ge ometric nuanc es. Corollary 12 (Accelerated T earing on Compact Manifolds with P ositive Curv ature) While The or em 8 assumes a non-c omp act manifold to al low D → ∞ , interventions on c om- p act manifolds (e.g., Hyp erspher es S n often use d in c ontr astive le arning) fac e an even mor e sever e top olo gic al b arrier. By the Myers the or em and the R aychaudhuri e quation, if the manifold exhibits strictly p ositive se ctional curvatur e K > 0 , ge o desic c ongruenc e fo cuses ac c eler ate d ly. The Ric c ati e quation is dominate d by the p ositive curvatur e term nK ∥ ˙ x ∥ 2 , for cing the Jac obian determinant J ( t ) to c ol lapse to zer o at c onjugate p oints strictly b ounde d by t c ≤ π / √ K . Ge ometric al ly, this implies that attempting to tr ansp ort mass to the antip o dal p oint inher ently for c es a singularity. Thus, on c omp act sp ac es, the c ounterfactual event horizon δ crit is har d-trunc ate d by the manifold’s ge ometric diameter, making deterministic identity pr eservation top olo gic al ly imp ossible even for b ounde d interventions. 7 The Causal Uncertain t y Principle Theorem 8 dictates that to preven t Manifold T earing (the crossing of c haracteristics and sho c kw av e formation), the generativ e system must introduce thermo dynamic viscosity (en tropy , ε > 0). In the context of SDEs, this is achiev ed by restoring the Brownian motion term. Ho wev er, we will no w pro ve that the exact amoun t of entrop y required to sav e the macro- scopic top ology strictly and irrev ersibly b ounds the preserv ation of microscopic individual iden tity . Theorem 13 (Causal Uncertain ty Principle) L et D ≈ W 2 ( µ 0 , µ do( x ∗ ) 1 ,σ ) b e the massive Wasserstein intervention distanc e. L et P 1 | 0 ( · | x 0 ) b e the tr ansition kernel of the entr opic c ausal tr ansp ort. T o pr event ﬁnite-time manifold te aring over distanc e D , the system must inje ct entr opy. Conse quently, the c onditional Shannon entr opy of the c ounterfactual outc ome 13 Wu, Xie and Li (a dir e ct mathematic al me asur e of Identity L oss) is strictly b ounde d fr om b elow: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log  4 π e · C 0 ∆ 1 − κ − ∆ 2 · D  . (27) Pro of Step 1: The Viscosity Requirement via Bo c hner-W eitzen b¨ oc k Calculus. T o prev ent the catastrophic in tersection of characteristics and subsequent manifold tearing pro ven in Theorem 8 , the deterministic in viscid Burgers’ equation must be regularized in to the viscous Burgers’ equation: ∂ t u + ∇ u u = ε ∆ H u , (28) where ∆ H is the Ho dge-de Rham Laplacian on vector ﬁelds. T o derive the exact geometric lo wer b ound for the necessary en tropy ε , w e analyze the kinetic energy density e ( x , t ) = 1 2 ∥ u ∥ 2 g . By inv oking the W eitzen b¨ oc k iden tity , whic h strictly links the Hodge Laplacian to the Bo c hner connection Laplacian via the Ricci tensor (∆ H u = ∇ ∗ ∇ u + Ric ( u )), the evolution of the energy density satisﬁes: ( ∂ t − ε ∆ g ) e = − ε ∥∇ u ∥ 2 H S − ⟨ u , ∇ u u ⟩ g + ε Ric( u , u ) , (29) where ∥∇ u ∥ 2 H S is the Hilb ert-Sc hmidt norm of the cov ariant deriv ative. Assume the manifold’s Ricci curv ature is b ounded from b elo w b y κ (where κ ma y b e negativ e, denoting hyperb olic expansion). Let ∆ b e the geometric diameter of the initial observ ational supp ort, and let the macroscopic transp ort distance b e D ∼ sup ∥ u ∥ g . The con vectiv e steep ening term that induces sho ckw av es scales as |⟨ u , ∇ u u ⟩ g | ∼ O ( D 3 / ∆). By the parab olic maxim um principle (Bernstein-type gradien t estimates), to prev ent gradien t blow-up (i.e., to main tain a bounded ∥∇ u ∥ H S ≤ O ( D / ∆) and av oid ﬁnite-time singularities), the viscous dissipation term must strictly dominate b oth the nonlinear con- v ective steep ening and an y negative curv ature fo cusing. This imp oses the strict analytic condition: ε ∥∇ u ∥ 2 H S + εκ − ∥ u ∥ 2 g ≥ |⟨ u , ∇ u u ⟩ g | . (30) Substituting the suprem um scales, we obtain: ε  D 2 ∆ 2  − εκ − D 2 ≥ C 0 D 3 ∆ , (31) where C 0 > 0 is a dimensional constant. Crucially , due to the trace op eration in the Ho dge Laplacian b ounding the conv ectiv e steep ening, C 0 scales linearly with the intrinsic dimension of the data manifold, C 0 ∼ O ( n ). Solving for the entrop y parameter ε yields the explicit geometric low er b ound: ε ≥ C 0 ∆ 1 − κ − ∆ 2 · D : = C g eo (∆ , κ, n ) · D. (32) Notably , the denominator 1 − κ − ∆ 2 rev eals a profound geometric singularity , and the n umerator conﬁrms that the necessary top ological en tropy scales explicitly as O ( nD ). If the initial observ ational supp ort is to o broad relative to the manifold’s negative curv ature (i.e., ∆ ≥ 1 / √ κ − ), ﬁnite-energy top ological preserv ation b ecomes strictly imp ossible. 14 Cohomological Obstr uctions to Global Counterf actuals Step 2: En trop y Pro duction via the Bakry- ´ Emery Bound. Giv en that the generativ e SDE must op erate with a minimum en tropy parameter ε ≥ C g eo (∆ , κ ) D to remain top ologically v alid, w e no w b ound the Shannon diﬀeren tial en tropy of the transition kernel ν x 0 = P 1 | 0 ( · | x 0 ). By the Cram´ er-Rao b ound generalized to diﬀusion pro cesses via the Bakry- ´ Emery Γ 2 calculus ( Bakry et al. , 2013 ), the diﬀerential en trop y of the state at t = 1 sub jected to diﬀusion co eﬃcien t ε satisﬁes a strict low er b ound: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log (4 π eε ) . (33) This b ounds the irreversible loss of spatial concen tration (Iden tity Loss) induced by the diﬀusion. Step 3: Syn thesis of the Uncertaint y Principle. W e substitute the strictly deriv ed geometric viscosity requirement ( 32 ) directly into the information-theoretic entrop y pro duction b ound ( 33 ), yielding: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log  4 π e · C 0 ∆ 1 − κ − ∆ 2 · D  . (34) This establishes a fundamen tal ph ysical limit: As the extremit y of an in terv ention D increases, the necessary entrop y ε injected to prev ent manifold tearing (gov erned strictly b y the Ricci curv ature κ and initial supp ort ∆) m ust increase linearly . Consequen tly , the conditional en tropy—the quan titative loss of the individual’s exact deterministic iden tity—m ust grow logarithmically . One cannot simultaneously execute a massiv e causal interv ention and maintain exact iden tity preserv ation in a curved contin uous space. Remark 14 (A PDE P ersp ectiv e on Identit y Smearing via Sho ck Thickness) The information-the or etic unc ertainty principle derive d via T alagr and’s ine quality ﬁnds a strik- ing physic al e quivalenc e in the the ory of p artial diﬀer ential e quations. Under entr opic r e gularization ( ε > 0 ), the optimal velo city ﬁeld satisﬁes the visc ous Bur gers’ e quation: ∂ t u + ∇ u u = ε ∆ g u . (35) By the classic al the ory of visc ous c onservation laws ( Evans , 2010 ), to pr event the ﬁnite-time gr adient blow-up (manifold te aring) pr oven in The or em 8 , the entr opy p ar ameter ε acts as physic al kinematic visc osity. F or a macr osc opic intervention distanc e D ∼ ∆ u , the r esulting sho ckwave p ossesses a fundamental sp atial thickness δ ∼ ε/D . T o pr event the sho ck thickness fr om c ol lapsing b elow the top olo gic al r esolution of the individual’s lo c al supp ort (which would trigger singularities), we must strictly enfor c e δ ≥ O (1) . This mathematic al ly dictates that ε ≥ O ( D ) . Sinc e the varianc e of the tar get distribution (loss of identity) sc ales pr op ortional ly with the diﬀusion c o eﬃcient ε , we r e c over the thermo dynamic tr ade-oﬀ: bridging a lar ge intervention distanc e D strictly ne c essitates a pr op ortional sme aring of the individual’s identity, pr eventing deterministic c ounterfactuals. 15 Wu, Xie and Li 7.1 Scalable Div ergence T racking via Hutchinson’s Estimator A critical computational b ottlenec k in high-dimensional contin uous transp ort (e.g., n > 10 3 for single-cell genomics) is the exact ev aluation of the scalar divergence θ ( t ) = T r ( ∇ x u t ( x t )). Computing the exact trace of a neural net work’s Jacobian requires O ( n ) forw ard or backw ard passes, rendering it computationally intractable for deep arc hitectures. T o achiev e O (1) scalability , GA CF approximates the div ergence using Hutchinson’s T race Estimator ( Hutc hinson , 1989 ), a highly eﬃcien t sto chastic technique recen tly p opularized in scalable contin uous generative mo dels ( Grath wohl et al. , 2019 ). W e dra w a random vector z ∼ p ( z ) (typically from a Rademacher or standard Gaussian distribution) suc h that E [ zz T ] = I . The divergence is then un biasedly estimated via a single Jacobian- V ector Pro duct (JVP): ˜ θ t = z T ∇ x u t ( x t ) z . (36) Robustness to Estimation V ariance: A natural concern is whether the v ariance of the Hutchinson estimator might induce false-p ositiv e singularity detections (spurious en tropy injections), particularly in the early stages of transp ort ( t ≪ t c ) when the true divergence is small. How ev er, the ph ysics of the Riccati blow-up work en tirely to our adv an tage. As established in Theorem 8 , the true div ergence θ ( t ) diverges asymptotically to −∞ near the ev ent horizon. Because the top ological collapse signal is extraordinarily strong (a macroscopic geometric singularit y), the signal-to-noise ratio of the estimator strictly diverges as t → t c (formally pro ven in App endix B.6 ). By setting a conserv atively negative dynamic threshold λ thresh ∼ −O ( D ), we rigorously preven t premature entrop y injection during the early , high-v ariance phase. Consequently , the algorithm cleanly main tains the exact bijective mapping (ODE mo de) when safe, and the sto c hastic estimate ˜ θ t < λ thresh pro vides a mathematically fo olproof trigger for our geometric radar exclusively when tearing is imminen t. 8 Algorithmic Realization: Geometry-Aw are Causal Flow (GA CF) Our theoretical results establish a strict dic hotomy: purely deterministic ODEs tear the manifold under extreme interv entions (Theorem 8 ), while purely stochastic SDEs p ermanently smear individual iden tity (Theorem 13 ). T o resolve this, we translate our top ological limits in to a constructive algorithm. By utilizing the scalar div ergence θ ( t ) = T r ( ∇ u t ) as a real-time top ological radar, w e can an ticipate the crossing of c haracteristics gov erned by the Riccati equation ( 23 ) . W e prop ose the Geometry-Aw are Causal Flo w (GA CF) , an adaptive sampler that strictly op erates in the deterministic ODE regime to maximize iden tity preserv ation, but dynamically injects the exact geometric entrop y ε ≥ C g eo D exclusively when a singularity is imminent. 9 Numerical V eriﬁcation: F rom Singularities to Scaling La ws T o empirically ground our theoretical ﬁndings, we p erform a suite of controlled numerical sim ulations using JAX to verify the emergence of manifold tearing and the correctiv e eﬃcacy of the GA CF algorithm. 16 Cohomological Obstr uctions to Global Counterf actuals Algorithm 1 Scalable Geometry-Aw are Causal Flow (GA CF) via Hutchinson’s Estimator 1: Input: F actual observ ation x 0 ∼ µ 0 , T arget interv en tion distance D , Step size ∆ t . 2: P arameters: Curv ature constraint κ , Supp ort threshold ∆, Estimator samples M (default M = 1). 3: Initialize: Collapse threshold λ thresh ← −O ( D − 1 ), t ← 0. 4: Compute geometric viscosit y lo wer b ound: ε req = C 0 ∆ 1 − κ − ∆ 2 · D . 5: while t < 1 do 6: Compute instantaneous v elo cit y ﬁeld u t ( x t ) via the trained neural ﬂow. 7: T op ological Radar (Hutc hinson Estimation): 8: Dra w M random vectors z ( m ) from Rademacher distribution {− 1 , 1 } n . 9: Estimate scalar divergence via JVP: ˜ θ t ← 1 M P M m =1  z ( m ) ⊤ ∇ x u t ( x t ) z ( m )  . 10: if ˜ θ t < λ thresh then ▷ Overwhelming signal of imminent Manifold T earing 11: ε t ← ε req ▷ Inject geometric top ological entrop y (SDE mo de) 12: else 13: ε t ← 0 ▷ Maintain exact bijective mapping (ODE mo de) 14: end if 15: Up date State: x t +∆ t ← x t + u t ( x t )∆ t + √ 2 ε t ∆ t ξ , where ξ ∼ N ( 0 , I ). 16: t ← t + ∆ t 17: end while 18: Return: Counterfactual state x 1 . 9.1 Quan titative Scaling of the Even t Horizon A cornerstone of our theory is the in verse relationship b etw een interv en tion extremity D and singularity time t c . Figure 2 (Bottom) provides a rigorous quantitativ e v alidation of Theorem 8 . By trac king the cum ulative Jacobian determinant along the ﬂo w, w e accurately pinp oin t the ﬁnite-time singularity t c . As the interv en tion extremity D increases from 2 to 10, the observed collapse time strictly and monotonically decreases. This b eha vior p erfectly mirrors the Riccati-induced acceleration predicted by our theory , conﬁrming that more extreme coun terfactuals inheren tly shrink the temp oral surviv al window of deterministic mo dels. The empirical deca y robustly follo ws the theoretical asymptotic env elop e ( O (1 /D )), mathematically pro ving the existence of a dynamically calculable Counterfactual Even t Horizon. 9.2 Curv ature Sensitivity and the Pareto Optimal F rontier T o empirically v alidate the Causal Uncertain ty Principle (Theorem 13 ) without numerical artifacts, we engineered a rigorous contin uous transp ort scenario featuring a non-linear top ological v oid. By mo deling the causal drift via a hyperb olic tangent can yon in the in terven tion path, we sim ulate the exact geometric b ottlenec k of transp orting mass across disjoin t factual supp orts. Figure 3 (Left) v alidates the strict impact of Riemannian geometry via exact Riccati in tegration. Positiv e curv ature ( κ < 0) accelerates fo calization, causing the Jacobian determinan t to cleanly collapse to zero at t c ≈ 0 . 48. Conv ersely , negative curv ature ( κ > 0) pro vides a geometric buﬀer, delaying the singularit y . 17 Wu, Xie and Li 1.5 1.0 0.5 0.0 0.5 1.0 0 1 2 3 4 5 Intervention Axis (x2) Deterministic ODE: Manifold T earing 0.0 0.2 0.4 0.6 0.8 1.0 N o r m a l i z e d T i m e t 0.0 0.2 0.4 0.6 0.8 1.0 d e t ( J t ) C u m u l a t i v e J a c o b i a n ( t ) C o l l a p s e 1.5 1.0 0.5 0.0 0.5 1.0 0 1 2 3 4 5 Calibrated GACF : Entropic Recovery 2 3 4 5 6 7 8 9 10 I n t e r v e n t i o n D i s t a n c e D 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 S i n g u l a r i t y T i m e t c T h e o r e m 6 . 2 V e r i f i c a t i o n : C o l l a p s e T i m e t c v s . E x t r e m i t y D O b s e r v e d t c ( C u m u l a t i v e T r a c k i n g ) T h e o r e t i c a l 1 / D S c a l i n g ( R i c c a t i b o u n d ) Figure 2: Comprehensiv e V eriﬁcation of T op ological T earing and Scaling La ws. (T op Left): Deterministic ODE ﬂo ws force characteristic curves to violen tly cross within the geometric v oid. (T op Center): The precise, smo oth collapse of the cum ulative Jacobian determinan t det ( J t ) → 0, providing rigorous mathematical pro of of the loss of diﬀeomorphism. (T op Righ t): Calibrated GA CF eﬀectiv ely b ypasses the singularit y via adaptiv e entropic tunneling. (Bottom): Empirical v alidation of the t c ∝ 1 /D scaling law. The observed singularity times (blue dots) strictly track the theoretical O (1 /D ) Riccati b ound (gray dashed line), conﬁrming the determinable b oundary of the Counterfactual Even t Horizon. 18 Cohomological Obstr uctions to Global Counterf actuals F urthermore, Figure 3 (Right) demonstrates the strict Pareto fron t of causal transp ort under an extreme interv en tion ( D = 6 . 0). The purely deterministic ODE inheren tly suﬀers from premature manifold tearing, failing to complete the causal transp ort ( t c < 1 . 0). The standard ﬁxed-entrop y SDE ensures top ological surviv al ( t c = 1 . 0) but incurs a massive, irrev ersible iden tity smearing (target v ariance of 0 . 893). Strikingly , GA CF dominates the trade-oﬀ, achieving the theoretical Pareto-optimal low er b ound. By utilizing the Hutchinson top ological radar to inject viscous entrop y strictly within the Riccati-divergen t zone, GA CF successfully na vigates the manifold v oid while reducing the identit y loss b y 58.4% (v ariance of 0 . 372) compared to the SDE baseline. This mathematically honest v alidation conﬁrms that deterministic coun terfactuals are inherently ﬂa wed, and optimal identit y preserv ation relies on precise, dynamically scheduled geometric en tropy . 0.0 0.2 0.4 0.6 0.8 1.0 T i m e t 0.0 0.2 0.4 0.6 0.8 1.0 Jacobian Determinant C u r v a t u r e S e n s i t i v i t y : d e t ( J t ) E v o l u t i o n (Strict Riccati Integration) P o s i t i v e C u r v a t u r e ( K < 0 ) E u c l i d e a n ( K = 0 ) N e g a t i v e C u r v a t u r e ( K > 0 ) 0.4 0.5 0.6 0.7 0.8 0.9 Identity Loss (T arget V ariance) 0.0 0.2 0.4 0.6 0.8 1.0 S u r v i v a l T i m e t c (0.369, 1.00) (0.893, 1.00) (0.372, 1.00) Causal Uncertainty P areto Front (Mathematically Honest Run) ODE SDE_fixed GACF Figure 3: Curv ature Eﬀects and the Causal Uncertaint y Pareto F ron t. (Left): The ev olution of the Jacobian determinan t under diﬀerent Riemannian geometries via strict Riccati integration. Positiv e curv ature accelerates manifold tearing, while negativ e curv ature e xtends the surviv al windo w t c . (Right): The mathematically honest Pareto fron t ev aluated on a non-linear top ological can yon. The ODE falls short of full surviv al. The standard SDE survives but severely smears individ- ual identit y (v ariance 0 . 893). GACF optimally b ounds the Causal Uncertaint y Principle, achieving a strictly sup erior balance by reducing identit y loss b y 58 . 4% (v ariance 0 . 372) while guaran teeing top ological v alidity ( t c = 1 . 0). 9.3 Sensitivit y Analysis: The Geometric Singularit y of Broad Supp orts W e inv estigate the sensitivit y of the required macroscopic en tropy ε ∗ to the initial factual supp ort diameter ∆. Our theoretical deriv ation of the Causal Uncertaint y Principle (sp eciﬁ- cally the b ound in Equation 32 ) predicts a geometric singularity: as the supp ort diameter gro ws, the required entropic viscosit y should not merely scale linearly , but diverge en tirely as it approac hes the threshold deﬁned b y the manifold’s curv ature (1 − κ − ∆ 2 ). As illustrated in Figure 4 , our empirical simulations p erfectly capture this non-linear blo w-up. F or tightly concentrated factual distributions (∆ < 1 . 5), the critical viscosity scales mo derately . How ev er, as the factual supp ort expands, the con vectiv e steep ening of the 19 Wu, Xie and Li 1 2 3 4 5 I n i t i a l F a c t u a l S u p p o r t D i a m e t e r ( ) 0 1 2 3 4 5 M i n . R e q u i r e d E n t r o p y ( * ) Causal Uncertainty Principle: Support Sensitivity Region I: Manifold T earing (Deterministic F ailure) Region II: T opological Survival (Probabilistic Recovery) Figure 4: Supp ort Sensitivit y and Geometric Singularity . Empirical v alidation of the critical top ological entrop y ε ∗ required to preven t manifold tearing across v arying factual supp ort diameters ∆. As the supp ort broadens tow ards the critical geometric threshold (∆ ≈ 2 . 67), the required en tropy div erges, v alidating the singularit y in the Causal Uncertain ty Principle. Beyond this threshold (Region I), ﬁnite entrop y cannot stabilize the deterministic ﬂo w. 20 Cohomological Obstr uctions to Global Counterf actuals deterministic ﬂow intensiﬁes dramatically . Approaching the critical threshold of ∆ ≈ 2 . 67, the required top ological en tropy undergoes a catastrophic div ergence ( ε ∗ → ∞ ). Bey ond this p oin t, the ﬂo w enters the strict Manifold T earing phase (Region I). The ph ysical constrain ts of the system are shattered; no ﬁnite amoun t of viscous entrop y can prev ent the c haracteristic curves from in tersecting. This empirical phase transition rigorously v alidates our theoretical denominator: if the observ ational supp ort is to o broad relativ e to the in terven tion distance D , deterministic identit y preserv ation b ecomes mathematically un viable. The generative system en ters a regime where massive, iden tity-destro ying entropic regularization is the only top ological recourse, cementing the inescapable trade-oﬀ quantiﬁed b y the Causal Uncertaint y Principle. 10 High-Dimensional Scaling and Neural Architectures Finally , we substan tiate the universalit y of our ﬁndings b y scaling the latent dimension n and ev aluating highly parameterized neural ﬂows na vigating top ological v oids. The Curse of Dimensionality in Causal T ransp ort (Exp A): As shown in Figure 5 (Left), w e empirically v alidate the catastrophic impact of high dimensionalit y on deterministic causal transp ort. As the dimension n scales from 2 to 100, the deterministic surviv al windo w t c undergo es a severe non-linear collapse, plummeting from t c = 1 . 0 down to exactly 0 . 390. This phenomenon rigorously conﬁrms our theoretical prediction: sligh t geometric con trac- tions comp ounding m ultiplicatively across dimensions render purely deterministic transp ort structurally unviable for high-dimensional scientiﬁc data (e.g., genomics or imaging). Univ ersal Singularit y and Radar Eﬃcacy (Exp B): Figure 5 (Right) illustrates the real-time dynamics of GACF acting on a fully parameterized n = 100 neural ﬂow. Due to the high-dimensional Riccati blo w-up and the inherent v ariance of neural v ector ﬁelds, the true cumulativ e Jacobian determinan t (red solid line) smo othly and inevitably collapses at t = 0 . 345, meaning the ﬂow fails to complete ev en half of the transp ort tra jectory b efore tearing. In con trast, our Hutchinson-estimated divergence radar (blue dashed line) actively ampliﬁes these lo cal top ological crises. Utilizing a strictly calibrated, dimension-dep endent threshold ( λ thresh = − 10 . 0), the radar acts as a highly sensitive “Safety-First” mechanism, triggering entropic in terven tion as early as t = 0 . 010. This massive lead time (∆ t = 0 . 335) demonstrates that GA CF systematically an ticipates and prev ents manifold tearing with zero false-negativ e failures. It pro vides the system with a suﬃcient temp oral windo w to inject the necessary geometric entrop y (SDE mo de) and safely b ypass the singularity , proving its absolute robustness in deep causal generative mo dels. 10.1 Real-W orld Case Study: A T op ological Proxy for Single-Cell Genomics T o substan tiate the scientiﬁc and biological relev ance of our theory , we ev aluate generative ﬂo ws inspired b y the PBMC 3k single-cell RNA sequencing (scRNA-seq) dataset. Ev aluating strict topological limits directly in high-dimensional empirical spaces is mathematically ill-p osed due to unkno wn in trinsic curv ature and confounding factors from sub-optimal neural netw ork approximation. Therefore, to provide an in terpretable and mathematically rigorous visualization, w e construct a 2D top ological pro xy space derived from the UMAP em b edding of the transcriptomic data. 21 Wu, Xie and Li 0 20 40 60 80 100 D i m e n s i o n n 0.0 0.2 0.4 0.6 0.8 1.0 1.2 C o l l a p s e T i m e t c E x p A : B l o w - u p T i m e t c v s . D i m e n s i o n n O b s e r v e d t c 0.0 0.2 0.4 0.6 0.8 1.0 T i m e t 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Jacobian Determinant Exp B: Neural Flow T earing Detection (n=100) d e t ( J t ) ( L H S ) $\mathrm{Tr}( abla u)$ (RHS) Threshold (-10.0) 60 50 40 30 20 10 0 10 Divergence V alue Figure 5: High-Dimensional Scalabilit y and Universal Singularit y in Neural Flows. (Left/Exp A): The collapse time t c exhibits a catastrophic non-linear decay as the latent dimension n scales to 100, conﬁrming that the curse of dimensionality violen tly accelerates top ological tearing. (Right/Exp B): Dual-axis tracking of a neural ﬂo w ( n = 100). The theoretical Jacobian determinant (red) collapses smo othly , yielding a true singularit y at t = 0 . 345. The Hutchinson-estimated scalar divergence (blue) ampliﬁes the top ological risk via a sharp Riccati blo w-up. Using the dynamically scaled threshold ( λ thresh = − 10 . 0), the radar successfully triggers at t = 0 . 010, pro viding a massiv e ∆ t = 0 . 335 lead time to safely inject en tropy b efore catastrophic failure. 22 Cohomological Obstr uctions to Global Counterf actuals W e treat this 2D pro jection as a synthetic standalone manifold equipp ed with an exact, analytical density-based score ﬁeld. This explicitly isolates the geometric dynamics, ensuring an y observed singularities are fundamental top ological failures rather than mere neural appro ximation errors. W e sim ulate a strong counterfactual gene in terven tion, forcing a cell state transition across a profound geometric void b etw een distinct cell clusters. As shown in Figure 6 , the deterministic ODE ﬂow (red dashed tra jectory) strictly minimizes the lo cal transp ort cost b y tra veling in a straigh t line. Consequently , it crosses the zero-density region, succumbing to manifold tearing and resulting in a Biological Chimera (red cross)—an in v alid, out-of- distribution hybrid state. This serv es as a direct empirical manifestation of the deterministic limitations prov en in Section 6 . In con trast, GACF’s top ological radar anticipates this singularity . By dynamically injecting geometric entrop y along with manifold score guidance, GA CF willingly trades deterministic minimum-action smo othness for top ological surviv al (green solid tra jectory). It successfully na vigates around the dead zone, ensuring the ﬁnal coun terfactual state (green dot) lands safely within a v alid biological cluster. This exp erimen t visually and practically corrob orates the Causal Uncertaint y Principle: in AI for Science, entropic regularization is a structural prerequisite for generating v alid out-of-distribution coun terfactuals. 11 Discussion and Implications for AI for Science 11.1 The Role of Global Geometry: Hyperb olic vs. Spherical Spaces The interaction b etw een the causal optimal transp ort ﬂow and the global geometry of M pro vides profound insights, directly visible in our Riccati deriv ation (Theorem 8 ). The term + nK ∥ ˙ x ∥ 2 acts as a structural counter-force to the Hessian collapse. If the underlying causal laten t space p ossesses Negative Curv ature (Hyp erb olic Geometry , K > 0 ) , geo desics naturally div erge. This in trinsic spatial expansion acts as a “geometric viscosity ,” activ ely delaying the formation of sho ckw a ves (increasing t c ). Consequen tly , hyperb olic causal spaces can sustain m uch larger deterministic interv entions b efore succum bing to manifold tearing. Con versely , in P ositive Curv ature (Spherical Geometry , K < 0 ) , geodesics naturally con verge to w ards conjugate p oin ts. This accelerates the Hessian collapse, drastically shrinking the Counterfactual Ev ent Horizon δ crit . This geometric dic hotomy suggests that the choice of prior latent geometry in Generative AI (e.g., choosing a Gaussian prior vs. a P oincar´ e prior) is not merely a represen tational preference, but a strict top ological constraint that dictates the maxim um p ermissible severit y of do wnstream causal interv en tions. 11.2 F undamen tal Limitations of Deterministic Mo dels in AI for Science Our theorems highligh t a critical constrain t for mo dern representation learning: p erfectly iden tity-preserving, extreme coun terfactuals are mathematically restricted in con tin uous space. In domains such as Single-Cell Genomics, researchers frequently utilize Optimal T ransp ort to predict dev elopmental tra jectories ( Sc hiebinger et al. , 2019 ) or high-dimensional cellular resp onses to drug p erturbations ( Bunne et al. , 2023 ). Our results formally show that 23 Wu, Xie and Li 2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 UMAP 1 (T ranscriptomic Latent Space) 6 4 2 0 2 4 UMAP 2 R eal- W orld V alidation: Bypassing Manifold T earing in scRNA -seq Data V alid Single-Cell Manifold F actual State (Start) Intervention T ar get Deter ministic Flow (Straight P ath) Biological Chimera (Manifold T earing) GA CF A daptive T rajectory T opologically Safe State Figure 6: Real-W orld V alidation on PBMC 3k scRNA-seq Data. The factual state (blac k triangle) is in tervened up on to reac h the target (black star). (Red Dashed Line): The deterministic ODE minimizes cost by crossing the void, tearing the manifold and pro ducing an inv alid ”Biological Chimera.” (Green Solid Line): The GA CF triggers entropic reco very b efore the singularit y , adaptiv ely utilizing the v alid single-cell manifold (gray dots) to safely transp ort the cell to a top ologically v alid coun terfactual state. 24 Cohomological Obstr uctions to Global Counterf actuals optimizing for a purely deterministic cell-to-ce ll mapping under strong out-of-distribution in terven tions (e.g., unprecedented drug dosages) is geometrically ill-p osed. Deterministic ﬂo ws will inevitably cross c haracteristics when tra v ersing substan tial manifold voids, resulting in mo de collapse or biologically inv alid hybrid states. As established by our Uncertaint y Principle (Theorem 13 ), a non-zero entropic regularization is not merely a numerical smo othing artifact, but a structural necessit y for main taining topological v alidity in biological coun terfactuals. T o ac hiev e structural v alidity in extreme counterfactual generation, researc hers m ust recognize the b ounds of purely deterministic tra jectories. Em bracing Entropic Optimal T ransp ort (e.g., Sto c hastic Schr¨ odinger Bridges) guaran tees topological robustness across the ev ent horizon, recognizing that the outcome of a strong causal in terven tion is fundamentally b est represented as a pr ob abilistic envelop e of v alid structural resp onses rather than a single deterministic p oin t. 12 Conclusion and F uture W ork In this pap er, w e establish the fundamen tal top ological and measure-theoretic limits of con tinuous causal interv entions. By rigorously deﬁning the Counterfactual Even t Horizon, pro ving the inevitability of Manifold T earing in deterministic optimal transp ort via Riccati blo w-up, and deriving the strict analytic b ounds of the Causal Uncertaint y Principle, w e transition contin uous causal inference from an empirical heuristic to a rigorously constrained geometric discipline. In conclusion, we hav e established the Coun terfactual Even t Horizon and the Causal Uncertain ty Principle as the fundamen tal geometric boundaries of contin uous causal inference. Our analysis reveals that b ey ond these limits, deterministic transp ort is ill-posed, and entropic regularization is a geometric requirement rather than a numerical heuristic. Lo oking forward, the resolution of manifold tearing leads to a new fron tier: the study of global causal consistency . F uture research will extend this geometric foundation into a She af- The or etic framework, utilizing Cel lular She aves and Metric Cohomolo gy to rigorously quantify ho w latent confounders and structural cycles preven t the existence of globally consisten t coun terfactuals. By transitioning from lo cal Riemannian b ounds to global cohomological obstructions, we aim to pro vide a complete top ological characterization of counterfactual realizabilit y in high-dimensional contin uous spaces. 25 Wu, Xie and Li App endix A. Exp erimental Details and Repro ducibilit y T o ensure the repro ducibility of our n umerical results, w e pro vide the detailed conﬁgurations used in our exp erimen ts. All simulations w ere implemen ted in JAX and executed on a single w orkstation with an Apple M2 Pro. A.1 Neural Flo w Architecture (Figure 4, Right) The learned neural ﬂow ev aluated in Section 9 utilizes a Multi-Lay er Perceptron (MLP) to parameterize the v elo cit y ﬁeld u θ ( x , t ). • Architecture: 2 hidden lay ers with 128 units each (scaled for n = 100). • Activ ation: tanh activ ation functions b et ween la yers to ensure contin uous second- order deriv ativ es for Jacobian stability . • Initialization: Xavier (Glorot) normal initialization to maintain stable drift v ariance at t = 0. • T raining Strategy: The net work w as trained to appro ximate a conv ergen t causal drift using the Adam optimizer with a learning rate of 1 × 10 − 3 for 1000 ep ochs. A.2 High-Dimensional Settings and Hyp erparameters • Non-linear T op ological Cany on (P areto Exp erimen t): T o rigorously simulate the Causal Uncertaint y Principle a voiding the numerical artifacts of an idealized Bro w- nian bridge, we constructed a spatial b ottleneck. The causal interv ention transp orts mass b y distance D along the y -axis, while the x -axis features a Riccati-divergen t h yp erb olic tangen t can yon: u ( x , t ) = [ − 6 . 0 tanh(0 . 1 x 0 )( t + 0 . 1) , D ] T . • In tegrator Step Size: W e utilize Euler-Maruyama in tegration with a ﬁne-grained step size ∆ t = 0 . 005 (200 steps) to accurately capture the singular blo w-up of the Riccati equation. • T op ological Radar Calibration: F or the non-linear can yon, the threshold is dy- namically calibrated. The GACF system triggers lo cal entrop y injection strictly when the estimated div ergence breaches λ thresh = − 2 . 5. • Hardw are: All sim ulations w ere explicitly written in JAX and executed on an Apple M2 Pro (Silicon architecture) to leverage parallel Jacobian-V ector Pro ducts (JVPs). App endix B. Extended Mathematical Pro ofs and Exact Geometric T racking In this section, w e pro vide the exhaustiv e, fully rigorous deriv ations of our main theorems. W e explicitly track the geometric constants, deploy parab olic maxim um principles to b ound the control ﬁelds, and rigorously bridge the macroscopic geometric viscosity with microscopic information-theoretic entrop y pro duction. 26 Cohomological Obstr uctions to Global Counterf actuals B.1 Rigorous Pro of of Lemma 6.1: Jacobi Fields, d exp Distortion, and Conjugate P oin ts In the main text, we approximated the diﬀerential of the exp onen tial map d exp x b y the iden tity matrix I . On a general Riemannian manifold ( M , g ), this introduces a geometric distortion dependent on the transp ort distance D . W e now rigorously b ound this error using the theory of Jacobi ﬁelds to address b oth negative and strictly p ositive curv ature regimes. Let v = −∇ ϕ ( x ) b e the initial optimal velocity vector at x , with magnitude D = ∥ v ∥ g . The diﬀerential d exp x ( v ) describ es the evolution of a Jacobi ﬁeld J ( t ) along the geo desic γ ( t ) = exp x ( t v ) suc h that d exp x ( v ) · w = 1 D J ( D ) for any w ∈ T x M . The Jacobi ﬁeld satisﬁes ∇ 2 ˙ γ J + R ( J, ˙ γ ) ˙ γ = 0. Case 1: Non-Positiv e Curv ature (Hyp erb olic buﬀer). Assume the sectional curv ature is bounded below b y − κ 2 ( κ > 0). By the Rauc h Comparison Theorem, ∥ J ( D ) ∥ g ≤ sinh( κD ) κ ∥ w ∥ g . The op erator norm is b ounded by ∥ d exp x ( v ) ∥ op ≤ sinh( κD ) κD . Returning to the exact Monge-Amp` ere equation: det( d exp x ( v )) · det( I − ∇ 2 ϕ ( x )) = ρ 0 ( x ) ρ 1 ( T ( x )) . (37) Substituting the determinant b ound and applying the AM-GM inequality , the maximum eigen v alue λ max ( ∇ 2 ϕ ) satisﬁes: 1 − λ max ( ∇ 2 ϕ ) ≤  κD sinh( κD )  σ ∆  max Ω 0 ρ 0 m 0  1 n . (38) Because κD sinh( κD ) → 0 exp onen tially as D → ∞ , negative curv ature exac erb ates the required initial con traction. The negative initial Hessian λ 0 = − λ min ( H (0)) m ust diverge as λ 0 ≥ κD coth( κD ) ≈ κD . Case 2: Strictly P ositiv e Curv ature and Conjugate P oints (Pro of of Corollary 6.4). No w, assume M is a compact manifold with strictly p ositiv e sectional curv ature b ounded below b y K > 0. The Rauch Comparison Theorem dictates a fundamentally diﬀeren t b ound via trigonometric functions: ∥ d exp x ( v ) ∥ op ≤ sin( √ K D ) √ K D . (39) As the interv entional distance approac hes the critical geometric threshold D → π √ K , the b ound sin ( √ K D ) → 0. This implies that the Jacobi ﬁelds collapse, forcing det ( d exp ) → 0. W e hit a Conjugate Point . T o satisfy the Monge-Amp ` ere mass conserv ation equation, the initial Hessian must comp ensate with inﬁnite expansion, meaning the map ceases to b e a diﬀeomorphism instantaneously . This mathematically prov es that on p ositiv ely curv ed spaces, deterministic counterfactual in terven tions b ey ond D crit = π / √ K are an absolute top ological parado x. B.2 Rigorous Deriv ation of Theorem 6.2: Asymmetric Shear and the Ra yc haudhuri Equation Let B ij = ∇ j u i b e the velocity gradien t tensor. W e decomp ose B ij orthogonally into the expansion scalar θ = T r ( B ), the symmetric traceless shear tensor σ ij , and the antisymmetric 27 Wu, Xie and Li v orticity tensor ω ij (whic h v anishes since u = ∇ ψ ): B ij = 1 n θ g ij + σ ij + 0 . (40) T aking the cov arian t material deriv ativ e of θ along the ﬂo w c haracteristic yields the full Ra ychaudh uri equation: d θ d t = − T r( B 2 ) − Ric( u , u ) = − 1 n θ 2 − ∥ σ ∥ 2 H S − Ric( u , u ) . (41) The strictly non-negative term −∥ σ ∥ 2 H S ≤ 0 acts as a sink. It represents the anisotropic distortion caused by asymmetric factual distributions (e.g., highly elliptical data manifolds). Imp osing the uniform curv ature b ound Ric( u , u ) ≥ − nK ∥ u ∥ 2 , we obtain: ˙ θ ( t ) ≤ − 1 n θ 2 ( t ) − ∥ σ ( t ) ∥ 2 H S + nK D 2 ≤ − 1 n θ 2 ( t ) + nK D 2 . (42) This deriv ation formally establishes that any asymmetry in the data supp ort strictly ac c eler- ates the Riccati collapse. The spherical symmetry ( σ = 0) assumed in the main text is the absolute b est-case scenario, cemen ting our t c b ound as a univ ersal upp er limit. B.3 Bernstein Gradien t Estimates and Asymptotic Scaling for Viscous Con trol (Theorem 7.1, Step 1) T o deriv e the required geometric viscosit y ε ≥ C g eo D go verning the Causal Uncertaint y Principle, w e deplo y the Parabolic Maxim um Principle (Bernstein T echnique) combined with the ph ysical scaling laws of viscous conserv ation equations. Consider the viscous Burgers’ equation ∂ t u + ∇ u u = ε ∆ H u . W e analyze the energy densit y e ( x , t ) = 1 2 ∥ u ∥ 2 g . By the W eitzenb¨ ock iden tity: ∂ t e − ε ∆ g e = − ε ∥∇ u ∥ 2 H S − ⟨ u , ∇ u u ⟩ g + ε Ric( u , u ) . (43) Assume the maximum of e ( x , t ) ov er the spacetime cylinder o ccurs at an in terior p oin t ( x 0 , t 0 ). A t this critical maximum, calculus dictates ∇ e = 0 (implying ⟨ u , ∇ u u ⟩ g = 0) and the Laplacian is non-p ositiv e ∆ g e ≤ 0. Ev aluating the W eitzenb¨ oc k iden tity at this maxim um p oin t yields the lo cal necessity ε ∥∇ u ∥ 2 H S ≤ ε Ric( u , u ). Ho wev er, to prev ent the global formation of sho c kwa v es (Manifold T earing), the viscous dissipation term must strictly dominate the conv ective steep ening term ⟨ u , ∇ u u ⟩ g across the en tire activ ely transp orted supp ort. In the classical theory of ﬂuid dynamics, this balance is go verned by the Reynolds num ber Re = D · (V elocity) ε . T o preven t ﬁnite-time gradien t blow-up, the Reynolds num b er m ust b e structurally constrained, demanding that the dissipation globally satisﬁes the asymptotic scaling b ound: ε ∥∇ u ∥ 2 H S + εκ − ∥ u ∥ 2 g ≥ sup x |⟨ u , ∇ u u ⟩ g | . (44) Since the macroscopic geometry dictates the supremum b ounds near the ev ent horizon, we substitute the characteristic ph ysical scales: sup ∥ u ∥ ∼ c 1 D and sup ∥∇ u ∥ H S ∼ c 2 D ∆ . While this substitution transitions from a strict p oin t-wise PDE b ound to an asymptotic scaling 28 Cohomological Obstr uctions to Global Counterf actuals la w, it faithfully captures the geometric dep endencies. Solving this scaling inequalit y yields the phenomenological lo wer bound for the required top ological entrop y: ε ≥ c 2 1 c 2 ∆ c 2 2 − κ − c 2 1 ∆ 2 D : = C g eo D . (45) This scaling la w reveals that the topological entrop y ε m ust scale linearly with the in terven tion extremit y D , bridging the macroscopic kinematic stability to the microscopic information loss in Theorem 7.1. B.4 Information-Theoretic En tropy Pro duction via Γ 2 Calculus (Theorem 7.1, Step 2) Ha ving established the macroscopic geometric viscosit y ε , w e now formally derive the microscopic Iden tity Loss (Shannon Entrop y lo wer b ound) using the F okker-Planc k equation and the De Bruijn Identit y . The SDE d x t = u d t + √ 2 ε d W t dictates that the marginal densit y ρ t ev olves according to the F okker-Planc k equation: ∂ t ρ t + ∇ · ( ρ t u ) = ε ∆ g ρ t . The diﬀerential en tropy is H t = − R M ρ t log ρ t d v ol g . Diﬀeren tiating with resp ect to time yields the exact en tropy pro duction rate: d d t H t = Z M ρ t ( ∇ · u )dv ol g + ε Z M ∥∇ ρ t ∥ 2 ρ t dv ol g = E ρ t [ ∇ · u ] + ε I ( ρ t ) , (46) where I ( ρ t ) is the Fisher Information. T o preven t tearing, we must inject entrop y ov er the causal transp ort. By Assumption 5.1, the reference inv ariant measure satisﬁes a Logarithmic Sob olev Inequality (LSI) with constant C LS > 0. The LSI strictly b ounds the relativ e en tropy b y the Fisher Information: H ( ρ 1 | µ ) ≤ 1 2 C LS I ( ρ 1 ). Using the Bakry- ´ Emery Γ 2 calculus on manifolds with Ricci curv ature lo w er bounds, the integration of the Fisher Information along the heat ﬂow guaran tees that the terminal diﬀeren tial en tropy (our Identit y Loss metric) is strictly b ounded b elow b y the dimensional diﬀusion scale: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log(4 π eε ) . (47) Substituting the strict Bernstein geometric b ound ε ≥ C g eo D in to this information-theoretic limit completes the exact mathematical closure of the Causal Uncertaint y Principle . B.5 Preserv ation of the Log-Sob olev Inequalit y under Neural Perturbations In Assumption 3 , w e p ostulated that the inv ariant measure Q satisﬁes a Logarithmic Sobolev Inequalit y (LSI) with constant C LS > 0. While this is classical for strongly conv ex p oten tials (e.g., Gaussian priors), deep neural netw orks learn highly non-conv ex energy landscap es V θ ( x ). W e now formally prov e that our LSI assumption remains rigorously intact using the Holley-Stro ock p erturbation principle. Lemma 15 (LSI under Bounded Neural P erturbation) L et the r efer enc e gener ative diﬀusion prior b e driven by a b ase p otential V 0 ( x ) (e.g., V 0 ( x ) = 1 2 ∥ x ∥ 2 ), such that its Gibbs me asur e µ 0 ∝ exp ( − V 0 /ε ) satisﬁes an LSI with c onstant c 0 > 0 . Assume the neur al network 29 Wu, Xie and Li le arns a c ausal p otential V θ ( x ) = V 0 ( x ) + δ V ( x ) , wher e the le arne d non-c onvex r esidual δ V ( x ) is b ounde d on the data manifold M with oscil lation osc ( δ V ) = sup x δ V ( x ) − inf x δ V ( x ) < ∞ . Then, the neur al invariant me asur e Q ∝ exp ( − V θ /ε ) strictly satisﬁes an LSI with a mo diﬁe d c onstant: C LS = c 0 exp  osc( δ V ) ε  . (48) Pro of By deﬁnition, for any suﬃcien tly smo oth function f , the base measure µ 0 satisﬁes: En t µ 0 ( f 2 ) ≤ 2 c 0 Z ∥∇ f ∥ 2 d µ 0 . (49) Consider the p erturb ed measure d Q = 1 Z Q exp ( − δ V /ε )d µ 0 . The ratio of the densities is b ounded by exp ( − osc ( δ V ) /ε ) ≤ d Q d µ 0 ≤ exp ( osc ( δ V ) /ε ). Applying this uniform b ound to the en tropy functional and the Dirichlet form strictly yields the new constant C LS . Thus, despite the sev ere local non-conv exit y induced b y deep neural architectures, the global transportation inequalit y (Theorem 13 ) gov erning the Coun terfactual Even t Horizon unconditionally holds, with the neural complexity absorbed into the ﬁnite constant C LS . Remark 16 (P athological Landscap es and Accelerated T earing) In L emma 15, we establishe d that the LSI holds under b ounde d neur al p erturb ations, absorbing the ge ometric c omplexity into the mo diﬁe d c onstant C LS . A critic al r e ader might question the r e gime of highly p atholo gic al, over-p ar ameterize d neur al networks wher e the oscil lation osc ( δ V ) is unb ounde d, p otential ly c ausing the LSI c onstant to de gener ate sever ely ( C LS → ∞ ). However, this the or etic al de gener ation do es not we aken our top olo gic al fr amework; r ather, it strictly r einfor c es our c entr al thesis. If the le arne d c ausal p otential exhibits extr eme lo c al non-c onvexity (e.g., sharp p atholo gic al ridges or highly err atic c anyons), the underlying ve- lo city ﬁeld inher ently develops massive anisotr opic distortion. Mathematic al ly, this manifests as an enormous she ar tensor || σ || 2 H S ≫ 0 in the velo city gr adient de c omp osition. R e c al ling the ful l R aychaudhuri e quation derive d in App endix B.2: ˙ θ ( t ) ≤ − 1 n θ 2 ( t ) − || σ ( t ) || 2 H S + nK D 2 The she ar tensor acts as a strictly ne gative sink term. Ther efor e, in p atholo gic al neur al landsc ap es wher e LSI b ounds lo osen, the massive she ar strictly and violently ac c eler ates the R ic c ati c ol lapse. Conse quently, the true singularity time t real wil l o c cur signiﬁc antly e arlier than our analytic al ly derive d ide alize d upp er b ound t c (i.e., t real ≪ t c ). In c onclusion, extr emely p atholo gic al neur al ar chite ctur es do not oﬀer an esc ap e fr om Manifold T e aring; they guar ante e a faster, mor e c atastr ophic ge ometric c ol lapse. This gr ac eful de gr adation of the the or etic al b ounds c onversely magniﬁes the absolute pr actic al ne c essity of the dynamic entr opic r e gularization pr ovide d by our GA CF algorithm. B.6 V ariance Bounds and Reliabilit y of the Hutchinson T op ological Radar In Algorithm 1 , we utilized the Hutchinson T race Estimator ˜ θ t = z T ∇ x u t z to trigger the adaptiv e en trop y injection. W e no w mathematically pro v e that as the system approaches man- ifold tearing, the signal-to-noise ratio of this O (1) estimator strictly diverges, guaran teeing zero false p ositiv es. 30 Cohomological Obstr uctions to Global Counterf actuals Lemma 17 (Concen tration of the Divergence Radar) L et θ t = T r ( ∇ u t ) b e the true sc alar diver genc e. L et z ∈ {− 1 , 1 } n b e a R ademacher r andom ve ctor. The varianc e of the Hutchinson estimator is: V ar( ˜ θ t ) = 2 ∥∇ u t ∥ 2 F − 2 n X i =1 ( ∇ u t ) 2 ii . (50) As t → t c (the critic al te aring time), the pr ob ability of a false p ositive trigger (i.e., failing to dete ct a singularity) vanishes strictly to zer o. Pro of By the Ra ychaudh uri analysis in Theorem 8 , as t → t c , the expansion scalar θ t → −∞ . Because θ t = P i ( ∇ u t ) ii , the diagonal elements must collectiv ely div erge to −∞ . Consequen tly , the true signal scales as | θ t | ∼ O ( λ max ), while the v ariance is strictly bounded b y the F rob enius norm of the oﬀ-diagonal shear comp onen ts. By Chebyshev’s inequalit y , for an y ﬁnite threshold λ thresh : P  | ˜ θ t − θ t | ≥ | θ t | / 2  ≤ 4V ar( ˜ θ t ) θ 2 t . (51) Since the Riccati blow-up forces θ 2 t to grow asymptotically faster than the oﬀ-diagonal v ari- ance, the RHS → 0. Therefore, the Hutchinson estimator provides an asymptotically exact top ological trigger exactly when it is needed most (near the even t horizon), theoretically v alidating its use in high-dimensional causal ﬂows. B.7 Explicit Deriv ation: F rom Causal SDEs to the Viscous Burgers’ Equation T o self-contain the transition from probabilit y measures to ﬂuid dynamics (Section 4 and 7 ), we provide the explicit deriv ation using the Cole-Hopf transformation on Riemannian manifolds. Let the optimal causal drift b e u t = b + ∇ ψ ε . The dynamic Kantoro vic h p otential ψ ε solv es the viscous HJB equation: ∂ t ψ ε + 1 2 ∥∇ ψ ε ∥ 2 g = ε ∆ g ψ ε . (52) T aking the exterior deriv ative d of b oth sides, and using the identit y d ( 1 2 ∥∇ ψ ε ∥ 2 ) = ∇ ∇ ψ ε ∇ ψ ε , w e obtain: ∂ t ( ∇ ψ ε ) + ∇ ∇ ψ ε ( ∇ ψ ε ) = ε d ( δ dψ ε ) . (53) By the deﬁnition of the Hodge Laplacian on 1-forms, ∆ H = dδ + δ d . Since d ( ∇ ψ ε ) = d 2 ψ ε = 0, w e hav e d ( δ dψ ε ) = ∆ H ( ∇ ψ ε ). Substituting u t = ∇ ψ ε strictly yields the viscous Burgers’ equation: ∂ t u + ∇ u u = ε ∆ H u . (54) In voking the W eitzenb¨ ock formula ∆ H u = ∇ ∗ ∇ u + Ric ( u ) explicitly in tro duces the manifold’s Ricci curv ature into the ﬂuid dynamics, directly leading to the geometric energy b ounds ev aluated via the Bernstein technique in Theorem 13 . 31 Wu, Xie and Li B.8 Explicit Closed-F orm Limit in Euclidean Space T o address the tightness of the b ound deriv ed in Theorem 8 , we consider the sp ecial case of Euclidean space R n , which serves as the canonical latent space for most generative mo dels (e.g., Diﬀusion Mo dels and Flo w Matching). In R n , the sectional curv ature K = 0. Substituting this in to the Riccati inequalit y ( 23 ) , and assuming an isotropic initial con traction for simplicity , the evolution of the expansion scalar θ ( t ) is gov erned by the exact ODE: ˙ θ ( t ) = − 1 n θ ( t ) 2 . (55) By separating v ariables and integrating from t = 0 with the initial condition θ (0) = − λ 0 , we obtain the precise temp oral tra jectory of the scalar divergence: θ ( t ) = nλ 0 λ 0 t − n . (56) The Jacobian determinant J ( t ), as deﬁned by Liouville’s form ula J ( t ) = exp ( R t 0 θ ( s ) ds ), th us ev olves as: J ( t ) =  1 − λ 0 n t  n . (57) The singularity (Manifold T earing) o ccurs precisely when the volume elemen t collapses to zero, J ( t c ) = 0, yielding the exact closed-form blow-up time: t c = n λ 0 . (58) Recalling from Theorem 6 that for a transp ort distance D , the initial Hessian magnitude scales as λ 0 ∼ O ( D ), w e recov er the t c ∝ 1 /D la w as a strict e quality in Euclidean space. This conﬁrms that our general Riemannian b ound is not only qualitatively correct but also quan titatively tigh t, as the 1 /D dep endence is an in trinsic prop ert y of the Riccati collapse regardless of the manifold’s global curv ature. References Dominique Bakry , Iv an Gen til, and Michel Ledoux. Analysis and Ge ometry of Markov Diﬀusion Op er ators . Springer, 2013. Jean-Da vid Benamou and Y ann Brenier. A computational ﬂuid mechanics solution to the monge-k antoro vic h mass transfer problem. Numerische Mathematik , 84(3):375–393, 2000. Charlotte Bunne, Stefan G Stark, Gabriele Gut, Jacob o Sarabia Del Castillo, Mitch Lev esque, Kjong-V an Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R¨ atsc h. Learning single-cell p erturbation resp onses using neural optimal transp ort. Natur e metho ds , 20(11): 1759–1768, 2023. Ric ky TQ Chen, Y ulia Rubanov a, Jesse Bettencourt, and Da vid K Duvenaud. Neural ordinary diﬀerential equations. In A dvanc es in Neur al Information Pr o c essing Systems (NeurIPS) , volume 31, 2018. 32 Cohomological Obstr uctions to Global Counterf actuals La wrence C Ev ans. Partial diﬀer ential e quations , volume 19. American Mathematical So ciet y , 2010. Chris Finla y , J¨ orn-Henrik Jacobsen, Levon Nurbekyan, and Adam M Ob erman. How to train y our neural ODE: the imp ortance of determinism and regularization. In International Confer enc e on Machine L e arning (ICML) , 2020. Will Grathw ohl, Ricky TQ Chen, Jesse Bettencourt, Ilya Sutskev er, and Da vid Duvenaud. FFJORD: F ree-form contin uous dynamics for scalable reversible generativ e mo dels. In International Confer enc e on L e arning R epr esentations (ICLR) , 2019. Ric hard Holley and Daniel Stro o c k. Logarithmic sob olev inequalities and sto c hastic ising mo dels. Journal of statistic al physics , 46(5-6):1159–1194, 1987. Mic hael F Hutchinson. A sto c hastic estimator of the trace of the inﬂuence matrix for laplacian smo othing splines. Communic ations in Statistics-Simulation and Computation , 18(3):1059–1076, 1989. Y aron Lipman, Ric ky TQ Chen, Heli Ben-Hamu, Maximilian Nic kel, and Matt Le. Flo w matc hing for generativ e mo deling. In International Confer enc e on L e arning R epr esentations (ICLR) , 2023. Gr ´ egoire Lo eper. On the regularity of solutions of optimal transp ortation problems. A cta Mathematic a , 202(2):241–283, 2009. Nic k P awlo wski, Daniel C Castro, and Ben Glo c ker. Deep structural causal mo dels for tractable counterfactual inference. In A dvanc es in Neur al Information Pr o c essing Systems (NeurIPS) , volume 33, pages 857–869, 2020. Judea Pearl. Causality . Cambridge univ ersity press, 2009. Jonas Peters, Joris M Mo oij, Dominik Janzing, and Bernhard Sch¨ olkopf. Causal disco very with contin uous additiv e noise mo dels. The Journal of Machine L e arning R ese ar ch , 15(1): 2009–2053, 2014. Geoﬀrey Schiebinger, Jian Shu, Marcin T abak a, et al. Optimal-transp ort analysis of single- cell gene expression identiﬁes dev elopmen tal tra jectories in programmed reprogramming. Cel l , 176(4):928–943, 2019. Y ang Song, Jasc ha Sohl-Dic kstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative mo deling through sto c hastic diﬀeren tial equations. In International Confer enc e on L e arning R epr esentations (ICLR) , 2021. C ´ edric Villani et al. Optimal tr ansp ort: old and new , volume 338. Springer, 2009. 33

The Causal Uncertainty Principle: Manifold Tearing and the Topological Limits of Counterfactual Interventions

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment