The Causal Uncertainty Principle: Manifold Tearing and the Topological Limits of Counterfactual Interventions

Judea Pearl's do-calculus provides a foundation for causal inference, but its translation to continuous generative models remains fraught with geometric challenges. We establish the fundamental limits of such interventions. We define the Counterfactu…

Authors: Rui Wu, Hong Xie, Yongjun Li

The Causal Uncertainty Principle: Manifold Tearing and the Topological Limits of Counterfactual Interventions
Journal of Machine Learning Research 23 (2024) 1- 33 Submitted 1/24; Revised 5/24; Published 9/24 The Causal Uncertain t y Principle: Manifold T earing and the T op ological Limits of Coun terfactual In terv en tions Rui W u wurui22@mail.ustc.edu.cn Scho ol of Management, University of Scienc e and T e chnolo gy of China 96 Jinzhai R o ad, Hefei, 230026, Anhui, China Hong Xie hongx87@ustc.edu.cn Scho ol of Computer Scienc e and Engine ering, University of Scienc e and T e chnolo gy of China 96 Jinzhai R o ad, Hefei, 230026, Anhui, China Y ong jun Li ∗ lionli@ustc.edu.cn Scho ol of Management, University of Scienc e and T e chnolo gy of China 96 Jinzhai R o ad, Hefei, 230026, Anhui, China Editor: My editor Abstract Judea Pearl’s do -calculus provides a universally accepted and mathematically rigorous foundation for causal inference on discrete directed acyclic graphs. How ever, its translation to contin uous, high-dimensional generativ e models—such as Score-based Diffusion Mo dels and Flow Matc hing—remains theoretically under-explored and fraught with geometric c hallenges. In con tinuous Riemannian domains, a coun terfactual in terven tion constitutes a significan t topological redistribution of the underlying probabilit y measure. In this pap er, w e establish the fundamental measure-theoretic and top ological limits of suc h in terven tions. By formalizing con tinuous in terven tions via measure disintegration and Gaussian mol- lification, we circum ven t the singular en tropy parado x of Dirac measures and formally define the Coun terfactual Even t Horizon —a critical transp ort distance beyond which iden tity-preserving causal transp ort necessitates divergen t con trol energy . F urthermore, we explicitly b ound the initial Hessian of the Brenier optimal transp ort map to pro ve that when an interv en tion forces the target measure b ey ond this horizon, the deterministic limit of the Sc hr¨ odinger Bridge (inviscid optimal transport) inevitably dev elops finite-time singularities. These singularities are go verned by Riccati equations along geo desics, ultimately leading to sho c kw av e formation and Manifold T earing . Finally , leveraging the theory of viscous con- serv ation laws and the Bakry- ´ Emery Γ 2 calculus, we establish the Uncertain t y Principle of Causal Interv en tions . W e deriv e a strict mathematical low er b ound that quan tifies the irreducible trade-off b et w een the extremity of an interv ention and the preserv ation of individual identit y . Guided by these top ological limits, we in tro duce Geometry-Aw are Causal Flow (GA CF) , a scalable algorithmic framew ork utilizing Hutchinson trace estimators as a dynamic top ological radar to inject geometric entrop y exclusively when manifold tearing is imminent. Our theoretical and empirical results highlight a fundamental structural constrain t: purely deterministic generativ e coun terfactuals are geometrically ill-p osed for strong out-of-distribution in terven tions, demonstrating that targeted entropic regularization ∗ . Corresponding author. © 2024 Rui W u. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/ . Attribution requirements are pro vided at http://jmlr.org/papers/v23/24- 0000.html . Wu, Xie and Li (via SDEs) is a necessary geometric requirement for robust causal inference in contin uous spaces. Keyw ords: Causal Inference, Optimal T ransp ort, Generative Mo dels, Manifold T earing, Coun terfactual In terven tions 1 In tro duction The transition from discrete Bay esian net works ( P earl , 2009 ; Peters et al. , 2014 ) to con tin uous, high-dimensional Structural Causal Mo dels (SCMs) represents one of the most profound paradigm shifts in mo dern machine learning. While traditional causal inference has excelled at estimating av erage treatment effects (A TE) in low-dimensional tabular data, the fron tier of AI for Science—ranging from single-cell genomics to medical imaging—demands the generation of individual-lev el coun terfactuals in spaces comprising thousands or millions of dimensions ( Pa wlo wski et al. , 2020 ). Recen t adv ances in Generative AI hav e provided a p o werful to olkit for this endeav or. Score-based Diffusion Mo dels ( Song et al. , 2021 ) and Flow Matching ( Lipman et al. , 2023 ) ha ve op erationalized counterfactual generation as a dynamic optimal transp ort problem. In these frameworks, generating a counterfactual in volv es solving a probabilit y flo w Ordinary Differen tial Equation (ODE) or a Sto c hastic Differential Equation (SDE) that transp orts an observ ed factual individual to a hypothetical p ost-in terv ention distribution. Deterministic flo ws (ODEs) are particularly fav ored by practitioners b ecause they offer exact likelihoo d computation and bijective mappings, which are theoretically ideal for preserving individual iden tity during the ab duction phase of counterfactual reasoning. Ho wev er, a foundational theoretical gap p ersists, casting a shadow ov er the reliabilit y of these metho ds: What ar e the mathematic al limits of a c ontinuous do -intervention? In a discrete directed acyclic graph (DA G), in tervening via do ( X = x ) is a surgically precise graph-theoretic op eration: one merely deletes incoming edges to no de X and forces its state. In a contin uous Riemannian manifold M , how ev er, an in terven tion is not merely a structural pruning; it is a profound top ological redistribution of probability mass. A strong out-of-distribution interv en tion forces probability mass to trav erse v ast regions of near-zero densit y—which w e term “v oids.” When researc hers apply deterministic generative mo dels to simulate extreme counterfactuals (e.g., predicting the morphology of a cell under an unpreceden ted drug dosage), they implicitly assume that the underlying geometry of the data manifold can mathematically sustain such a transp ort plan. W e demonstrate that this assumption faces strict geometric limitations. The failure of con tinuous causal transp ort under extreme shifts is not merely a n umerical optimization c hallenge (e.g., p oorly trained neural netw orks), but a fundamental top ological barrier inheren t to the underlying measure transp ort. A Crucial Premise: The Iden tit y-Preserving Requirement. W e emphasize that our claim—deterministic coun terfactual generation b eing mathematically ill-p osed under extreme interv en tions—is strictly predicated on the principle of identit y preserv ation via optimal transp ort (minimal action). T rivially , one could construct a deterministic global translation map (e.g., T ( x ) = x + c ) that av oids singularities. Ho wev er, such arbitrary mappings violate the core physical philosophy of counterfactual reasoning: mo difying only what is necessary while preserving the unique, inheren t structural identit y of the 2 Cohomological Obstr uctions to Global Counterf actuals individual. When mo dels (e.g., Flow Matc hing) are trained to find the most efficient, iden tity-preserving paths, they inherently con verge tow ard optimal transp ort maps, whic h w e pro ve are structurally predisp osed to top ological singularities under strong interv entions. 4 2 0 2 4 L a t e n t D i m e n s i o n Z 1 4 3 2 1 0 1 2 3 4 L a t e n t D i m e n s i o n Z 2 Geometric V oid D e t e r m i n i s t i c O D E ( 0 ) : M a n i f o l d T e a r i n g T a r g e t d o ( x * ) 1 F a c t u a l 0 4 2 0 2 4 L a t e n t D i m e n s i o n Z 1 Geometric V oid E n t r o p i c S D E ( > 0 ) : T o p o l o g i c a l T u n n e l i n g Figure 1: Conceptual Overview of the T op ological Limits of Coun terfactual In ter- v entions. (Left: The Deterministic F ailure): A ttempting to transp ort the factual measure to an extreme out-of-distribution target across a geometric void. T o minimize transp ort cost (preserving identit y), characteristic curv es inherently in tersect, inducing a finite-time singularity (Manifold T earing). (Righ t: The En- tropic Necessit y): The injection of geometric entrop y (via Sc hr¨ odinger Bridges / SDEs) allows the probability mass to fluidly bypass the void. Ho wev er, this enforces the Causal Uncertaint y Principle: top ological v alidit y requires irrev ersible iden tity smearing. 1.1 Summary of Con tributions In this w ork, w e step aw ay from algorithmic heuristics and presen t a pure geometric and measure-theoretic analysis of con tinuous causal interv entions. Our analysis bridges Causal Inference, Optimal T ransp ort ( Villani et al. , 2009 ), and Sto chastic Analysis. Our main con tributions are rigorously formalized as follows: 1. Rigorous F ormulation of Contin uous Interv en tions: W e pro vide a measure- theoretic definition of the con tinuous do ( · ) op erator using measure disintegration and Gaussian mollification, thereb y resolving the singular en tropy problem asso ciated with Dirac measures in con tinuous spaces. Based on this, w e mathematically define the Coun terfactual Even t Horizon (Theorem 5 ), a topological b oundary b ey ond whic h the relative en tropy (con trol energy) of the causal transp ort plan blows up to infinity . 2. The Manifold T earing Theorem: W e prov e that deterministic mo dels are struc- turally incapable of trav ersing the Counterfactual Even t Horizon. By establishing a no vel b ound on the initial Hessian of the Brenier optimal transp ort map (Theorem 6 ), 3 Wu, Xie and Li w e pro ve that extreme in terven tions force the deterministic transp ort flow to develop finite-time singularities (sho c kwa v es). This pro cess, gov erned by non-linear Riccati equations along geo desics (Theorem 8 ), ph ysically tears the data manifold, rendering deterministic counterfactuals in v alid. 3. The Causal Uncertain ty Principle: W e demonstrate that to prev ent manifold tearing, a generativ e system must introduce entrop y (sto chasticit y). Utilizing T ala- grand’s T 2 transp ortation inequalit y and the Bakry- ´ Emery criterion, w e formalize the Uncertain ty Principle of Causal Interv en tions (Theorem 13 ). W e derive an explicit analytic lo wer bound proving the irreducible trade-off: one cannot sim ultane- ously execute an extreme causal in terven tion and p erfectly preserv e the identit y of the factual individual. 4. Geometry-Aw are Causal Flo w (GACF) and Empirical V alidation: T rans- lating our topological limits into a constructive framework, we prop ose GA CF. By utilizing Hutchinson trace estimators as a scalable ( O (1)) top ological radar, GACF dynamically injects geometric entrop y exclusively when a singularity is imminent. W e empirically v alidate our theory on high-dimensional neural flows and real-world single-cell RNA sequencing (scRNA-seq) data. W e demonstrate that while purely deterministic flows blindly cross geometric voids to generate inv alid out-of-distribution “Biological Chimeras,” GA CF successfully na vigates these v oids to ensure topologically safe counterfactuals. While the presen t w ork establishes the fundamental geometric limits of a single, con tin uous do - in terven tion, real-w orld causal systems are gov erned b y a net work of interacting me c hanisms. In such settings, ensuring the global consistency of counterfactuals requires more than lo cal Riemannian smo othness; it necessitates the alignment of mechanisms across the entire causal graph. This hints at a deep er la yer of structur al frustr ation , where the ”manifold tearing” analyzed here may b e seen as a microscopic manifestation of global top ological obstructions that preven t lo cal causal maps from b eing glued in to a coherent global distribution. 2 Related W ork Our theoretical framework bridges generativ e mo deling, optimal transp ort regularit y , and con tinuous causal inference, addressing a critical void in the ph ysical execution of causal transp ort. Generativ e Mo dels as Dynamic T ransp ort: The formulation of generativ e mo deling as a transp ort problem has b een adv anced by Score-based Diffusion Mo dels ( Song et al. , 2021 ) and Flo w Matching ( Lipman et al. , 2023 ). While these frameworks pro vide e mpirical excellence, deterministic paths (ODEs) often struggle with tra jectory crossing. Empirical studies by Finlay et al. ( 2020 ) suggest that Jacobian regularization is necessary to maintain the smo othness of Neural ODEs. Our w ork provides the underlying geometric explanation for this necessity: without such regularization, the flow inevitably encounters the Counterfactual Event Horizon , leading to the manifold tearing w e rigorously prov e in Section 6 . Con tin uous Causal Inference: F rom Identifiabilit y to Ph ysical Realizabilit y: The foundational work b y P eters et al. ( 2014 ) established the mathematical iden tifiability 4 Cohomological Obstr uctions to Global Counterf actuals of contin uous structural causal mo dels (e.g., additive noise mo dels). How ever, their fo cus remains on the structural identification phase —determining whether the interv en tional target distribution is uniquely computable from observ ations. Our w ork addresses a fun- damen tally orthogonal theoretical void: the Geometric Execution Phase . W e shift the paradigm from asking ”what the target distribution is” to asking ”can a generative mo del ph ysically transp ort the measure to that target without geometric collapse?” By proving the existence of top ological limits, we fill the gap b et w een iden tifiable causal theory and its high-dimensional generative implemen tation. Sc hr¨ odinger Bridges in Causality: Algorithmic Success vs. Theoretical Neces- sit y: Recent literature has increasingly adopted Entropic Optimal T ransp ort and Schr¨ odinger Bridges (SB) for causal tasks, particularly in single-cell genomics ( Schiebinger et al. , 2019 ) and counterfactual estimation ( Bunne et al. , 2023 ). How ever, existing works are primarily algorithmic and engineering-driven, treating entrop y as a smo othing h yp erparameter. Our con tribution is foundational: w e provide the first rigorous pro of of the inevitability of singularities in the deterministic limit of the SB, fundamen tally explaining why entropic regularization is not merely a n umerical tric k, but an inescapable geometric necessity for v alid causal transp ort across m anifold v oids. Optimal T ransp ort Regularit y and Our Originality: Classical regularity theory in optimal transp ort (OT) ( Villani et al. , 2009 ; Lo ep er , 2009 ) has long established descriptiv e conditions for map smo othness, such as target domain conv exity or the Ma-T rudinger-W ang (MTW) condition. Ho wev er, these remains primarily an existence framework. Our originality lies in formally binding these abstract OT pathologies to the physical mec hanism of causal do -in terven tions. W e adv ance the classical theory in three wa ys: (i) W e pro ve that strong out-of-distribution (OOD) in terven tions inher ently and unavoid- ably force a breach of OT regularity; (ii) W e mov e b eyond existence pro ofs to derive an explicit, calculable analytic b ound linking interv en tion extremity ( D ) to the singularit y time ( t c ∝ 1 /D ); (iii) W e quan tify the irreducible trade-off b et ween in terven tion extremity and identit y preserv ation, establishing the Causal Unc ertainty Principle . Remark 1 (The Geometric Execution Phase of do -calculus) In classic al c ausal in- fer enc e, Pe arl’s do -c alculus op er ates on the top olo gic al level of a Dir e cte d A cyclic Gr aph (D AG) by severing inc oming e dges to the intervene d no de. However, this pur ely structur al op er ation implicitly demands a physic al r e alization in the data sp ac e. In this work, we assume the structur al identific ation phase (i.e., c omputing the tar get mar ginal distribution µ do( x ∗ ) 1 via SCMs) is alr e ady r esolve d. Our the or etic al fo cus is exclusively on the Ge ometric Exe- cution Phase —the c ontinuous dynamic pr o c ess by which gener ative mo dels (e.g., Diffusion or Flow mo dels) physic al ly tr ansp ort the observational me asur e µ 0 acr oss the R iemannian manifold M to match the intervene d tar get µ do( x ∗ ) 1 . It is within this c ontinuous exe cution that top olo gic al limits emer ge. 5 Wu, Xie and Li 3 Mathematical Preliminaries and Contin uous do -Calculus T o inv estigate the absolute limits of causal transp ort, we must first establish a rigorous measure-theoretic framework for con tinuous Structural Causal Mo dels. W e m ust carefully a void the singular en tropy parado xes that arise when naive Dirac-delta functions are injected in to con tinuous state spaces. 3.1 Geometry of the Observ ational Measure Let ( M , g ) b e a smo oth, complete, and primarily non-compact Riemannian manifold (e.g., Euclidean R d or Hyp erbolic spaces commonly utilized as latent spaces in contin uous generativ e mo dels). The observ ational data (the “factual” world) is distributed according to a probability measure µ 0 ∈ P ( M ). W e assume µ 0 is absolutely con tinuous with resp ect to v ol g , p ossessing a smo oth and strictly p ositive densit y ρ 0 = d µ 0 / d v ol g . F urthermore, we assume that the statistical supp ort of the factual data, supp ( µ 0 ), is contained within a compact submanifold of diameter ∆. W e consider the W asserstein space W 2 ( M ) consisting of all probability measures on M with finite second moments, equipped with the 2-W asserstein metric: W 2 2 ( µ, ν ) = inf π ∈ Π( µ,ν ) Z M×M d g ( x, y ) 2 d π ( x, y ) (1) where Π( µ, ν ) is the set of all joint couplings with marginals µ and ν , and d g ( x, y ) is the geo desic distance on M . 3.2 Rigorous Definition of the Con tinuous do -Op erator In Pearl’s classical framework on discrete graphs, an in terven tion do ( X = x ∗ ) deterministi- cally sets the v alue of a no de, effectiv ely creating a Dirac measure δ x ∗ . Ho wev er, in the context of Entropic Optimal T ransp ort (and b y extension, any diffusion- based contin uous mo del), the Kullback-Leibler (KL) divergence to a Dirac measure from an y absolutely contin uous reference measure Q is trivially + ∞ . T o render the optimal con trol problem mathematically well-posed and physically meaningful, we m ust define the in terven tion via Gaussian Mol lific ation . Definition 2 (Mollified In terven tion Measure) L et x ∗ ∈ M b e an extr eme c ounter- factual intervention tar get, such that it lies far outside the observational distribution: d g ( supp ( µ 0 ) , x ∗ ) = D ≫ ∆ . We define the mol lifie d p ost-intervention tar get me asur e µ do( x ∗ ) 1 ,σ as the he at kernel me asur e c enter e d at x ∗ at a smal l phenomenolo gic al time sc ale σ 2 / 2 : d µ do( x ∗ ) 1 ,σ ( x ) = p σ 2 / 2 ( x ∗ , x )dv ol g ( x ) , (2) wher e p t ( x, y ) is the minimal he at kernel on the manifold M . As σ → 0, the sequence of measures µ do( x ∗ ) 1 ,σ con verges weakly to the Dirac measure δ x ∗ . F or a sufficiently small σ , the differential en tropy of this mollified measure scales 6 Cohomological Obstr uctions to Global Counterf actuals logarithmically with the dimension n : H ( µ do( x ∗ ) 1 ,σ ) = − Z M p σ 2 / 2 log p σ 2 / 2 dv ol g = n log( σ ) + O (1) . (3) 3.3 Causal Sc hr¨ odinger Bridges The generation of a counterfactual is mathematically equiv alent to finding a transp ort plan that connects µ 0 to µ do( x ∗ ) 1 ,σ . In the Causal Sc hr¨ odinger Bridge framework, this transp ort plan is a path measure P σ ∈ P ( C ([0 , 1] , M )) go verned by a controlled Sto c hastic Differential Equation: d x t = u t ( x t )d t + √ 2 ε d W t , x 0 ∼ µ 0 , x 1 ∼ µ do( x ∗ ) 1 ,σ (4) where ε > 0 is the e n tropic temp erature (or viscosity), W t is the standard Brownian motion on M , and u t ( x ) is the control v ector field (the “drift”). The optimization seeks to minimize the KL divergence KL ( P ∥ Q ), where the reference measure Q ∈ P ( C ([0 , 1] , M )) is the uncon trolled causal diffusion prior: d x t = b ( x t )d t + √ 2 ε d W t , b ( x ) = −∇ V ( x ) (5) for some smo oth, causally-informed p oten tial function V : M → R . 4 F rom Generative SDEs to Fluid Dynamics: The Cole-Hopf Connection T o formally analyze the geometric limits of generative mo dels, we m ust first establish the precise mathematical corresp ondence b et ween mo dern Score-based SDEs, the Schr¨ odinger Bridge, and deterministic fluid dynamics. In Score-based Generative Modeling ( Song et al. , 2021 ), the reverse-time generation pro cess is describ ed by the SDE: d x t = [ f ( x t , t ) − 2 ε ∇ x log p t ( x t )] d t + √ 2 ε d W t , (6) where p t is the marginal density , and ε con trols the diffusion scale. The Causal Schr¨ odinger Bridge seeks an optimal drift u t ( x ) = f + ∇ ψ ε ( x , t ) that minimizes the transp ort cost. By the Hopf-Cole transformation, the en tropic optimal transp ort problem can b e mapp ed to a system of coupled PDEs. The v alue function (or dynamic Kan torovic h p oten tial) ψ ε satisfies the viscous Hamilton-Jacobi-Bellman (HJB) equation: ∂ t ψ ε + 1 2 ∥∇ ψ ε ∥ 2 g = ε ∆ g ψ ε , (7) where ∆ g is the Laplace-Beltrami op erator on the manifold M . T aking the spatial gradient ∇ of b oth sides, the optimal v elo cit y field u t = ∇ ψ ε satisfies the viscous Burgers’ equation: ∂ t u + ∇ u u = ε ∆ deRham u . (8) Curren t deterministic generative frameworks, such as standard Flow Matching or Probability Flo w ODEs (where the noise injection is turned off during generation), implicitly op erate in 7 Wu, Xie and Li the inviscid limit ( ε → 0). In this deterministic limit, the parab olic PDE ( 7 ) degenerates in to the first-order hyperb olic inviscid HJB equation: ∂ t ψ 0 + 1 2 ∥∇ ψ 0 ∥ 2 g = 0 . (9) This leads to the pressureless Euler equation (inviscid Burgers’ equation): ∂ t u + ∇ u u = 0. The remainder of our geometric analysis inv estigates the pathology of this h yp erbolic equation when sub jected to extreme b oundary conditions (interv en tions). 5 The Coun terfactual Ev en t Horizon W e first inv estigate the thermo dynamic control cost required to transp ort mass to the mollified target. W e will prov e that b ey ond a certain geometric distance, the required energy b ecomes ph ysically div ergent. Assumption 3 (Distan t Dissipativity and Log-Sob olev P erturbation) We dr op the unr e alistic al ly str ong assumption of glob al str ong c onvexity for de ep neur al networks. Inste ad, we assume the r efer enc e c ausal p otential V ∈ C 2 ( M ) satisfies a distant quadr atic gr owth (dis- sip ativity) c ondition outside a c omp act factual supp ort set K ⊃ supp ( µ 0 ) . Sp e cific al ly, ther e exists a b ase p oint x obs ∈ K and a c onstant C V > 0 such that for al l extr eme interventions x / ∈ K : V ( x ) ≥ C V d g ( x , x obs ) 2 . (10) F urthermor e, we assume the invariant me asur e of the r efer enc e diffusion Q satisfies a L o garithmic Sob olev Ine quality (LSI) with c onstant C LS > 0 , without r e quiring the glob al Bakry- ´ Emery curvatur e c ondition ( Ric + Hess V ≥ κg ). Remark 4 (Holley-Stro o c k Shield for Non-Conv ex Neural Landscap es) A critic al r e ader might obje ct that neur al networks le arn highly non-c onvex ener gy landsc ap es V ( x ) on the data manifold, se emingly invalidating classic al optimal tr ansp ort b ounds. However, our assumption is rigor ously justifie d for mo dern gener ative mo dels via the Hol ley-Str o o ck Per- turb ation Principle ( Hol ley and Str o o ck , 1987 ). In standar d diffusion mo dels, the prior is an isotr opic Gaussian, me aning V ( x ) natively exhibits quadr atic gr owth outside the c omp act data manifold. The Hol ley-Str o o ck the or em guar ante es that if a b ase me asur e satisfies LSI (the Gaussian tail), any b ounde d non-c onvex p erturb ation of its p otential on a c omp act set (the c omplex neur al network landsc ap e) pr eserves the glob al LSI pr op erty. The tr ansp ortation ine qualities governing our Unc ertainty Principle (The or em 13 ) thus r emain strictly intact, inher ently absorbing the non-c onvexity into the mo difie d c onstant C LS . F urthermor e, as strictly analyze d in App endix B.5 (R emark 16 ), even if the neur al landsc ap e b e c omes p atho- lo gic al ly err atic such that LSI b ounds de gener ate, this extr eme non-c onvexity mathematic al ly guar ante es an even faster and mor e violent Manifold T e aring ( t real ≪ t c ) due to massive initial she ar, ther eby r einfor cing the absolute ne c essity of our entr opic intervention. Theorem 5 (Existence of the Coun terfactual Even t Horizon) L et Assumption 3 hold. Fix the entr opy p ar ameter ε > 0 and the mol lific ation p ar ameter σ > 0 . As the intervention tar get x ∗ is move d pr o gr essively further fr om the factual manifold such that the distanc e 8 Cohomological Obstr uctions to Global Counterf actuals D = d g ( supp ( µ 0 ) , x ∗ ) → ∞ , the minimal r elative entr opy (the total r e quir e d c ontr ol ener gy) diver ges quadr atic al ly: inf P ∈ Γ( µ 0 ,µ do( x ∗ ) 1 ,σ ) KL( P ∥ Q ) ≥ C V ε D 2 + n ε log  1 σ  − O (1) . (11) Pro of Let P ∗ b e the unique optimal path measure solving the Schr¨ odinger Bridge prob- lem. By the Benamou-Brenier fluid dynamics formulation of optimal transp ort ( Benamou and Brenier , 2000 ), the relativ e en tropy with resp ect to the reference measure Q can b e decomp osed exactly in to the kinetic energy of the control field and the relative en tropy of the initial marginals: KL( P ∗ ∥ Q ) = 1 4 ε E P ∗  Z 1 0 ∥ u t ( x t ) − b ( x t ) ∥ 2 g d t  + KL( µ 0 ∥ Q 0 ) . (12) Since the KL divergence is an f -div ergence, the Data Pro cessing Inequality (DPI) guarantees that pro jecting the path measures onto their final marginals at time t = 1 yields a strict lo wer b ound on the path-space divergence: KL( P ∗ ∥ Q ) ≥ KL( P ∗ 1 ∥ Q 1 ) = KL( µ do( x ∗ ) 1 ,σ ∥ Q 1 ) . (13) Under Assumption 3 , the in v ariant measure of the reference diffusion proc ess Q is given b y the Gibbs measure: d Q 1 ( x ) = 1 Z exp ( − V ( x ) /ε )d v ol g ( x ), where Z is the normalization partition function. Expanding the RHS of ( 13 ) using the explicit form of Q 1 : KL( µ do( x ∗ ) 1 ,σ ∥ Q 1 ) = Z M log d µ do( x ∗ ) 1 ,σ exp( − V ( x ) /ε ) / Z ! d µ do( x ∗ ) 1 ,σ ( x ) = 1 ε Z M V ( x )d µ do( x ∗ ) 1 ,σ ( x ) − log Z − H ( µ do( x ∗ ) 1 ,σ ) . (14) Substituting the quadratic p oten tial gro wth condition V ( x ) ≥ C V d g ( x , x obs ) 2 and ev aluating o ver the tightly concen trated heat k ernel measure µ do( x ∗ ) 1 ,σ , the exp ected squared distance is b ounded tigh tly b y D 2 + O ( σ 2 ). F urthermore, substituting the differen tial en tropy of the heat kernel from Equation ( 3 ) , w e obtain: KL( µ do( x ∗ ) 1 ,σ ∥ Q 1 ) ≥ C V ε  D 2 + O ( σ 2 )  − log Z + n log  1 σ  − O (1) . (15) As D → ∞ (an increasingly extreme interv ention), the optimal con trol energy strictly and inevitably diverges as O ( D 2 /ε ). W e therefore define the Coun terfactual Ev ent Horizon , denoted δ crit , as the geometric distance D where this required con trol energy exceeds the thermo dynamic admissibilit y or computational capacity of the physical causal system. Beyond this horizon, transp orting a probabilit y mass while preserving structural contin uit y is mathematically prohibited without infinite control effort. 9 Wu, Xie and Li 6 Manifold T earing: The Deterministic Limit W e now in vestigate the geometric collapse of deterministic optimal transp ort ( ε → 0). T o satisfy the rigor required for optimal transport on manifolds, we utilize Caffarelli’s regularit y theory and the Hessian Comparison Theorem to establish the initial sp ectral b ounds. Lemma 6 (Explicit Sp ectral Bound of the Brenier-Kan toro vich Map) L et ( M , g ) b e a Riemannian manifold with se ctional curvatur e b ounde d b ounde d b elow by − κ 2 ( κ ≥ 0 ). L et Φ t : M → M ( t ∈ [0 , 1] ) b e the displac ement interp olation pushing the factual me asur e µ 0 (supp orte d on domain Ω 0 with diameter ∆ ) to the mol lifie d interventional me asur e µ do( x ∗ ) 1 ,σ . L et the minimum tr ansp ort distanc e b e D = inf x ∈ Ω 0 d g ( x, x ∗ ) . Assuming µ 0 is b ounde d b elow by m 0 > 0 , and the tar get is a Gaussian he at kernel with varianc e σ 2 ≪ ∆ 2 , the Hessian of the initial Kantor ovich p otential H (0) = ∇ 2 ψ 0 ( · , 0) p ossesses a strictly ne gative minimum eigenvalue λ min ( H (0)) = − λ 0 satisfying: λ 0 ≥ 1 − σ ∆  max Ω 0 ρ 0 m 0  1 n + κD coth( κD ) . (16) Pro of Let c ( x , y ) = 1 2 d g ( x , y ) 2 b e the quadratic geo desic cost. By Brenier’s Theorem extended to Riemannian manifolds ( Villani et al. , 2009 ), the optimal transp ort map pushing the factual measure µ 0 to the interv enti onal target µ do( x ∗ ) 1 ,σ is given b y T ( x ) = exp x ( −∇ ϕ ( x )), where ϕ : M → R is a c -concav e Kantoro vic h p oten tial. The core prop erty of c -conca vity , defined by ϕ ( x ) = inf y ∈M { c ( x , y ) − ϕ c ( y ) } , guaran tees that at an y p oin t of differen tiability , the p oten tial is globally b ounded by the cost function. Consequen tly , its Hessian strictly satisfies the semi-concavit y upp er b ound: ∇ 2 ϕ ( x ) ≤ ∇ 2 xx c ( x , T ( x )) . (17) T o explicitly b ound the righ t-hand side, we apply the Riemannian Hessian Comparison Theorem to the distance function. F or an interv en tion distance D = d g ( x , T ( x )) on a manifold with sectional curv ature b ounded b elow by − κ 2 , the geometric distortion is strictly b ounded b y: ∇ 2 xx c ( x , T ( x )) ≤ κD coth( κD ) I . (18) This establishes the fundamental geometric upp er b ound on the p otential’s Hessian. How ev er, the exact configuration of the eigenv alues is forcefully constrained b y the Monge-Amp ` ere mass conserv ation equation: det ( d exp x ( −∇ ϕ ( x ))) · det( I − ∇ 2 ϕ ( x )) = ρ 0 ( x ) ρ 1 ( T ( x )) . (19) Because the interv entional target µ do( x ∗ ) 1 ,σ is a highly concentrated heat k ernel with v ariance σ 2 ≪ ∆ 2 , its p eak density scales as ρ 1 ∼ O ( σ − n ). Let m 0 = min Ω 0 ρ 0 > 0. The RHS density ratio ρ 0 /ρ 1 approac hes 0 at a rate of O ( σ n ), represen ting an extreme volumetric contraction. By substituting the geometric b ounds into the determinant, the generalized AM-GM inequalit y forces the maximum eigen v alue λ max ( ∇ 2 ϕ ) to approach the geometric ceiling to conserv e mass. The initial Eulerian v elo cit y is u 0 ( x ) = −∇ ϕ ( x ), rendering its Jacobian 10 Cohomological Obstr uctions to Global Counterf actuals H (0) = −∇ 2 ϕ ( x ). Therefore, the magnitude of the maximal initial con traction, defined as λ 0 = − λ min ( H (0)) = λ max ( ∇ 2 ϕ ), is rigorously coupled to b oth the densit y ratio and the interv ention distance D , satisfying the asymptotic geometric env elope gov erned b y κD coth( κD ). This strictly confirms that extreme long-distance interv en tions ( D → ∞ ) necessitate an exp onen tially violent initial contraction in the deterministic v elo cit y field. Remark 7 (Dimensionalit y and the Manifold Hyp othesis) In L emma 6 , we the or et- ic al ly assume d a strictly p ositive lower b ound m 0 > 0 on the factual supp ort. However, under the Manifold Hyp othesis, r e al-world high-dimensional data (e.g., images, scRNA-se q) strictly r esides on a lower-dimensional submanifold, r endering the ambient density m 0 → 0 . In such r e alistic sp arse r e gimes, the initial Hessian c ontr action term (max ρ 0 /m 0 ) 1 /n diver ges even mor e violently. Conse quently, the finite-time singularity t c derive d subse quently in The or em 8 r epr esents an absolute, mathematic al ly c onservative upp er envelop e; deterministic flows tr aversing empiric al data voids wil l systematic al ly c ol lapse signific antly faster than this the or etic al limit. Theorem 8 (Explicit Finite-Time Manifold T earing) L et Φ t b e the deterministic tr ans- p ort flow map. Assume the se ctional curvatur e is b ounde d b elow by − K ( K = κ 2 ≥ 0 ). L et λ 0 > 0 b e the magnitude of the maximal initial c ontr action define d in L emma 6 . If the intervention distanc e D is sufficiently lar ge such that λ 0 > √ nK D , then the Jac obian determinant det ( ∇ x Φ t ( x )) strictly c ol lapses to 0 at a finite critic al time t c b ounde d analytic al ly by: t c ≤ n √ nK D arccoth  λ 0 √ nK D  < 1 . (20) Pro of Let Φ t : M → M b e the flow map generated b y the deterministic transp ort v elo cit y u ( x , t ). W e denote the Jacobian matrix of this flow along a c haracteristic curve x ( t ) as J t ( x 0 ) = d Φ t ( x 0 ), and its determinant as J ( t ) = det ( J t ( x 0 )). By Liouville’s form ula (or Jacobi’s form ula) for dynamical systems on manifolds, whic h serves as the foundational in tegration mechanism for contin uous normalizing flows ( Chen et al. , 2018 ), the temp oral ev olution of the Jacobian determinant is strictly gov erned b y the scalar div ergence of the v elo cit y field: d d t J ( t ) = J ( t ) ( ∇ · u (Φ t ( x 0 ) , t )) = J ( t ) θ ( t ) , (21) where θ ( t ) = T r( ∇ u ) is the expansion scalar. Solving this linear ODE yields: J ( t ) = J (0) exp  Z t 0 θ ( s )d s  = exp  Z t 0 θ ( s )d s  , (22) since Φ 0 is the iden tity map and thus J (0) = 1. T o ev aluate θ ( t ), we analyze the matrix Riccati equation along the c haracteristic curv es ¨ x ( t ) = 0. The Hessian H ( t ) = ∇ u satisfies the Ra ychaudh uri equation, yielding the differen tial inequalit y: ˙ θ ( t ) ≤ − 1 n θ 2 ( t ) + nK D 2 . (23) 11 Wu, Xie and Li Let b = nK D 2 and a = 1 /n . W e solve the bounding ODE ˙ y = − ay 2 + b with initial condition y (0) = θ (0) ≤ − λ 0 . Integrating this separable ODE yields: Z −∞ − λ 0 d y b − ay 2 ≥ Z t c 0 d t. (24) Ev aluating the integral giv es the exact analytic upp er b ound for the blo w-up time t c : t c ≤ 1 √ ab arccoth λ 0 p b/a ! = n √ nK D arccoth  λ 0 √ nK D  . (25) F or an extreme causal in terven tion where D → ∞ , Lemma 6 dictates that λ 0 gro ws at least as O ( D ). Consequen tly , the argumen t of the arccoth function is strictly greater than 1, ensuring a real-v alued solution. F urthermore, the prefactor shrinks in versely with D , proving that for sufficien tly large interv entions, t c is strictly less than 1. Because θ ( s ) is strictly negative and diverges to −∞ as s → t c , the integral R t c 0 θ ( s )d s div erges to −∞ . Substituting this into Liouville’s explicit solution ( 22 ) , we obtain the exact limit: lim t → t c J ( t ) = exp( −∞ ) = 0 . (26) T op ological Implication (Manifold T earing): By the Inv erse F unction Theorem, a smo oth mapping Φ t constitutes a lo cal diffeomorphism if and only if its Jacobian determinan t is non-zero everywhere. Since J ( t c ) = 0, the flow map Φ t c ceases to b e a diffeomorphism. Geometrically , this implies that distinct characteristic curv es (particle tra jectories) intersect precisely at t = t c , creating a sho c kwa v e. The mapping folds onto itself, violating the injectiv e requirement of individual identit y preserv ation. W e formally define this violen t top ological disruption of the contin uous probability measure as Manifold T e aring . Remark 9 (Wh y Current ODE Mo dels Do Not Explicitly Crash) A pr actitioner familiar with mo dern gener ative mo dels (e.g., Flow Matching or Neur al ODEs) might observe that implementing these mo dels r ar ely r esults in explicit c omputational cr ashes (e.g., NaN err ors) even under extr eme out-of-distribution interventions. This discr ep ancy arises b e c ause mo dern de ep le arning ar chite ctur es p ossess finite Lipschitz b ounds, and ODE solvers op er ate via discr ete numeric al inte gr ation steps. These c omputational factors act as an artificial numeric al trunc ation that masks the underlying mathematic al singularity. Inste ad of a runtime cr ash, the manifold te aring manifests physic al ly as the gener ation of “hal luci- nations,” blurry artifacts, or off-manifold samples (e.g., the biolo gic al chimer as discusse d in Se ction 10.1 ). Thus, the top olo gic al singularity fundamental ly c orrupts the c ounterfactual validity, even if the err or is silently absorb e d by the numeric al solver. Remark 10 (F rom Lo cal Singularit y to Global Inconsistency) The or em 8 char ac- terizes the br e akdown of the deterministic flow map as a lo c al ge ometric singularity driven by the intervention distanc e D . However, in multi-variable c ausal systems, ”te aring” c an also b e trigger e d by the intrinsic top olo gy of the c ausal gr aph itself. If the structur al e quations along differ ent p aths imp ose c onflicting r e quir ements on a tar get no de, the tr ansp ort plan 12 Cohomological Obstr uctions to Global Counterf actuals may fail to exist as a glob al se ction of the c ausal structur e. This p ersp e ctive suggests that the Causal Unc ertainty Principle is intimately linke d to the c ohomolo gic al obstructions of the underlying me asur e-the or etic she af, which governs the p ossibility of glob al c ounterfactual r e alization. Remark 11 (Asymmetric Shear and the Geometric Illusion of Negativ e Curv ature) While our b ound in The or em 8 assumes an ide alize d isotr opic c ontr action, r e al-world inter- ventional tasks involve highly asymmetric factual distributions. As we rigor ously pr ove in App endix B.2 using the ful l fluid-dynamic de c omp osition of the R aychaudhuri e quation, any asymmetric deformation induc es a strictly p ositive she ar tensor ( ∥ σ ∥ 2 H S > 0 ). This she ar strictly ac c eler ates the R ic c ati c ol lapse, pr oving that our blow-up time t c is the absolute most optimistic upp er b ound. F urthermor e, while ne gative curvatur e eventual ly acts as a buffer during tr ansp ort (as discusse d in Se ction 11 ), it actively exac erb ates the initial Hessian c ontr action r e quir e d to push mass acr oss an exp onential ly exp anding sp ac e (pr oven via Jac obi fields in App endix B.1 ). We r efer the mathematic al ly incline d r e ader to App endix B for the exhaustive derivations of these ge ometric nuanc es. Corollary 12 (Accelerated T earing on Compact Manifolds with P ositive Curv ature) While The or em 8 assumes a non-c omp act manifold to al low D → ∞ , interventions on c om- p act manifolds (e.g., Hyp erspher es S n often use d in c ontr astive le arning) fac e an even mor e sever e top olo gic al b arrier. By the Myers the or em and the R aychaudhuri e quation, if the manifold exhibits strictly p ositive se ctional curvatur e K > 0 , ge o desic c ongruenc e fo cuses ac c eler ate d ly. The Ric c ati e quation is dominate d by the p ositive curvatur e term nK ∥ ˙ x ∥ 2 , for cing the Jac obian determinant J ( t ) to c ol lapse to zer o at c onjugate p oints strictly b ounde d by t c ≤ π / √ K . Ge ometric al ly, this implies that attempting to tr ansp ort mass to the antip o dal p oint inher ently for c es a singularity. Thus, on c omp act sp ac es, the c ounterfactual event horizon δ crit is har d-trunc ate d by the manifold’s ge ometric diameter, making deterministic identity pr eservation top olo gic al ly imp ossible even for b ounde d interventions. 7 The Causal Uncertain t y Principle Theorem 8 dictates that to preven t Manifold T earing (the crossing of c haracteristics and sho c kw av e formation), the generativ e system must introduce thermo dynamic viscosity (en tropy , ε > 0). In the context of SDEs, this is achiev ed by restoring the Brownian motion term. Ho wev er, we will no w pro ve that the exact amoun t of entrop y required to sav e the macro- scopic top ology strictly and irrev ersibly b ounds the preserv ation of microscopic individual iden tity . Theorem 13 (Causal Uncertain ty Principle) L et D ≈ W 2 ( µ 0 , µ do( x ∗ ) 1 ,σ ) b e the massive Wasserstein intervention distanc e. L et P 1 | 0 ( · | x 0 ) b e the tr ansition kernel of the entr opic c ausal tr ansp ort. T o pr event finite-time manifold te aring over distanc e D , the system must inje ct entr opy. Conse quently, the c onditional Shannon entr opy of the c ounterfactual outc ome 13 Wu, Xie and Li (a dir e ct mathematic al me asur e of Identity L oss) is strictly b ounde d fr om b elow: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log  4 π e · C 0 ∆ 1 − κ − ∆ 2 · D  . (27) Pro of Step 1: The Viscosity Requirement via Bo c hner-W eitzen b¨ oc k Calculus. T o prev ent the catastrophic in tersection of characteristics and subsequent manifold tearing pro ven in Theorem 8 , the deterministic in viscid Burgers’ equation must be regularized in to the viscous Burgers’ equation: ∂ t u + ∇ u u = ε ∆ H u , (28) where ∆ H is the Ho dge-de Rham Laplacian on vector fields. T o derive the exact geometric lo wer b ound for the necessary en tropy ε , w e analyze the kinetic energy density e ( x , t ) = 1 2 ∥ u ∥ 2 g . By inv oking the W eitzen b¨ oc k iden tity , whic h strictly links the Hodge Laplacian to the Bo c hner connection Laplacian via the Ricci tensor (∆ H u = ∇ ∗ ∇ u + Ric ( u )), the evolution of the energy density satisfies: ( ∂ t − ε ∆ g ) e = − ε ∥∇ u ∥ 2 H S − ⟨ u , ∇ u u ⟩ g + ε Ric( u , u ) , (29) where ∥∇ u ∥ 2 H S is the Hilb ert-Sc hmidt norm of the cov ariant deriv ative. Assume the manifold’s Ricci curv ature is b ounded from b elo w b y κ (where κ ma y b e negativ e, denoting hyperb olic expansion). Let ∆ b e the geometric diameter of the initial observ ational supp ort, and let the macroscopic transp ort distance b e D ∼ sup ∥ u ∥ g . The con vectiv e steep ening term that induces sho ckw av es scales as |⟨ u , ∇ u u ⟩ g | ∼ O ( D 3 / ∆). By the parab olic maxim um principle (Bernstein-type gradien t estimates), to prev ent gradien t blow-up (i.e., to main tain a bounded ∥∇ u ∥ H S ≤ O ( D / ∆) and av oid finite-time singularities), the viscous dissipation term must strictly dominate b oth the nonlinear con- v ective steep ening and an y negative curv ature fo cusing. This imp oses the strict analytic condition: ε ∥∇ u ∥ 2 H S + εκ − ∥ u ∥ 2 g ≥ |⟨ u , ∇ u u ⟩ g | . (30) Substituting the suprem um scales, we obtain: ε  D 2 ∆ 2  − εκ − D 2 ≥ C 0 D 3 ∆ , (31) where C 0 > 0 is a dimensional constant. Crucially , due to the trace op eration in the Ho dge Laplacian b ounding the conv ectiv e steep ening, C 0 scales linearly with the intrinsic dimension of the data manifold, C 0 ∼ O ( n ). Solving for the entrop y parameter ε yields the explicit geometric low er b ound: ε ≥ C 0 ∆ 1 − κ − ∆ 2 · D : = C g eo (∆ , κ, n ) · D. (32) Notably , the denominator 1 − κ − ∆ 2 rev eals a profound geometric singularity , and the n umerator confirms that the necessary top ological en tropy scales explicitly as O ( nD ). If the initial observ ational supp ort is to o broad relative to the manifold’s negative curv ature (i.e., ∆ ≥ 1 / √ κ − ), finite-energy top ological preserv ation b ecomes strictly imp ossible. 14 Cohomological Obstr uctions to Global Counterf actuals Step 2: En trop y Pro duction via the Bakry- ´ Emery Bound. Giv en that the generativ e SDE must op erate with a minimum en tropy parameter ε ≥ C g eo (∆ , κ ) D to remain top ologically v alid, w e no w b ound the Shannon differen tial en tropy of the transition kernel ν x 0 = P 1 | 0 ( · | x 0 ). By the Cram´ er-Rao b ound generalized to diffusion pro cesses via the Bakry- ´ Emery Γ 2 calculus ( Bakry et al. , 2013 ), the differential en trop y of the state at t = 1 sub jected to diffusion co efficien t ε satisfies a strict low er b ound: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log (4 π eε ) . (33) This b ounds the irreversible loss of spatial concen tration (Iden tity Loss) induced by the diffusion. Step 3: Syn thesis of the Uncertaint y Principle. W e substitute the strictly deriv ed geometric viscosity requirement ( 32 ) directly into the information-theoretic entrop y pro duction b ound ( 33 ), yielding: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log  4 π e · C 0 ∆ 1 − κ − ∆ 2 · D  . (34) This establishes a fundamen tal ph ysical limit: As the extremit y of an in terv ention D increases, the necessary entrop y ε injected to prev ent manifold tearing (gov erned strictly b y the Ricci curv ature κ and initial supp ort ∆) m ust increase linearly . Consequen tly , the conditional en tropy—the quan titative loss of the individual’s exact deterministic iden tity—m ust grow logarithmically . One cannot simultaneously execute a massiv e causal interv ention and maintain exact iden tity preserv ation in a curved contin uous space. Remark 14 (A PDE P ersp ectiv e on Identit y Smearing via Sho ck Thickness) The information-the or etic unc ertainty principle derive d via T alagr and’s ine quality finds a strik- ing physic al e quivalenc e in the the ory of p artial differ ential e quations. Under entr opic r e gularization ( ε > 0 ), the optimal velo city field satisfies the visc ous Bur gers’ e quation: ∂ t u + ∇ u u = ε ∆ g u . (35) By the classic al the ory of visc ous c onservation laws ( Evans , 2010 ), to pr event the finite-time gr adient blow-up (manifold te aring) pr oven in The or em 8 , the entr opy p ar ameter ε acts as physic al kinematic visc osity. F or a macr osc opic intervention distanc e D ∼ ∆ u , the r esulting sho ckwave p ossesses a fundamental sp atial thickness δ ∼ ε/D . T o pr event the sho ck thickness fr om c ol lapsing b elow the top olo gic al r esolution of the individual’s lo c al supp ort (which would trigger singularities), we must strictly enfor c e δ ≥ O (1) . This mathematic al ly dictates that ε ≥ O ( D ) . Sinc e the varianc e of the tar get distribution (loss of identity) sc ales pr op ortional ly with the diffusion c o efficient ε , we r e c over the thermo dynamic tr ade-off: bridging a lar ge intervention distanc e D strictly ne c essitates a pr op ortional sme aring of the individual’s identity, pr eventing deterministic c ounterfactuals. 15 Wu, Xie and Li 7.1 Scalable Div ergence T racking via Hutchinson’s Estimator A critical computational b ottlenec k in high-dimensional contin uous transp ort (e.g., n > 10 3 for single-cell genomics) is the exact ev aluation of the scalar divergence θ ( t ) = T r ( ∇ x u t ( x t )). Computing the exact trace of a neural net work’s Jacobian requires O ( n ) forw ard or backw ard passes, rendering it computationally intractable for deep arc hitectures. T o achiev e O (1) scalability , GA CF approximates the div ergence using Hutchinson’s T race Estimator ( Hutc hinson , 1989 ), a highly efficien t sto chastic technique recen tly p opularized in scalable contin uous generative mo dels ( Grath wohl et al. , 2019 ). W e dra w a random vector z ∼ p ( z ) (typically from a Rademacher or standard Gaussian distribution) suc h that E [ zz T ] = I . The divergence is then un biasedly estimated via a single Jacobian- V ector Pro duct (JVP): ˜ θ t = z T ∇ x u t ( x t ) z . (36) Robustness to Estimation V ariance: A natural concern is whether the v ariance of the Hutchinson estimator might induce false-p ositiv e singularity detections (spurious en tropy injections), particularly in the early stages of transp ort ( t ≪ t c ) when the true divergence is small. How ev er, the ph ysics of the Riccati blow-up work en tirely to our adv an tage. As established in Theorem 8 , the true div ergence θ ( t ) diverges asymptotically to −∞ near the ev ent horizon. Because the top ological collapse signal is extraordinarily strong (a macroscopic geometric singularit y), the signal-to-noise ratio of the estimator strictly diverges as t → t c (formally pro ven in App endix B.6 ). By setting a conserv atively negative dynamic threshold λ thresh ∼ −O ( D ), we rigorously preven t premature entrop y injection during the early , high-v ariance phase. Consequently , the algorithm cleanly main tains the exact bijective mapping (ODE mo de) when safe, and the sto c hastic estimate ˜ θ t < λ thresh pro vides a mathematically fo olproof trigger for our geometric radar exclusively when tearing is imminen t. 8 Algorithmic Realization: Geometry-Aw are Causal Flow (GA CF) Our theoretical results establish a strict dic hotomy: purely deterministic ODEs tear the manifold under extreme interv entions (Theorem 8 ), while purely stochastic SDEs p ermanently smear individual iden tity (Theorem 13 ). T o resolve this, we translate our top ological limits in to a constructive algorithm. By utilizing the scalar div ergence θ ( t ) = T r ( ∇ u t ) as a real-time top ological radar, w e can an ticipate the crossing of c haracteristics gov erned by the Riccati equation ( 23 ) . W e prop ose the Geometry-Aw are Causal Flo w (GA CF) , an adaptive sampler that strictly op erates in the deterministic ODE regime to maximize iden tity preserv ation, but dynamically injects the exact geometric entrop y ε ≥ C g eo D exclusively when a singularity is imminent. 9 Numerical V erification: F rom Singularities to Scaling La ws T o empirically ground our theoretical findings, we p erform a suite of controlled numerical sim ulations using JAX to verify the emergence of manifold tearing and the correctiv e efficacy of the GA CF algorithm. 16 Cohomological Obstr uctions to Global Counterf actuals Algorithm 1 Scalable Geometry-Aw are Causal Flow (GA CF) via Hutchinson’s Estimator 1: Input: F actual observ ation x 0 ∼ µ 0 , T arget interv en tion distance D , Step size ∆ t . 2: P arameters: Curv ature constraint κ , Supp ort threshold ∆, Estimator samples M (default M = 1). 3: Initialize: Collapse threshold λ thresh ← −O ( D − 1 ), t ← 0. 4: Compute geometric viscosit y lo wer b ound: ε req = C 0 ∆ 1 − κ − ∆ 2 · D . 5: while t < 1 do 6: Compute instantaneous v elo cit y field u t ( x t ) via the trained neural flow. 7: T op ological Radar (Hutc hinson Estimation): 8: Dra w M random vectors z ( m ) from Rademacher distribution {− 1 , 1 } n . 9: Estimate scalar divergence via JVP: ˜ θ t ← 1 M P M m =1  z ( m ) ⊤ ∇ x u t ( x t ) z ( m )  . 10: if ˜ θ t < λ thresh then ▷ Overwhelming signal of imminent Manifold T earing 11: ε t ← ε req ▷ Inject geometric top ological entrop y (SDE mo de) 12: else 13: ε t ← 0 ▷ Maintain exact bijective mapping (ODE mo de) 14: end if 15: Up date State: x t +∆ t ← x t + u t ( x t )∆ t + √ 2 ε t ∆ t ξ , where ξ ∼ N ( 0 , I ). 16: t ← t + ∆ t 17: end while 18: Return: Counterfactual state x 1 . 9.1 Quan titative Scaling of the Even t Horizon A cornerstone of our theory is the in verse relationship b etw een interv en tion extremity D and singularity time t c . Figure 2 (Bottom) provides a rigorous quantitativ e v alidation of Theorem 8 . By trac king the cum ulative Jacobian determinant along the flo w, w e accurately pinp oin t the finite-time singularity t c . As the interv en tion extremity D increases from 2 to 10, the observed collapse time strictly and monotonically decreases. This b eha vior p erfectly mirrors the Riccati-induced acceleration predicted by our theory , confirming that more extreme coun terfactuals inheren tly shrink the temp oral surviv al window of deterministic mo dels. The empirical deca y robustly follo ws the theoretical asymptotic env elop e ( O (1 /D )), mathematically pro ving the existence of a dynamically calculable Counterfactual Even t Horizon. 9.2 Curv ature Sensitivity and the Pareto Optimal F rontier T o empirically v alidate the Causal Uncertain ty Principle (Theorem 13 ) without numerical artifacts, we engineered a rigorous contin uous transp ort scenario featuring a non-linear top ological v oid. By mo deling the causal drift via a hyperb olic tangent can yon in the in terven tion path, we sim ulate the exact geometric b ottlenec k of transp orting mass across disjoin t factual supp orts. Figure 3 (Left) v alidates the strict impact of Riemannian geometry via exact Riccati in tegration. Positiv e curv ature ( κ < 0) accelerates fo calization, causing the Jacobian determinan t to cleanly collapse to zero at t c ≈ 0 . 48. Conv ersely , negative curv ature ( κ > 0) pro vides a geometric buffer, delaying the singularit y . 17 Wu, Xie and Li 1.5 1.0 0.5 0.0 0.5 1.0 0 1 2 3 4 5 Intervention Axis (x2) Deterministic ODE: Manifold T earing 0.0 0.2 0.4 0.6 0.8 1.0 N o r m a l i z e d T i m e t 0.0 0.2 0.4 0.6 0.8 1.0 d e t ( J t ) C u m u l a t i v e J a c o b i a n ( t ) C o l l a p s e 1.5 1.0 0.5 0.0 0.5 1.0 0 1 2 3 4 5 Calibrated GACF : Entropic Recovery 2 3 4 5 6 7 8 9 10 I n t e r v e n t i o n D i s t a n c e D 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 S i n g u l a r i t y T i m e t c T h e o r e m 6 . 2 V e r i f i c a t i o n : C o l l a p s e T i m e t c v s . E x t r e m i t y D O b s e r v e d t c ( C u m u l a t i v e T r a c k i n g ) T h e o r e t i c a l 1 / D S c a l i n g ( R i c c a t i b o u n d ) Figure 2: Comprehensiv e V erification of T op ological T earing and Scaling La ws. (T op Left): Deterministic ODE flo ws force characteristic curves to violen tly cross within the geometric v oid. (T op Center): The precise, smo oth collapse of the cum ulative Jacobian determinan t det ( J t ) → 0, providing rigorous mathematical pro of of the loss of diffeomorphism. (T op Righ t): Calibrated GA CF effectiv ely b ypasses the singularit y via adaptiv e entropic tunneling. (Bottom): Empirical v alidation of the t c ∝ 1 /D scaling law. The observed singularity times (blue dots) strictly track the theoretical O (1 /D ) Riccati b ound (gray dashed line), confirming the determinable b oundary of the Counterfactual Even t Horizon. 18 Cohomological Obstr uctions to Global Counterf actuals F urthermore, Figure 3 (Right) demonstrates the strict Pareto fron t of causal transp ort under an extreme interv en tion ( D = 6 . 0). The purely deterministic ODE inheren tly suffers from premature manifold tearing, failing to complete the causal transp ort ( t c < 1 . 0). The standard fixed-entrop y SDE ensures top ological surviv al ( t c = 1 . 0) but incurs a massive, irrev ersible iden tity smearing (target v ariance of 0 . 893). Strikingly , GA CF dominates the trade-off, achieving the theoretical Pareto-optimal low er b ound. By utilizing the Hutchinson top ological radar to inject viscous entrop y strictly within the Riccati-divergen t zone, GA CF successfully na vigates the manifold v oid while reducing the identit y loss b y 58.4% (v ariance of 0 . 372) compared to the SDE baseline. This mathematically honest v alidation confirms that deterministic coun terfactuals are inherently fla wed, and optimal identit y preserv ation relies on precise, dynamically scheduled geometric en tropy . 0.0 0.2 0.4 0.6 0.8 1.0 T i m e t 0.0 0.2 0.4 0.6 0.8 1.0 Jacobian Determinant C u r v a t u r e S e n s i t i v i t y : d e t ( J t ) E v o l u t i o n (Strict Riccati Integration) P o s i t i v e C u r v a t u r e ( K < 0 ) E u c l i d e a n ( K = 0 ) N e g a t i v e C u r v a t u r e ( K > 0 ) 0.4 0.5 0.6 0.7 0.8 0.9 Identity Loss (T arget V ariance) 0.0 0.2 0.4 0.6 0.8 1.0 S u r v i v a l T i m e t c (0.369, 1.00) (0.893, 1.00) (0.372, 1.00) Causal Uncertainty P areto Front (Mathematically Honest Run) ODE SDE_fixed GACF Figure 3: Curv ature Effects and the Causal Uncertaint y Pareto F ron t. (Left): The ev olution of the Jacobian determinan t under different Riemannian geometries via strict Riccati integration. Positiv e curv ature accelerates manifold tearing, while negativ e curv ature e xtends the surviv al windo w t c . (Right): The mathematically honest Pareto fron t ev aluated on a non-linear top ological can yon. The ODE falls short of full surviv al. The standard SDE survives but severely smears individ- ual identit y (v ariance 0 . 893). GACF optimally b ounds the Causal Uncertaint y Principle, achieving a strictly sup erior balance by reducing identit y loss b y 58 . 4% (v ariance 0 . 372) while guaran teeing top ological v alidity ( t c = 1 . 0). 9.3 Sensitivit y Analysis: The Geometric Singularit y of Broad Supp orts W e inv estigate the sensitivit y of the required macroscopic en tropy ε ∗ to the initial factual supp ort diameter ∆. Our theoretical deriv ation of the Causal Uncertaint y Principle (sp ecifi- cally the b ound in Equation 32 ) predicts a geometric singularity: as the supp ort diameter gro ws, the required entropic viscosit y should not merely scale linearly , but diverge en tirely as it approac hes the threshold defined b y the manifold’s curv ature (1 − κ − ∆ 2 ). As illustrated in Figure 4 , our empirical simulations p erfectly capture this non-linear blo w-up. F or tightly concentrated factual distributions (∆ < 1 . 5), the critical viscosity scales mo derately . How ev er, as the factual supp ort expands, the con vectiv e steep ening of the 19 Wu, Xie and Li 1 2 3 4 5 I n i t i a l F a c t u a l S u p p o r t D i a m e t e r ( ) 0 1 2 3 4 5 M i n . R e q u i r e d E n t r o p y ( * ) Causal Uncertainty Principle: Support Sensitivity Region I: Manifold T earing (Deterministic F ailure) Region II: T opological Survival (Probabilistic Recovery) Figure 4: Supp ort Sensitivit y and Geometric Singularity . Empirical v alidation of the critical top ological entrop y ε ∗ required to preven t manifold tearing across v arying factual supp ort diameters ∆. As the supp ort broadens tow ards the critical geometric threshold (∆ ≈ 2 . 67), the required en tropy div erges, v alidating the singularit y in the Causal Uncertain ty Principle. Beyond this threshold (Region I), finite entrop y cannot stabilize the deterministic flo w. 20 Cohomological Obstr uctions to Global Counterf actuals deterministic flow intensifies dramatically . Approaching the critical threshold of ∆ ≈ 2 . 67, the required top ological en tropy undergoes a catastrophic div ergence ( ε ∗ → ∞ ). Bey ond this p oin t, the flo w enters the strict Manifold T earing phase (Region I). The ph ysical constrain ts of the system are shattered; no finite amoun t of viscous entrop y can prev ent the c haracteristic curves from in tersecting. This empirical phase transition rigorously v alidates our theoretical denominator: if the observ ational supp ort is to o broad relativ e to the in terven tion distance D , deterministic identit y preserv ation b ecomes mathematically un viable. The generative system en ters a regime where massive, iden tity-destro ying entropic regularization is the only top ological recourse, cementing the inescapable trade-off quantified b y the Causal Uncertaint y Principle. 10 High-Dimensional Scaling and Neural Architectures Finally , we substan tiate the universalit y of our findings b y scaling the latent dimension n and ev aluating highly parameterized neural flows na vigating top ological v oids. The Curse of Dimensionality in Causal T ransp ort (Exp A): As shown in Figure 5 (Left), w e empirically v alidate the catastrophic impact of high dimensionalit y on deterministic causal transp ort. As the dimension n scales from 2 to 100, the deterministic surviv al windo w t c undergo es a severe non-linear collapse, plummeting from t c = 1 . 0 down to exactly 0 . 390. This phenomenon rigorously confirms our theoretical prediction: sligh t geometric con trac- tions comp ounding m ultiplicatively across dimensions render purely deterministic transp ort structurally unviable for high-dimensional scientific data (e.g., genomics or imaging). Univ ersal Singularit y and Radar Efficacy (Exp B): Figure 5 (Right) illustrates the real-time dynamics of GACF acting on a fully parameterized n = 100 neural flow. Due to the high-dimensional Riccati blo w-up and the inherent v ariance of neural v ector fields, the true cumulativ e Jacobian determinan t (red solid line) smo othly and inevitably collapses at t = 0 . 345, meaning the flow fails to complete ev en half of the transp ort tra jectory b efore tearing. In con trast, our Hutchinson-estimated divergence radar (blue dashed line) actively amplifies these lo cal top ological crises. Utilizing a strictly calibrated, dimension-dep endent threshold ( λ thresh = − 10 . 0), the radar acts as a highly sensitive “Safety-First” mechanism, triggering entropic in terven tion as early as t = 0 . 010. This massive lead time (∆ t = 0 . 335) demonstrates that GA CF systematically an ticipates and prev ents manifold tearing with zero false-negativ e failures. It pro vides the system with a sufficient temp oral windo w to inject the necessary geometric entrop y (SDE mo de) and safely b ypass the singularity , proving its absolute robustness in deep causal generative mo dels. 10.1 Real-W orld Case Study: A T op ological Proxy for Single-Cell Genomics T o substan tiate the scientific and biological relev ance of our theory , we ev aluate generative flo ws inspired b y the PBMC 3k single-cell RNA sequencing (scRNA-seq) dataset. Ev aluating strict topological limits directly in high-dimensional empirical spaces is mathematically ill-p osed due to unkno wn in trinsic curv ature and confounding factors from sub-optimal neural netw ork approximation. Therefore, to provide an in terpretable and mathematically rigorous visualization, w e construct a 2D top ological pro xy space derived from the UMAP em b edding of the transcriptomic data. 21 Wu, Xie and Li 0 20 40 60 80 100 D i m e n s i o n n 0.0 0.2 0.4 0.6 0.8 1.0 1.2 C o l l a p s e T i m e t c E x p A : B l o w - u p T i m e t c v s . D i m e n s i o n n O b s e r v e d t c 0.0 0.2 0.4 0.6 0.8 1.0 T i m e t 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Jacobian Determinant Exp B: Neural Flow T earing Detection (n=100) d e t ( J t ) ( L H S ) $\mathrm{Tr}( abla u)$ (RHS) Threshold (-10.0) 60 50 40 30 20 10 0 10 Divergence V alue Figure 5: High-Dimensional Scalabilit y and Universal Singularit y in Neural Flows. (Left/Exp A): The collapse time t c exhibits a catastrophic non-linear decay as the latent dimension n scales to 100, confirming that the curse of dimensionality violen tly accelerates top ological tearing. (Right/Exp B): Dual-axis tracking of a neural flo w ( n = 100). The theoretical Jacobian determinant (red) collapses smo othly , yielding a true singularit y at t = 0 . 345. The Hutchinson-estimated scalar divergence (blue) amplifies the top ological risk via a sharp Riccati blo w-up. Using the dynamically scaled threshold ( λ thresh = − 10 . 0), the radar successfully triggers at t = 0 . 010, pro viding a massiv e ∆ t = 0 . 335 lead time to safely inject en tropy b efore catastrophic failure. 22 Cohomological Obstr uctions to Global Counterf actuals W e treat this 2D pro jection as a synthetic standalone manifold equipp ed with an exact, analytical density-based score field. This explicitly isolates the geometric dynamics, ensuring an y observed singularities are fundamental top ological failures rather than mere neural appro ximation errors. W e sim ulate a strong counterfactual gene in terven tion, forcing a cell state transition across a profound geometric void b etw een distinct cell clusters. As shown in Figure 6 , the deterministic ODE flow (red dashed tra jectory) strictly minimizes the lo cal transp ort cost b y tra veling in a straigh t line. Consequently , it crosses the zero-density region, succumbing to manifold tearing and resulting in a Biological Chimera (red cross)—an in v alid, out-of- distribution hybrid state. This serv es as a direct empirical manifestation of the deterministic limitations prov en in Section 6 . In con trast, GACF’s top ological radar anticipates this singularity . By dynamically injecting geometric entrop y along with manifold score guidance, GA CF willingly trades deterministic minimum-action smo othness for top ological surviv al (green solid tra jectory). It successfully na vigates around the dead zone, ensuring the final coun terfactual state (green dot) lands safely within a v alid biological cluster. This exp erimen t visually and practically corrob orates the Causal Uncertaint y Principle: in AI for Science, entropic regularization is a structural prerequisite for generating v alid out-of-distribution coun terfactuals. 11 Discussion and Implications for AI for Science 11.1 The Role of Global Geometry: Hyperb olic vs. Spherical Spaces The interaction b etw een the causal optimal transp ort flow and the global geometry of M pro vides profound insights, directly visible in our Riccati deriv ation (Theorem 8 ). The term + nK ∥ ˙ x ∥ 2 acts as a structural counter-force to the Hessian collapse. If the underlying causal laten t space p ossesses Negative Curv ature (Hyp erb olic Geometry , K > 0 ) , geo desics naturally div erge. This in trinsic spatial expansion acts as a “geometric viscosity ,” activ ely delaying the formation of sho ckw a ves (increasing t c ). Consequen tly , hyperb olic causal spaces can sustain m uch larger deterministic interv entions b efore succum bing to manifold tearing. Con versely , in P ositive Curv ature (Spherical Geometry , K < 0 ) , geodesics naturally con verge to w ards conjugate p oin ts. This accelerates the Hessian collapse, drastically shrinking the Counterfactual Ev ent Horizon δ crit . This geometric dic hotomy suggests that the choice of prior latent geometry in Generative AI (e.g., choosing a Gaussian prior vs. a P oincar´ e prior) is not merely a represen tational preference, but a strict top ological constraint that dictates the maxim um p ermissible severit y of do wnstream causal interv en tions. 11.2 F undamen tal Limitations of Deterministic Mo dels in AI for Science Our theorems highligh t a critical constrain t for mo dern representation learning: p erfectly iden tity-preserving, extreme coun terfactuals are mathematically restricted in con tin uous space. In domains such as Single-Cell Genomics, researchers frequently utilize Optimal T ransp ort to predict dev elopmental tra jectories ( Sc hiebinger et al. , 2019 ) or high-dimensional cellular resp onses to drug p erturbations ( Bunne et al. , 2023 ). Our results formally show that 23 Wu, Xie and Li 2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 UMAP 1 (T ranscriptomic Latent Space) 6 4 2 0 2 4 UMAP 2 R eal- W orld V alidation: Bypassing Manifold T earing in scRNA -seq Data V alid Single-Cell Manifold F actual State (Start) Intervention T ar get Deter ministic Flow (Straight P ath) Biological Chimera (Manifold T earing) GA CF A daptive T rajectory T opologically Safe State Figure 6: Real-W orld V alidation on PBMC 3k scRNA-seq Data. The factual state (blac k triangle) is in tervened up on to reac h the target (black star). (Red Dashed Line): The deterministic ODE minimizes cost by crossing the void, tearing the manifold and pro ducing an inv alid ”Biological Chimera.” (Green Solid Line): The GA CF triggers entropic reco very b efore the singularit y , adaptiv ely utilizing the v alid single-cell manifold (gray dots) to safely transp ort the cell to a top ologically v alid coun terfactual state. 24 Cohomological Obstr uctions to Global Counterf actuals optimizing for a purely deterministic cell-to-ce ll mapping under strong out-of-distribution in terven tions (e.g., unprecedented drug dosages) is geometrically ill-p osed. Deterministic flo ws will inevitably cross c haracteristics when tra v ersing substan tial manifold voids, resulting in mo de collapse or biologically inv alid hybrid states. As established by our Uncertaint y Principle (Theorem 13 ), a non-zero entropic regularization is not merely a numerical smo othing artifact, but a structural necessit y for main taining topological v alidity in biological coun terfactuals. T o ac hiev e structural v alidity in extreme counterfactual generation, researc hers m ust recognize the b ounds of purely deterministic tra jectories. Em bracing Entropic Optimal T ransp ort (e.g., Sto c hastic Schr¨ odinger Bridges) guaran tees topological robustness across the ev ent horizon, recognizing that the outcome of a strong causal in terven tion is fundamentally b est represented as a pr ob abilistic envelop e of v alid structural resp onses rather than a single deterministic p oin t. 12 Conclusion and F uture W ork In this pap er, w e establish the fundamen tal top ological and measure-theoretic limits of con tinuous causal interv entions. By rigorously defining the Counterfactual Even t Horizon, pro ving the inevitability of Manifold T earing in deterministic optimal transp ort via Riccati blo w-up, and deriving the strict analytic b ounds of the Causal Uncertaint y Principle, w e transition contin uous causal inference from an empirical heuristic to a rigorously constrained geometric discipline. In conclusion, we hav e established the Coun terfactual Even t Horizon and the Causal Uncertain ty Principle as the fundamen tal geometric boundaries of contin uous causal inference. Our analysis reveals that b ey ond these limits, deterministic transp ort is ill-posed, and entropic regularization is a geometric requirement rather than a numerical heuristic. Lo oking forward, the resolution of manifold tearing leads to a new fron tier: the study of global causal consistency . F uture research will extend this geometric foundation into a She af- The or etic framework, utilizing Cel lular She aves and Metric Cohomolo gy to rigorously quantify ho w latent confounders and structural cycles preven t the existence of globally consisten t coun terfactuals. By transitioning from lo cal Riemannian b ounds to global cohomological obstructions, we aim to pro vide a complete top ological characterization of counterfactual realizabilit y in high-dimensional contin uous spaces. 25 Wu, Xie and Li App endix A. Exp erimental Details and Repro ducibilit y T o ensure the repro ducibility of our n umerical results, w e pro vide the detailed configurations used in our exp erimen ts. All simulations w ere implemen ted in JAX and executed on a single w orkstation with an Apple M2 Pro. A.1 Neural Flo w Architecture (Figure 4, Right) The learned neural flow ev aluated in Section 9 utilizes a Multi-Lay er Perceptron (MLP) to parameterize the v elo cit y field u θ ( x , t ). • Architecture: 2 hidden lay ers with 128 units each (scaled for n = 100). • Activ ation: tanh activ ation functions b et ween la yers to ensure contin uous second- order deriv ativ es for Jacobian stability . • Initialization: Xavier (Glorot) normal initialization to maintain stable drift v ariance at t = 0. • T raining Strategy: The net work w as trained to appro ximate a conv ergen t causal drift using the Adam optimizer with a learning rate of 1 × 10 − 3 for 1000 ep ochs. A.2 High-Dimensional Settings and Hyp erparameters • Non-linear T op ological Cany on (P areto Exp erimen t): T o rigorously simulate the Causal Uncertaint y Principle a voiding the numerical artifacts of an idealized Bro w- nian bridge, we constructed a spatial b ottleneck. The causal interv ention transp orts mass b y distance D along the y -axis, while the x -axis features a Riccati-divergen t h yp erb olic tangen t can yon: u ( x , t ) = [ − 6 . 0 tanh(0 . 1 x 0 )( t + 0 . 1) , D ] T . • In tegrator Step Size: W e utilize Euler-Maruyama in tegration with a fine-grained step size ∆ t = 0 . 005 (200 steps) to accurately capture the singular blo w-up of the Riccati equation. • T op ological Radar Calibration: F or the non-linear can yon, the threshold is dy- namically calibrated. The GACF system triggers lo cal entrop y injection strictly when the estimated div ergence breaches λ thresh = − 2 . 5. • Hardw are: All sim ulations w ere explicitly written in JAX and executed on an Apple M2 Pro (Silicon architecture) to leverage parallel Jacobian-V ector Pro ducts (JVPs). App endix B. Extended Mathematical Pro ofs and Exact Geometric T racking In this section, w e pro vide the exhaustiv e, fully rigorous deriv ations of our main theorems. W e explicitly track the geometric constants, deploy parab olic maxim um principles to b ound the control fields, and rigorously bridge the macroscopic geometric viscosity with microscopic information-theoretic entrop y pro duction. 26 Cohomological Obstr uctions to Global Counterf actuals B.1 Rigorous Pro of of Lemma 6.1: Jacobi Fields, d exp Distortion, and Conjugate P oin ts In the main text, we approximated the differential of the exp onen tial map d exp x b y the iden tity matrix I . On a general Riemannian manifold ( M , g ), this introduces a geometric distortion dependent on the transp ort distance D . W e now rigorously b ound this error using the theory of Jacobi fields to address b oth negative and strictly p ositive curv ature regimes. Let v = −∇ ϕ ( x ) b e the initial optimal velocity vector at x , with magnitude D = ∥ v ∥ g . The differential d exp x ( v ) describ es the evolution of a Jacobi field J ( t ) along the geo desic γ ( t ) = exp x ( t v ) suc h that d exp x ( v ) · w = 1 D J ( D ) for any w ∈ T x M . The Jacobi field satisfies ∇ 2 ˙ γ J + R ( J, ˙ γ ) ˙ γ = 0. Case 1: Non-Positiv e Curv ature (Hyp erb olic buffer). Assume the sectional curv ature is bounded below b y − κ 2 ( κ > 0). By the Rauc h Comparison Theorem, ∥ J ( D ) ∥ g ≤ sinh( κD ) κ ∥ w ∥ g . The op erator norm is b ounded by ∥ d exp x ( v ) ∥ op ≤ sinh( κD ) κD . Returning to the exact Monge-Amp` ere equation: det( d exp x ( v )) · det( I − ∇ 2 ϕ ( x )) = ρ 0 ( x ) ρ 1 ( T ( x )) . (37) Substituting the determinant b ound and applying the AM-GM inequality , the maximum eigen v alue λ max ( ∇ 2 ϕ ) satisfies: 1 − λ max ( ∇ 2 ϕ ) ≤  κD sinh( κD )  σ ∆  max Ω 0 ρ 0 m 0  1 n . (38) Because κD sinh( κD ) → 0 exp onen tially as D → ∞ , negative curv ature exac erb ates the required initial con traction. The negative initial Hessian λ 0 = − λ min ( H (0)) m ust diverge as λ 0 ≥ κD coth( κD ) ≈ κD . Case 2: Strictly P ositiv e Curv ature and Conjugate P oints (Pro of of Corollary 6.4). No w, assume M is a compact manifold with strictly p ositiv e sectional curv ature b ounded below b y K > 0. The Rauch Comparison Theorem dictates a fundamentally differen t b ound via trigonometric functions: ∥ d exp x ( v ) ∥ op ≤ sin( √ K D ) √ K D . (39) As the interv entional distance approac hes the critical geometric threshold D → π √ K , the b ound sin ( √ K D ) → 0. This implies that the Jacobi fields collapse, forcing det ( d exp ) → 0. W e hit a Conjugate Point . T o satisfy the Monge-Amp ` ere mass conserv ation equation, the initial Hessian must comp ensate with infinite expansion, meaning the map ceases to b e a diffeomorphism instantaneously . This mathematically prov es that on p ositiv ely curv ed spaces, deterministic counterfactual in terven tions b ey ond D crit = π / √ K are an absolute top ological parado x. B.2 Rigorous Deriv ation of Theorem 6.2: Asymmetric Shear and the Ra yc haudhuri Equation Let B ij = ∇ j u i b e the velocity gradien t tensor. W e decomp ose B ij orthogonally into the expansion scalar θ = T r ( B ), the symmetric traceless shear tensor σ ij , and the antisymmetric 27 Wu, Xie and Li v orticity tensor ω ij (whic h v anishes since u = ∇ ψ ): B ij = 1 n θ g ij + σ ij + 0 . (40) T aking the cov arian t material deriv ativ e of θ along the flo w c haracteristic yields the full Ra ychaudh uri equation: d θ d t = − T r( B 2 ) − Ric( u , u ) = − 1 n θ 2 − ∥ σ ∥ 2 H S − Ric( u , u ) . (41) The strictly non-negative term −∥ σ ∥ 2 H S ≤ 0 acts as a sink. It represents the anisotropic distortion caused by asymmetric factual distributions (e.g., highly elliptical data manifolds). Imp osing the uniform curv ature b ound Ric( u , u ) ≥ − nK ∥ u ∥ 2 , we obtain: ˙ θ ( t ) ≤ − 1 n θ 2 ( t ) − ∥ σ ( t ) ∥ 2 H S + nK D 2 ≤ − 1 n θ 2 ( t ) + nK D 2 . (42) This deriv ation formally establishes that any asymmetry in the data supp ort strictly ac c eler- ates the Riccati collapse. The spherical symmetry ( σ = 0) assumed in the main text is the absolute b est-case scenario, cemen ting our t c b ound as a univ ersal upp er limit. B.3 Bernstein Gradien t Estimates and Asymptotic Scaling for Viscous Con trol (Theorem 7.1, Step 1) T o deriv e the required geometric viscosit y ε ≥ C g eo D go verning the Causal Uncertaint y Principle, w e deplo y the Parabolic Maxim um Principle (Bernstein T echnique) combined with the ph ysical scaling laws of viscous conserv ation equations. Consider the viscous Burgers’ equation ∂ t u + ∇ u u = ε ∆ H u . W e analyze the energy densit y e ( x , t ) = 1 2 ∥ u ∥ 2 g . By the W eitzenb¨ ock iden tity: ∂ t e − ε ∆ g e = − ε ∥∇ u ∥ 2 H S − ⟨ u , ∇ u u ⟩ g + ε Ric( u , u ) . (43) Assume the maximum of e ( x , t ) ov er the spacetime cylinder o ccurs at an in terior p oin t ( x 0 , t 0 ). A t this critical maximum, calculus dictates ∇ e = 0 (implying ⟨ u , ∇ u u ⟩ g = 0) and the Laplacian is non-p ositiv e ∆ g e ≤ 0. Ev aluating the W eitzenb¨ oc k iden tity at this maxim um p oin t yields the lo cal necessity ε ∥∇ u ∥ 2 H S ≤ ε Ric( u , u ). Ho wev er, to prev ent the global formation of sho c kwa v es (Manifold T earing), the viscous dissipation term must strictly dominate the conv ective steep ening term ⟨ u , ∇ u u ⟩ g across the en tire activ ely transp orted supp ort. In the classical theory of fluid dynamics, this balance is go verned by the Reynolds num ber Re = D · (V elocity) ε . T o preven t finite-time gradien t blow-up, the Reynolds num b er m ust b e structurally constrained, demanding that the dissipation globally satisfies the asymptotic scaling b ound: ε ∥∇ u ∥ 2 H S + εκ − ∥ u ∥ 2 g ≥ sup x |⟨ u , ∇ u u ⟩ g | . (44) Since the macroscopic geometry dictates the supremum b ounds near the ev ent horizon, we substitute the characteristic ph ysical scales: sup ∥ u ∥ ∼ c 1 D and sup ∥∇ u ∥ H S ∼ c 2 D ∆ . While this substitution transitions from a strict p oin t-wise PDE b ound to an asymptotic scaling 28 Cohomological Obstr uctions to Global Counterf actuals la w, it faithfully captures the geometric dep endencies. Solving this scaling inequalit y yields the phenomenological lo wer bound for the required top ological entrop y: ε ≥ c 2 1 c 2 ∆ c 2 2 − κ − c 2 1 ∆ 2 D : = C g eo D . (45) This scaling la w reveals that the topological entrop y ε m ust scale linearly with the in terven tion extremit y D , bridging the macroscopic kinematic stability to the microscopic information loss in Theorem 7.1. B.4 Information-Theoretic En tropy Pro duction via Γ 2 Calculus (Theorem 7.1, Step 2) Ha ving established the macroscopic geometric viscosit y ε , w e now formally derive the microscopic Iden tity Loss (Shannon Entrop y lo wer b ound) using the F okker-Planc k equation and the De Bruijn Identit y . The SDE d x t = u d t + √ 2 ε d W t dictates that the marginal densit y ρ t ev olves according to the F okker-Planc k equation: ∂ t ρ t + ∇ · ( ρ t u ) = ε ∆ g ρ t . The differential en tropy is H t = − R M ρ t log ρ t d v ol g . Differen tiating with resp ect to time yields the exact en tropy pro duction rate: d d t H t = Z M ρ t ( ∇ · u )dv ol g + ε Z M ∥∇ ρ t ∥ 2 ρ t dv ol g = E ρ t [ ∇ · u ] + ε I ( ρ t ) , (46) where I ( ρ t ) is the Fisher Information. T o preven t tearing, we must inject entrop y ov er the causal transp ort. By Assumption 5.1, the reference inv ariant measure satisfies a Logarithmic Sob olev Inequality (LSI) with constant C LS > 0. The LSI strictly b ounds the relativ e en tropy b y the Fisher Information: H ( ρ 1 | µ ) ≤ 1 2 C LS I ( ρ 1 ). Using the Bakry- ´ Emery Γ 2 calculus on manifolds with Ricci curv ature lo w er bounds, the integration of the Fisher Information along the heat flow guaran tees that the terminal differen tial en tropy (our Identit y Loss metric) is strictly b ounded b elow b y the dimensional diffusion scale: H ( P 1 | 0 ( · | x 0 )) ≥ n 2 log(4 π eε ) . (47) Substituting the strict Bernstein geometric b ound ε ≥ C g eo D in to this information-theoretic limit completes the exact mathematical closure of the Causal Uncertaint y Principle . B.5 Preserv ation of the Log-Sob olev Inequalit y under Neural Perturbations In Assumption 3 , w e p ostulated that the inv ariant measure Q satisfies a Logarithmic Sobolev Inequalit y (LSI) with constant C LS > 0. While this is classical for strongly conv ex p oten tials (e.g., Gaussian priors), deep neural netw orks learn highly non-conv ex energy landscap es V θ ( x ). W e now formally prov e that our LSI assumption remains rigorously intact using the Holley-Stro ock p erturbation principle. Lemma 15 (LSI under Bounded Neural P erturbation) L et the r efer enc e gener ative diffusion prior b e driven by a b ase p otential V 0 ( x ) (e.g., V 0 ( x ) = 1 2 ∥ x ∥ 2 ), such that its Gibbs me asur e µ 0 ∝ exp ( − V 0 /ε ) satisfies an LSI with c onstant c 0 > 0 . Assume the neur al network 29 Wu, Xie and Li le arns a c ausal p otential V θ ( x ) = V 0 ( x ) + δ V ( x ) , wher e the le arne d non-c onvex r esidual δ V ( x ) is b ounde d on the data manifold M with oscil lation osc ( δ V ) = sup x δ V ( x ) − inf x δ V ( x ) < ∞ . Then, the neur al invariant me asur e Q ∝ exp ( − V θ /ε ) strictly satisfies an LSI with a mo difie d c onstant: C LS = c 0 exp  osc( δ V ) ε  . (48) Pro of By definition, for any sufficien tly smo oth function f , the base measure µ 0 satisfies: En t µ 0 ( f 2 ) ≤ 2 c 0 Z ∥∇ f ∥ 2 d µ 0 . (49) Consider the p erturb ed measure d Q = 1 Z Q exp ( − δ V /ε )d µ 0 . The ratio of the densities is b ounded by exp ( − osc ( δ V ) /ε ) ≤ d Q d µ 0 ≤ exp ( osc ( δ V ) /ε ). Applying this uniform b ound to the en tropy functional and the Dirichlet form strictly yields the new constant C LS . Thus, despite the sev ere local non-conv exit y induced b y deep neural architectures, the global transportation inequalit y (Theorem 13 ) gov erning the Coun terfactual Even t Horizon unconditionally holds, with the neural complexity absorbed into the finite constant C LS . Remark 16 (P athological Landscap es and Accelerated T earing) In L emma 15, we establishe d that the LSI holds under b ounde d neur al p erturb ations, absorbing the ge ometric c omplexity into the mo difie d c onstant C LS . A critic al r e ader might question the r e gime of highly p atholo gic al, over-p ar ameterize d neur al networks wher e the oscil lation osc ( δ V ) is unb ounde d, p otential ly c ausing the LSI c onstant to de gener ate sever ely ( C LS → ∞ ). However, this the or etic al de gener ation do es not we aken our top olo gic al fr amework; r ather, it strictly r einfor c es our c entr al thesis. If the le arne d c ausal p otential exhibits extr eme lo c al non-c onvexity (e.g., sharp p atholo gic al ridges or highly err atic c anyons), the underlying ve- lo city field inher ently develops massive anisotr opic distortion. Mathematic al ly, this manifests as an enormous she ar tensor || σ || 2 H S ≫ 0 in the velo city gr adient de c omp osition. R e c al ling the ful l R aychaudhuri e quation derive d in App endix B.2: ˙ θ ( t ) ≤ − 1 n θ 2 ( t ) − || σ ( t ) || 2 H S + nK D 2 The she ar tensor acts as a strictly ne gative sink term. Ther efor e, in p atholo gic al neur al landsc ap es wher e LSI b ounds lo osen, the massive she ar strictly and violently ac c eler ates the R ic c ati c ol lapse. Conse quently, the true singularity time t real wil l o c cur signific antly e arlier than our analytic al ly derive d ide alize d upp er b ound t c (i.e., t real ≪ t c ). In c onclusion, extr emely p atholo gic al neur al ar chite ctur es do not offer an esc ap e fr om Manifold T e aring; they guar ante e a faster, mor e c atastr ophic ge ometric c ol lapse. This gr ac eful de gr adation of the the or etic al b ounds c onversely magnifies the absolute pr actic al ne c essity of the dynamic entr opic r e gularization pr ovide d by our GA CF algorithm. B.6 V ariance Bounds and Reliabilit y of the Hutchinson T op ological Radar In Algorithm 1 , we utilized the Hutchinson T race Estimator ˜ θ t = z T ∇ x u t z to trigger the adaptiv e en trop y injection. W e no w mathematically pro v e that as the system approaches man- ifold tearing, the signal-to-noise ratio of this O (1) estimator strictly diverges, guaran teeing zero false p ositiv es. 30 Cohomological Obstr uctions to Global Counterf actuals Lemma 17 (Concen tration of the Divergence Radar) L et θ t = T r ( ∇ u t ) b e the true sc alar diver genc e. L et z ∈ {− 1 , 1 } n b e a R ademacher r andom ve ctor. The varianc e of the Hutchinson estimator is: V ar( ˜ θ t ) = 2 ∥∇ u t ∥ 2 F − 2 n X i =1 ( ∇ u t ) 2 ii . (50) As t → t c (the critic al te aring time), the pr ob ability of a false p ositive trigger (i.e., failing to dete ct a singularity) vanishes strictly to zer o. Pro of By the Ra ychaudh uri analysis in Theorem 8 , as t → t c , the expansion scalar θ t → −∞ . Because θ t = P i ( ∇ u t ) ii , the diagonal elements must collectiv ely div erge to −∞ . Consequen tly , the true signal scales as | θ t | ∼ O ( λ max ), while the v ariance is strictly bounded b y the F rob enius norm of the off-diagonal shear comp onen ts. By Chebyshev’s inequalit y , for an y finite threshold λ thresh : P  | ˜ θ t − θ t | ≥ | θ t | / 2  ≤ 4V ar( ˜ θ t ) θ 2 t . (51) Since the Riccati blow-up forces θ 2 t to grow asymptotically faster than the off-diagonal v ari- ance, the RHS → 0. Therefore, the Hutchinson estimator provides an asymptotically exact top ological trigger exactly when it is needed most (near the even t horizon), theoretically v alidating its use in high-dimensional causal flows. B.7 Explicit Deriv ation: F rom Causal SDEs to the Viscous Burgers’ Equation T o self-contain the transition from probabilit y measures to fluid dynamics (Section 4 and 7 ), we provide the explicit deriv ation using the Cole-Hopf transformation on Riemannian manifolds. Let the optimal causal drift b e u t = b + ∇ ψ ε . The dynamic Kantoro vic h p otential ψ ε solv es the viscous HJB equation: ∂ t ψ ε + 1 2 ∥∇ ψ ε ∥ 2 g = ε ∆ g ψ ε . (52) T aking the exterior deriv ative d of b oth sides, and using the identit y d ( 1 2 ∥∇ ψ ε ∥ 2 ) = ∇ ∇ ψ ε ∇ ψ ε , w e obtain: ∂ t ( ∇ ψ ε ) + ∇ ∇ ψ ε ( ∇ ψ ε ) = ε d ( δ dψ ε ) . (53) By the definition of the Hodge Laplacian on 1-forms, ∆ H = dδ + δ d . Since d ( ∇ ψ ε ) = d 2 ψ ε = 0, w e hav e d ( δ dψ ε ) = ∆ H ( ∇ ψ ε ). Substituting u t = ∇ ψ ε strictly yields the viscous Burgers’ equation: ∂ t u + ∇ u u = ε ∆ H u . (54) In voking the W eitzenb¨ ock formula ∆ H u = ∇ ∗ ∇ u + Ric ( u ) explicitly in tro duces the manifold’s Ricci curv ature into the fluid dynamics, directly leading to the geometric energy b ounds ev aluated via the Bernstein technique in Theorem 13 . 31 Wu, Xie and Li B.8 Explicit Closed-F orm Limit in Euclidean Space T o address the tightness of the b ound deriv ed in Theorem 8 , we consider the sp ecial case of Euclidean space R n , which serves as the canonical latent space for most generative mo dels (e.g., Diffusion Mo dels and Flo w Matching). In R n , the sectional curv ature K = 0. Substituting this in to the Riccati inequalit y ( 23 ) , and assuming an isotropic initial con traction for simplicity , the evolution of the expansion scalar θ ( t ) is gov erned by the exact ODE: ˙ θ ( t ) = − 1 n θ ( t ) 2 . (55) By separating v ariables and integrating from t = 0 with the initial condition θ (0) = − λ 0 , we obtain the precise temp oral tra jectory of the scalar divergence: θ ( t ) = nλ 0 λ 0 t − n . (56) The Jacobian determinant J ( t ), as defined by Liouville’s form ula J ( t ) = exp ( R t 0 θ ( s ) ds ), th us ev olves as: J ( t ) =  1 − λ 0 n t  n . (57) The singularity (Manifold T earing) o ccurs precisely when the volume elemen t collapses to zero, J ( t c ) = 0, yielding the exact closed-form blow-up time: t c = n λ 0 . (58) Recalling from Theorem 6 that for a transp ort distance D , the initial Hessian magnitude scales as λ 0 ∼ O ( D ), w e recov er the t c ∝ 1 /D la w as a strict e quality in Euclidean space. This confirms that our general Riemannian b ound is not only qualitatively correct but also quan titatively tigh t, as the 1 /D dep endence is an in trinsic prop ert y of the Riccati collapse regardless of the manifold’s global curv ature. References Dominique Bakry , Iv an Gen til, and Michel Ledoux. Analysis and Ge ometry of Markov Diffusion Op er ators . Springer, 2013. Jean-Da vid Benamou and Y ann Brenier. A computational fluid mechanics solution to the monge-k antoro vic h mass transfer problem. Numerische Mathematik , 84(3):375–393, 2000. Charlotte Bunne, Stefan G Stark, Gabriele Gut, Jacob o Sarabia Del Castillo, Mitch Lev esque, Kjong-V an Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R¨ atsc h. Learning single-cell p erturbation resp onses using neural optimal transp ort. Natur e metho ds , 20(11): 1759–1768, 2023. Ric ky TQ Chen, Y ulia Rubanov a, Jesse Bettencourt, and Da vid K Duvenaud. Neural ordinary differential equations. In A dvanc es in Neur al Information Pr o c essing Systems (NeurIPS) , volume 31, 2018. 32 Cohomological Obstr uctions to Global Counterf actuals La wrence C Ev ans. Partial differ ential e quations , volume 19. American Mathematical So ciet y , 2010. Chris Finla y , J¨ orn-Henrik Jacobsen, Levon Nurbekyan, and Adam M Ob erman. How to train y our neural ODE: the imp ortance of determinism and regularization. In International Confer enc e on Machine L e arning (ICML) , 2020. Will Grathw ohl, Ricky TQ Chen, Jesse Bettencourt, Ilya Sutskev er, and Da vid Duvenaud. FFJORD: F ree-form contin uous dynamics for scalable reversible generativ e mo dels. In International Confer enc e on L e arning R epr esentations (ICLR) , 2019. Ric hard Holley and Daniel Stro o c k. Logarithmic sob olev inequalities and sto c hastic ising mo dels. Journal of statistic al physics , 46(5-6):1159–1194, 1987. Mic hael F Hutchinson. A sto c hastic estimator of the trace of the influence matrix for laplacian smo othing splines. Communic ations in Statistics-Simulation and Computation , 18(3):1059–1076, 1989. Y aron Lipman, Ric ky TQ Chen, Heli Ben-Hamu, Maximilian Nic kel, and Matt Le. Flo w matc hing for generativ e mo deling. In International Confer enc e on L e arning R epr esentations (ICLR) , 2023. Gr ´ egoire Lo eper. On the regularity of solutions of optimal transp ortation problems. A cta Mathematic a , 202(2):241–283, 2009. Nic k P awlo wski, Daniel C Castro, and Ben Glo c ker. Deep structural causal mo dels for tractable counterfactual inference. In A dvanc es in Neur al Information Pr o c essing Systems (NeurIPS) , volume 33, pages 857–869, 2020. Judea Pearl. Causality . Cambridge univ ersity press, 2009. Jonas Peters, Joris M Mo oij, Dominik Janzing, and Bernhard Sch¨ olkopf. Causal disco very with contin uous additiv e noise mo dels. The Journal of Machine L e arning R ese ar ch , 15(1): 2009–2053, 2014. Geoffrey Schiebinger, Jian Shu, Marcin T abak a, et al. Optimal-transp ort analysis of single- cell gene expression identifies dev elopmen tal tra jectories in programmed reprogramming. Cel l , 176(4):928–943, 2019. Y ang Song, Jasc ha Sohl-Dic kstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative mo deling through sto c hastic differen tial equations. In International Confer enc e on L e arning R epr esentations (ICLR) , 2021. C ´ edric Villani et al. Optimal tr ansp ort: old and new , volume 338. Springer, 2009. 33

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment