Hierarchy of extreme-event predictability in turbulence revealed by machine learning

H I E R A R C H Y O F E X T R E M E - E V E N T P R E D I C T A B I L I T Y I N T U R B U L E N C E R E V E A L E D B Y M AC H I N E L E A R N I N G Y uxuan Y ang National Univ ersity of Singapore Singapore y.yang.65@u.nus.edu Chenyu Dong National Univ ersity of Singapore Singapore chenyu.dong@u.nus.edu Gianmarco Mengaldo National Univ ersity of Singapore Singapore mpegim@nus.edu.sg A B S T R AC T Extreme-ev ent predictability in turbulence is strongly state dependent, yet e vent-by-ev ent predictabil- ity horizons are difﬁcult to quantify without access to gov erning equations or costly perturbation ensembles. Here we train an autoregressi ve conditional diffusion model on direct numerical sim- ulations of the two-dimensional K olmogorov ﬂow and use a CRPS-based skill score to deﬁne an ev ent-wise predictability horizon. Enstrophy extremes exhibit a pronounced hierarchy: forecast skill persists from ≈ 1 to > 4 L yapunov times across events. Spectral ﬁltering shows that these horizons are controlled predominantly by lar ge-scale structures. Extremes are preceded by intense strain cores organizing quadrupolar vortex packets, whose lifetime sharply separates long- from short-horizon ev ents. These results identify coherent-structure persistence as a governing mechanism for the predictability of turb ulence extremes and provide a data-dri ven route to diagnose predictability limits from observations. 1 Introduction Predicting the future e volution of natural phenomena such as the Earth’ s atmosphere is central to science and carries major societal and economic implications [ 1 ]. Such systems are high-dimensional and multiscale (often with intrinsic or ef fective stochasticity), and their chaotic dynamics limits predictability to a ﬁnite horizon [ 2 , 3 , 4 , 5 , 6 , 7 ]. These limits are most consequential for rare, high-impact excursions far from the normal system’ s behavior – extreme e vents – where early warning is essential [8, 9]. Recent numerical studies indicate that extreme-e vent predictability is strongly event dependent : different extremes can exhibit mark edly different predictability [ 10 ]. This variability suggests that extremes are not uni versally less predictable and may be controlled by identiﬁable physical mechanisms [ 11 ]. Howe ver , the origins of this ev ent-to-event spread in predictability remain poorly understood. F or instance, in the context of ﬂuid mechanics, it is unclear what ﬂow organizations (if an y) separate more predictable extremes from rapidly unpredictable ones. A ke y open question is whether such differences can be diagnosed dir ectly fr om data , without access to the gov erning equations. Addressing this question is nontrivial because extremes are rare and therefore difﬁcult to learn and characterize reliably with data-driv en models [ 12 , 13 ]. Nev ertheless, equation-free predictability diagnostics are highly desirable, both for realistic systems where only observations are a vailable and for building mechanistic understanding that can inform forecasting and control [14]. T o place this question in context, it is useful to recall ho w predictability is traditionally quantiﬁed in dynamical systems. The e valuation of predictability traces back to the sensitivity to initial conditions introduced by Lorenz [ 6 ], later formalized through the lar gest L yapunov exponent [ 15 ], which characterizes the e xponential growth of inﬁnitesimal perturbations. Subsequently , various e xtensions hav e been dev eloped to quantify predictability ov er ﬁnite temporal scales, thereby enabling the in vestigation of local predictability in nonlinear re gimes [ 16 , 17 , 18 , 19 , 20 ]. In practice, many of these approaches are inherently ensemble-dependent, where multiple trajectories are generated by perturbing the initial conditions of a kno wn forward operator [ 21 ]. Such requirements pose signiﬁcant challenges: the governing equations are often unkno wn in realistic settings, the computational cost of ensembles can be prohibiti ve, and results frequently exhibit high sensiti vity to the choice of initial perturbations [ 10 ]. Data-driv en approaches of fer equation-free alternati ves that bypass the need for an explicit forw ard operator . These methods include local dynamical indices rooted in extreme v alue theory [ 22 , 23 , 24 ], recent state-dependent predictability metrics such as time-lagged recurrence [ 25 ], and information-theoretic measures that quantify predictability through div ergences between forecast and climatological distributions [ 26 , 27 , 28 , 29 ]. These approaches characterize predictability as a property of the system or of typical states, but not as a forecast problem conditioned on a particular e vent. As a result, they do not provide an e vent-wise score curve from which a concrete e vent-wise predictability horizon (also referred to as limit) can be extracted. Recent advances in machine learning pro vide a paradigm shift, whereby data-driv en approaches can directly learn the ev olution of forecast distributions [30, 31], ef fectiv ely acting as surrogate models. This new route enables the explicit estimation of ev ent-conditioned predictability limits, bringing ensemble-style forecast veriﬁcation into an equation-free, data-driv en setting. In this work, we introduce a diffusion-based ensemble forecasting framew ork that implements this principle and allows quantifying ev ent-wise predictability limits of extreme e vents from data. Forecast distributions are generated by an autoregressi ve conditional diffusion model [ 30 ], enabling stable probabilistic predictions at long lead times. These forecast distributions are e valuated using probabilistic scoring rules, and an event-wise predictability horizon is deﬁned from the lead-time decay of forecast skill, expressed in units of the L yapunov time. Applying the proposed framework to a well-known two-dimensional turbulence testbed, the K olmogorov ﬂow , we ﬁnd a pronounced hierarchy of predictability across extremes, with predictability horizons spanning from ∼ 1 to > 4 L yapunov times. The resulting separation into high- and lo w-predictability events is consistent with rankings obtained from DNS perturbation ensembles, supporting the physical rele vance of the data-driv en estimates. Scale-ﬁltering further shows that these horizons are controlled predominantly by large-scale modes. Finally , coherent-structure analysis identiﬁes the precursor organization responsible for this hierarchy: extreme e vents are anchored to intense strain cores that organize quadrupolar v ortices, and the temporal persistence of these quadrupoles distinguishes highly predictable extremes from those with short predictability . T ogether , these ﬁndings identify coherent-structure persistence as the mechanism setting extreme-e vent predictability horizons and show that ev ent-wise, skill-based diagnostics provide a practical route to quantify these predictability limits directly from data. 2 Methodology W e consider a homogeneous and isotropic turb ulent ﬂo w contained in a doubly periodic square domain of area L 2 = (2 π ) 2 , driv en by a sinusoidal body force with a speciﬁed wav enumber . This system is known as the K olmogorov ﬂow [32], and it is go verned by the incompressible Na vier-Stokes equations ∇ · u = 0 , (1a) ∂ u ∂ t + u · ∇ u = −∇ p + ν ∇ 2 u + f , (1b) where u is the velocity ﬁeld , p is the kinematic pr essur e , ν is the kinematic viscosity , and f is the forcing term deﬁned as f ( y ) =  f 0  L 2 π  cos  2 π n f L y  , 0  , (2) with forcing amplitude f 0 and integer forcing mode n f = 4 (corresponding to physical w avenumber k f = 2 π n f /L ). The forcing term in Eq. (2) only acts on the x -direction. Follo wing Refs. [33, 34], we deﬁne the Reynolds number as Re = √ f 0 ( L/ 2 π ) 2 ν , (3) and set Re = 100 , a regime known to exhibit extreme ev ents. The two-dimensional incompressible Navier–Stokes equations (1) are solved by direct numerical simulation (DNS) on a 128 × 128 grid in a doubly periodic domain of size L 2 = (2 π ) 2 . Extreme e vents in this ﬂo w conﬁguration take the form of intermittent b ursts of energy dissipation, and are identiﬁed as sudden surges in the instantaneous spatially averaged enstrophy Ω = ⟨ ω ⟩ 2 . Speciﬁcally , we deﬁne extreme e vents as those states with enstrophy Ω belonging to the top 1% of enstrophy distribution in the dataset. This criterion corresponds to a threshold value of approximately Ω th ≈ 8 . 5 . Using the deterministic DNS dataset created on the 128 × 128 squared domain of size L 2 = (2 π ) 2 , we construct a diffusion-based autore gressiv e probabilistic model that learns the conditional distrib ution of future velocity ﬁelds gi ven the current state, i.e., p ( x t +1 | x t ) [ 35 , 30 ]. The model follows an autore gressive denoising dif fusion formulation trained on velocity-ﬁeld ev olution, implemented with a U-Net backbone [ 36 ] and v arious established smaller architecture modernizations [ 37 ]. Although the training data consist of deterministic realizations, the dif fusion model learns the conditional distrib ution of future states gi ven the present state, thereby providing a stochastic representation of the ﬂo w dynamics. Ensemble forecasts are then generated 2 by sampling from the learned distribution, without e xplicitly introducing perturbations to the initial conditions. Once trained, the model recursiv ely samples future states, enabling probabilistic forecasts over arbitrarily long horizons and facilitating e vent-wise predictability assessment at extended lead times. More details regarding the model are provided in the Supplemental Material. Forecast quality is assessed with the continuous ranked probability score ( CRPS ) [ 38 , 39 ]. For an extreme e vent e whose enstrophy attains its maximum at time t e , we assess the forecast for the scalar observ able Ω( t e ) using predictions initialized at time t = t e − T , i.e., at lead time T = t e − t (in Sec. 3, we equiv alently write T = − ( t − t e ) to represent time relativ e to the event peak). W e evaluate forecast skill at the enstrophy peak (in analogy with Ref. [ 10 ]) rather than at the threshold-crossing time because the latter depends on the arbitrary choice of the detection threshold and shifts when that threshold is varied. In contrast, the peak time t e is uniquely deﬁned, ensuring that the resulting predictability estimates are robust to the threshold used to identify e xtremes. The CRPS for ev ent e at lead time T is giv en by CRPS T e ( D , Ω e ) = Z R ( F D (Ω) − H (Ω − Ω e )) 2 dΩ (4) where D is the distribution of the forecasted enstrophy , F D is the cumulativ e distribution function of the forecasted enstrophy , H is the unit step function, and Ω e ∈ R is the ground truth of enstrophy at the event peak. A vanishing CRPS T e in Eq. (4) corresponds to a perfect probabilistic forecast, whereas lar ge values indicate de graded predictiv e performance due to either substantial bias between the predicted mean and the ground truth or excessi ve forecast spread. T o determine whether a forecast at lead time T retains skill for predicting extreme ev ents, we compare CRPS T e against a reference CRPS ref deﬁned as CRPS ref = Z R ( F L (Ω) − H (Ω − Ω th )) 2 dΩ (5) where F L is the cumulativ e distribution function of the long-term distribution of enstrophy denoted by subscript L , and Ω th is the threshold of extreme ev ents. The reference score deﬁned in equation (5) represents the best performance attainable under climatological prediction, since using the extreme-e vent threshold yields the lower bound of the CRPS in the absence of state-dependent information. Using Eqs. (4) and (5), we introduce the e vent-wise predictability score for e vent e at lead time T S e ( T ) = 1 − CRPS T e CRPS ref . (6) The event-wise predictability score S e ( T ) has an upper bound of one, with S e ( T ) = 1 corresponding to a perfect forecast. V alues S e ( T ) ≤ 0 indicate that the forecast skill in predicting e xtreme events is no better than that obtained from the long-term distrib ution (i.e., climatology). The ev ent-wise predictability score S e ( T ) measures forecast skill relativ e to a reference prediction, in the spirit of the continuous ranked probability skill score (CRPSS) [ 40 ]. The key difference here is that S e ( T ) is computed for indi vidual extreme e vents and speciﬁc lead times, rather than as an av erage over an entire dataset. This allows us to directly assess the predictability of particular extreme e vents, enabling further ev ent-wise analysis. The predictability horizon (also referred to as limit) T ∗ e for ev ent e is then the largest lead time T such that S e ( T ) > 0 ; equi valently , it is the lead time at which the forecast CRPS T e degrades to the long-term reference CRPS ref . In practice, we e valuate S e ( T ) for each detected extreme event using dif ferent lead times. The statistics of the resulting predictability limits T ∗ e are expressed in both units of the system’ s L yapunov time T λ = 3 . 6 √ f 0 [10] and simulation time units. 3 Results Fig. 1 displays S e ( T ) (blue colormap) along with their associated predictability limits T ∗ e for each extreme ev ent in the test set (i.e., N ev ents = 211 ), where we also categorized extreme e vents into three regimes (right axis): a high-predictability regime ( T ∗ e > 3 T λ ), a low-predictability regime (bottom 8% quantile, chosen to approximately match the number of e vents in the high-predictability regime), and an intermediate regime comprising the remaining e vents. The horizontal axis at the bottom shows the time relati ve to the ev ent, t − t e ≤ 0 in simulation time units, while the top axis shows the simulation time units normalized by the L yapunov time T λ . Note that increasingly negati ve values of t − t e correspond to longer lead times, i.e., earlier predictions. The ﬁgure demonstrates that predictability rapidly declines with increasing lead time. 3 Figure 1: Predictability ev olution of extreme events prior to occurrence. The heatmap shows S e ( T ) against t − t e (bottom axis) and normalized time ( t − t e ) /T λ (top axis), with darker shadings indicating higher predictability . V ertical ticks to the left of each shading mark the individual predictability limit (horizons) T ∗ e for each corresponding extreme ev ent. The box plot abov e the heatmap summarizes the distribution of the e vent-wise predictability score ( S e ( T ) ) for all ev ents. The right axis indicates the predictability regime of each ev ent (high predictability , medium predictability , or low predictability) based on its corresponding T ∗ e . Most extreme e vents lose predictability within approximately 7 simulation time units, corresponding to 1-2 L yapunov times ( T λ ). This is visible in the heatmap as the colors transition from dark blue (high S e ( T ) ) on the right (short lead time) to light blue (lo w S e ( T ) ) on the left (long lead time). Howe ver , certain extreme e vents exhibit remarkably long predictability limits, extending up to 15 simulation time units, corresponding to more than 4 L yapunov times. Giv en this ev ent-dependent nature of predictability , that we were able to detect thanks to the proposed framework, we ask ourselves: what is driving dif ferent pr edictability horizons for differ ent e vents? T o answer this question, we ﬁrst attempt to understand whether the predictability of extreme e vents in the K olmogorov ﬂow is gov erned primarily by large-scale coherent structures. T o this end, we repeat the forecasts obtained with the dif fusion-based autoregressi ve probabilistic model after systematically removing small-scale components from the initial conditions. More speciﬁcally , we apply an isotropic spectral cutof f in Fourier space, where modes with radial wav enumber k = ( k 2 x + k 2 y ) 1 / 2 > k c are remov ed. For our computational domain ( L = 2 π , N = 128 ), the cutof f is deﬁned as k c = r c k max , with r c being the cutoff ratio and k max = N / 2 = 64 . By progressiv ely decreasing r c , we can determine the smallest spatial scale whose remov al does not alter the ev ent-wise predictability limits. Fig. 2 illustrates how predictability changes when applying the ﬁltering just introduced, whereby the selected physical wa velengths λ c = 2 π /k c and corresponding cutoff ratios are listed on each subﬁgure (these v alues are also reported in a table in the Supplemental Material for the reader’ s con venience). The physical wav elength λ c /L is increased linearly , corresponding to a decrease in the cutof f ratio r c , thereby progressi vely removing lar ger spatial structures. Fig. 2(a) depicts the scatter plots of ﬁltered versus original (i.e., without ﬁltering) predictability limits T ∗ e . W e observe that the predictability limits after ﬁltering remain similar to the original ones for lo w-predictability (blue dots) and medium- predictability (grey dots) extreme ev ents up to a physical wavelength λ c = 0 . 30 L (the points representing extreme e vents in the scatter plots remain clustered along the diagonal). For high-predictability e xtreme ev ents (orange dots), we start to see a breakdown in predictability slightly earlier , at physical wavelengths between λ c = 0 . 25 L and λ c = 0 . 30 L . As a whole, e xtreme events are insensitiv e to small-scale ﬁltering up to λ c = 0 . 30 L , as sho wn by the relativ ely high correlation coef ﬁcients ( R ≥ 0 . 84 ). When removing lar ger scales corresponding to wa velength λ c = 0 . 30 L , we start seeing a breakdown in predictability across all extreme e vents (i.e., high-, medium- and lo w-predictability events), as shown by the signiﬁcantly de graded correlation coefﬁcient (( R ≤ 0 . 73 ) that is reﬂected by the spread of points off the diagonal. 4 T o further assess this beha vior, in Fig. 2(b), we sho w the e volution of the a verage predictability limits ⟨ T ∗ e ⟩ for each extreme e vent cate gory (high-, medium-, low-predictability) as a function of the physical wa velength ﬁlter λ c /L applied. High-predictability ev ents (orange line) retain long forecast horizons (limits) until structures with physical wav elengths between λ c = 0 . 25 L and λ c = 0 . 30 L are remov ed, consistent with the results shown in Fig. 2(a). Once structures with a physical wa velength λ c = 0 . 30 L are remov ed, we start seeing a breakdown in predictability across all ev ents, namely high-, medium-, and lo w-predictability . This critical scale is identiﬁed by comparing the distrib ution of predictability limits at each ﬁltering scale with the baseline (unﬁltered) distribution using the W asserstein distance, with statistical signiﬁcance assessed via a paired permutation test (see Supplemental Material). W e further note that the breakdown wa velength identiﬁed through statistical analysis corresponds to a de gradation of structural coherence in the ﬂow , as illustrated in Fig. 3(c). Speciﬁcally , we compare the ﬂow ﬁeld for two ﬁltering scales, namely λ c = 0 . 016 L (i.e., no ﬁltering) and λ c = 0 . 35 L . The baseline ﬂo w exhibits sharp coherent structures, whereas strong ﬁltering retains only smooth vorte x clusters. In particular , the remov al of structures with characteristic wa velengths of order λ c /L ≈ 0 . 3 eliminates the distinct ﬂo w organization, leading to a rapid loss of predictability . Figure 2: Inﬂuence of spatial ﬁltering scale r c on predictability limit T ∗ e . (a) Scatter plots of ﬁltered versus baseline T ∗ e . (b) Evolution of T ∗ e for high (orange) and low (blue) predictability limits e vents. The vertical dashed area ( λ c /L = 0 . 3 ) marks the region of predictability breakdo wn. (c) V isualization of structural degradation in vorticity ﬁelds, with the baseline presented on the right and the breakdown scale ( λ c /L = 0 . 35 ) on the left. The negligible sensiti vity of predictability to the remov al of small-scale structures sho wn in Fig. 2 can be understood in terms of the energetic hierarchy of two-dimensional turb ulence [ 41 ]. While small-scale motions contribute substantially to enstrophy , they carry only a minor fraction of the total kinetic energy and are therefore less inﬂuential in shaping 5 the large-scale ﬂow evolution [ 42 ]. As a result, ﬁltering out high-wavenumber modes primarily removes ﬁne-scale ﬂuctuations without substantially altering the dominant coherent structures [ 43 ]. Predictability is therefore constrained by the ev olution of energy-dominant large-scale structures, which persist ov er longer timescales and gov ern the growth of forecast errors. These observ ations naturally raise the question of which coherent structures are responsible for extreme e vents, and which of their characteristics control the predictability limit. W e therefore examine the ﬂow structure associated with extreme e vents, and their relation to predictability . Figure 3: (a) Ensemble-av eraged ev olution of Ω (green, left axis) and | Q min | (purple, right axis), with shaded regions denoting one standard deviation. The insets display representativ e snapshots of the Q -criterion and vorticity at the peak of the extreme e vent. (b) T emporal e volution of coherent-structure statistics within the analysis window centered on the strain core, deﬁned by the maximum of | Q min | . (c) Cumulativ e distribution functions (CDFs) of structural lifetimes τ for high-predictability (orange) and lo w-predictability (blue) events. The inset shows the bootstrap distrib ution of the mean lifetime difference ∆ τ = ⟨ τ ⟩ high − ⟨ τ ⟩ low , yielding p = 0 . 017 . Fig. 3(a) presents the composite temporal evolution of enstrophy (green line) and the local minimum of the Q -criterion (purple line), conditionally averaged o ver all detected extreme ev ents and aligned at the ev ent time. Both quantities exhibit pronounced v ariations near the extreme, with the intensiﬁcation of negativ e Q (represented as | Q min | in purple curve) systematically preceding the peak in enstrophy (green curv e). This temporal offset re veals a strong correlation between intense strain-dominated re gions and subsequent enstrophy ampliﬁcation, suggesting that the emer gence of a strain core acts as a precursor to enstrophy bursts [ 44 ]. As shown in the inset snapshots, while the vorticity ﬁeld is spatially complex and fragmented, the strain core identiﬁed by the minimum of Q provides a clear and localized reference point around which dynamically relev ant coherent structures organize. This makes the strain core a more effecti ve anchor than vorticity e xtrema for identifying the ﬂow structures that gov ern extreme ev ents and inﬂuence their predictability . Motiv ated by this observ ation, we apply a structure classiﬁer to the ﬂo w ﬁeld within a local windo w of size λ c /L ≈ 0 . 3 centered on the strain core. This wav elength corresponds to the statistically identiﬁed critical length at which predictability begins to break down (Fig. 2). Based on the number of dynamically signiﬁcant v ortex cores contained within this windo w , identiﬁed after applying an intensity threshold to exclude weak v ortices, the local ﬂo w conﬁguration is classiﬁed into four categories: isolated (0–1 vorte x), dipole (2 vortices), quadrupole (3–4 vortices), and complex conﬁgurations (more than 4 v ortices). The 3–4 category is labeled as quadrupole because quadrupolar arrangements may occasionally appear with only three detected cores when one vorte x falls below the intensity threshold or lies partially outside the detection window . Fig. 3(b) sho ws the temporal evolution of the relati ve occurrence of these structure types, demonstrating that quadrupoles become increasingly dominant as the system approaches the enstrophy peak. This trend suggests that quadrupoles constitute a characteristic dynamical conﬁguration underlying extreme ev ents, motiv ating a closer examination of their temporal persistence and their impact on predictability . W e therefore quantify the structural lifetime of quadrupolar 6 conﬁgurations in the vicinity of the strain core and examine its relationship to ev ent predictability . The structural lifetime τ is deﬁned as the continuous time interv al over which a quadrupolar structure persists prior to the enstrophy maximum in a Lagrangian frame; events whose peak conﬁguration is not classiﬁed as a quadrupole are assigned a lifetime of zero. The same intensity threshold and window size as in Fig.3(b) are employed to ensure consistency of the structural classiﬁcation. As demonstrated in Fig. 3(c), a pronounced separation in structural persistence emerges between high-predictability (orange) and lo w-predictability (blue) e vents. The cumulati ve distrib ution functions (CDFs) sho w that high-predictability ev ents are systematically associated with longer-li ved quadrupoles. This separation is further supported by a one-sided bootstrap test of the lifetime difference, ∆ τ = ⟨ τ ⟩ high − ⟨ τ ⟩ low (inset), testing the hypothesis ∆ τ < 0 , with p = 0 . 017 based on 20,000 bootstrap resamples. These results indicate that the temporal stability of quadrupoles plays an important role in shaping the predictability of extreme e vents in K olmogorov ﬂow . The persistence of the quadrupole suggests a structural resilience against the turbulent background, which limits local error gro wth compared to lo w-predictability ev ents [ 45 ]. Consequently , the ﬂo w e volution with long-liv ed quadrupoles remains comparati vely more predictable, explaining the observed persistence gap. Notably , this enhanced predictability is not associated with the largest instantaneous enstrophy values. Instead, the longest predictable extreme e vents typically exhibit moderate enstrophy peaks accompanied by long-li ved coherent structures. By contrast, the least predictable ev ents tend to coincide with the strongest enstrophy bursts accompanied by short-liv ed coherent structures, reﬂecting a general loss of structural coherence and predictability . W e further veriﬁed that the distrib ution of predictability limits are robust across different diffusion architectures. W e tested three denoising architectures, including a U-Net with Con vNeXt blocks, a U-Net with Con vNeXt blocks and attention, and a U-Net with ResNet blocks and attention – all yield nearly identical statistics of T ∗ e (see Supplemental Material). These results demonstrate that the observed predictability limits and e vent-wise hierarchies are go verned primarily by the underlying ﬂow dynamics rather than by model-speciﬁc biases. T o quantitativ ely validate the predictability estimates obtained with the dif fusion models, we perform DNS ensemble simulations via perturbations of the initial conditions for a subset of extreme ev ents, initializing 2 T λ prior to the ev ents’ peak. Despite the fundamentally different constructions of the two approaches, we ﬁnd that the diffusion-based predictability estimates qualitativ ely align with the div ergence behavior observ ed in DNS ensembles (see Supplemental Material). 4 Conclusions In this work, we introduced a diffusion-based ensemble framew ork to quantify predictability in complex systems from a data-driv en, ev ent-wise perspective. Rather than relying on predeﬁned perturbation ensembles or explicit assumptions about the underlying dynamics, the frame work provides a statistical characterization of forecast uncertainty directly from data, enabling scalable and reproducible predictability analyses. Applied to the two-dimensional turbulent K olmogorov ﬂo w , our results rev eal a pronounced heterogeneity in the predictability of extreme ev ents, with predictability limits spanning from approximately 1 to more than 4 L yapunov times. Crucially , this v ariability cannot be explained by ev ent intensity alone. Instead, highly predictable extremes are consistently associated with persistent, lar ge-scale coherent structures, whereas the least predictable e vents coincide with intense, small-scale-dominated b ursts and a loss of structural coherence. Through scale-ﬁltering and structural analyses, we demonstrate that predictability is ultimately constrained by the ev olution and persistence of energy- dominant large-scale ﬂo w structures. Long-liv ed quadrupolar conﬁgurations emerge as a representativ e example of such structures, providing a concrete physical link between ﬂo w organization and e xtended predictability horizons. More broadly , these ﬁndings rev eal that the predictability of extreme ev ents in turbulent ﬂo ws is fundamentally con- strained by the persistence of coherent structures that organize the ﬂo w dynamics. By linking e vent-wise predictability to the structural stability of quadrupoles, the proposed framew ork provides a route to uncover predictability hierarchies of extreme e vents and their structural origins in high-dimensional chaotic systems, with implications for forecasting, uncertainty quantiﬁcation, and extreme-e vent analysis. 7 References [1] Y uejian Zhu, Zoltan T oth, Richard W obus, David Richardson, and Kenneth Mylne. The economic v alue of ensemble-based weather forecasts. Bulletin of the American Meteor ological Society , 83(1):73–84, 2002. [2] J-P Eckmann and David Ruelle. Ergodic theory of chaos and strange attractors. Reviews of modern physics , 57(3):617, 1985. [3] A. Piko vsky and A. Politi. Lyapunov Exponents: A T ool to Explor e Complex Dynamics . Cambridge Univ ersity Press, 2016. [4] H. Risken and T . Frank. The F okker -Planck Equation: Methods of Solution and Applications . Springer Series in Synergetics. Springer Berlin Heidelber g, 2012. [5] G.A. Pavliotis. Stochastic Pr ocesses and Applications: Dif fusion Pr ocesses, the F okker-Planc k and Lange vin Equations . T exts in Applied Mathematics. Springer Ne w Y ork, 2014. [6] E.N Lorenz. Deterministic nonperiodic ﬂow . Journal Of The Atmospheric Science , 1963. [7] Stefan Siegert and Holger Kantz. Prediction of complex dynamics: who cares about chaos? Chaos Detection and Pr edictability , pages 249–269, 2016. [8] Hugo LD de S. Cav alcante, Marcos Oriá, Didier Sornette, Edward Ott, and Daniel J Gauthier . Predictability and suppression of extreme e vents in a chaotic system. Physical Review Letter s , 111(19):198701, 2013. [9] Y uan Y uan and Adrián Lozano-Durán. Limits to extreme e vent forecasting in chaotic systems. Physica D: Nonlinear Phenomena , 467:134246, 2024. [10] Alberto V ela-Martín and Marc A vila. Lar ge-scale patterns set the predictability limit of extreme ev ents in kolmogorov ﬂo w . Journal of Fluid Mec hanics , 986:A2, 2024. [11] V alerio Lucarini, V era Melinda Galﬁ, Jacopo Riboldi, and Gabriele Messori. T ypicality of the 2021 western north america summer heatwa ve. Envir onmental Resear ch Letter s , 18(1):015004, 2023. [12] Leonardo Oli vetti and Gabriele Messori. Do data-dri ven models beat numerical models in forecasting weather extremes? a comparison of ifs hres, pangu-weather, and graphcast. Geoscientiﬁc Model Development , 17(21):7915– 7962, 2024. [13] Jiawen W ei, Aniruddha Bora, V ivek Oommen, Chen yu Dong, Juntao Y ang, Jeff Adie, Chen Chen, Simon See, George Karniadakis, and Gianmarco Mengaldo. Xai4extremes: An interpretable machine learning framew ork for understanding extreme-weather precursors under climate change. arXiv pr eprint arXiv:2503.08163 , 2025. [14] P T rent V onich and Gregory J Hakim. Predictability limit of the 2021 paciﬁc northwest heatwa ve from deep- learning sensitivity analysis. Geophysical Resear ch Letters , 51(19):e2024GL110651, 2024. [15] V alery Iustinovich Oseledec. A multiplicativ e ergodic theorem, lyapunov characteristic numbers for dynamical systems. T ransactions of the Moscow Mathematical Society , 19:197–231, 1968. [16] Bruno Eckhardt and Demin Y ao. Local lyapunov exponents in chaotic systems. Physica D: Nonlinear Phenomena , 65(1-2):100–108, 1993. [17] Guido Bof fetta, P Giuliani, G Paladin, and A V ulpiani. An extension of the lyapuno v analysis for the predictability problem. Journal of the Atmospheric Sciences , 55(23):3409–3416, 1998. [18] E Aurell, Guido Bof fetta, Andrea Crisanti, G Paladin, and Angelo V ulpiani. Predictability in the large: an extension of the concept of lyapuno v exponent. Journal of physics A: Mathematical and gener al , 30(1):1, 1997. [19] Ruiqiang Ding and Jianping Li. Nonlinear ﬁnite-time lyapuno v exponent and predictability . Physics Letters A , 364(5):396–400, 2007. [20] Xiao-W ei Huai, Jian-Ping Li, Rui-Qiang Ding, Jie Feng, and De-Qiang Liu. Quantifying local predictability of the lorenz system using the nonlinear local lyapunov exponent. Atmospheric and Oceanic Science Letters , 10(5):372–378, 2017. [21] Alberto V ela-Martín. Predictability of isotropic turbulence by massiv e ensemble forecasting. Physical Review Fluids , 9(12):L122601, 2024. [22] V alerio Lucarini, Davide Faranda, Jor ge Miguel Milhazes de Freitas, Mark Holland, T obias Kuna, Matthe w Nicol, Mike T odd, Sandro V aienti, et al. Extr emes and r ecurr ence in dynamical systems . John W iley & Sons, 2016. [23] Chenyu Dong, Gabriele Messori, Davide F aranda, Adriano Gualandi, V alerio Lucarini, and Gianmarco Mengaldo. Spatio-temporal dynamical indices for complex systems. Chaos, Solitons & F ractals , 201:117248, 2025. 8 [24] Zhou Fang and Gianmarco Mengaldo. Dynamical errors in machine learning forecasts. Chaos, Solitons & F ractals , 201:117376, 2025. [25] Chenyu Dong, Davide Faranda, Adriano Gualandi, V alerio Lucarini, and Gianmarco Mengaldo. Time-lagged recurrence: A data-dri ven method to estimate the predictability of dynamical systems. Proceedings of the National Academy of Sciences , 122(20):e2420252122, 2025. [26] T imothy DelSole. Predictability and information theory . part i: Measures of predictability . Journal of the atmospheric sciences , 61(20):2425–2440, 2004. [27] T apio Schneider and Stephen M Grifﬁes. A conceptual framew ork for predictability studies. J ournal of climate , 12(10):3133–3155, 1999. [28] Richard Kleeman. Measuring dynamical prediction utility using relativ e entropy . Journal of the atmospheric sciences , 59(13):2057–2072, 2002. [29] Thomas M Co ver . Elements of information theory . John W iley & Sons, 1999. [30] Georg Kohl, Li-W ei Chen, and Nils Thuerey . Benchmarking autoregressi ve conditional diffusion models for turbulent ﬂo w simulation. Neural Networks , page 108641, 2026. [31] Ira JS Shokar , Rich R K erswell, and Peter H Haynes. Stochastic latent transformer: Ef ﬁcient modeling of stochastically forced zonal jets. Journal of Advances in Modeling Earth Systems , 16(6):e2023MS004177, 2024. [32] Emmanouil D Fylladitakis et al. K olmogorov ﬂo w: se ven decades of history . J ournal of Applied Mathematics and Physics , 6(11):2227, 2018. [33] Gary J Chandler and Rich R Kerswell. In variant recurrent solutions embedded in a turbulent two-dimensional kolmogorov ﬂo w . Journal of Fluid Mec hanics , 722:554–595, 2013. [34] Mohammad Farazmand and Themistoklis P Sapsis. A v ariational approach to probing extreme e vents in turbulent dynamical systems. Science advances , 3(9):e1701533, 2017. [35] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information pr ocessing systems , 33:6840–6851, 2020. [36] Olaf Ronneberger , Philipp Fischer , and Thomas Brox. U-net: Conv olutional networks for biomedical image segmentation. In Medical ima ge computing and computer-assisted intervention–MICCAI 2015: 18th international confer ence, Munich, Germany , October 5-9, 2015, proceedings, part III 18 , pages 234–241. Springer , 2015. [37] Prafulla Dhariwal and Ale xander Nichol. Diffusion models beat g ans on image synthesis. Advances in neural information pr ocessing systems , 34:8780–8794, 2021. [38] T ilmann Gneiting and Adrian E. Raftery . Strictly proper scoring rules, prediction, and estimation. J ournal of the American Statistical Association , 102(477):359–378, 2007. [39] Jeffre y L Anderson. A method for producing and ev aluating probabilistic forecasts from ensemble model integrations. Journal of climate , 9(7):1518–1530, 1996. [40] D.S. W ilks. Chapter 8 - forecast veriﬁcation. In Daniel S. W ilks, editor , Statistical Methods in the Atmospheric Sciences , volume 100 of International Geophysics , pages 301–394. Academic Press, 2011. [41] Robert H Kraichnan. Inertial-range transfer in two-and three-dimensional turb ulence. Journal of Fluid Mechanics , 47(3):525–535, 1971. [42] Guido Bof fetta and Robert E Ecke. T w o-dimensional turb ulence. Annual r evie w of ﬂuid mec hanics , 44(1):427–451, 2012. [43] Marie Farge et al. W avelet transforms and their applications to turbulence. Annual re view of ﬂuid mechanics , 24(1):395–458, 1992. [44] Mohammad Farazmand and Themistoklis P Sapsis. Extreme events: Mechanisms and prediction. Applied Mechanics Re views , 71(5):050801, 2019. [45] Guido Boffetta and Stef ano Musacchio. Chaos and predictability of homogeneous-isotropic turbulence. Physical r eview letter s , 119(5):054102, 2017. 9

Hierarchy of extreme-event predictability in turbulence revealed by machine learning

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment