Non-equilibrium functional inequalities for finite Markov chains

Functional inequalities such as the Poincaré and log-Sobolev inequalities quantify convergence to equilibrium in continuous-time Markov chains by linking generator properties to variance and entropy decay. However, many applications, including multis…

Authors: Bastian Hilder, Patrick van Meurs, Upanshu Sharma

Non-equilibrium functional inequalities for finite Markov chains
Non-equilibrium functional inequalities for finite Mark o v c hains Bastian Hilder ∗ , P atrick v an Meurs † , Upansh u Sharma ‡ Abstract F unctional inequalities suc h as the P oincar´ e and log-Sobolev inequalities quan tify conv ergence to equilibrium in con tinuous-time Mark ov c hains by linking generator properties to v ariance and en tropy deca y . How ev er, man y applications, including m ultiscale and non-reversible dynamics, require analysing probabilit y measures that are not at equilibrium, where the classical theory tied to steady states no longer applies. W e introduce generalised versions of these inequalities for arbitrary positive measures on a finite state space, retaining k ey structural properties of their classical counterparts. In particular, we pro ve contin uity of the asso ciated constants with resp ect to the reference measure and establish explicit p ositiv e low er b ounds. As an application, we derive quan titative coarse-graining error estimates for non- rev ersible Marko v chains, b oth with and without explicit scale separation, and prop ose a quantitativ e criterion for assessing the quality of coarse-graining maps. Keyw ords. functional inequalities; relativ e entrop y; Fisher information; contin uous-time Mark ov c hains; coarse-graining Mathematics Sub ject Classification (2020). 39B05; 39B62; 60J27; 60J28; 34C29; 34E13 1 In tro duction F unctional inequalities suc h as the P oincar´ e inequality (PI) and the log-Sob olev inequalit y (LSI) are cen tral to ols in the quantitativ e analysis of Mark ov semigroups and diffusion pro cesses. In the setting of a con tin uous-time Mark ov c hain with an irreducible generator matrix, these inequalities quantify ho w fast observ ables and distributions relax to ward the steady state (or tow ards equilibrium). The PI controls the exp onential decay of the v ariance of observ ables, thereby determining the rate of con vergence to steady state in the L 2 -sense. The LSI pro vides a stronger, en tropy-based control, implying exp onential con vergence of the law of the pro cess to equilibrium in relativ e entrop y . In b oth cases, the inequality constan ts (resp ectiv ely called the PI and LSI constants) encapsulate the strength of the mixing mec hanism in that small(/large) constants corresp ond to slo w(/fast) equilibration. These inequalities hav e thus b ecome fundamental in the analysis of conv ergence to equilibrium, concentration of measure, and the stabilit y of sto chastic systems, see [ DSC96 , SC97 , AF99 , BT06 , MT06 , EF18 , SS19 ] for a non-exhaustive list. Most of these references fo cus on the sp ecial setting of reversible Mark ov chains whic h allo w for considerably deeper analysis. Ho wev er, the assumptions of rev ersibilit y and equilibrium (c haracterised b y the steady state) is re- strictiv e in certain applications. F or instance, complex c hemical and biomolecular systems are often mo d- elled via high-dimensional, p ossibly non-rev ersible, Marko v chains constructed from molecular dynam- ics [ PWS + 11 , SS13 ], whic h are routinely coarse-grained in to clusters [ D W05 , KW07 , FSW18 ]. Inspired b y coarse-graining approaches developed for diffusion processes [ LL10 , Cho03 ], the recen t w ork [ HS24 ] b y tw o of the authors provides a reduced dynamics on the clusters by defining an effective generator through conditional exp ectations. The accuracy of this (and related) reduced mo dels dep ends on the mixing prop erties within each cluster, go v erned by measures that are not the steady states. In these ∗ Department of Mathematics, T ec hnische Universit¨ at M¨ unchen, Boltzmannstrasse 3, 85748, Garc hing b. M¨ unchen, German y . Email: bastian.hilder@tum.de † F aculty of Mathematics and Ph ysics, Kanazaw a Universit y , Kakuma, Kanaza wa 920-1192, Japan. Email: p jpvmeurs@staff.k anazaw a-u.ac.jp ‡ School of Mathematics and Statistics, Universit y of New South W ales, Sydney 2052, Australia. Email: upanshu.sharma@unsw.edu.au 1 con texts, the classical PI and LSI inequalities, defined with the steady state as the reference measure, are no longer applicable and therefore lead to the following natural question. What is a me aningful gener alisation of functional ine qualities for r efer enc e measur es other than the ste ady state? T o the b est of our knowledge, a comprehensive framework for non-equilibrium functional inequalities is not a v ailable in the literature. Therefore, in this w ork, we introduce generalised v ersions of the P oincar´ e and log-Sob olev inequalities (called gPI and gLSI) defined for arbitrary reference measures in the class of p ositive probability measures on the state space of the underlying contin uous-time Marko v chain, see ( 21 ) and ( 22 ). These inequalities are built up on natural generalisations of the Dirichlet form and the Fisher information. W e show that these latter generalisations retain key analytical prop erties of their classical counterparts: non-negativity , conv exity , and con tinuit y in their argumen t, see Prop osition 3.2 . Using these generalised functionals, we establish a family of functional inequalities parametrised b y the reference probability measure ζ , whose optimal constan ts are denoted by α gPI ( ζ ) and α gLSI ( ζ ), see ( 23 ). Bey ond the use of non-equilibrium reference measures, a crucial p oin t of this study is that we do not require the Mark ov chain to b e rev ersible (i.e. to b e in detailed balance), whic h is often assumed when dealing with such functional inequalities. W e demonstrate that these generalised constants p ossess several desirable prop erties. As in the equilibrium case, there is a hierarch y for the functional inequalities, cf. [ BT06 ]. Specifically , the gLSI implies the gPI with α gLSI ( ζ ) ≤ α gPI ( ζ ), see Prop osition 4.1 . F urthermore, b oth the gPI and gLSI inequalities satisfy the tensorisation property , see Prop osition 4.2 . Beyond these classical properties, we pro ve that the constants v ary contin uously with ζ , see Prop osition 3.5 and Theorem 4.4 . In addition, we pro vide strictly positive low er bounds for the constan ts, see Theorem 4.7 . Sp ecifically , these lo wer b ounds also hold for the classical PI and LSI constants, by which w e establish their p ositivity for irreducible generators that need not b e reversible. These properties require ζ to b e strictly p ositive. Ho w ever, this is t ypically guaranteed in applications where ζ is linked to solutions of forward Kolmogoro v equations with irreducible generators, see for instance [ HPST20 , Lemma C.1]. W e demonstrate the utility of these generalised inequalities by applying them to tw o problems where non-equilibrium analysis is essential. First, w e deriv e quantitativ e stability estimates for the time- dep enden t probabilit y distribution of a Marko v c hain relative to an y other suc h measure ζ , see Section 5.1 . Here ζ is time-dep endent, and consequen tly the gPI and gLSI constants are time-dependent. W e ap- ply our con tin uity result and lo wer bounds to establish an exponential con vergence result on the solutions to the Marko v chains. Second, we use the generalised functional inequalities to study clustering in Mark ov c hains. In several w orks [ LL10 , ZHS16 , LLO17 , DLP + 18 , LLS19 , HNS20 , HS24 ] on coarse-graining estimates, including some of our own, functional inequalities are imposed for conditional measures that are not steady states of the underlying dynamics. Although this hypothesis has led to strong and elegant results, its structural meaning in a genuinely non-equilibrium setting has remained unclear. The present framew ork is a first step to wards alleviating this issue b y pro viding a systematic interpretation of suc h inequalities for con tinuous-time Marko v chains. T o demonstrate its applicability , we sho w that the LSI assumption with resp ect to non-steady states in [ HS24 ] can be dropp ed en tirely using our framew ork, where w e sp ecifically use the low er b ounds. This provides quan titative coarse-graining error estimates both in the absence and presence of explicit scale separation in Marko v chains, see Prop osition 5.3 and Theorem 5.6 . Finally , our generalised inequalities allow us to provide a direct criterion for assessing the qualit y of a coarse-graining map, see Section 5.2.3 . In both of the aforementioned applications, the contin uity of the functional inequalit y constants plays a central role in con trolling deviations from the steady state. Finally , w e discuss how our functional inequalities can p oten tially be useful for constructing coarse- graining maps for Mark ov chains, and for Γ-calculus. In addition, we discuss ho w our framework can p oten tially b e generalised to other settings, including coun tably infinite state spaces, and coarse-grained diffusion pro cesses under appropriate additional assumptions. F or the details on this, we refer to Section 6 . In summary , this work extends the scope of classical functional inequalities to the non-stationary realm. It provides an analytical framework for studying – at non-equilibrium – relaxation, comparison, and reduction of contin uous-time Mark ov chains. 2 Outline of the article and summary of notation. In Section 2 w e recall the classical notions of the PI and the LSI. Section 3 presen ts the generalised functionals and in tro duces the generalised P oincar´ e and log-Sob olev constan ts along with their basic prop erties. Section 4 contains our main results on imp ortant prop erties of the generalised constants. In Section 5 we demonstrate the utility of these results in tw o applications, and in Section 6 we sp eculate how these results could be applied to related researc h areas. Finally , Appendix A demonstrates that the naive generalisation of the classical inequalities generally fail to give a w ell-defined theory . The following table summarises imp ortan t notation used throughout this pap er. Additional notation will be introduced in Section 3.1 . X , Y , Z finite state space |Z | cardinalit y of a finite space Z R |Z | > 0 v ectors in R |Z | with strictly p ositive co ordinates 1 constan t unit function Sec. 3.1 P ( Z ) space of probability measures on Z P + ( Z ) space of strictly p ositive probability measures on Z η ∗ lo wer b ound for η ∈ P + ( Z ) Thm. 4.7 D η ( Z ) space of probability densities with resp ect to η ∈ P + ( Z ) ( 6 ) M , L irreducible generators for contin uous-time Mark ov chains M ∗ smallest positive entry in generator M Thm. 4.7 M ζ ζ -symmetrised v ersion of generator M ( 36 ) π steady state of contin uous-time Marko v chain v ar η V ariance with respect to probability measure η ( 2 ) H η relativ e entrop y with respect to probability measure η ( 7 ) En t η cen tred entrop y with respect to probability measure η ( 11 ) E classical Diric hlet form with resp ect to the steady state π ( 3 ) E ζ generalised Diric hlet form with respect to probability measure ζ ( 16 ) R classical Fisher information with resp ect to the steady state π ( 8 ) R ζ generalised Fisher information with resp ect to probability measure ζ ( 17 ) α PI , α gPI classical and generalised Poincar ´ e constants ( 5 ), ( 23 ) α LSI , α gLSI classical and generalised log-Sob olev constan ts ( 10 ), ( 23 ) α sLSI standard LSI constant ( 13 ) ξ coarse-graining map ( 56 ) Λ y y -lev el set of ξ : X → Y Sec. 5.2.1 2 The classical functional inequalities In this section, we in tro duce the classical notions of the PI and the LSI. Here and elsewhere in the pap er, w e fix Z as a finite state space with |Z | ≥ 2 the num b er of states, and fix M ∈ R |Z |×|Z | as an irreducible, but not necessarily rev ersible, generator (or transition rate matrix) on Z . W e will interc hangeably in terpret M as a matrix or as an operator. In addition, let P ( Z ) be the space of probability measures on Z . Similar to M , w e will treat measures on Z as either v ectors in R |Z | or functions. The la w t 7→ µ t ∈ P ( Z ) of the Marko v chain driven by M evolv es according to the forward Kolmogorov equation    d µ d t = M T µ, µ   t =0 = µ 0 , (1) with initial data µ 0 ∈ P ( Z ). Irreducibility ensures that this Mark ov chain admits a unique strictly p ositiv e steady state π ∈ P + ( Z ), i.e. M T π = 0. Here, we denote the space of strictly p ositiv e probability measures b y P + ( Z ). T o define the PI we introduce E π [ f ] : = X z ∈Z f ( z ) π ( z ) 3 for the exp ectation of f ∈ R |Z | with respect to π and ( f , g ) π : = X z ∈Z f ( z ) g ( z ) π ( z ) for the the π -weigh ted inner product on R |Z | . Definition 2.1. F or f ∈ R |Z | w e define the varianc e and Dirichlet form of f resp ectively as v ar π ( f ) : = E π [ f 2 ] −  E π [ f ]  2 = ( f 2 , 1) π −  ( f , 1) π  2 , (2) E π ( f , M ) : = ( f , − M f ) π = − X z,z ′ ∈Z f ( z ) f ( z ′ ) M ( z , z ′ ) π ( z ) = 1 2 X z,z ′ ∈Z  f ( z ) − f ( z ′ )  2 M ( z , z ′ ) π ( z ) . (3) Henceforth, for E π ( f , M ) and other ob jects dep ending on M , we drop M from the notation and write E π ( f ) if there is no risk of confusion ab out the underlying generator. Classically , the Dirichlet form is defined as ˜ E π ( f , g ) = ( f , − M g ) π and requires M to b e reversible. Y et, throughout this pap er w e only deal with f = g , and therefore introduce the Dirichlet form as a function of one v ariable. Since w e do not require M to b e rev ersible, E π is not a Diric hlet form in the classical sense. Nevertheless, it is t ypically still called a Dirichlet form by a slight abuse of notation. Definition 2.2. The (classical) Poinc ar´ e ine quality (PI) is satisfied if there exists a constant α > 0 such that ∀ f ∈ R |Z | : v ar π ( f ) ≤ 1 α E π ( f ) . (4) The suprem um ov er all p ossible choices of α is called the Poinc ar´ e c onstant . W e denote it by α PI ( M ) : = inf f ∈ R |Z | f / ∈⟨ 1 ⟩ E π ( f , M ) v ar π ( f ) . (5) Next, we in tro duce the classical notion of the log-Sobolev inequality . Throughout this article, let R |Z | > 0 b e the space of v ectors in R |Z | with strictly p ositive co ordinates. Then, let D π ( Z ) : =  φ ∈ R |Z | > 0 : E π [ φ ] = 1  (6) b e the space of p ositive densities with respect to π . In the following, we use f , g for functions on R |Z | and φ, ψ for densities with resp ect to some probability measure on R |Z | . Definition 2.3. F or φ ∈ D π ( Z ) we define the r elative entr opy and ( M -) Fisher information respectively as H π ( φ ) : = X z ∈Z φ ( z ) log φ ( z ) π ( z ) , (7) R π ( φ, M ) : =  φ, − M log φ  π = − X z,z ′ ∈Z φ ( z ) log φ ( z ′ ) M ( z , z ′ ) π ( z ) . (8) Definition 2.4. The lo g-Sob olev inequality (LSI) is satisfied if there exists a constan t α > 0 such that ∀ φ ∈ D π ( Z ) : H π ( φ ) ≤ 1 α R π ( φ, M ) . (9) The suprem um ov er all p ossible choices of α is called the lo g-Sob olev c onstant . W e denote it by α LSI ( M ) : = inf φ ∈D π ( Z ) φ  = 1 R π ( φ, M ) H π ( φ ) . (10) Strictly sp eaking, ( 9 ) is the so-called mo dified LSI, as opp osed to the standard LSI introduced in Remark 2.6 . Since this pap er predominan tly deals with generalising ( 9 ), we simply refer to it as the LSI instead of the modified LSI. W e recall that, similar to the abbreviated notation E π ( f ), we simply write α PI , R π ( φ ) , α LSI whenev er the underlying generator is clear from the con text. A k ey difference b etw een the tw o functional inequalities abov e is the domains of functions f and φ o ver which these inequalities are defined. The following remark discusses this. 4 R emark 2.5 . First, the relative entrop y H π ( f ) is only defined for f ∈ R |Z | ≥ 0 and the Fisher information R π ( f ) only for f ∈ R |Z | > 0 . Thus, the full class of f ∈ R |Z | in the PI cannot b e considered for the LSI. Second, for the LSI, the inequalit y in ( 9 ) could b e considered for all f ∈ R |Z | > 0 rather than the restricted class φ ∈ D π ( Z ) ⊂ R |Z | > 0 . Y et, this inequality would be meaningless to consider as the LSI. T o see this, note that any f ∈ R |Z | > 0 can b e uniquely decomp osed as β φ with β > 0 and φ ∈ D π ( Z ); in particular β = E π [ f ] and φ = f E π [ f ] . Then H π ( f ) = β H π ( φ ) + β log β , R π ( f ) = β R π ( φ ) . Consequen tly , the LSI constan t would read inf φ ∈D π ( Z ) inf β > 0 R π ( φ ) H π ( φ ) + log β , whic h equals −∞ due to the minimisation ov er β . This issue can b e fixed b y working with the centred entrop y instead, defined as En t π ( f ) : = H π ( f ) − E π ( f ) log  E π ( f )  . (11) In this case Ent π ( β φ ) = β H π ( φ ) and thus α LSI = inf φ ∈D π ( Z ) R π ( φ ) H π ( φ ) = inf f ∈ R |Z | > 0 R π ( f ) En t π ( f ) . Consequen tly , all results p ertaining to the LSI inequalit y can b e extended to R |Z | > 0 using this centred en tropy . Ho wev er, since w e only consider the case of densities, where Ent π ( φ ) = H π ( φ ) for φ ∈ D π ( Z ), w e use the setting ( 9 ) throughout this pap er. The following remark provides some context for these functional inequalities in the setting of Marko v c hains. R emark 2.6 . The Poincar ´ e and log-Sob olev inequalit y hav e b een used to quan tify the conv ergence to equilibrium in Mark ov c hains, see [ BT06 ] and references therein. In particular it follows that the solution µ t to the forward Kolmogoro v equation ( 1 ) con verges to the steady state π exp onen tially fast in v ariance and relativ e entrop y with rates α PI and α LSI resp ectiv ely [ BT06 , Section 1], due to the relations d d t v ar π  µ t π  = − 2 E π  µ t π  ≤ − 2 α PI v ar π  µ t π  , d d t H π  µ t π  = − R π  µ t π  ≤ − α LSI H π  µ t π  . (12) Regarding both constants, 0 < α LSI ≤ 2 α PI (see the last line of the proof of [ BT06 , Prop osition 3.6]). F urthermore, in the sp ecial case of reversible Mark ov chains, α PI is the spectral-gap asso ciated to the generator matrix M [ DSC96 , Section 2.2]. W e also point out that the Fisher information is connected to the Dirichlet form via R π ( φ ) =  φ, − M log φ  π ≥ 1 2 E π ( √ φ ) . This giv es rise to the standar d LSI (sLSI) (w e follow the naming con v ention introduced in [ BT06 ]), whic h measures the ratio b etw een E π ( f ) and Ent π ( f 2 ) (introduced in ( 11 )). The corresp onding constan t is giv en by [ DSC96 , Equation (1.7)] α sLSI ( M ) : = inf f ∈ R |Z | > 0 f / ∈⟨ 1 ⟩ E π ( f , M ) En t π ( f 2 ) . (13) 5 3 Generalised functional inequalities As stated in the introduction, our aim is to generalise the PI and the LSI to the non-equilibrium setting. This translates to defining these inequalities with the steady state π replaced by a general strictly p ositiv e probabilit y measure ζ ∈ P + ( Z ). W e first p oint out that the naive generalisation of replacing in the PI and the LSI π b y ζ do es not w ork, even though the definitions of all functionals can directly b e extended to any ζ ∈ P + ( Z ). W e fo cus on the classical Poincar ´ e inequality ( 4 ), whose naive generalisation reads as ˜ α PI ( ζ , M ) : = inf f ∈ R |Z | f / ∈⟨ 1 ⟩ E ζ ( f , M ) v ar ζ ( f ) , (14) where ⟨ 1 ⟩ denotes the space of constan t functions. As for the classical inequality , w e need to imp ose f / ∈ ⟨ 1 ⟩ to ensure that v ar ζ ( f )  = 0. A simple calculation (see Proposition A.1 in the app endix) then sho ws that ˜ α PI ( ζ ) = −∞ for any ζ  = π . In this section we first introduce generalisations of the Dirichlet form and the Fisher information, whic h are then used to define non-equilibrium v ersions of P oincar ´ e and log-Sobolev inequalities. W e conclude this section by proving a num ber of prop erties of these generalisations. 3.1 Notation Other than the Dirichlet form and the Fisher information, we extend the other functionals and notation from Section 2 by replacing π in their definitions by ζ ∈ P + ( Z ). In this manner w e define E ζ , ( f , g ) ζ , v ar ζ , D ζ ( Z ) and H ζ . In addition, we define for f ∈ R |Z | ∥ f ∥ 2 : = X z ∈Z f 2 ( z ) , and ∥ f ∥ 2 ζ = X z ∈Z f 2 ( z ) ζ ( z ) . Let 1 : Z → R b e the constant function equal to 1. F or f 1 , . . . , f k ∈ R |Z | , w e denote their span by ⟨ f 1 , . . . , f k ⟩ . F or ζ ∈ P + ( Z ) and U ⊂ R |Z | a subspace, w e denote b y U ⊥ ζ the orthogonal complemen t with respect to the inner pro duct ( f , g ) ζ , e.g. ⟨ 1 ⟩ ⊥ ζ = { f ∈ R |Z | : ( f , 1 ) ζ = 0 } . In a similar manner, we also in tro duce the orthogonal complement in the flat geometry as ⟨ 1 ⟩ ⊥ = n f ∈ R |Z | : ( f , 1 ) = X z ∈Z f ( z ) = 0 o . In particular, since R |Z | = ⟨ 1 ⟩ ⊕ ⟨ 1 ⟩ ⊥ ζ , w e hav e ∀ f ∈ R |Z | , ∃ ! c ∈ R , ∃ ! h ∈ ⟨ 1 ⟩ ⊥ ζ : f = c 1 + h. (15) Finally , w e set ζ ∗ : = min z ∈Z ζ ( z ) > 0 . 3.2 Generalised Diric hlet form and the Fisher information As already discussed, a k ey interpretation for the classical notions of Dirichlet form and the Fisher infor- mation is that these ob jects can b e respectively deriv ed as the d issipation of v ariance and relativ e en tropy with resp ect to the steady state π along the solution t 7→ µ t of the forward Kolmogorov equation ( 1 ), see ( 12 ). F ollowing this connection, to motiv ate generalisation tow ards non-equilibrium measures, we consider t wo time-dependent solutions t 7→ µ t , ζ t of ( 1 ), and consider the non-e quilibrium evolution of v ariance and relative entrop y , i.e. d d t v ar ζ t ( µ t ζ t ) and d d t H ζ t ( µ t ζ t ). W e wan t our generalisation to b e suc h that these deriv ativ es satisfy similar equations as in ( 12 ), see ( 18 ) b elow. This leads to the following generalisations of the Dirichlet form and the Fisher information. 6 Definition 3.1. Let ζ ∈ P + ( Z ). F or any f ∈ R |Z | w e define the gener alise d Dirichlet form with resp ect to ζ as E ζ ( f , M ) : = 1 2 X z,z ′ ∈Z M ( z , z ′ ) ζ ( z )  f ( z ) − f ( z ′ )  2 = ( f , − M f ) ζ + 1 2 (1 , M f 2 ) ζ . (16) F or an y φ ∈ D ζ ( Z ) (defined in ( 6 )) we define the gener alise d Fisher information with resp ect to ζ as R ζ ( φ, M ) : = X z,z ′ ∈Z M ( z , z ′ ) ζ ( z ) φ ( z )  φ ( z ′ ) φ ( z ) − 1 − log  φ ( z ′ ) φ ( z )  . (17) As in Section 2 , we write E ζ ( f ) , R π ( φ ) in place of E ζ ( f , M ) , R π ( φ, M ) if there is no risk of confusion ab out the underlying generator. F ollowing our motiv ation ab ov e, it is easy to c heck that for tw o solutions t 7→ µ t , ζ t of ( 1 ), we indeed ha ve the prop erty d d t v ar ζ t  µ t ζ t  = − 2 E ζ t  µ t ζ t  , d d t H ζ t  µ t ζ t  = − R ζ t  µ t ζ t  , (18) i.e. the generalised Diric hlet form and the generalised Fisher information are indeed the dissipation of non-e quilibrium v ariance and relative en tropy , i.e. dissipation along non steady-state solutions of ( 1 ). F urthermore, these ob jects reduce to their classical counterparts ( 2 ) and ( 8 ) when ζ is replaced b y the steady state π , i.e. when M T ζ = 0. Regarding the generalised Diric hlet form, it equals the righ t-hand side of ( 3 ). Y et, the last equality in ( 3 ) relies on π b eing inv arian t. The second equalit y in ( 16 ) sho ws the additional term which is needed to hav e equalit y for non steady states. Regarding the generalised Fisher information, we note that it can b e derived as a mo dulation of its classical coun terpart [ Hil17 , Section 5.1]. These factors explain the lab el of ‘generalisations’ that has b een used ab ov e. W e further remark that E ζ , similar to E π , is not a Dirichlet form in the strict sense, and that the domain of definition for E ζ and R ζ with respect to ζ can b e further enlarged to general probabilit y measures b y extending them via lo wer-semicon tin uous en velopes. W e skip these details here for simplicit y of presen tation. The generalised Diric hlet form and the Fisher information share several properties with their classical v arian ts. W e summarise these in the following result. Prop osition 3.2. F or any ζ ∈ P + ( Z ) , the gener alise d Dirichlet form ( 16 ) and the gener alise d Fisher information ( 17 ) with r esp e ct to ζ satisfy the fol lowing. 1. Continuity: The maps ( ζ , f ) 7→ E ζ ( f ) and ( ζ , φ ) 7→ R ζ ( φ ) ar e c ontinuous on P + ( Z ) × R |Z | and { ( ζ , φ ) : ζ ∈ P + ( Z ) , φ ∈ D ζ ( Z ) } r esp e ctively. 2. Non-ne gativity: E ζ ( f ) , R ζ ( φ ) ≥ 0 with e quality if and only if f ∈ ⟨ 1 ⟩ and φ = 1 resp e ctively. 3. Convexity: E ζ ( · ) and R ζ ( · ) ar e c onvex on their domains of definition. 4. Conne ction: L et f ∈ R |Z | with E ζ [ f ] = 0 . Then, for al l δ ∈ R with | δ | ≤ 1 2 ∥ f ∥ − 1 ∞ , we have 1 + δ f ∈ D ζ ( Z ) and R ζ ( 1 + δ f ) = δ 2  E ζ ( f ) + δ R  , (19) wher e the r emainder term satisfies | R | ≤ 3 ∥ M ∥ ∞ |Z |∥ f ∥ 3 ∞ . Pr o of. The con tinuit y of the generalised Dirichlet form and the generalised Fisher information follows directly from their resp ective definitions, see ( 16 ) and ( 17 ), resp ectiv ely . Next w e show the non-negativit y and conv exity of E ζ . The non-negativity of E ζ ( · ) follo ws from the definition ( 16 ) since M ( z , z ′ ) ≥ 0 for z  = z ′ and the term M ( z , z ) do es not con tribute to the summation. Additionally , we clearly ha ve E ζ ( c 1 ) = 0. On the other hand, if E ζ ( f ) = 0, then M ( z , z ′ )( f ( z ) − f ( z ′ )) 2 = 0 for every z  = z ′ . Since M is irreducible, there is a path ( z j ) J j =1 ⊂ Z betw een every z and z ′ suc h that M ( z j , z j +1 ) > 0. This yields that f is constant along this path. Since z , z ′ w ere arbitrary , w e conclude that f is constant on Z . Finally , the conv exit y of E ζ follo ws directly from the conv exity of f 7→ ( f ( z ) − f ( z ′ )) 2 , whic h can b e written as a composition of x 7→ x 2 with a linear map. Next, w e prov e the non-negativit y and con vexit y for R ζ . Let Φ( x ) : = x − 1 − log x for x > 0. Note that Φ ≥ 0 is strictly conv ex and that Φ(1) = 0. Then, the non-negativit y of R ζ follo ws from M ( z , z ′ ) ≥ 0. Similarly to E ζ , using that ζ , φ > 0 and that Φ( x ) = 0 only at x = 1, we obtain that R ζ v anishes if and only if φ ( z ′ ) φ ( z ) = 1 for all z  = z ′ , which together with φ ∈ D ζ sho ws φ = 1 , see also [ HPST20 , Lemma 2.5]. 7 T o sho w con vexit y , we observe that it is sufficient to sho w con vexit y of φ 7→ φ ( z )Φ( φ ( z ′ ) φ ( z ) ). This either follo ws from direct calculation using that ( x, y ) 7→ − x log ( y /x ) is conv ex on R 2 > 0 , see also [ HPST20 , Remark 2.12], or using an abstract result on p ersp ectiv e functions, see e.g. [ Com18 ]. No w we prov e the final part. Since ∥ δf ∥ ∞ ≤ 1 2 < 1 and E ζ [ f ] = 0, w e ha ve φ : = 1 + δ f ∈ D ζ ( Z ), using whic h R ζ ( 1 + δ f ) = X z,z ′ ∈Z z  = z ′ M ( z , z ′ ) ζ ( z )  1 + δ f ( z ′ )  −  1 + δ f ( z )  −  1 + δ f ( z )  log  1 + δ f ( z ′ )  +  1 + δ f ( z )  log  1 + δ f ( z )  . T aylor expanding log(1 + x ) = x − 1 2 x 2 − P ∞ k =3 1 k ( − x ) k leads to R ζ ( 1 + δ f ) = δ 2 2 X z,z ′ ∈Z z  = z ′ M ( z , z ′ ) ζ ( z )  f ( z ) − f ( z ′ )  2 + δ 3 R = δ 2  E ζ ( f ) + δ R  , where w e estimate the remainder term R : = X z,z ′ ∈Z M ( z , z ′ ) ζ ( z ) δ 3  δ 3 2 f ( z )  f ( z ′ ) 2 − f ( z ) 2  +  1 + δ f ( z )  ∞ X k =3 ( − δ ) k f ( z ′ ) k k −  1 + δ f ( z )  ∞ X k =3 ( − δ ) k f ( z ) k k  as (with F : = ∥ f ∥ ∞ and | δ | F ≤ 1 2 ) | R | ≤     X z,z ′ ∈Z z  = z ′ M ( z , z ′ ) ζ ( z )      F 3 + 2 | δ | 3 (1 + | δ | F ) ∞ X k =3 ( | δ | F ) k k  ≤ ∥ M ∥ ∞ |Z | F 3  1 + 3 ∞ X ℓ =0 ( | δ | F ) ℓ ℓ + 3  ≤ ∥ M ∥ ∞ |Z | F 3  1 + ∞ X k =0 1 2 k  . This sho ws that, indeed, R is uniformly b ounded in δ and that the asserted estimate holds. Regarding ( 19 ), a similar relation holds betw een H ζ and v ar ζ . Indeed, under the same assumptions as for ( 19 ), we hav e H ζ ( 1 + δ f ) = δ 2  1 2 v ar ζ ( f ) + δ ˜ R  , (20) where the remainder term ˜ R satisfies | ˜ R | ≤ 1 3 ∥ f ∥ 3 ∞ . T o see this, a simplified version of the pro of of ( 19 ) rev eals that H ζ ( 1 + δ f ) = X z ∈Z ( 1 + δ f ( z ))  δ f ( z ) − 1 2 ( δ f ( z )) 2 − ∞ X k =3 ( − δ f ( z )) k k  ζ ( z ) = δ E ζ [ f ] + 1 2 δ 2 E ζ [ f 2 ] + δ 3 ˜ R, where no w ˜ R = X z ∈Z ζ ( z ) δ 3  − δ 3 f ( z ) 3 2 −  1 + δ f ( z )  ∞ X k =3 ( − δ ) k f ( z ) k k  = X z ∈Z ζ ( z ) δ 3  − ∞ X k =3 ( − δ f ( z )) k k + ∞ X k =2 ( − δ f ( z )) k +1 k  = X z ∈Z ζ ( z ) δ 3 ∞ X k =3 ( − δ f ( z )) k k ( k − 1) . Then, applying | f ( z ) | ≤ ∥ f ∥ ∞ , noting that P z ∈Z ζ ( z ) = 1, using | δ |∥ f ∥ ∞ ≤ 1 2 , and k ( k − 1) ≥ 6 for k ≥ 3 the asserted estimate on | ˜ R | follo ws. The following remark discusses the connection b et ween the generalised Fisher information intro- duced ab o ve and y et another generalisation of the Fisher information introduced by some of the authors in [ HPST20 ]. 8 R emark 3.3 . In [ HPST20 , Definition 1.5], the authors in tro duce a different generalisation of the Fisher information inspired by large deviations of indep endent copies of Marko v chains. In the language of this article, for any λ ∈ (0 , 1) and f ∈ R |Z | > 0 , this version of the Fisher information reads (note the difference in domain of definition compared to ( 17 )) R λ ζ ( f ) : = X z,z ′ ∈Z M ( z , z ′ ) ζ ( z ) f ( z )  f ( z ′ ) f ( z ) − 1 − 1 λ  f ( z ′ ) f ( z )  λ − 1  . Noting that lim λ → 0  x − 1 − 1 λ ( x λ − 1)  = x − 1 − lim λ → 0  e λ log x − 1 λ  = x − 1 − log x w e obtain R λ ζ ( f ) → R ζ ( f ) as λ → 0, where the limit is defined in ( 17 ). In particular, the results in this pap er can b e appropriately generalised if R ζ is replaced by R λ ζ using that x 7→ x − 1 − 1 λ ( x λ − 1) is con vex and R λ ζ (1 + δ f ) = δ 2 ((1 − λ ) E ζ ( f ) + O ( δ )) , whic h follows from a T aylor expansion of x 7→ x − 1 − 1 λ ( x λ − 1). 3.3 Generalised PI and LSI inequalities With the generalised ob jects defined abov e, w e are no w ready to introduce the corresp onding (non- e quilibrium) generalised functional inequalities. Definition 3.4. Let M ∈ R |Z |×|Z | b e an irreducible generator. A strictly p ositive probabilit y measure ζ ∈ P + ( Z ) satisfies the • gener alise d Poinc ar´ e ine quality (gPI) if there exists a constant α > 0 such that ∀ f ∈ R |Z | : v ar ζ ( f ) ≤ 1 α E ζ ( f ); (21) • gener alise d lo g-Sob olev ine quality (gLSI) if there exists a constant α > 0 suc h that ∀ φ ∈ D ζ ( Z ) : H ζ ( φ ) ≤ 1 α R ζ ( φ ) . (22) where D ζ ( Z ) is the space of densities with respect to ζ (defined in ( 6 )). The suprema ov er all p ossible choices of α are called respectively the gener alise d Poinc ar´ e and the gener alise d lo g-Sob olev c onstants . W e denote them by α gPI ( · , M ) , α gLSI ( · , M ) : P + ( Z ) → [0 , ∞ ) and they are giv en by α gPI ( ζ , M ) : = inf f ∈ R |Z | f / ∈⟨ 1 ⟩ E ζ ( f , M ) v ar ζ ( f ) , and α gLSI ( ζ , M ) : = inf φ ∈D ζ ( Z ) φ  = 1 R ζ ( φ, M ) H ζ ( φ ) . (23) As in the case of the classical v ariants, w e write α gPI ( ζ ) , α gLSI ( ζ ) instead of α gPI ( ζ , M ) , α gLSI ( ζ , M ) if there is no risk of confusion about the underlying generator. Since w e are particularly interested in the dependence on ζ , w e keep it in the notation. Note that ( 21 ) and ( 22 ) hold trivially for all ζ ∈ P + ( Z ), all f = c 1 ∈ ⟨ 1 ⟩ and φ = 1 resp ectively , since v ar ζ ( c 1 ) = E ζ ( c 1 ) = 0 and H ζ ( 1 ) = R ζ ( 1 ) = 0. Hence, removing them in ( 23 ) do es not change the fact that α gPI ( ζ ) and α gLSI ( ζ ) are resp ectively the suprema ov er all constants α in ( 21 ) and ( 22 ). W e now prov e the contin uity of ζ 7→ α gPI ( ζ ) as well as equiv alent characterisations of the generalised PI constant (see Prop osition 3.5 b elow), which will be useful in the forthcoming analysis. In the first tw o c haracterisations, the space of test functions is reduced. The first characterisation simplifies the fraction, and the second characterisation remov es the dependence on ζ from the class o ver which f is minimised. F rom these characterisations, it is relatively easy to pro ve con tin uity of ζ 7→ α gPI ( ζ ). The final, third c haracterisation draws a bridge to the formula of the generalised LSI constant thereby establishing a useful connection betw een the gPI and the gLSI. The proof of this exploits the contin uity of α gPI and that the generalised Dirichlet form and v ariance app ear as the leading-order term of respectively the generalised Fisher information and the relativ e en tropy close to φ = 1 , see ( 19 ) and ( 20 ). The result mak es use of notation in tro duced in Section 3.1 . 9 Prop osition 3.5. The gPI c onstant α gPI define d in ( 23 ) is c ontinuous on P + ( Z ) . Mor e over, we have the fol lowing alternative char acterisations for any ζ ∈ P + ( Z ) α gPI ( ζ ) = inf f ∈⟨ 1 ⟩ ⊥ ζ ∥ f ∥ ζ =1 E ζ ( f ) = inf f ∈⟨ 1 ⟩ ⊥ ∥ f ∥ =1 E ζ ( f ) v ar ζ ( f ) . (24) In addition, for any P + ( Z ) ∋ ζ ε → ζ and any 0 < δ ε → 0 as ε → 0 , we have α gPI ( ζ ) = 1 2 lim ε → 0 inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε <δ ε R ζ ε ( φ ) H ζ ε ( φ ) . (25) Pr o of. Since R |Z | = ⟨ 1 ⟩ ⊕ ⟨ 1 ⟩ ⊥ ζ , any f / ∈ ⟨ 1 ⟩ can b e decomp osed into f = c 1 + g where c ∈ R and g ∈ ⟨ 1 ⟩ ⊥ ζ \{ 0 } ; in particular E ζ [ g ] = 0. F rom this we claim that α gPI ( ζ ) = inf g ∈⟨ 1 ⟩ ⊥ ζ \{ 0 } E ζ ( g ) v ar ζ ( g ) = inf g ∈⟨ 1 ⟩ ⊥ ζ \{ 0 } E ζ  g ∥ g ∥ ζ  = inf f ∈⟨ 1 ⟩ ⊥ ζ ∥ f ∥ ζ =1 E ζ ( f ) . Indeed, the first equalit y follows by using f = c 1 + g as abov e, E ζ ( c 1 + g ) = E ζ ( g ) and v ar ζ ( c 1 + g ) = v ar ζ ( g ). The second equality follo ws since E ζ ( β g ) = β 2 E ζ ( g ) for an y β ∈ R and v ar ζ ( g ) = E ζ [ g 2 ] = ∥ g ∥ 2 ζ . The second equality in ( 24 ) follows by rep eating the same arguments as ab ov e using the alternativ e decomp osition R |Z | = ⟨ 1 ⟩ ⊕ ⟨ 1 ⟩ ⊥ and noting that v ar ζ ( β g ) = β 2 v ar ζ ( g ). W e no w prov e the con tinuit y of α gPI . Using the second alternativ e characterisation we write α gPI ( ζ ) = inf f ∈⟨ 1 ⟩ ⊥ ∥ f ∥ =1 F ( ζ , f ) , F ( ζ , f ) : = E ζ ( f ) v ar ζ ( f ) , where the domain of F ( · , · ) is given by Dom( F ) = ∪ τ > 0 K τ , K τ : =  ζ ∈ P + ( Z ) : ζ ( z ) ≥ τ , ∀ z ∈ Z  ×  f ∈ ⟨ 1 ⟩ ⊥ : ∥ f ∥ = 1  . Note that F ∈ C ( K τ ) for any τ > 0 since K τ is a compact set, E ζ ( f ) and v ar ζ ( f ) are con tinuous on K τ (the former follows from Prop osition 3.2 ), and for any f ∈ ⟨ 1 ⟩ ⊥ v ar ζ ( f ) = E ζ  ( f − E ζ [ f ]) 2  ≥ τ ∥ f − E ζ [ f ] ∥ 2 = τ  ∥ f ∥ 2 + ∥ E ζ [ f ] ∥ 2  ≥ τ ∥ f ∥ 2 = τ > 0 , where the second equality follows since f ∈ ⟨ 1 ⟩ ⊥ and E ζ [ f ] ∈ ⟨ 1 ⟩ . Therefore ζ 7→ α gPI ( ζ ) is con tin uous on { ζ ∈ P ( Z ) : ζ ( z ) ≥ τ , ∀ z ∈ Z } . Since τ > 0 is arbitrary it follows that ζ 7→ α gPI ( ζ ) is contin uous on P + ( Z ). Finally , w e prov e ( 25 ). Let ζ , ζ ε , δ ε b e given. W e will prov e that lim inf ε → 0 inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε <δ ε R ζ ε ( φ ) H ζ ε ( φ ) ≥ 2 α gPI ( ζ ) , (26) lim sup ε → 0 inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε <δ ε R ζ ε ( φ ) H ζ ε ( φ ) ≤ 2 α gPI ( ζ ) , (27) whic h together establish ( 25 ). W e start with proving ( 26 ). Recall ζ ∗ = min z ∈Z ζ ( z ) > 0 and take ε small enough such that ( ζ ε ) ∗ > 1 2 ζ ∗ and δ ε ≤ 1 4 √ 2 ζ ∗ (this upp er b ound is chosen a p osteriori). Let φ b e any admissible function in ( 25 ) (note that φ and the following ob jects depend on ε ). Let δ : = ∥ φ − 1 ∥ ζ ε ∈ (0 , δ ε ) , f : = φ − 1 δ ∈ R |Z | . By construction, φ = 1 + δf , E ζ ε [ f ] = 0 (i.e. f ∈ ⟨ 1 ⟩ ⊥ ζ ε ) and v ar ζ ε ( f ) = ∥ f ∥ 2 ζ ε = 1, where the latter follo ws from the definition of δ . Using δ < δ ε ≤ 1 4 √ 2 ζ ∗ and 1 = ∥ f ∥ 2 ζ ε = X z ∈Z f 2 ( z ) ζ ε ( z ) ≥ ( ζ ε ) ∗ ∥ f ∥ 2 ∞ ≥ ζ ∗ 2 ∥ f ∥ 2 ∞ , (28) 10 w e obtain δ ∥ f ∥ ∞ < 1 2 . Hence, the expansions of the generalised Fisher information and relative entrop y in Proposition 3.2 and ( 20 ) apply . In particular, since ∥ f ∥ 2 ∞ ≤ 2 ζ ∗ , the corresp onding remainder terms in ( 19 ), ( 20 ) are bounded by some constants C, C ′ > 0 indep enden t of ε , which is crucial for what follows. With this we obtain (recall that v ar ζ ε ( f ) = 1) R ζ ε ( φ ) H ζ ε ( φ ) = R ζ ε ( 1 + δ f ) H ζ ε ( 1 + δ f ) ≥ 2 E ζ ε ( f ) − C δ 1 + C ′ δ ≥ 2 E ζ ε ( f ) 1 + C ′ δ ε − C δ ε . (29) Then, using that φ is arbitrary , w e obtain inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε <δ ε R ζ ε ( φ ) H ζ ε ( φ ) ≥ 2 1 + C ′ δ ε inf f ∈⟨ 1 ⟩ ⊥ ζ ε ∥ f ∥ ζ ε =1 E ζ ε ( f ) − C δ ε . (30) By the first characterisation of α gPI ( ζ ) we recognise the minimization in the right-hand side as α gPI ( ζ ε ). Then, ( 26 ) follows from δ ε → 0 and the contin uity of α gPI . W e no w prov e ( 27 ). While most of the proof is the same as that for ( 26 ), we are going to establish ( 29 ) and ( 30 ) in the rev ersed order with rev ersed inequalities. W e choose ε small enough such that ( ζ ε ) ∗ > 1 2 ζ ∗ and δ ε ≤ δ 0 for some ε -indep enden t δ 0 > 0 that we choose later by means of tw o upp er b ounds. Let δ ∈ (0 , δ ε ) and let f be as in the infimum in ( 30 ), i.e. f ∈ ⟨ 1 ⟩ ⊥ ζ ε and ∥ f ∥ ζ ε = 1. T ak e φ : = 1 + δ f , and note that φ b elongs to the class from the infimum in ( 30 ). Since ( 28 ) still holds, w e ha ve for δ 0 ≤ 1 4 √ 2 ζ ∗ that Prop osition 3.2 and ( 20 ) apply with f , δ . This yields the reverse of ( 29 ) given b y R ζ ε ( φ ) H ζ ε ( φ ) ≤ 2 E ζ ε ( f ) 1 − C ′ δ ε + C δ ε for some constant C , C ′ > 0 indep enden t of ε , where we hav e assumed δ 0 ≤ 1 2 ( C ′ ) − 1 to ensure that the denominator is p ositive. Moving the constants inv olving δ ε to the left-hand side and taking the infimum o ver f on b oth sides, w e get (1 − C ′ δ ε ) inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε <δ ε R ζ ε ( φ ) H ζ ε ( φ ) − C δ ε ! ≤ 2 inf f ∈⟨ 1 ⟩ ⊥ ζ ε ∥ f ∥ ζ ε =1 E ζ ε ( f ) = 2 α gPI ( ζ ε ) . T aking the limsup ov er ε yields ( 27 ). 4 Prop erties of the generalised PI and LSI constan ts W e now establish the main properties of the generalised constants α gPI ( ζ ) and α gLSI ( ζ ) defined in ( 23 ). W e begin with t wo elementary observ ations: first, that 2 α gPI ( ζ ) ≥ α gLSI ( ζ ); and second, that b oth the gPI and gLSI tensorise, meaning that a pro duct Marko v chain inherits the smallest gPI and gLSI constan ts among its components. W e then sho w that, in addition to ζ 7→ α gPI ( ζ ), also ζ 7→ α gLSI ( ζ ) is con tinuous on P + ( Z ). Finally , we deriv e positive lo wer b ounds on α gLSI ( ζ ), which then, by the inequalit y 2 α gPI ( ζ ) ≥ α gLSI ( ζ ), also imply the same b ounds for α gPI ( ζ ). 4.1 gLSI implies gPI and tensorisation The follo wing result shows that α gLSI ( ζ ) ≤ 2 α gPI ( ζ ), whic h in particular implies that the gLSI implies the gPI. Note that this is also a prop erty of the corresp onding classical v arian ts [ BT06 , Section 3]. Prop osition 4.1. F or any ζ ∈ P + ( Z ) , α gLSI ( ζ ) ≤ 2 α gPI ( ζ ) . The pro of is a direct consequence of the characterisation of α gPI in ( 25 ); in particular, taking ζ ε = ζ , remo ving the condition ∥ φ − 1 ∥ ζ ε < δ ε from ( 24 ) and using ( 23 ) we arrive at the required result. The follo wing result presents the tensorisation prop ert y of the generalised PI and LSI inequalities whic h carries ov er from the classical counterparts [ DSC96 , Lemma 3.2]. 11 Prop osition 4.2. Let d ≥ 1 . F or e ach i ∈ { 1 , . . . , d } let Z i b e a finite state spac e, M i b e an irr educible gener ator on Z i and ζ i ∈ P + ( Z i ) . Define Z : = d Y i =1 Z i , ζ : = ζ 1 ⊗ . . . ⊗ ζ d ∈ P + ( Z ) , M : = 1 d d X i =1 I ⊗ . . . I | {z } i − 1 ⊗ M i ⊗ I ⊗ . . . ⊗ I | {z } d − i ∈ R |Z |×|Z | . Then, M is an irr e ducible gener ator on Z and ζ satisfies the gPI and gLSI with c onstants α gPI ( ζ , M ) = 1 d min 1 ≤ i ≤ d α gPI ( ζ i , M i ) , α gLSI ( ζ , M ) = 1 d min 1 ≤ i ≤ d α gLSI ( ζ i , M i ) . Pr o of. The proof for the gPI inequality follo ws exactly as in the classical case, see [ DSC96 , Lemma 3.2] for instance. F or gLSI we provide a pro of for d = 2, and note that the calculations easily generalise to d > 2 (alb eit with messier notation). In the follo wing z = ( x, y ) and z ′ = ( x ′ , y ′ ). Using Φ( r , s ) = r ( s r − 1 − log s r ) and φ x ( · ) = φ ( x, · ), φ y ( · ) = φ ( · , y ) for φ ∈ D ζ ( Z ) R ζ ( φ ) = X ( x,y ) ∈Z ζ 1 ( x ) ζ 2 ( y ) X ( x ′ ,y ′ ) ∈Z M  ( x, y ) , ( x ′ , y ′ )  Φ  φ ( x, y ) , φ ( x ′ , y ′ )  = 1 2 X y ∈Z 2 ζ 2 ( y ) X x ∈Z 1 ζ 1 ( x ) X x ′ ∈Z 1 M 1 ( x, x ′ )Φ  φ ( x, y ) , φ ( x ′ , y )  + 1 2 X x ∈Z 1 ζ 1 ( x ) X y ∈Z 2 ζ 2 ( y ) X y ′ ∈Z 2 M 2 ( y , y ′ )Φ  φ ( x, y ) , φ ( x, y ′ )  = 1 2 E ζ 2  R ζ 1 ( φ y )  + 1 2 E ζ 1  R ζ 2 ( φ x )  . (31) Here and in the rest of this proof w e use the con ven tion that E ζ 1 , R ζ 1 and H ζ 1 only act on x -v ariable, while similar op erations with resp ect to ζ 2 only act on the y -v ariable. F or an y φ ∈ D ζ ( Z ), w e define the marginal densit y g ∈ D ζ 2 ( Z 2 ) and family of conditional densities Ψ y ∈ D ζ 1 ( Z 1 ) for y ∈ Z 2 as g ( y ) : = E ζ 1 ( φ y ) = X x ∈Z 1 ζ 1 ( x ) φ ( x, y ) > 0 , Ψ y ( x ) : = φ ( x, y ) g ( y ) , whic h leads to the decomposition ∀ φ ∈ D ζ ( Z ) : φ ( x, y ) = g ( y )Ψ y ( x ) . Therefore for any φ ∈ D ζ ( Z ) we find H ζ ( φ ) = X y ∈Z 2 ζ 2 ( y ) X x ∈Z 1 ζ 1 ( x ) g ( y )Ψ y ( x )  log Ψ y ( x ) + log g ( y )  = E ζ 2  g ( · ) H ζ 1 (Ψ ( · ) )  + H ζ 2 ( g ) , where the second equality follows since E ζ 1 (Ψ y ) = 1. Using the notation α j gLSI : = α gLSI ( ζ j , M j ) for j = 1 , 2, we therefore find min { α 1 gLSI , α 2 gLSI } H ζ ( φ ) ≤ α 1 gLSI E ζ 2  g ( · ) H ζ 1 (Ψ ( · ) )  + α 2 gLSI H ζ 2 ( g ) ≤ E ζ 2  g ( · ) R ζ 1 (Ψ ( · ) )  + R ζ 2 ( g ) = E ζ 2  R ζ 1 ( φ y )  + R ζ 2 ( g ) , (32) where the final equality follows since the generalised Fisher information is 1-homogenous, i.e. R ζ 1 ( cξ ) = c R ζ 1 ( ξ ) for an y c ∈ R and ξ ∈ D ζ 1 ( Z 1 ). Since ( r, s ) 7→ Φ( r, s ) is con vex for r , s > 0, using Jensen’s inequalit y along with g ( y ) = E ζ 1 ( φ y ), we ha ve the b ound R ζ 2 ( g ) = X y,y ′ ∈Z 2 ζ 2 ( y ) M 2 ( y , y ′ )Φ  E ζ 1 ( φ y ) , E ζ 1 ( φ y ′ )  12 ≤ X y,y ′ ∈Z 2 ζ 2 ( y ) M 2 ( y , y ′ ) E ζ 1 Φ  φ y , φ y ′  = X x ∈Z 1  R ζ 2 ( φ x )  ζ 1 ( x ) = E ζ 1  R ζ 2 ( φ x )  . Substituting this b ound into ( 32 ) and using ( 31 ) we find min { α 1 gLSI , α 2 gLSI } H ζ ( φ ) ≤ 2 R ζ ( φ ) , whic h implies that 2 α gLSI ≥ min { α 1 gLSI , α 2 gLSI } . T o obtain the opp osite inequality , we rep eat the arguments abov e with the simple c hoices φ ( x, y ) = ξ ( x ) and φ ( x, y ) = η ( y ). F or the first choice, this yields H ζ ( φ ) = H ζ 1 ( ξ ) and R ζ ( φ ) = R ζ 1 ( ξ ), and thus 2 α gLSI ( ζ , M ) ≤ α gLSI ( ζ 1 , M 1 ). Similarly , for the second choice of φ we get 2 α gLSI ( ζ , M ) ≤ α gLSI ( ζ 2 , M 2 ). 4.2 Contin uity of the generalised LSI constan t With the contin uity of the generalised PI constan t already established in Prop osition 3.5 , we now prov e that α gLSI is con tin uous as well. The proof is based on the v ariational definition of α gLSI ( ζ ) as the infim um of R ζ ( φ ) / H ζ ( φ ) ov er φ , see ( 23 ). If R ζ ( φ ) / H ζ ( φ ) were to b e contin uous in φ ∈ D ζ ( Z ), then the pro of w ould b e muc h simpler. How ever, R ζ ( φ ) / H ζ ( φ ) may b e discon tinuous at φ = 1 ; see Example 4.3 below. T o obtain con tinuit y of α gLSI , w e establish Γ-con vergence of R ζ ( φ ) / H ζ ( φ ), see ( 33 ), whic h implies the conv ergence of minima and thus the contin uity of α gLSI . Example 4.3 (Discon tinuit y of R ζ ( φ ) / H ζ ( φ ) at φ = 1 ) . Let Z = Z / (3 Z ) ∼ = { 0 , 1 , 2 } . W e treat i ∈ Z as indices. W e consider the cyclic chain giv en by M i,i +1 = 1, M ii = − 1 and 0 otherwise. In vector notation, let ζ = 1 4   2 1 1   , f 1 =   0 1 − 1   , f 2 =   1 − 2 0   , φ k = 1 + δ f k ( k = 1 , 2) , where we will later pass δ → 0. By construction, E ζ [ f k ] = 0 for k = 1 , 2. Then, for δ small enough, φ k ∈ D ζ ( Z ), and the expansions in ( 19 ) and ( 20 ) apply and yield R ζ ( φ k ) = δ 2 E ζ ( f k ) + O ( δ 3 ) , H ζ ( φ k ) = δ 2 2 v ar ζ ( f k ) + O ( δ 3 ) . Straigh tforward computations yield v ar ζ ( f k ) = 2 X i =0 ( f k i ) 2 ζ i =      1 2 if k = 1 3 2 if k = 2 , E ζ ( f k ) = 1 2 2 X i,j =0 M ij ζ i ( f k i − f k j ) 2 = 1 2 2 X i =0 ζ i ( f k i − f k i +1 ) 2 =      7 8 if k = 1 23 8 if k = 2 , and therefore R ζ ( φ ) / H ζ ( φ ) is not contin uous at φ = 1 since lim δ → 0 R ζ ( 1 + δ f k ) H ζ ( 1 + δ f k ) = 2 E ζ ( f k ) v ar ζ ( f k ) =      21 12 if k = 1 , 23 12 if k = 2 . Theorem 4.4. L et M be a gener ator of an irr e ducible Markov chain on a finite state sp ac e Z and ζ ∈ P + ( Z ) , the set of p ositive pr ob ability measur es on Z . The gener alise d LSI c onstant α gLSI ( ζ ) , see ( 23 ) , is c ontinuous on P + ( Z ) . 13 Pr o of. W e show that for any ζ ∈ P + ( Z ) and any sequence { ζ ε } ε> 0 ∈ P + ( Z ) with ζ ε → ζ as ε → 0 w e ha ve α gLSI ( ζ ε ) → α gLSI ( ζ ) as ε → 0. Let such ζ , ζ ε b e given. W e in tro duce the corresp onding functionals E ε , E 0 : R |Z | > 0 → R as E ε ( φ ) : =    R ζ ε ( φ ) H ζ ε ( φ ) , if φ ∈ D ζ ε ( Z ) \ { 1 } , + ∞ , otherwise, E 0 ( φ ) : =          R ζ ( φ ) H ζ ( φ ) , if φ ∈ D ζ ( Z ) \ { 1 } , 2 α gPI ( ζ ) , if φ = 1 , + ∞ , otherwise . In the following, we will prov e the stronger result Γ-lim ε → 0 E ε = E 0 , (33) i.e. the functional E ε Γ-con verges to the functional E 0 as ε → 0, which is equiv alent to the following tw o conditions: (Γ1) F or any φ ∈ R |Z | > 0 , and any sequence { φ ε } ε> 0 ∈ R |Z | > 0 whic h satisfies φ ε → φ as ε → 0 it holds lim inf ε → 0 E ε ( φ ε ) ≥ E 0 ( φ ) . (34) (Γ2) F or any φ ∈ R |Z | > 0 there exists a sequence { φ ε } ε> 0 ∈ R |Z | > 0 suc h that φ ε → φ as ε → 0 and it holds lim sup ε → 0 E ε ( φ ε ) ≤ E 0 ( φ ) . (35) This Γ-conv ergence result along with the observ ation that E ε , E 0 tak e finite v alues at some φ ∈ R |Z | > 0 ensure that the infima con verge, i.e. α gLSI ( ζ ε ) → α gLSI ( ζ ) as ε → 0, see [ Bra02 ], which is the claimed result. W e first point out that ( 34 ), ( 35 ) hold if φ  = 1 . This follo ws since ( η, φ ) 7→ H η ( φ ) , R η ( φ ) are con tinuous maps on P + ( Z ) × R |Z | > 0 and H η ( φ ) > 0 for all φ ∈ D η ( Z ) \ { 1 } , see Proposition 3.2 . Con- sequen tly , for any φ ∈ D ζ ( Z ) \ { 1 } and an y sequence { φ ε } ε> 0 ∈ D ζ ε ( Z ) with φ ε → φ , we ha ve that ∥ φ ε − 1 ∥ ≥ 1 2 ∥ φ − 1 ∥ > 0 for all ε > 0 small enough, and then lim ε → 0 E ε ( φ ε ) = E 0 ( φ ) . Note that one example is giv en by φ ε = φ ζ ζ ε , whic h prov es the existence statement in ( 35 ). W e no w prov e ( 34 ) for φ = 1 . W e denote the left-hand side in ( 34 ) by a ∈ [0 , ∞ ]. W e may assume that a < ∞ , as otherwise ( 34 ) is trivially satisfied. Then, upon extracting a subsequence of ε (not relab eled), we ma y assume that ∞ > E ε ( φ ε ) → a as ε → 0. Finiteness implies that { φ ε } ε> 0 ⊂ D ζ ε ( Z ) satisfies φ ε  = 1 for all ε . W e introduce h ε = φ ε − 1 , and recall from the pro of of Prop osition 3.5 that 0 < ∥ h ε ∥ ζ ε → 0 as ε → 0. Then, ( 34 ) follo ws from ( 25 ) b y lim ε → 0 E ε ( φ ε ) = lim ε → 0 R ζ ε ( φ ε ) H ζ ε ( φ ε ) ≥ lim ε → 0 inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε < ∥ h ε ∥ ζ ε R ζ ε ( φ ) H ζ ε ( φ ) = 2 α gPI ( ζ ) = E 0 ( 1 ) . Next, w e show ( 35 ) for φ = 1 . W e use Prop osition 3.5 to characterise E 0 ( 1 ) = lim ε → 0 inf φ ∈D ζ ε ( Z ) 0 < ∥ φ − 1 ∥ ζ ε <ε R ζ ε ( φ ) H ζ ε ( φ ) . Let φ ε b e a minimising sequence for the sequence of minimisation problems in the right-hand side. F rom the definition of E ε w e then obtain that E 0 ( 1 ) = lim ε → 0 E ε ( φ ε ), and thus it is left to verify that φ ε → 1 as ε → 0. This follows from ∥ φ − 1 ∥ ζ ε < ε ; see ( 28 ) for details. 14 The follo wing remark discusses the additional con tinuit y of the gPI and gLSI constan ts with the generator matrix as an additional v ariable. R emark 4.5 . So far, when dealing with generalised functional inequalities, we ha ve suppressed the explicit dep endence on the generator in the analysis by keeping it fixed. By applying similar arguments as in the pro ofs of Prop osition 3.5 and Theorem 4.4 it can be shown that the maps ( ζ , M ) 7→ α gPI ( ζ , M ) , α gLSI ( ζ , M ) are contin uous ov er P + ( Z ) × M ( Z ), where M ( Z ) is the space of irreducible generators on Z , whic h is a con vex cone in R |Z |×|Z | . T o see this it is sufficien t to note that E ζ ,M ( f ) and R ζ ,M ( φ ) depend linearly on M . 4.3 Low er b ounds on α gPI ( ζ ) and α gLSI ( ζ ) In the classical setting where the steady state π is the reference measure, [ SC97 ] provides implicit low er b ounds on the PI and LSI constan ts. More precisely , [ SC97 , Section 3 .2] establishes v arious implicit lo wer b ounds on α PI and [ SC97 , Theorem 2.2.3] provides an implicit lo wer bound for the constan t app earing in the standard LSI ( 13 ). This also provides a lo wer b ound for the constant app earing in the classical LSI inequalit y ( 9 ), see [ BT06 , Proposition 3.6] for the rev ersible case and [ Mic18 , Section 1] for the general case. In this section, w e present explicit lo w er bounds on the gener alise d PI and LSI constants for any reference measure ζ . In Theorem 4.7 b elo w, we establish the p ositivity of the gLSI constant, which, in turn, by Propo- sition 4.1 , implies the positivity of the gPI constant as w ell. The proof of this main result requires the follow ing lemma, which collects sev eral prop erties of a symmetrised generator and makes use of the standard LSI (recall ( 13 )), which in volv es the centred entrop y (recall Ent π ( · ) introduced in Remark 2.5 ). Giv en ζ ∈ P + ( Z ) we define the ζ -symmetrised generator M ζ ∈ R |Z |×|Z | as ∀ z ′  = z : M ζ ( z , z ′ ) : = 1 2  M ( z , z ′ ) + ζ ( z ′ ) ζ ( z ) M ( z ′ , z )  ; M ζ ( z , z ) : = − X z ′ ∈Z z ′  = z M ζ ( z , z ′ ) . (36) Note that the second term in the definition of M ζ is precisely the adjoint op erator to M in L 2 ( ζ ), which clarifies that this is a symmetrised generator. W e also in tro duce the Diric hlet form using the generator M ζ as E ζ ( f , M ζ ) = 1 2 X z,z ′ ∈Z M ζ ( z , z ′ ) ζ ( z )  f ( z ) − f ( z ′ )  2 = E ζ ( f , M ) . (37) The following lemma then establishes imp ortan t prop erties of M ζ and provides estimates relating its sp ectral gap to the standard LSI constant of M ζ in tro duced in Remark 2.6 . Lemma 4.6. L et M b e a gener ator of an irr e ducible Markov chain on a finite state sp ac e Z and ζ ∈ P + ( Z ) . The symmetrised gener ator M ζ ( 36 ) is irr e ducible and is r eversible with r esp e ct to its steady state ζ . Conse quently, ζ satisfies the PI and the standar d LSI, i.e., ∀ f ∈ R |Z | > 0 : v ar ζ ( f ) ≤ 1 λ ζ E ζ ( f , M ζ ) , Ent ζ ( f 2 ) ≤ 1 α sLSI ( M ζ ) E ζ ( f , M ζ ) , (38) wher e En t ζ ( · ) is the c entr e d entr opy ( 11 ) , E ζ ( · , M ζ ) is the Dirichlet form using the gener ator M ζ , and λ ζ = α PI ( M ζ ) = α gPI ( ζ , M ) > 0 is the sp e ctr al gap for the gener ator − M ζ . F urthermor e, we have the r elation λ ζ 2 + log 1 ζ ∗ ≤ α sLSI ( M ζ ) ≤ 2 λ ζ , (39) wher e ζ ∗ = min z ζ ( z ) . Pr o of. The irreducibility of M ζ follo ws from the irreducibilit y of M . Using the characterisation ( M ζ f )( z ) = X z ′ ∈Z z ′  = z M ζ ( z , z ′ )  f ( z ′ ) − f ( z )  , ∀ f ∈ R |Z | , 15 it follo ws that ( M ζ ) T ζ = 0 (i.e. ζ is the steady state), since for an y f ∈ R |Z | w e hav e ( M ζ f , 1 ) ζ = 1 2 X z,z ′ ∈Z  M ( z , z ′ ) ζ ( z ) + ζ ( z ′ ) M ( z ′ , z )   f ( z ′ ) − f ( z )  = 0 . The final equality follo ws by exc hanging the indices in the second sum. Finally , the reversibilit y with resp ect to ζ follows since for z  = z ′ ζ ( z ) M ζ ( z , z ′ ) = ζ ( z ′ ) M ζ ( z ′ , z ) . The standard LSI inequalit y ( 38 ) follo ws from classical arguments for reversible Marko v c hains (see for instance [ DSC96 , Section 3.1]), with α sLSI ( M ζ ) > 0 presented in [ SC97 , Theorem 2.2.3]. The connection b et ween the PI constan t and the spectral gap is standard for reversible Marko v chains [ BT06 ]. No w we prov e the b ounds on α sLSI ( M ζ ) in ( 39 ). The upp er b ound is standard [ DSC96 , Lemma 3.1] and follows exactly as in the pro of of Prop osition 4.1 . The low er b ound requires the Rothaus lemma [ BGL14 , Lemma 5.1.4], which states that for any g ∈ R |Z | and a ∈ R w e hav e En t ζ  ( g + a ) 2  ≤ Ent ζ  g 2  + 2 E ζ ( g 2 ) . Using g = f − E ζ ( f ) and a = E ζ ( f ), the Rothaus lemma b ecomes En t ζ ( f 2 ) ≤ Ent ζ  ( f − a ) 2  + 2 v ar ζ ( f ) . (40) Setting m : = E ζ (( f − a ) 2 ) = v ar ζ ( f ) we find En t ζ  ( f − a ) 2  = E ζ  ( f − a ) 2 log ( f − a ) 2 m  ≤ m log ∥ f − a ∥ 2 ∞ m . With ζ ∗ = min z ζ ( z ) and ∥ f − a ∥ 2 ∞ ≤ ∥ f − a ∥ 2 ≤ 1 ζ ∗ ∥ f − a ∥ 2 ζ = m ζ ∗ , where the norms are defined in Section 3.1 , we arrive at En t ζ  ( f − a ) 2  ≤ m log 1 ζ ∗ = v ar ζ ( f ) log 1 ζ ∗ . Substituting in to ( 40 ) w e find En t ζ ( f 2 ) ≤  2 + log 1 ζ ∗  v ar ζ ( f ) ≤  2 + log 1 ζ ∗  1 λ ζ E ζ ( f , M ζ ) , where the second inequality follows from the Poincar ´ e inequalit y for ζ with respect to M ζ . This concludes our lo wer b ound for α sLSI ( M ζ ). Using the previous lemma, w e now establish the main theorem of this section, whic h pro vides low er b ounds on the generalised P oincar´ e and log-Sobolev constants. Theorem 4.7. L et M be a gener ator of an irr e ducible Markov chain on a finite state sp ac e Z and ζ ∈ P + ( Z ) . L et ζ ∗ = min z ζ ( z ) and M ∗ = min { M ( z , z ′ ) : z , z ′ ∈ Z , M ( z , z ′ ) > 0 } b e the smal lest p ositive r ate in M . (1) We have the lower b ound α gPI ( ζ ) ≥ ζ ∗ M ∗ |Z | . (41) (2) Using λ ζ for the sp e ctr al gap of − M ζ and α sLSI ( M ζ ) for the standar d LSI c onstant (defined in L emma 4.6 ) we have α gLSI ( ζ ) ≥ 2 α sLSI ( M ζ ) ≥ λ ζ 1 + 1 2 log 1 ζ ∗ = α gPI ( ζ ) 1 + 1 2 log 1 ζ ∗ ≥ ζ ∗ M ∗ |Z | − 1 1 + 1 2 log 1 ζ ∗ . (42) 16 Pr o of. W e start with the pro of of ( 41 ). Consider the definition of α gPI ( ζ ) in ( 23 ), and take a corresp ond- ing function f ∈ R |Z | with f / ∈ ⟨ 1 ⟩ . Let f = max z ∈Z f ( z ) and f = min z ∈Z f ( z ), and some corresp onding maximiser z ∈ Z and minimiser z ∈ Z . Since M is irreducible there exists a path ( z i ) ℓ i =0 ⊂ Z of length ℓ + 1 ≤ |Z | with z 0 = z and z ℓ = z such that M ( z i , z i − 1 ) > 0 for all i = 1 , . . . , ℓ . Setting κ i : = f ( z i ) − f ( z i − 1 ), we get f − f = f ( z ℓ ) − f ( z 0 ) = P ℓ i =1 κ i . Hence, by the Cauch y-Sch wartz inequal- it y , ( f − f ) 2 =  ℓ X i =1 κ i  2 ≤ ℓ ℓ X i =1 κ 2 i . Using this we obtain 2 E ζ ( f ) ≥ ℓ X i =1 M ( z i , z i − 1 ) ζ ( z i )( f ( z i ) − f ( z i − 1 )) 2 ≥ M ∗ ζ ∗ ( f − f ) 2 ℓ ≥ 1 |Z | − 1 M ∗ ζ ∗ ( f − f ) 2 . In addition, we b ound the denominator in ( 23 ) as v ar ζ ( f ) = 1 2 X z,z ′ ∈Z ( f ( z ) − f ( z ′ )) 2 ζ ( z ) ζ ( z ′ ) ≤ 1 2 ( f − f ) 2 . Then, ( 41 ) follows directly . Regarding ( 42 ), the second inequalit y is ( 39 ), the last inequalit y is a direct application of ( 41 ) and the equality follows from α gPI ( ζ ) = α PI ( M ζ ) = λ ζ . The first inequalit y requires more work. T o prov e it, let φ b e as in ( 23 ), i.e. φ ∈ D ζ ( Z ) and φ  = 1 . W e will b ound R ζ ( φ ) from b elow and H ζ ( φ ) from ab ov e. Using the inequality r − 1 − log r ≥ ( √ r − 1) 2 for r > 0 w e find R ζ ( φ ) ≥ X z,z ′ ∈Z M ( z , z ′ ) ζ ( z )  p φ ( z ′ ) − p φ ( z )  2 = 2 E ζ  √ φ, M  = 2 E ζ ( √ φ, M ζ ) , where the Dirichlet form E ζ ( · , M ζ ) is defined in ( 37 ). Then, using the (alternative form of the) classical LSI inequalit y ( 38 ), for any density φ ∈ D ζ ( Z ) (recall ( 6 )) we find H ζ ( φ ) = En t ζ ( φ ) ≤ 1 α sLSI ( M ζ ) E ζ  √ φ, M ζ  ≤ 1 2 α sLSI ( M ζ ) R ζ ( φ ) , where Ent ζ is the cen tred entrop y ( 11 ). Therefore, we hav e α gLSI ( ζ ) ≥ 2 α sLSI ( M ζ ). The second and final inequalit y in ( 42 ) follo w from Lemma 4.6 and ( 41 ) resp ectiv ely . If one has more information on M , then sharper bounds than ( 41 ) are a v ailable; see e.g. [ SC97 , Section 3.2]. F urthermore, we also hav e the following straightforw ard upp er bound for the gLSI constant α gLSI ( ζ ) ≤ 2 α gPI ( ζ ) = 2 α PI ( ζ , M ζ ) = 2 λ ζ . (43) W e do not exp ect that the explicit low er b ound ( 42 ) is sharp. Y et, the following remark demonstrates that the low er b ound has to b e at least linear in ζ ∗ . R emark 4.8 . Here we demonstrate b y means of an example that, at least for certain M , the low er b ounds on α gPI ( ζ ) and α gLSI ( ζ ) ha ve to be at least linear in ζ ∗ , i.e. α gPI ( ζ ) + α gLSI ( ζ ) ≤ C ζ ∗ . Similar to Example 4.3 , but now for a larger state space, let Z = Z / (2 N Z ) ∼ = { 0 , 1 , . . . , 2 N − 1 } with large N , and let M be the cyclic chain given by M i,i +1 = 1, M ii = − 1 and 0 otherwise. Let ζ i =      ε N i = N − 1 , 2 N − 1 1 N otherwise, f i = ( 1 i < N − 1 i ≥ N , φ i =      1 2 i < N 3 2 i ≥ N . Note that E ζ [ f ] = 0. Except for errors of size O ( ε, 1 N ), ζ and φ are normalised. These errors do not play a significan t role in the following computation, and thus we neglect them. Next w e estimate α gPI ( ζ ) and α gLSI ( ζ ). First, since E ζ [ f ] = 0, v ar ζ ( φ ) = X i ζ i f 2 i = 1 + O  1 N  . 17 Similarly , with Ψ( x ) = x log x − x + 1 ≥ 0, which v anishes only at x = 1, H ζ ( φ ) = X i ζ i Ψ( φ i ) = 1 2 Ψ  1 2  + 1 2 Ψ  3 2  + O  ε, 1 N  , whic h is also close to a universal, p ositive constant. Second, E ζ ( f ) = 1 2 X i  = j ζ i M ij ( f i − f j ) 2 = 1 2 2 N − 1 X i =0 ζ i M i,i +1 ( f i − f i +1 ) 2 = 2 ζ N − 1 + 2 ζ 2 N − 1 = 4 ε N = 4 ζ ∗ . Similarly , with Φ( x ) = x − 1 − log x , which v anishes only at x = 1, R ζ ( φ ) = X i  = j ζ i φ i M ij Φ  φ j φ i  = 2 N − 1 X i =0 ζ i φ i M i,i +1 Φ  φ i +1 φ i  = ζ N − 1 φ N − 1 Φ(3) + ζ 2 N − 1 φ 2 N − 1 Φ  1 3  = ε 2 N  3Φ  1 3  + Φ(3)  = C ζ ∗ . Th us, for ε, 1 N small enough, w e obtain α gPI ( ζ ) ≤ C ζ ∗ and α gLSI ( ζ ) ≤ C ′ ζ ∗ for some univ ersal constants C, C ′ > 0. 5 Applications W e no w present t w o applications of our generalised functional inequalities. The first one deals with deca y estimates for the distance b et ween t wo solutions to the forward Kolmogoro v equation ( 1 ) in v ariance and relativ e entrop y , see Section 5.1 . The second one is qualitative error estimates in coarse-graining, which impro ve the results in [ HS24 ]. Here, we present results in the case of general generators, see Section 5.2.1 , and a generator with explicit scale separation, see Section 5.2.2 . The first result also provides a recip e to ev aluate the qualit y of the chosen coarse-graining map, which is presented in Section 5.2.3 . 5.1 Comparing t w o solutions to the forward Kolmogoro v equation As discussed in Section 3.2 , the generalised Dirichlet form and Fisher information are the dissipation of the generalised v ariance and entrop y , respectively , along solutions of the forw ard Kolmogoro v equation ( 1 ). In the classical case, this relation allows to estimate the rate of conv ergence to equilibrium in v ariance and en tropy , resp ectively . Using the generalised Poincar ´ e inequality and generalised log-Sobolev inequalit y , we can extend this to measure the conv ergence in time of one solution of ( 1 ) to another. W e point out that suc h results comparing t w o different solutions are not entirely new in the literature: estimates in L 2 ( π ) can b e derived using the spectral gap of the symmetrised generator with respect to π b y a straightforw ard time-deriv ative argumen t; estimates in TV-norm can b e deriv ed by using the Do eblin’s minorisation condition, which alw ays holds for irreducible Marko v chains, and follo wing the argumen ts in [ MT09 , Theorems 16.2.3, 16.2.4]; and estimates in W asserstein distance can b e derived under a geometric condition [ LP17 , Theorem 14.6]. While all these approac hes are used to compare the time-dep enden t solution to the steady state, the pro of techniques generalise to the setting of this section, whic h compares t wo different time-dep enden t solutions. Nonetheless, we discuss our estimate since it uses relativ e en tropy , whic h is not used in the literature, and the constants app earing in the estimates b elo w are differen t from those appearing when using the tec hniques stated ab o ve. Consider tw o time-dependent solutions t 7→ ζ t , µ t to the forward Kolmogorov equations ( 1 ) with tw o differen t initial conditions µ 0 , ζ 0 ∈ P ( Z ). W e recall from ( 18 ) that d d t v ar ζ t  µ t ζ t  = − 2 E ζ t  µ t ζ t  , d d t H ζ t  µ t ζ t  = − R ζ t  µ t ζ t  . 18 Therefore, using the generalised functional inequalities ( 21 ), ( 22 ) we arrive at d d t v ar ζ t  µ t ζ t  = − 2 E ζ t  µ t ζ t  ≤ − 2 α gPI ( ζ t ) v ar ζ t  µ t ζ t  , d d t H ζ t  µ t ζ t  = − R ζ t  µ t ζ t  ≤ − α gLSI ( ζ t ) H ζ t  µ t ζ t  . Note that this setup allo ws for considerably more flexibilit y compared to the classical setup since it allo ws us to prov e concentration estimates for time-dependent solutions ζ t rather than the steady state π only . W e no w apply Gronw all’s lemma to obtain v ar ζ t  µ t ζ t  ≤ v ar ζ 0  µ 0 ζ 0  exp  − Z t 0 2 α gPI ( ζ s ) d s  , H ζ t  µ t ζ t  ≤ H ζ 0  µ 0 ζ 0  exp  − Z t 0 α gLSI ( ζ s ) d s  . (44) As the con vergence rate is encoded in a time in tegral due to the time-dependence of ζ t , these estimate are not directly usable by themself. Ho wev er, combining the low er b ound on α gLSI in Theorem 4.7 , together with α gLSI ≤ 2 α gPI stated in Prop osition 4.1 , w e find that 2 α gPI ( ζ t ) ≥ α gLSI ( ζ t ) ≥ ( ζ t ) ∗ M ∗ |Z | − 1 1 + 1 2 log 1 ( ζ t ) ∗ , (45) where we recall that ( ζ t ) ∗ is the minimal v alue of ζ t and M ∗ is the smallest p ositive en try of M . Using the low er b ounds on solutions to forward Kolmogorov equations provided by [ HS24 , Prop osition A.1] we obtain the next result which compares t wo solutions to the forw ard Kolmogorov equation. Prop osition 5.1. L et M b e an irre ducible gener ator. Then, for any δ ∈ (0 , 1) there exists an α ∗ > 0 such that for al l initial data ζ 0 , µ 0 ∈ P ( Z ) and c orr esp onding solutions ζ t , µ t of the forwar d Kolmo gor ov e quation ( 48 ) holds 0 < α ∗ ≤ α gLSI ( ζ t ) ≤ 2 α gPI ( ζ t ) (46) for al l t ≥ δ and v ar ζ t  µ t ζ t  ≤ v ar ζ 0  µ 0 ζ 0  min  1 , e − α ∗ ( t − δ )  , H ζ t  µ t ζ t  ≤ H ζ 0  µ 0 ζ 0  min  1 , e − α ∗ ( t − δ )  (47) for al l t ≥ 0 . Pr o of. W e start with the lo wer b ound on α gLSI ( ζ t ). F or this, w e note that since ζ t con verges to π as t → ∞ at an exp onen tial rate, there exists a τ > 0 such that ( ζ t ) ∗ ≥ π ∗ 2 > 0 for all t > τ . In fact, suc h a τ can be chosen indep enden t of ζ 0 . Next, we recall from [ HS24 , Proposition A.1] that there exist constan ts c ( δ ) > 0, c ( δ, τ ) > 0 and N ∈ N such that ζ t ≥ ( c ( δ ) t N , if t ∈ [0 , δ ) , c ( δ, τ ) , if t ∈ [ δ, τ ] . T ogether with ( 45 ), this shows that there exists an α ∗ > 0 suc h that ( 46 ) holds. F or the second part of the proposition, we bound the exponentials in ( 44 ). W e note that it is sufficient to provide an estimate for the integral con taining α gLSI since this implies the same estimate for α gPI due to Proposition 4.1 . First, we use that α gLSI ( ζ ) ≥ 0 and th us, H ζ t  µ t ζ t  ≤ H ζ 0  µ 0 ζ 0  . for all t ≥ 0. F or t ≥ δ , we can additionally estimate Z t 0 α gLSI ( ζ s ) d s = Z δ 0 α gLSI ( ζ s ) d s + Z t δ α gLSI ( ζ s ) d s ≥ Z δ 0 α gLSI ( ζ s ) d s + α ∗ ( t − δ ) ≥ α ∗ ( t − δ ) , where we again use that α gLSI ( ζ ) ≥ 0. Plugging this into ( 44 ) then yields the estimate ( 47 ), which completes the pro of. 19 Recall α PI and α LSI as the classical P oincar´ e and log-Sobolev constan ts. Then, by exp onen tial con- v ergence to the stationary measure and contin uity of the generalised P oincar´ e and log-Sob olev constants, see Proposition 3.5 and Theorem 4.4 , w e find that α gPI ( ζ t ) → α PI and α gLSI ( ζ t ) → α LSI as t → ∞ . A direct adaptation of the pro of of Prop osition 5.1 then yields the follo wing corollary , which relates the long-term deca y to the classical Poincar ´ e and log-Sob olev constants. Corollary 5.2. L et M , ζ t , µ t b e as in Pr op osition 5.1 . Then, ther e exists a τ > 0 such that v ar ζ t  µ t ζ t  ≤ v ar ζ 0  µ 0 ζ 0  e − α PI 2 t , H ζ t  µ t ζ t  ≤ H ζ 0  µ 0 ζ 0  e − α LSI 2 t for al l t ≥ τ . 5.2 Clustering in Mark o v c hains W e no w turn to the second application of the generalised functional inequalities, which is clustering in Marko v chains. Molecular systems and chemical kinetics are often mo delled at a microscopic lev el b y diffusions ov er large-dimensional complex energy landscap es [ A T17 , T uc23 ]. A typical situation is describ ed in Figure 1 : the dynamics rapidly relaxes within the basin of attraction of a lo cal minimum, while transition betw een neighbouring minima requires the crossing of energy barriers and therefore o ccurs on a longer timescales. Each such basin can be identified as a micr o-state , and ignoring the fast equilibration within eac h basin, the slow dynamics can b e approximated by a Mark ov chain on the set of micro-states with transition rates reflecting typical escap e times b et ween basins. This forms the basis of Mark ov state mo dels t ypically used to study conformational dynamics [ PWS + 11 ]. macro-state micro-state Figure 1: Energy landscape with tw o macro-states and six micro-states. In some situations the energy landscap e exhibits barriers of differen t heigh ts, as in Figure 1 , wherein micro-states connected by smaller barriers equilibrate rapidly compared to transitions separated by large barriers. This leads to groups of micro-states, lump ed together in to so-called macr o-states , that exhibit quic k in ternal equilibration within the group compared to slow transition across different macro-states. On sufficien tly long time scales, the original Marko v c hain dynamics can b e describ ed b y a reduced Mark ov chain on the macro-state space. This reduction is justified due to presence of pronounced disparities in the transition rates (or barrier heights) and underlies multiscale simulation and cluster- ing/lumping approaches to Marko v chains [ CGP05 , EL VE05 , SFHD99 ]. Mathematically , these disparate transition rates corresp ond to the presence of explicit scale-separation in the system and considerable mathematical literature has been dev oted to limit passage and a veraging in m ultiscale jump pro cesses, see [ PS08 , Lah13 , Zha16 , HPST20 , MS20 , PR23 , LMS25 ] for a non-exhaustive list. While v arious reduc- tion (or coarse-graining) approac hes exist in the presence of pronounced rate disparities, coarse-graining to low er-dimensional Mark ov chains remains essential even when such disparities are not sharp, typically due to large system sizes. F rom this viewp oint, clustering even without explicit scale separation can b e in terpreted as constructing reduced Mark ov c hains that approximate the slo wer components of the original dynamics. These ideas hav e inspired v arious sp ectral and v ariational approaches to constructing 20 clusters [ DW05 , KW07 ]. While these tec hniques provide useful practical to ols to cluster states together, they do not provide any error estimates to ascertain the quality of their approac h. Inspired by coarse-graining for diffusion pro cesses [ LL10 , Cho03 ], the first and third author recen tly pro vided a systematic construction of a reduced Marko v c hain giv en an y clustering. F urthermore, this w ork provides an error estimate comparing the reduced chain to the original chain under the assumption of a log-Sob olev inequality [ HS24 , Theorem 3.1]. In this work, the LSI app eared as a technical assumption, and its in terpretation was not transparent since the underlying dynamics was non-reversible and the reference measure whic h satisfied the LSI w as not the steady state of the underlying generator [ HS24 , Remark 3.5 & Section 6]. With the generalised framework introduced in this pap er, we can remov e the log-Sob olev assumption en tirely and obtain improv ed results both in the presence and in the absence of explicit scale separation in the system; see Prop osition 5.3 and Theorem 5.6 b elow. In the following, we consider coarse-grained Mark ov c hains in tw o complementary regimes: Section 5.2.1 treats coarse-grained Marko v c hains in the absence of explicit scale separation, illustrating how the framew ork applies to large systems where reduc- tion is required for computational tractability rather than asymptotic reasons. In contrast, Section 5.2.2 studies systems with explicit scale separation where the reduced dynamics admit a clear m ultiscale in- terpretation. Moreo ver, in Section 5.2.3 , we show how the generalised LSI and its asso ciated low er bounds can b e used to compare different clusterings, providing a quantitativ e notion of coarse-graining quality . This w as already suggested in [ HS24 ] in an example for a specific rev ersible Mark ov chain, which allows the LSI constan t to b e calculated. With the framew ork developed here, we can giv e a m uch more general approac h for ev aluating the quality of the coarse-graining map using the low er bound ( 42 ) and the upp er b ound ( 43 ). As we demonstrate in Section 5.2.3 for a non-reversible example, this provides guidance for selecting clusterings that lead to accurate reduced Marko v chains (see discussion in Section 6 ). In addition, with a quality score it b ecomes p ossible to try to obtain a go o d coarse-graining map from data b y machine learning, see Section 6 for more details. 5.2.1 Clustering in Marko v chains without explicit scale separation W e first discuss coarse-graining and error estimates for a general Marko v chain without explicit scale separation. Before stating the result, we briefly recapitulate the setting in [ HS24 ] to fix notation. W e consider a con tinuous-time Mark ov chain on a finite state space X with an irreducible generator M ∈ R |X |×|X | . Then, the corresp onding forward Kolmogorov equation for the ev olution of the probability densit y reads    d d t µ = M T µ, µ t =0 = µ 0 . (48) Our goal is to define an appropriate reduced Marko v chain on a smaller state space Y , whic h is enco ded via a so-called c o arse-gr aining map ξ : X → Y . W e denote the level sets of ξ by Λ y : = { ξ − 1 ( x ) : x ∈ X } . The coarse-grained (or pro jected) dynamics on Y is exactly characterised by the push-forward of µ under the map ξ , i.e. t 7→ ˆ µ t : = ξ # µ t ∈ P ( Y ) defined as ˆ µ t ( y ) = X x ∈ Λ y µ t ( x ) for an y y ∈ Y . The evolution of the exact coarse-grained dynamics ˆ µ t is explicitly describ ed by        d d t ˆ µ t = ˆ M T t ˆ µ t ˆ µ t =0 ( y ) = X x ∈ Λ y µ 0 ( x ) , with ˆ M t ( y 1 , y 2 ) : = X x 1 ∈ Λ y 1 ,x 2 ∈ Λ y 2 M ( x 1 , x 2 ) µ t ( x 1 | y 1 ) , (49) see [ HS24 , Lemma 2.4] for a deriv ation. While the evolution of ˆ µ t is the exact dynamics induced by the full dynamics µ t under the coarse-graining map ξ , it is impractical in applications since the generator is time-dep enden t and, in fact, requires full knowledge of the full dynamics µ t . 21 Instead, motiv ated by recent developmen ts in computational statistical mec hanics [ LL10 , Cho03 ], an appr oximate effe ctive dynamics t 7→ η t ∈ P ( Y ) is introduced in [ HS24 ] which evolv es according to d d t η t = N T η t , with N ( y 1 , y 2 ) : = X x 1 ∈ Λ y 1 ,x 2 ∈ Λ y 2 M ( x 1 , x 2 ) ρ ( x 1 | y 1 ) , (50) where ρ ( ·| y ) ∈ P (Λ y ) for y ∈ Y is the family of conditional measures corresp onding to the full steady state ρ ∈ P ( X ) of M , i.e. ∀ x ∈ Λ y : ρ ε ( x | y ) = ρ ε ( x ) P x ′ ∈ Λ y ρ ε ( x ′ ) . T o construct the effective dynamics, one only needs to know the full-space steady state ρ rather than the full-space dynamics. This generator N of the effectiv e dynamics ( 50 ) mimics the coarse-grained generator ˆ M t ( 49 ) with the crucial difference that the cluster dynamics (i.e. dynamics on the level-sets of ξ ) has b een equilibrated – we refer in terested readers to [ HS24 ] for details. The k ey question in the inv estigation in [ HS24 ] is under what conditions this effective dynamics is a go o d appro ximation of the coarse-grained dynamics, and by extension, the full dynamics. F or this setting, the first and third author provided an error estimate, which measures the error betw een the coarse-grained dynamics ˆ µ t and the effective dynamics η t in relative entrop y , that is, it provides a b ound on H η t ( ˆ µ t /η t ), see [ HS24 , Theorem 3.1] and ( 53 ). T o form ulate this estimate, w e define for any y ∈ Y the generator M y ∈ R |Y |×|Y | as the restriction of M to Λ y × Λ y with an appropriately mo dified diagonal suc h that M y is again a generator. The key assumption in [ HS24 , Theorem 3.1] is the existence of α > 0 suc h that α ≤ α gLSI ( ρ ( ·| y ) , M y ) for all y ∈ Y . T o provide such an α , w e introduce the symmetrised generators M y,ρ ( ·| y ) : = 1 2 ( M + D ρ ( ·| y ) M T D − 1 ρ ( ·| y ) ) . (51) Here we use the notation D ρ ( ·| y ) = diag( ρ ( ·| y )), i.e., the diagonal matrix with the entries of ρ ( ·| y ) on its diagonal. Note that the second term in the righ t-hand side of ( 51 ) is the explicit form of the adjoint to M in L 2 ( ζ ), which is precisely the second term app earing in the symmetrised generator M ζ ( 36 ). Then, applying Theorem 4.7 , we find that α : = min y ∈Y λ y 2 + log 1 ρ ( ·| y ) ∗ (52) with λ y the sp ectral gap of M y,ρ ( ·| y ) and ρ ( ·| y ) ∗ the minimum of ρ ( ·| y ), satisfies α ≤ α gLSI ( ρ ( ·| y ) , M y ) for all y ∈ Y . Recall that Theorem 4.7 also provides a more explicit low er b ound whic h does not require the calculation of the sp ectral gap. Although b oth low er b ounds in Theorem 4.7 yield strict p ositivity of the α gLSI ( ρ ( ·| y ) , M y ) for all y ∈ Y , we use the sharp er low er b ound given b y α here. A direct application of [ HS24 , Theorem 3.1, Remark 3.2] then yields the following result. Prop osition 5.3. L et µ ∈ C 1 ([0 , ∞ ); P ( X )) b e a solution to the ful l forwar d Kolmo gor ov e quation ( 48 ) with c orr esp onding c o arse-gr aine d dynamics ˆ µ and let η ∈ C 1 ([0 , ∞ ); P ( Y )) b e the solution to the effe ctive dynamics ( 50 ) with initial c ondition ˆ µ 0 , η 0 ∈ P + ( Y ) satisfying ˆ µ 0 ( y ) , η 0 ( y ) ≥ c 0 for some c 0 > 0 and al l y ∈ Y . Then, for any fixe d T > 0 ther e exists a constant C = C ( L, |X | , T , c 0 ) such that H η t  ˆ µ t η t  ≤ 2 H η 0  ˆ µ 0 η 0  + C min y ( α gLSI ( ρ ( ·| y ) , M y ))  H ρ  µ 0 ρ  − H ρ  µ t ρ  (53) for al l t ∈ [0 , T ] . In p articular, the implicit pr efactor min y ( α gLSI ( ρ ( ·| y ) , M y )) c an b e r eplac e d by the explicit c onstant α > 0 given in ( 52 ) . R emark 5.4 . As p oin ted out in [ HS24 , Remark 3.2], it is unclear if the estimate ( 53 ) can be pro ven for T = ∞ . Ho wev er, one can obtain the weak er estimate H η t  ˆ µ t η t  ≤ H η 0  ˆ µ 0 η 0  + C √ α  H ρ  µ 0 ρ  − H ρ  µ t ρ  1 2 for all t > 0 with a constan t C no w dep ending on L, N and ρ , see [ HS24 , Theorem 3.1]. In fact, this estimate ev en holds without assuming strictly p ositive initial data. 22 W e should p oin t out that it is p ossible to construct another natural reduced dynamics on Y inspired b y multiscale problems in Marko v c hains (see Section 5.2.2 for this connection). The follo wing remark briefly discusses this reduced dynamics in the the absence of scale separation. R emark 5.5 . The restriction M y ∈ R |Y |×|Y | of the generator M to Λ y × Λ y is assumed to b e a generator in the setting of this section, which can alwa ys b e guaranteed by mo difying the diagonal elements of the restriction. If we additionally assume that M y is irreducible for every y ∈ Y , then there exists ρ y ∈ P ( Y ) suc h that ( M y ) T ρ y = 0, i.e., ρ y is the steady state of the restricted generator M y . Note that, in the non-rev ersible case, the conditional steady state ρ ( ·| y ) is generally not the same as the steady state ρ y within each cluster (see Section 5.2.3 for one such explicit example). F ollowing the same strategy as in the construction of effective dynamics ab o ve, we can introduce the reduced dynamics d d t µ av t = ( M av ) T µ av t , with M av ( y 1 , y 2 ) : = X x 1 ∈ Λ y 1 ,x 2 ∈ Λ y 2 M ( x 1 , x 2 ) ρ y 1 ( x 1 ) . Here the µ av t refers to the aver age d dynamics in multiscale problems, see ( 57 ) below. While this dynamics is traditionally used only in the con text of m ultiscale problems, it remains w ell-defined ev en in the absence of explicit scale separation, as is the case here. How ev er, our pro of tec hniques from Prop osition 5.3 do not carry ov er to this av eraged reduced dynamics. Sp ecifically , the pro of of [ HS24 , Lemma 3.8] fails since the push-forward steady state ξ # ρ is generally not the steady state of M av . This p oints to the main issue that, in general, there is no reason why the coarse-grained generator ˆ M t ( 49 ) should b e close to M av . This is in stark contrast to the effective generator N , see ( 50 ), whic h is the limit of ˆ M t for t → ∞ . A notable exception to this issue are reversible Marko v c hains where ρ y ( · ) = ρ ( ·| y ), see [ HS24 , Lemma 4.9], and therefore the effective and a veraged dynamics coincide. 5.2.2 Clustering in Marko v chains with explicit scale separation The estimate ( 53 ) is useful in particular if α is large. T o illustrate this, w e now consider the error estimates for a system with explicit scale separation, which is reflected in a small model parameter ε . Sp ecifically , w e consider the b ehaviour of a particle moving b et ween the micro-states within an energy landscap e with energy barriers of v astly different sizes, similar to Figure 1 . W e mo del this as a Marko v jump pro cess on a discrete state space X = Y × Z , where Y labels the macro-state and Z lab els the micro-state within a particular macrostate. W e will make tw o assumptions throughout: • there are only tw o macro-states i.e. Y = { 0 , 1 } ; • all macro-state con tains equal num ber of micro-states i.e. Z = { 0 , . . . , n − 1 } with n ≥ 1. This is reflected in Figure 1 , where the state space (which lab els each basin) is giv en by X = Y × Z : = { 0 , 1 } × { 0 , 1 , 2 } . Both these assumptions ha ve been made purely for the sak e of notational simplicit y and all the following results can straigh tforwardly b e extended to multiple macro-states containing v arying n umber of micro-states. W e refer to [ HS24 , Section 6] for a more detailed discussion. T o mak e these ideas concrete, consider a family of Marko v chains parametrised b y ε > 0 with corresp onding forward Kolmogorov equations    d d t µ ε = ( L ε ) T µ ε , µ ε t =0 = µ ε 0 , (54) on X = Y × Z , where µ ε t is the probability distributions of the Marko v c hain with the initial datum µ ε 0 . The generator L ε is assumed to admit the form L ε = 1 ε Q + G, where Q, G are ε -indep endent generators, and ε > 0 is a small parameter that represents the ratio of the t wo time scales. In the language of Marko v c hains, Q describes the O ( 1 ε ) fast dynamics on the micro- states Z b elonging to a single macro-state y ∈ Y , and G describes the O (1) slow dynamics of jumps b et ween different macro-states y , y ′ ∈ Y . More precisely , we further decompose Q, G in to the matrices Q y , D y , G y, 1 − y ∈ R n × n for y ∈ Y as L ε = 1 ε Q + G : = 1 ε  Q 0 0 0 Q 1  +  D 0 G 0 , 1 G 1 , 0 D 1  , (55) 23 i.e. with (writing x = ( y , z ) ∈ X ) Q (( y , z ) , ( y ′ , z ′ )) = ( Q y ( z , z ′ ) if y ′ = y 0 otherwise , G (( y , z ) , ( y ′ , z ′ )) = ( D y ( z , z ′ ) if y ′ = y G y,y ′ ( z , z ′ ) otherwise and D y diagonal matrices. The matrix Q y ∈ R n × n enco des the fast jumps b etw een micro-states within the y -th macro-state. The matrix G y, 1 − y ∈ R n × n enco des the slow transition from the y -th macro-state to the (1 − y )-th macro-state. Finally , D y ensures that G is a generator. Since L ε , Q, G are generators and Q, G are indep enden t of ε , they satisfy ∀ x ∈ X : X x ′ ∈X Q ( x, x ′ ) = 0 = X x ′ ∈X G ( x, x ′ ) , ∀ z ∈ Z : D y ( z ) = − X z ′ ∈Z G y, 1 − y ( z , z ′ ) . W e assume that L ε is irreducible, and therefore ( 54 ) admits a stationary solution ρ ε ∈ P + ( X ). Addition- ally , w e assume that Q 0 and Q 1 are irreducible generators as w ell. Consequently , the dynamics driv en b y Q y on Z for y = 0 , 1 admit stationary measures ρ y ∈ P + ( Z ). This setting is of coarse-graining t yp e in the sense that for 0 < ε ≪ 1 the dynamics within the macro-state equilibrates, whic h allo ws the reduction to a jump process on Y in the limit ε → 0, i.e. w e exp ect that only relev an t dynamics in the limit is b etw een macro-states. T o mak e this precise, w e introduce a coarse-graining map as the pro jection on to the slo w v ariables enco ded in Y ξ : X → Y with ξ ( y , z ) = y . (56) As in the case without explicit scale separation, w e describ e the slo w coarse-grained dynamics b y t 7→ ˆ µ ε t : = ξ # µ ε t ∈ P ( Y ) defined as ∀ y ∈ Y : ˆ µ ε t ( y ) : = X z ∈Z µ ε t (( y , z )) . W e use Λ y : = ξ − 1 ( y ) ⊂ X to describ e the the ‘copy’ of Z that b elongs to a macro-state y ∈ Y . Deriving the limiting dynamics as ε → 0 has received considerable attention in the literature. Under some mild assumptions, the solution to the coarse-grained dynamics con verges to the a v eraged dynamics, i.e. ˆ µ ε → µ av , where t 7→ µ av t ∈ P ( Y ) solv es d d t µ av t =  L av  T µ av t , (57) see [ PS08 , Chapter 16] for conv ergence of backw ard equations, [ LL13 , Theorem 1] for a martingale ap- proac h and [ HPST20 , Section 3] for a v ariational approac h. Here the limiting generator L av ∈ R |Y |×|Y | = R 2 × 2 is defined as L av : =  − λ 0 λ 0 λ 1 − λ 1  , λ y : = X z,z ′ ∈Z ρ y ( z ) G y, 1 − y ( z , z ′ ) , (58) where ρ y is the stationary measure corresp onding to Q y . Note that this coincides with the reduced dynamics proposed in Remark 5.5 in the absence of scale separation. While these results describ e the case of ‘infinite scale separation’, as discussed in the case without explicit scale separation, it is usually more relev ant in applications to obtain a reduced mo del in the case of small, but finite, ε > 0, or even for settings without explicit scale separation. Motiv ated the previous Section 5.2.1 , we again define the effectiv e dynamics t 7→ η ε t ∈ P ( Y ) as d d t η ε t =  N ε  T η ε t , with N ε ( y 1 , y 2 ) : = X x 1 ∈ Λ y 1 ,x 2 ∈ Λ y 2 L ε ( x 1 , x 2 ) ρ ε ( x 1 | y 1 ) , (59) where ρ ε ( ·| y ) ∈ P (Λ y ) are the conditional measures corresp onding to the full steady state ρ ε . W e p oin t out that N ε is an irreducible generator with steady state ˆ ρ ε giv en by ˆ ρ ε ( y ) : = X x ∈ Λ y ρ ε ( x | y ) 24 for all y ∈ Y , see [ HS24 , Lemma 2.5, Prop osition 2.6]. W e note that the effective dynamics con verges to the av eraged dynamics, i.e. η ε → µ av as ε → 0 [ HS24 , Theorem 4.4]. More interestingly , the effectiv e dynamics is indeed a goo d appro ximation of the true slow- v ariable dynamics [ HS24 , Theorem 4.5], i.e. under suitable assumptions w e find sup t ∈ [0 ,T ] H η ε t  ˆ µ ε t η ε t  ≤ C ε. The main assumption is the existence of a uniform log-Sob olev inequality , that is, there exists an α > 0 such that α < α gLSI ( ρ ε ( ·| y ) , Q y ) for all ε ∈ (0 , 1) and each y ∈ Y . This is similar to the case without explicit scale separation, but we additionally require uniformity in ε ∈ (0 , 1). Note that similar assumptions also app ear in the coarse-graining literature on diffusion pro cesses [ LL10 , ZHS16 , DLP + 18 , HNS20 ] for the log-Sobolev inequalit y and [ LLO17 , LLS19 , LZ19 ] for the Poincar ´ e inequalit y . Using the framew ork introduced in this pap er, we can remov e this assumption entirely , as we will show in the remainder of this section. W e stress that for non-rev ersible processes L ε , the conditional steady-state ρ ε ( ·| y ) is in general not the steady state of the generator Q y on the level-set Λ y since ρ ε ( ·| y ) generically dep ends on ε while Q y do es not, see Section 5.2.3 for an explicit example. How ever, if L ε is rev ersible, its stationary measure ρ ε is ε -independent, see [ HS24 , Lem. 4.9], and ρ ε ( ·| y ) is the steady state of Q y . The existence and in terpretation of such an α > 0 was left as an op en question in [ HS24 ]. Now, with the non-equilibrium functional inequalities dev elop ed in this paper, w e can answ er it. F or this, w e now apply the general set-up to the scale-separated Marko v chain with generator L ε , see ( 55 ), and obtain the result, whic h remov es the assumption of a uniform log-Sob olev constant in [ HS24 , Thm. 4.5]. Theorem 5.6. L et Q, G b e as define d in ( 55 ) . Then, ther e exist c onstants ε 0 , c 0 , C 0 , C 1 , C 2 , β > 0 such that the fol lowing r esult holds. F or ε ∈ (0 , ε 0 ) let t 7→ µ ε t ∈ P ( X ) and t 7→ η ε t ∈ P ( Y ) b e the solutions to    d d t µ ε t = ( L ε ) T µ ε t , µ ε t =0 = µ ε 0 ,    d d t η ε t = ( N ε ) T η ε t , η ε t =0 = η ε 0 , wher e we assume the fol lowing on the initial c onditions µ ε 0 , η ε 0 : • (strictly p ositive initial data) ˆ µ ε 0 ( x ) ≥ c 0 and η ε 0 ( y ) ≥ c 0 for al l ε ∈ (0 , ε 0 ) and for al l x ∈ X and al l y ∈ Y r esp e ctively, • (c onver genc e) µ ε 0 , ˆ µ ε 0 = ξ # µ ε 0 (se e ( 56 ) ) and η ε 0 c onver ge in P ( Y ) as ε → 0 , and • (r elative entr opy estimate) for al l ε ∈ (0 , ε 0 ) H η ε 0  ˆ µ ε 0 η ε 0  ≤ C 0 ε. Then, the estimate H η ε t  ˆ µ ε t η ε t  ≤ εC 1 e − β t (60) holds for al l t ≥ 0 and al l ε ∈ (0 , ε 0 ) . In p articular, the time-uniform estimate sup t ≥ 0 H η ε t  ˆ µ ε t η ε t  ≤ C 2 ε (61) holds for al l ε ∈ (0 , ε 0 ) . R emark 5.7 . W e briefly commen t on the first tw o assumptions in Theorem 5.6 . The first assumption is consistent with the assumptions in Proposition 5.3 , and we add the uniformity in ε > 0 to a void degeneracy as ε → 0. The second assumption that ˆ µ ε 0 and η ε 0 con verge as ε → 0 is used in the pro of of [ HS24 , Theorem 4.5] to obtain that b oth the coarse-grained dynamics ˆ µ ε t and the effectiv e dynamics η ε t con verge to the ε -indep enden t a veraged dynamics µ av t giv en b y ( 57 ) uniformly on a fixed, finite time in terv al [0 , T ]. This is then used to obtain an ε -independent low er b ound on ˆ µ ε t and η ε t on [0 , T ]. Additionally , we assume the conv ergence of the full initial data µ ε 0 , which is used in the pro of of [ HS24 , Theorem 4.5] to guarantee conv ergence of the relative entrop y of µ ε 0 and ρ ε as ε → 0. Since this is only needed to obtain an ε -uniform b ound on the relative entrop y of µ ε 0 and ρ ε , w e conjecture that it can b e remo ved. How ever, we still state it here for conv enience. 25 Pr o of. W e first prov e that for all T > 0 there exists a constant C ( T ) > 0 such that sup t ∈ [0 ,T ] H η ε t  ˆ µ ε t η ε t  ≤ C ( T ) ε (62) for all ε ∈ (0 , ε 0 ). This follows from an application of [ HS24 , Theorem 4.5, Equation (50)]. F or this, the only part left to chec k is [ HS24 , Thm. 4.5 (A2)], i.e. there exist ε 0 > 0 and α > 0 suc h that for all ε ∈ (0 , ε 0 ), y ∈ Y and ν ∈ P ( Z ) it holds that H ρ ε ( ·| y )  ν ρ ε ( ·| y )  ≤ 1 α R ρ ε ( ·| y )  ν ρ ε ( ·| y ) , Q y  , where R ρ ε ( ·| y ) is the generalised Fisher information with resp ect to the generator Q y , see ( 55 ), and ρ ε ( ·| y ) is the conditional measure to the full steady state ρ ε of L ε . Th us, we need to show that α gLSI ( ρ ε ( ·| y ) , Q y ) ≥ α for some α > 0 indep endent of ε ∈ (0 , ε 0 ). T o see this, we note that ρ ε ( ·| y ) → π y as ε → 0, where π y is the positive stationary measure of the generator Q y , see [ HPST20 , Lemma 3.3]. Since α gLSI ( π y , Q y ) = α LSI ( Q y ) > 0 for eac h y ∈ Y , w e ha v e ˜ α : = min y ∈Y α gLSI ( π y , Q y ) > 0. Then, the contin uity of ν 7→ α gLSI ( ν, Q y ) for each y shows that there exists an ε 0 > 0 such that min y ∈Y α gLSI ( ρ ε ( ·| y ) , Q y ) ≥ ˜ α 2 =: α for all ε ∈ (0 , ε 0 ). Note that ε 0 can b e chosen uniformly in y since Y is finite. Then, all assumptions of [ HS24 , Theorem 4.5] are satisfied, which yields ( 62 ). Next, we establish a long-time estimate. Although this follo ws from similar arguments as in Section 5.1 , w e give a detailed proof since we hav e to carefully track the ε dependence. W e recall that d d t H η ε t  ˆ µ ε t η ε t  = − R η ε t  ˆ µ ε t η ε t  ≤ − α gLSI ( η ε t , N ε ) H η ε t  ˆ µ ε t η ε t  . (63) W e no w show that there exist T 0 > 0 and ε 0 > 0 suc h that α gLSI ( η ε t , N ε ) ≥ 1 2 α gLSI ( π av , L av ) =: β > 0 (64) for all t ≥ T 0 and all ε ∈ (0 , ε 0 ). The pro of relies on the contin uity of the map ( ζ , M ) 7→ α gLSI ( ζ , M ) at ( ζ , M ) = ( π av , L av ), see Theorem 4.4 and Remark 4.5 , and the observ ation that the conv ergence ( η ε t , N ε ) → ( π av , L av ) as ε → 0 and t → ∞ is uniform with resp ect to the initial condition η ε 0 . T o prov e the latter, it is sufficient to show that for all δ > 0 there exists T 1 > 0 large enough and ε 0 > 0 small enough suc h that the estimate ∥ η ε t − π av ∥ TV + ∥ N ε − L av ∥ < δ holds for any t ≥ T 1 and an y ε ∈ (0 , ε 0 ). First, we recall from [ HS24 , Equation (43)] that ∥ N ε − L av ∥ ≤ C ε (65) for all ε ∈ (0 , ε 0 ). Note that the argument below does not need the linear decay in ε and any deca y would suffice. Second, we estimate ∥ η ε t − π av ∥ TV ≤ ∥ η ε t − ˆ ρ ε ∥ TV + ∥ ˆ ρ ε − π av ∥ TV , where ˆ ρ ε ∈ P + ( Y ) is the steady state of N ε . The first term is exp ected to b e small for sufficiently large t due to the con vergence of η ε t to the steady state ˆ ρ ε and the second term to b e small for sufficiently small ε > 0 due to ˆ ρ ε → π av as ε → 0, see [ HS24 , Lemma 4.3] for a qualitative pro of of the latter. W e no w quan tify these conv ergences. F or the second term, we note that ∥ ˆ ρ ε − π av ∥ TV ≤ C ε for all ε ∈ (0 , ε 0 ). Since this result is p otentially of independent interest, w e pro ve this estimate in the following separate Lemma 5.9 . F or the first term, by [ HS24 , Prop osition A.2], there exist constants C ( η ε 0 ) , D ( N ε ) > 0 suc h that ∥ η ε t − ˆ ρ ε ∥ TV ≤ C ( η ε 0 ) e − D ( N ε ) t for all t ≥ 0. Next, we recall that N ε → L av as ε → 0, see ( 65 ). Revisiting the pro of of [ HS24 , Prop osition A.2] we can, in fact, find constan ts C, D > 0 independent of ε, t and the initial data η ε 0 suc h that ∥ η ε t − ˆ ρ ε ∥ TV ≤ C e − Dt 26 for all t ≥ 0 and ε ∈ (0 , ε 0 ) (p otentially c ho osing a smaller ε 0 ). Here, we use that D ( N ε ) can b e b ounded from below b y the spectral gap of N ε . T o obtain the independence of the initial data, w e note in the proof of [ HS24 , Prop osition A.2] that P ( Y ) can b e in terpreted as a b ounded subset of R |Y | . Since the sp ectrum of a finite-dimensional matrix is con tinuous in the matrix entries, the spectral gap of N ε con verges to the (positive) sp ectral gap of L av . In summary , we obtain that ∥ η ε t − π av ∥ TV ≤ C ( e − Dt + ε ) , whic h can be made arbitrarily small uniformly in t ≥ T 1 and ε ∈ (0 , ε 0 ) by choosing T 1 sufficien tly large and ε 0 > 0 sufficien tly small. Using ( 63 ) and ( 64 ) we thus obtain that d d t H η ε t  ˆ µ ε t η ε t  ≤ − β H η ε t  ˆ µ ε t η ε t  for all t ≥ T 1 and ε ∈ (0 , ε 0 ). An application of Gron wall’s inequality thus yields H η ε t  ˆ µ ε t η ε t  ≤ H η ε T 1 ˆ µ ε T 1 η ε T 1 ! e − β ( t − T 1 ) ≤ εC ( T 1 ) e β T 1 e − β t = : εC 1 e − β t (66) for all t ≥ T 1 and ε ∈ (0 , ε 0 ), where in the second inequality w e hav e applied the finite-time estimate ( 62 ). Thus, combining the finite-time estimate ( 62 ) and the large-time estimate ( 66 ), we find the desired error bound ( 60 ). This immediately implies the estimate ( 61 ), which completes the proof. R emark 5.8 . In contrast to the case without explicit scale separation, w e exp ect that the a veraged dynamics ( 57 ) is a goo d appro ximation of the coarse-grained dynamics ˆ µ ε t . Sp ecifically , the coarse-grained generator con verges to the a veraged one as ε → 0 uniformly on finite time interv als, see [ HPST20 , Lemma 3.4]. Moreov er, error estimates are also av ailable in this case, see e.g. [ PS08 , Remark 16.2]. T o complete the proof of Theorem 5.6 , it is left to establish an error estimate betw een the steady states of the effective generator N ε and the av eraged generator L ε . In particular, this estimate complements the error estimate for t wo solutions of the effectiv e equation ( 59 ) and the a veraged equation ( 57 ) obtained in [ HS24 , Theorem 4.4]. How ev er, w e p oint out that [ HS24 , Theorem 4.4] does not include an estimate b et ween steady states due to the difference in initial data b eing contained in the b ound. Lemma 5.9. L et N ε and L av b e given as in ( 59 ) and ( 58 ) , r esp e ctively, and let ˆ ρ ε and π av b e the c orr esp onding ste ady states. Then, there exists C > 0 and ε 0 > 0 such that ∥ ˆ ρ ε − π av ∥ TV ≤ C ε for al l ε ∈ (0 , ε 0 ) . R emark 5.10 . W e highlight that the result of Lemma 5.9 is not restricted to |Y | = 2 as in the example ( 55 ). Indeed the pro of w orks for an y finite reduced state space Y . Pr o of. Since ˆ ρ ε and π av are steady states, we hav e that 0 = ( N ε ) T ˆ ρ ε − ( L av ) T π av = ( N ε ) T ( ˆ ρ ε − π av ) + (( N ε ) T − ( L av ) T ) π av and therefore, ( N ε ) T ( ˆ ρ ε − π av ) = − ( N ε − L av ) T π av . (67) W e wish to multiply b oth sides of ( 67 ) by ( N ε ) − T on the left, how ev er, ( N ε ) T is not inv ertible. Our strategy is to reduce R |Y | to a subspace on which ( N ε ) T is in vertible. With this aim, we first note that ( 1 , ˆ ρ ε − π av ) = 0 , ( 1 , ( N ε − L av ) T π av ) = (( N ε − L av ) 1 , π av ) = 0 , where w e use that ˆ ρ ε , π av ∈ P ( Y ) and that N ε and L av are generators. Therefore, b oth ˆ ρ ε − π av and ( N ε − L av ) T π av are in ⟨ 1 ⟩ ⊥ . Since N ε is a generator and thus dim(k er( N ε )) = 1, we hav e that rank(( N ε ) T ) = rank( N ε ) = |Y | − 1 due to the rank-nullit y theorem and that ( 1 , ( N ε ) T v ) = ( N ε 1 , v ) = 0 for all v ∈ R |Y | . These observ ations demonstrate that ran(( N ε ) T ) = ⟨ 1 ⟩ ⊥ . Using in addition that ker(( N ε ) T ) = ⟨ ˆ ρ ε ⟩ (b ecause N ε is an irreducible generator and ˆ ρ ε is its unique steady state), w e obtain from ˆ ρ ε / ∈ ⟨ 1 ⟩ ⊥ that the restriction 27 T ε : = ( N ε ) T | ⟨ 1 ⟩ ⊥ is in vertible as a linear map from ⟨ 1 ⟩ ⊥ to itself. Moreov er, ( T ε ) − 1 is bounded uniformly in ε ∈ (0 , ε 0 ). T o see this, note that N ε → L av as ε → 0, see ( 65 ), and that L av is also a generator. Since L av is a generator, the restriction T 0 : = ( L av ) T | ⟨ 1 ⟩ ⊥ is also inv ertible as a map from ⟨ 1 ⟩ ⊥ in to itself. Therefore, we hav e a sequence of inv ertible maps T ε : ⟨ 1 ⟩ ⊥ → ⟨ 1 ⟩ ⊥ , which conv erge to an inv ertible map T 0 : ⟨ 1 ⟩ ⊥ → ⟨ 1 ⟩ ⊥ . Therefore, ( T ε ) − 1 con verges to ( T 0 ) − 1 and is thus uniformly b ounded. T aking ( T ε ) − 1 on both sides of ( 67 ), we obtain ∥ ˆ ρ ε − π av ∥ ≤ ∥ ( T ε ) − 1 ∥∥ ( N ε ) T − ( L av ) T ∥∥ π av ∥ ≤ C ε for ε small enough, where we used ∥ ( N ε ) T − ( L av ) T ∥ ≤ C ε , see ( 65 ), in the last estimate. Using that all norms are equiv alent on R |Y | this completes the pro of. 5.2.3 Assessing the quality of coarse-graining maps W e no w return to the general error estimates without scale separation in Proposition 5.3 and recall that the constant in the error estimate depends on α − 1 with α ≤ min y ∈Y α gLSI ( ρ ( ·| y ) , M y ). Using the expression for α given in ( 52 ), w e can now give an estimate of the quality of the chosen coarse-graining map ξ . F or this, we note that since the constant in the error estimate b ehav es like α − 1 , a larger α yields a smaller error and thus indicates a ”b etter” coarse-graining map ξ . While α given in ( 52 ) is only a lo wer b ound for the gLSI constants α gLSI ( ρ ( ·| y ) , L y ) and thus only gives a sufficient criterion to ev aluate its qualit y , we also pro vide an upper b ound whic h is given by λ y , the spectral gap of the symmetrised generator corresponding to the pair L y and ρ ( ·| y ), see ( 36 ) and ( 43 ). Therefore, the sp ectral gap λ y together with the minimum of the conditional measure ρ ( ·| y ) provide necessary and sufficient criteria to ev aluate the quality of the c hosen coarse-graining map. W e demonstrate this for the example Mark ov chain, which is generated by L =         − 3 ε − 1 2 ε − 1 ε − 1 0 0 0 ε − 1 − 1 − 3 ε − 1 2 ε − 1 1 0 0 2 ε − 1 ε − 1 − 2 − 3 ε − 1 0 0 2 0 2 0 − 2 − 3 ε − 1 ε − 1 2 ε − 1 0 0 0 2 ε − 1 − 3 ε − 1 ε − 1 0 0 1 ε − 1 2 ε − 1 − 1 − 3 ε − 1         (68) with ε > 0. This is a generator of a non-reversible Marko v c hain on the state space X = { 1 , 2 , 3 , 4 , 5 , 6 } , and the corresponding net work indicating the transitions betw een the states is depicted in Figure 2 . Moreo ver, the steady state is giv en by ρ = 1 42 + 27 ε (7 + 4 ε, 7 + 6 ε, 7 + 3 ε, 7 + 3 ε, 7 + 5 ε, 7 + 6 ε ) T and is also ε -dep endent. In fact, this generator aligns with the setting in Section 5.2 and exhibits explicit scale separation. This makes it con venien t to ev aluate the quality of the coarse-graining map. Since the calculations are extensive, we refrain from including all details here and only present imp ortan t interme- diate results. Instead, we refer to the supplemen tary material [ HvMS26 ], where the full calculations are pro vided using Mathematica [ Mat ]. The natural choice for the coarse-graining map is given by Y = { a, b } and ξ ( x ) = ( a, x ∈ { 1 , 2 , 3 } , b, x ∈ { 4 , 5 , 6 } . In fact, this is the choice of ξ discussed in Section 5.2.2 . In this case, the restricted generators L y are giv en by L a = 1 ε   − 3 2 1 1 − 3 2 2 1 − 3   , L b = 1 ε   − 3 1 2 2 − 3 1 1 2 − 3   and the conditional measures are ρ ( ·| a ) = 1 21 + 13 ε (7 + 4 ε, 7 + 6 ε, 7 + 3 ε ) T , ρ ( ·| b ) = 1 21 + 14 ε (7 + 3 ε, 7 + 5 ε, 7 + 6 ε ) T . 28 1 2 3 4 5 6 ε − 1 ε − 1 ε − 1 2 ε − 1 ε − 1 ε − 1 ε − 1 2 ε − 1 1 2 2 1 Figure 2: Depiction of the transitions of the Mark ov c hain generated by ( 68 ). Notice that the conditional steady states are indeed ε -dep enden t, which stems from the non-reversibilit y of the generator L . With a direct calculation, w e then find that the spectral gaps of the symmetrised generators L a,ρ ( ·| a ) and L b,ρ ( ·| b ) are giv en by λ a = − 9 2 ε + 1 2 r 3 7 +  − 3 28 − 199 196 √ 21  ε + O ( ε 2 ) , λ b = − 9 2 ε + 1 2 r 3 7 +  − 3 28 − 179 196 √ 21  ε + O ( ε 2 ) . and therefore, α = 9 2(2 + log(3)) ε + O (1) = O ( ε − 1 ) . Th us, restricting to strictly positive initial data satisfying H η 0  ˆ µ 0 η 0  ≤ C ε the impro ved estimate ( 53 ) yields that H η t  ˆ µ t η t  ≤ C ε, whic h matches Theorem 5.6 . Finally , to illustrate a “bad” choice for the coarse-graining map, let ˜ ξ ( x ) : = ( a, x ∈ { 1 , 2 , 4 } , b, x ∈ { 3 , 5 , 6 } . This do es not reflect the scale separation of L since it mixes micro-states from different macro-states. Indeed, w e find that ˜ L a =   − 2 ε − 1 2 ε − 1 0 ε − 1 − 1 − ε − 1 1 0 2 − 2   ˜ L b =   − 2 0 2 0 − ε − 1 ε − 1 1 2 ε − 1 − 1 − 2 ε − 1   and proceeding as ab o ve yields ˜ α = 9 4(2 + log(3)) + O ( ε ) = O (1) . Th us, as exp ected, Proposition 5.3 yields no decay of the error b et ween ˆ µ and η as ε → 0. R emark 5.11 . A key step in the ab ov e calculations is the sp ectral gap of the symmetrised generators L y,ρ ( ·| y ) for y ∈ Y , whic h requires solving an eigenv alue problem. How ever, since the symmetrised generators are self-adjoin t matrices on R | Λ y | equipp ed with the weigh ted inner pro duct ( · , · ) ρ ( ·| y ) , the sp ectral gap can b e obtained from a v ariational problem λ y = sup f ∈⟨ 1 ⟩ ⊥ ( f , L y,ρ ( ·| y ) f ) ρ ( ·| y ) ( f , f ) ρ ( ·| y ) . 29 6 Discussion In this paper, we in troduce a generalised version of the classical P oincar´ e and log-Sobolev inequalities for reference measures whic h are not the steady state of the underlying Marko v chain. W e prov e that the associated constants ha ve similar prop erties to the classical v ariants and also depend contin uously on the reference measure. W e establish a hierarc hy b etw een the constan ts and obtain an explicit low er b ound, which is determined by the smallest p ositive entries of the reference measure and the generator. Moreo ver, w e find an upp er and low er bound in terms of the sp ectral gap of a symmetrised generator. Finally , we apply our framework to establish exp onen tial conv ergence of t wo distributions of the same Mark ov chain and to obtain quantitativ e error estimates for an effective dynamics in coarse-graining applications. W e no w comment on several op en questions and interesting directions of future researc h. Assessing qualit y of coarse-graining maps. Consider a Mark ov chain L on X and a coarse graining map ξ : X → Y as in Sec 5.3. The corresp onding v alue α = α ( ξ ) in ( 52 ) is a measure for ho w w ell the effectiv e dynamics mimic the coarse-grained dynamics, and can therefore b e used as a quality score for the c hoice of ξ . This is exemplified in Section 5.2.3 , where the ‘natural’ choice ξ and the ‘bad’ c hoice ˜ ξ for the coarse graining maps yield α ( ξ ) ≫ α ( ˜ ξ ). Bey ond pro viding a diagnostic tool, the qualit y score α ( ξ ) induces a principled strategy of constructing coarse-graining maps where α ( ξ ) can b e viewed as an ob jectiv e function to b e maximised. This is in line with the established metastability viewp oint that go o d macroscopic (or coarse) v ariables should group states that equilibrate rapidly under the microscopic dynamics, while states separated b y slow transitions should remain in distinct coarse clusters. This gLSI approac h differs from the classical lumpabilit y [ KS69 , Buc94 ] and sp ectral [ D W05 ] approaches. Lumpabilit y identifies a v alid coarse-grained c hain only under v ery strong algebraic constraints on the generator – eac h microstate in a cluster m ust ha v e exactly the same transition rate to every other cluster – a condition that is rarely met in practice. Sp ectral metho ds iden tify metastable sets by analysing the eigenv ectors close to the zero eigenv alue of the generator, whic h enco des the c hain’s global slow mo des and op erates entirely at the equilibrium lev el. In contrast, the gLSI approac h, which do es not rely on exact algebraic structure or stationarity , instead measures whether the restricted dynamics on each cluster exhibits sufficien t internal c ontractivity relative to the c hosen reference measure. This provides a criterion for selecting coarse clusters that remain dynamically meaningful even when the system is far from equilibrium. Finally , the v ariational c haracter of the sp ectral gap for the symmetrised generator suggests a natural machine learning p ersp ective wherein one can parametrise a family of coarse-graining maps ξ θ and optimise θ as to maximise α ( ξ θ ). Connections to Γ -calculus. A natural question raised by the generalised functional inequalities in tro duced in this work is whether they admit a geometric characterisation in the spirit of Bakry- ´ Emery theory esp ecially b ey ond reversibilit y . Bakry- ´ Emery Γ-calculus provides a framework which links curv ature-dimension conditions of the type Γ 2 ≥ α Γ, where Γ is the carr´ e-du-c hamp op erator and Γ 2 is the corresp onding iterated operator, to functional inequalities for diffusion processes [ BGL14 ]. In particular, the curv ature condition implies the LSI with constant α . The picture c hanges fundamen tally for Mark o v c hains due to lack of a chain rule which is not av ailable due to the non-lo cal nature of jump pro cesses. More precisely , for Mark ov chains the usual carr´ e-du- c hamp op erator Γ( f , g )( z ) = 1 2 X z ′ ∈ Z M ( z , z ′ )  f ( z ′ ) − f ( z )  g ( z ′ ) − g ( z )  is w ell defined. The first deriv ativ e of the entrop y satisfies d d t H π ( φ t ) = − X z ∈Z π ( z )Γ( φ t , log φ t )( z ) where φ t = µ t π , d dt µ t = M T µ t and M T π = 0. Ho wev er, as opp osed to the diffusive case, we cannot con trol the second deriv ative of relative entrop y purely in terms of the ‘diffusive’ Γ 2 defined for Mark ov c hains as Γ 2 ( f ) = 1 2  M Γ( f , f ) − 2Γ( f , M f )  . 30 Tw o nonlinear framew orks hav e been developed to address this issue in the rev ersible Mark ov-c hain setting. The entropic–transport approach of [ EM12 , FM16 ] defines curv ature via conv exit y of entrop y along discrete transp ort geo desics, while discrete state-space Γ-calculus [ CDPP09 , WZ21 ] introduces an en tropy-adapted carr ´ e-du-champ Γ Υ leading to the exact en tropy identities and the LSI. Both these approac hes are intrinsically nonlinear and rely on reversibilit y and stationarity . In view of this it is natural to ask whether the generalised PI and LSI considered in this work can b e related to a curv ature notion that remain meaningful for non-rev ersible Mark ov c hains and possibly with a non steady-state reference measure. Coun tably infinite state space and coarse-grained diffusion pro cesses. Throughout the article, we restrict to contin uous-time Marko v chains on finite state spaces. Ho wev er, we expect that the definitions of the generalised functionals, see ( 16 ) and ( 17 ), and the generalised Poincar ´ e and log-Sob olev inequalities, see ( 21 ) and ( 22 ), generalise to the case of a countably infinite state space, cf. [ HPST20 ], where an alternative generalisation of the Fisher information (see Remark 3.3 ) was introduced on count- able state spaces. Moreov er, w e expect that the properties of the generalised functionals in Sections 3 , and some prop erties of the gPI and gLSI constan ts, in particular the con tinuit y , see Theorem 4.4 , gener- alise under appropriate additional assumptions, such as assuming that M has uniformly b ounded rates, whic h guarantees that the sequence ( M ( z , z ′ )) z ′ ∈Z is integrable for any z ∈ Z . How ev er, our curren t pro of of the lo w er bounds presented in Section 4.3 relies hea vily on the fact that the reference measure is strictly positive, i.e. ζ ∗ > 0, whic h fails in countably infinite state spaces. Therefore, under what conditions a low er bound can still b e obtained in the countably infinite setting is still open. Similarly , the generalisation of the error estimates for clustering of Marko v chains (see Section 5.2 ) to infinite state spaces is an op en question, and we refer to [ HS24 , Section 6] for a discussion. W e now turn to Marko v pro cesses on a con tinuous state space, sp ecifically the coarse-graining of diffusion pro cesses in molecular dynamics. Here, coarse-graining estimates similar to Section 5.2 hav e b een developed for b oth rev ersible [ LLS19 , LLO17 , ZHS16 , LZ19 ] and non-reversible setting [ DLP + 18 , LLS19 , HNS20 ], under the k ey assumption that the conditional steady-state of the process, denoted b y ρ ( ·| ξ ( z ) = y ), satisfies either a LSI or a PI inequality with respect to a coarse-grained generator L y – the notation is as in Section 5.2 to indicate similarities. In the reversible setting it turns out that ( L y ) ∗ ρ ( ·| ξ ( z ) = y ) = 0, where ( L y ) ∗ is the adjoint op erator, and therefore the LSI inequalit y follows under classical gro wth conditions on the co efficien ts [ CGWW09 , BGL14 ]. Ho w ever, this relationship is not clear in the setting of non-reversible diffusions discussed in [ LLS19 , HNS20 ]. In ligh t of these observ ations, it is natural to ask whether the non-equilibrium principles developed in this pap er can b e extended to diffusion pro cesses. Sev eral natural questions follow from here. F or instance, how can one formulate practical low er b ounds for the diffusion version of the gLSI constan t (note that in the diffusion setting the entrop y-dissipation is the same for b oth the rev ersible and non- rev ersible setting), and do analogous spectral lo w er bounds as in Theorem 4.7 carry o ver? It is also natural to wonder ab out the scaling b ehaviour of these gPI, gLSI inequalities in the presence of explicit scale-separation whic h connects coarse-graining to classical av eraging problems [ HNS20 ]. Finally , can the gLSI inequality (or a suitable low er b ound) b e used as a diagnostic tool to compare coarse-graining maps? This is esp ecially relev an t in the field of molecular dynamics. Ac kno wledgemen ts PvM was supp orted b y JSPS KAKENHI, gran t Numbers JP20K14358 and JP24K06843. BH w as par- tially supported by the Swedish Research Council – gran t no. 2020-00440 – and the Deutsc he F orsch ungs- gemeinsc haft (DFG, German Research F oundation) – Pro ject-IDs 444753754 and 543917644. Data a v ailabilit y statemen t The supplementary material providing the detailed calculations for the example in Section 5.2.3 is av ail- able at https://github.com/Bastian- Hilder/non- equilibrium- functional- ineq , see also [ HvMS26 ]. 31 A Auxiliary results The follo wing result sho ws that the classical Poincar ´ e inequalit y with constant ˜ α PI ( ζ ) = inf f ∈ R |Z | f / ∈⟨ 1 ⟩ ( f , − M f ) ζ v ar ζ ( f ) , see ( 14 ), only makes sense when ζ is the steady state. Prop osition A.1. L et ζ ∈ P + ( Z ) . We have ˜ α PI ( ζ ) > 0 if and only if ζ = π . Mor e over, if ζ  = π , then ˜ α PI ( ζ ) = −∞ . Pr o of. T ak e any f ∈ R |Z | with f / ∈ ⟨ 1 ⟩ and use ( 15 ) to decomp ose it as f = c 1 + h with c ∈ R and h ∈ ⟨ 1 ⟩ ⊥ ζ with h  = 0. Then, ( f , − M f ) ζ = ( h, − M h ) ζ + c ( 1 , − M h ) ζ , v ar ζ ( f ) = ( h 2 , 1 ) ζ = ( h, h ) ζ , whic h leads to ˜ α PI ( ζ ) = inf c ∈ R inf h ∈⟨ 1 ⟩ ⊥ ζ h  =0 ( h, − M h ) ζ + c (1 , − M h ) ζ ( h, h ) ζ . Clearly ˜ α PI ( ζ ) = −∞ unless (1 , − M h ) ζ = 0 for every h ∈ ⟨ 1 ⟩ ⊥ ζ . W e claim that the latter is imp ossible unless ζ = π . If this claim holds, then the result follows, since ˜ α PI ( π ) = α PI ( π ) > 0. It is left to prov e the claim that ζ = π if and only if (1 , − M h ) ζ = 0 for ev ery h ∈ ⟨ 1 ⟩ ⊥ ζ . Note that (1 , − M h ) ζ = 0 for every h ∈ ⟨ 1 ⟩ ⊥ ζ if and only if ζ is an eigen vector of M T . Since M is an irreducible generator, the Perron-F robenius theorem applies, and states that the only eigen vector of M T with strictly p ositiv e entries is the steady state π . R emark A.2 . While w e lack a similar general result for the classical log-Sob olev inequality , it is still straigh tforward to find examples where the naive generalisation of the classical log-Sobolev constan t is negativ e for any measure ζ other than the steady state π . In fact, already for a tw o-p oint state space Z = { 1 , 2 } such an example exists. Indeed, tak e the generator M =  − 1 1 a − a  for an arbitrary a > 0, and tak e ζ = ( β , 1 − β ) for an y β ∈ (0 , 1) suc h that ζ  = π = ( a 1+ a , 1 1+ a ). Then, for an y densit y φ ∈ D ζ ( Z ) w e calculate the naive generalisation of the relative Fisher information, see ( 8 ): ( φ, − M log φ ) ζ = log  φ 1 φ 2  ( φ 1 β − φ 2 (1 − β ) a ) = φ 2 log  φ 1 φ 2   φ 1 φ 2 β − (1 − β ) a  =: φ 2 F  φ 1 φ 2  . A straightforw ard extreme v alue analysis of F then shows that there exists a γ > 0 such that F ( γ ) < 0 (w e use here β  = a 1+ a , that is, ζ  = π ). Recalling that the relative entrop y H ζ is non-negative, this shows that the naive generalisation of the classical log-Sob olev inequality constant is generically negativ e. References [AF99] D. Aldous and J. Fill. R eversible Markov chains and r andom walks on gr aphs . 1999. Av ailable at https://www.stat.berkeley.edu/users/aldous/RWG/book.pdf . [A T17] M. P . Allen and D. J. Tildesley . Computer simulation of liquids . Oxford univ ersity press, 2nd edition, 2017. [BGL14] D. Bakry , I. Gentil, and M. Ledoux. Analysis and Ge ometry of Markov Diffusion Op er ators , v olume 348 of Grund lehr en der mathematischen Wissenschaften . Springer International Publishing, 2014. 32 [Bra02] A. Braides. Gamma-con ver genc e for Be ginners , volume 22 of Oxfor d L e ctur e Series in Mathematics and its Applic ations . Oxford Universit y Press, 2002. [BT06] S. G. Bobko v and P . T etali. Mo dified logarithmic Sob olev inequalities in discrete settings. Journal of The or etic al Pr ob ability , 19(2):289–336, 2006. [Buc94] P . Buchholz. Exact and Ordinary Lumpability in Finite Marko v Chains. Journal of Applie d Pr ob ability , 31(1):59–75, 1994. [CDPP09] P . Caputo, P . Dai Pra, and G. Posta. Conv ex entrop y decay via the Bo chner–Bakry–Emery approac h. A nnales de l’Institut Henri Poinc ar´ e, Pr ob abilit´ es et Statistiques , 45(3):734–751, 2009. [CGP05] Y. Cao, D. T. Gillespie, and L. R. Petzold. The slow-scale sto chastic simulation algorithm. The Journal of Chemic al Physics , 122(1):014116, 2005. [CGWW09] P . Cattiaux, A. Guillin, F.-Y. W ang, and L. W u. Lyapuno v conditions for super poincar´ e inequalities. Journal of F unctional A nalysis , 256(6):1821–1841, 2009. [Cho03] A. J. Chorin. Conditional exp ectations and renormalization. SIAM Multisc ale Mo deling and Simulation , 1(1):105–118, 2003. [Com18] P . L. Com b ettes. Perspective F unctions: Prop erties, Constructions, and Examples. Set- V alue d and V ariational Analysis , 26(2):247–264, 2018. [DLP + 18] M. H. Duong, A. Lamacz, M. A. Peletier, A. Sc hlich ting, and U. Sharma. Quantification of coarse-graining error in Langevin and ov erdamp ed Langevin dynamics. Nonline arity , 31(10):4517, 2018. [DSC96] P . Diaconis and L. Saloff-Coste. Logarithmic Sob olev inequalities for finite mark o v chains. The Annals of Applied Prob ability , 6(3):695–750, 1996. [D W05] P . Deuflhard and M. W eber. Robust perron cluster analysis in conformation dynamics. Line ar algebr a and its applic ations , 398:161–184, 2005. [EF18] M. Erbar and M. F athi. P oincar´ e, mo dified logarithmic Sobolev and isop erimetric inequal- ities for Marko v c hains with non-negative Ricci curv ature. Journal of F unctional Analysis , 274(11):3056–3089, 2018. [EL VE05] W. E, D. Liu, and E. V anden-Eijnden. Nested sto chastic simulation algorithm for chemical kinetic systems with disparate rates. The Journal of Chemical Physics , 123(19), 2005. [EM12] M. Erbar and J. Maas. Ricci Curcature of Finite Mark ov Chains via Con vexit y of the En tropy . Ar chive for R ational Me chanics and Analysis , 206:997–1038, 2012. [FM16] M. F athi and J. Maas. Entropic Ricci curv ature b ounds for discrete interacting systems. The Annals of Applied Prob ability , 26(3):1774–1806, 2016. [FSW18] K. F ack eldey , A. Sikorski, and M. W eber. Spectral clustering for non-reversible Marko v c hains. Computational and Applie d Mathematics , 37(5):6376–6391, 2018. [Hil17] B. Hilder. An FIR inequality for Mark ov jump processes on discrete state spaces. Master’s thesis, Eindhov en Universit y of T echnology / Universit y of Stuttgart, 2017. https://pure. tue.nl/ws/portalfiles/portal/89094760/1037422_Hilder.b.pdf . [HNS20] C. Hartmann, L. Neureither, and U. Sharma. Coarse graining of nonreversible stochastic differen tial equations: Quantitativ e results and connections to av eraging. SIAM Journal on Mathematic al Analysis , 52(3):2689–2733, 2020. [HPST20] B. Hilder, M. A. P eletier, U. Sharma, and O. Tse. An inequality connecting entrop y dis- tance, Fisher information and large deviations. Sto chastic Pr o c esses and their Applic ations , 130(5):2596–2638, 2020. [HS24] B. Hilder and U. Sharma. Quantitativ e Coarse-Graining of Marko v Chains. SIAM Journal on Mathematic al A nalysis , 56(1):913–954, 2024. [HvMS26] B. Hilder, P . v an Meurs, and U. Sharma. Supplemen tary material for “Non-equilibrium functional inequalities for Mark ov c hains”, 2026. Av ailable at https://github.com/ Bastian- Hilder/non- equilibrium- functional- ineq . 33 [KS69] J. G. Kemen y and J. L. Snell. Finite markov chains , v olume 26. v an Nostrand Princeton, NJ, 1969. [KW07] S. Kube and M. W eber. A coarse graining metho d for the identification of transition rates b et ween molecular conformations. The Journal of Chemical Physics , 126(2):024103, 2007. [Lah13] S. Lah babi. ´ Etude math´ ematique de mo d ` eles quantiques et classigue p our les mat ´ eriaux al´ eatoir es ` a l’´ echel le atomique . PhD thesis, Universit ´ e de Cergy-Pon toise, 2013. [LL10] F. Legoll and T. Leli` evre. Effective dynamics using conditional expectations. Nonline arity , 23(9):2131–2163, 2010. [LL13] S. Lahbabi and F. Legoll. Effective dynamics for a kinetic Mon te–Carlo mo del with slo w and fast time scales. Journal of Statistic al Physics , 153(6):931–966, 2013. [LLO17] F. Legoll, T. Leli` evre, and S. Olla. Path wise estimates for an effective dynamics. Sto chastic Pr o c esses and their Applic ations , 127(9):2841–2863, 2017. [LLS19] F. Legoll, T. Leli ` evre, and U. Sharma. Effective dynamics for non-rev ersible sto c hastic differen tial equations: a quantitativ e study . Nonlinearity , 32(12):4779, 2019. [LMS25] C. Landim, D. Marcondes, and I. Seo. A resolven t approach to metastability . Journal of the Eur op e an Mathematical Society , 27(4):1563–1618, 2025. [LP17] D. A. Levin and Y. P eres. Markov chains and mixing times , volume 107. American Math- ematical Soc., 2017. [LZ19] T. Leli` evre and W. Zhang. Path wise estimates for effective dynamics: the case of nonlinear v ectorial reaction co ordinates. SIAM Multisc ale Mo deling and Simulation , 17(3):1019–1051, 2019. [Mat] Mathematica. V ersion 14.1. W olfram Research, Inc. https://www.wolfram.com/ mathematica . [Mic18] L. Miclo. Some drawbac ks of finite modified logarithmic Sob olev inequalities. Mathematic a Sc andinavic a , pages 147–159, 2018. [MS20] A. Mielk e and A. Stephan. Coarse-graining via EDP-conv ergence for linear fast-slo w reaction systems. Mathematic al Mo dels and Metho ds in Applie d Scienc es , 30(09):1765–1807, 2020. [MT06] R. Mon tenegro and P . T etali. Mathematical asp ects of mixing times in Marko v c hains. F oundations and T r ends in The or etic al Computer Scienc e , 1(3):237–354, 2006. [MT09] S. P . Meyn and R. L. Tw eedie. Markov chains and stochastic stability . Cambridge Universit y Press, 2009. [PR23] M. A. Peletier and D. M. Renger. F ast reaction limits via Γ-conv ergence of the flux rate functional. Journal of Dynamics and Differ ential Equations , 2023. [PS08] G. A. Pa vliotis and A. Stuart. Multisc ale metho ds: aver aging and homogenization . Springer Science & Business Media, 2008. [PWS + 11] J.-H. Prinz, H. W u, M. Saric h, B. Keller, M. Senne, M. Held, J. D. Chodera, C. Sch¨ utte, and F. No ´ e. Marko v mo dels of molecular kinetics: Generation and v alidation. The Journal of Chemic al Physics , 134(17), 2011. [SC97] L. Saloff-Coste. Lectures on finite Mark ov c hains. L e ctur es on pr ob ability the ory and statis- tics , pages 301–413, 1997. [SFHD99] C. Sch¨ utte, A. Fisc her, W. Huisinga, and P . Deuflhard. A direct approach to conformational dynamics based on h ybrid Monte Carlo. Journal of Computational Physics , 151(1):146–168, 1999. [SS13] C. Sch¨ utte and M. Sarich. Metastability and Markov state mo dels in mole cular dynamics , v olume 24. American Mathematical So c., 2013. [SS19] A. Sc hlich ting and M. Slowik. P oincar ´ e and logarithmic Sob olev constan ts for metastable Mark ov chains via capacitary inequalities. The Annals of Applie d Pr ob ability , 29(6):3438– 3488, 2019. 34 [T uc23] M. E. T uc kerman. Statistic al me chanics: the ory and mole cular simulation . Oxford univ ersity press, 2023. [WZ21] F. W eb er and R. Zacher. The en tropy metho d under curv ature-dimension conditions in the spirit of Bakry- ´ Emery in the discrete setting of Marko v c hains. Journal of F unctional Analysis , 281(5):109061, 2021. [Zha16] W. Zhang. Asymptotic analysis of multiscale marko v chain. arXiv:1512.08944v2, 2016. [ZHS16] W. Zhang, C. Hartmann, and C. Sc h ¨ utte. Effective dynamics along giv en reaction coordi- nates, and reaction rate theory . F ar aday Discussions , 195:365–394, 2016. 35

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment