Cramér-type moderate deviation for double index permutation statistics

We establish a Cramér-type moderate deviation theorem for double-index permutation statistics (DIPS). To the best of our knowledge, previous results only provided Berry-Esseen type bounds for DIPS, which cannot yield moderate deviation results and ar…

Authors: Songhao Liu, Qiman Shao, Jingyu Xu

Submitted to Bernoulli Cramér -t ype modera te de viation f or double inde x per mut ation statis tic s S O N G - H A O L I U 1 , a , Q I - M A N S H A O 2 , 3 , b and J I N G -Y U X U 3 , c 1 School of Mathematical Science s, D alian Univ ersity of T echnolo g y , D alian, Liaoning , China, a liusong hao@dlut.edu .cn 2 Department of Statistics and Data Science, Shenzhen Inter national Center for Mathematics, Southern Univ ersity of Science and T echnolo gy, Shenzhen, Guangdong , 518055, China, b shaoqm@sustec h.edu .cn 3 Department of Statistics and Data Science, Southern Univ ersity of Science and T echnology , Shenzhen, Guang dong , 518055, China, c 12131253@mail.sustech.edu.cn W e establish a Cramér -type moderate deviation theorem f or double-index permut ation statistic s (DIPS). T o the best of our k no wledg e, previous results onl y pro vided Berr y -Esseen type bounds f or DIPS, which cannot yield moderate de viation results and are insufficient to capture the optimal con ver g ence rates for some relati v ely sparse DIPS. Our result o ver come these limitations: it not onl y reco v er the optimal conv erg ence rates f or c lassical DIPS, such as the Mann- W hitne y- Wilco x on statistic, but also extend to sparse statistics, includin g t he number of descents in per mutations and Chatterjee’ s rank cor re lation coefficient, for which pre vious approaches do not apply . T o pro v e this result, we establish a Cramér -t ype moderate devia tion of nor mal appro ximation for bounded ex chang eable pairs. Compared with existin g results, our theorem requires more easily v erifiable conditions. Keyw ords : Stein ’ s Method; ex chan g eable pair approac h; Cramér-type moderate devia tion; double inde x permut a tion statistics 1. Introduc tion Let { 𝜉 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑛 ] be a 4-index real number ar ra y . W e are interes ted in the double-index ed permut a tion statistics (DIPS) of the general f or m  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , where 𝜋 is a random per mutation c hosen unif ormly from 𝑆 𝑛 (the symmetric group of degree 𝑛 ). The DIPS of the restricted f orm  𝑖 , 𝑗 𝑎 𝑖 𝑗 𝑏 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) w as first in v esti gated b y Daniels ( 1944 ) in the study of the g eneralized correla tion coe fficient, with K endall’ s 𝜏 and Spearman ’ s 𝜌 bein g special case s. Danie ls ga v e a set of sufficient conditions for t heir asympt otic normality as 𝑛 → ∞ . Later , v ar ious results under w eak ened conditions w ere introduced in Bloemena ( 1964 ), Jog deo ( 1968 ), Abe ( 1969 ), Shapiro and Hubert ( 1979 ), Barbour and Eagleson ( 1986 ) and Pham et al. ( 1989 ). The use of DIPS has div ersel y been sugg ested b y Fr iedman and R afs ky ( 1979 ) (this paper is t he s tar t of graph-based tests), Friedman and Rafs ky ( 1983 ), Sc hillin g ( 1986 ) and Vi gna ( 2015 ) in g raph-based tests, by Huber t and Sc hultz ( 1976 ) in clus tering studies, b y Cliff and Ord ( 1981 ) in geo graph y , by Chatterjee ( 2021 ) and Shi et al . ( 2022 ) in testin g dependence. The asym pto tic pr opert ies of DIP S ha v e also been widel y s t udied by people. Usin g S tein ’ s method, Zhao et al. ( 1997 ) pro ved a Ber ry-Es seen t ype t heor em f or g eneral form DIPS. Ho w e v er t heir results do not apply to statistics such as t he number of descents, t he number of inv ersions of per mutation, or Chatterjee’ s rank cor re lation coe fficient, which appear to be "too sparse". In contrast, b y constructing a 1 2 special e xc hang eable pair , Fulman ( 2004 ) obt ained con v erg ence rate of order 𝑛 − 1 / 2 in the Kolmo goro v metric for both t he number of descents and the number of inv ersions of per mutation. U nf ort una tel y , their met hod cannot be applied to general f or m DIPS. Later , F ang and Röllin ( 2015 ) e xtended Fulman ( 2004 )’ s result to the mult i variate setting and remo v e a cer tain condition ar isin g from the requirement of ex chan g eability . Ne v er theles s, both Fulman ( 2004 ) and Fan g and Röllin ( 2015 ) results address only cert ain special cases of DIPS and are not applicable to the g eneral setting . Moreo ver , none of the exis ting w orks pro vide Cramér -t ype moderate devia tion results f or DIPS. Ber r y -Esseen bound descr ibe the absolute er ror for distr ibutional appro ximation while t he Cramér - type moderate de viation describes the rela ti v e er ror . More precisel y , let { 𝑌 𝑖 } 𝑛 𝑖 = 1 be a sequence of random v ariables that con v erge to 𝑌 in distribut ion, t he Cramér -type moderate dev iation is P ( 𝑌 𝑛 > 𝑥 ) P ( 𝑌 > 𝑥 ) = 1 + er r or ter m → 1 , f or 0 ≤ 𝑥 ≤ 𝑎 𝑛 , where 𝑎 𝑛 → ∞ . Specially , f or the nor malized sum of i.i.d. random variables with finite moment generatin g f unc tions, the rang e 0 ≤ 𝑥 ≤ 𝑛 1 / 6 and t he order of the er ror ter m 𝑛 − 1 / 2 ( 1 + 𝑥 3 ) are optimal, re fer to P et r o v ( 2012 ) f or details. The purpose of this paper is to establish a Cramér -type moderate de viation theorem for double-index ed permut a tion statistic s (DIPS) of the general form  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) with an optimal con v ergenc e rate. W e hope our results can not only be used to yie ld the optimal conv erg ence rate f or some w ell-kno wn statistic s, such as K endall’ s 𝜏 , Spearman’ s 𝜌 and the Mann- W hitne y - Wilco x on statistic, but also for some ’ sparse ’ statistic s such as the number of descents and the number of inv ersions of per mutations and Chatterjee’ s rank cor re lation coefficient. T o ac hiev e this g oal, we apply Stein ’ s method and the e x chan geable pair approach to the abo v e statistics. Stein ’ s method w as first introduced by Stein ( 1972 ), an introduction and a sur ve y of Stein ’ s method can be found in Chen et al. ( 2010 ). The ex chan g eable pair approach of Stein ’ s method is a pow er f ul tool f or estimatin g the con v erg ence rates for distributional appro ximation. Chen et al. ( 2013a ) de ve loped the method to pro v e Cramér -type moderate devia tion results in nor mal approxima tion without ske wness correction for dependent random variables under a boundedness condition. Chen et al. ( 2013b ) and Shao et al. ( 2021 ) considered P oisson approxima tion and nonnormal appro ximations, respec tiv el y . Zhan g ( 2023 ) re fined the results in Chen et al. ( 2013a ) b y relaxing t he boundedness condition. For ex chan g eable pair approach, let 𝑊 be the random variable of interest, and we sa y ( 𝑊 , 𝑊 ′ ) an e x chan geable pair if ( 𝑊 , 𝑊 ′ ) 𝑑 = ( 𝑊 ′ , 𝑊 ) . L et Δ = 𝑊 − 𝑊 ′ . It is of ten to assume that (see, e.g., Rinott and R otar ( 1997 )) there e xits a constant 𝜆 > 0 and a random variable 𝑅 such that E { Δ | 𝑊 } = 𝜆 ( 𝑊 + 𝑅 ) . but unfortunatel y , this assump tion is not satis fied for general DIPS. Zhang ( 2023 ) refined this assump tion, assumin g that there exits 𝐷 : = Ψ ( 𝑊 , 𝑊 ′ ) which is an antisymmetric f unc tion satis fying E ( 𝐷 | 𝑊 ) = 𝜆 ( 𝑊 + 𝑅 ) . So we can constr uct a suit able 𝐷 for g eneral DIPS and use the ex chan g eable pair approach to deriv e a Cramér -t ype moderate devia tion. B y emplo ying the pow erf ul tools mentioned abo v e, w e establish a Cramér -t ype moderate dev iation result for bounded e x chan geable pairs in T heorem 4.1 , which can be view ed as a special case of t he result obtained in Zhang ( 2023 ). T he ke y distinction is that we optimize t he proof so that our theorem remo v es a technical condition in Zhan g ( 2023 ) which is not easy to verify in practice . Building on this result, w e fur ther deriv e a Cramér -t ype moderate deviation result f or doubly -index ed per mutation statistic s in Theorem 2.1 . This theorem o v ercomes the limits of former results, it applies not only to c lassical DIPS but also to relati ve l y sparse ones, and in both cases we are able to achie v e t he optimal con ver g ence rate . Cramér -t ype moder ate dev iation for double index permutation statistics 3 The rest of this paper is or ganized as f ollow s. In Section 2, we present our main result T heorem 2.1 , a Cramér -t ype moderate dev iation theorem f or double inde x per mutation statis tics. In Section 3, w e pro- vide applications of our results to some we ll-k no wn statistics such as Mann- W hitne y - Wilco x on statistic, and some other "sparse" statistics such as the number of descents and in v ersions of per mutations and Chatterjee’ s rank correla tion coefficient. In Section 4, w e pro v e the gener al bound and the application. 2. Main Results Let { 𝜉 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑁 ] be real numbers, and the double index ed per mutation statistic (DIPS) is defined as 𝐷 𝐼 𝑃𝑆 = 𝑛  𝑖 , 𝑗 = 1 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , (2.1) where 𝜋 is a random per mutation chosen uniformly from 𝑆 𝑛 (symmetric group of degree 𝑛 ). Inspired b y the proof of t he Combinatorial Central Limit T heorem, Zhao et al. ( 1997 ) con v er ted t he general form of DIPS ( 2.1 ) to the form of  𝑛 𝑖 = 1 𝑎 ( 𝑖 , 𝜋 ( 𝑖 ) ) +  ′ 𝑖 , 𝑗 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , 1 where { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑁 ] is a real matrix and { 𝑏 ( 𝑖, 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑁 ] is a 4-index real ar ra y . W e also use t his nor mali zed f orm of DIPS and consider some boundedness conditions that hold t rue f or most cases. For the sake of simplicity in writing , some notations w e use in this ar tic le are as follo ws 𝑎 ( 𝑖 , ·) = 1 𝑛  𝑘 𝑎 ( 𝑖 , 𝑗 ) , 𝑎 ( · , ·) = 1 𝑛 2  𝑖 , 𝑘 𝑎 ( 𝑖 , 𝑘 ) , 𝑏 ( 𝑖 , 𝑗 , 𝑘 , ·) = 1 𝑛  𝑙 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) , 𝑏 ( 𝑖 , 𝑗 , · , · ) = 1 𝑛 2  𝑘 , 𝑙 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) , 𝑏 ( 𝑖 , · , · , ·) = 1 𝑛 3  𝑗 , 𝑘 , 𝑙 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) , 𝑏 ( · , · , · , · ) = 1 𝑛 4  𝑖 , 𝑗 , 𝑘 , 𝑙 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) , Proposition 2.1. By a suitable normaliza tion, the gener al DIPS ( 2.1 ) c an be con verted into one of the follow ing forms 𝑊 𝑛 =  𝑖 𝑎 ( 𝑖 , 𝜋 ( 𝑖 ) ) +  𝑖 , 𝑗 ′ 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , (2.2) wher e { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑁 ] is a r eal number matrix and { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑁 ] is a 4-index real number array , and { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑁 ] satisfie s 𝑎 ( 𝑖 , 𝑘 ) ≡ 0 or  𝑖 , 𝑗 𝑎 2 ( 𝑖 , 𝑘 ) = 𝑛 − 1 , (2.3) and 𝑎 ( 𝑖 , ·) = 𝑎 ( · , 𝑘 ) = 0 , (2.4) 1 Throughout this paper,  ′ 𝑖 , 𝑗 denotes  𝑖 , 𝑗 ,𝑖 ≠ 𝑗 . 4 and { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑁 ] satisfie s 𝑏 ( 𝑖 , 𝑗 , 𝑘 ·) = 𝑏 ( 𝑖 , 𝑗 , · , 𝑙 ) = 𝑏 ( 𝑖 , · , 𝑘 , 𝑙 ) = 𝑏 ( · , 𝑗 , 𝑘 , 𝑙 ) = 0 , (2.5) no matter what { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑁 ] is. In earlier wor k, Zhao et al. ( 1997 ) der i ved t he Ber ry-Es seen bound for DIPS, but t heir result has notable limit ations . First, it onl y applies to the case where { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑛 ] ≠ 0 in definition ( 2.2 ); when { 𝑎 ( 𝑖, 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑛 ] ≡ 0 , the result does not yield a conv erg ence rate. Ho we v er , in practice, there e xits import ant double -inde x per mutation statis tics sa tisfyin g { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑛 ] ≡ 0 , such as Cha tter jee’ s rank cor re lation coefficient, which is quite popular recentl y for detectin g the independence betw een random v ariables. Moreo ver , ev en in the case { 𝑎 ( 𝑖, 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑛 ] ≠ 0 , t he result in Zhao et al. ( 1997 ) is still limited: for cer tain s tatistics, includin g the number of descents (Des) and t he number of in v ersions (In v) of permut ation, it fails to pro vide the optimal conv erg ence rate. Alt hou gh Fulman ( 2004 ) prov ided a Berr y -Esseen bound re sult f or both Des and In v , there result is onl y suitable f or specific cases  𝑖 , 𝑗 1 { 𝑖 < 𝑗 } 𝑀 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) , where 𝑀 = ( 𝑀 𝑖 , 𝑗 ) be a real, antisymmetr ic 𝑛 ∗ 𝑛 matr ix, and canno t be applied to the gener al DIPS. The pur pose of our result is to present t he conv erg ence rate of t he nor mal approxima tion of a g eneral double inde x per mutation statis tics and be able to o v ercome the limitations of pre vious results . In T heorem 2.1 , w e giv e a Cramér -type moderate de viation for t he DIPS under some boundedness conditions. Theorem 2.1. F or double index permutation statistics 𝑊 𝑛 defined as ( 2.2 ) in Proposition 2.1 , assume that for some cons tant 0 < 𝛿 < 1 , the follow ing boundedness conditions hold max 𝑖 , 𝑘 | 𝑎 ( 𝑖, 𝑘 ) | ≤ 𝛿 , max 𝑖 , 𝜋  𝑗 | 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) | ≤ 𝛿 , max 𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | ≤ 𝛿 , max 𝑠 ∈ { 1 , 2 } 𝑡 ∈ { 1 , 2 } max 𝑖 𝑠 ∈ [ 𝑛 ] 𝑘 𝑡 ∈ [ 𝑛 ]  𝑖 3 − 𝑡 ∈ [ 𝑛 ] 𝑗 3 − 𝑡 ∈ [ 𝑛 ] | 𝑏 ( 𝑖 1 , 𝑖 2 , 𝑘 1 , 𝑘 2 ) | ≤ 𝛿 . (2.6) Wit hout loss of g enerality, assume that E { 𝑊 2 𝑛 } = 1 + 𝐶 √ 𝑛 , where 𝐶 is a constant. F or any 𝜃 > 0 , let 𝜏 ( 𝜃 ) : = max { 0 ≤ 𝑡 ≤ 1 / 𝛿 : 𝑡 3 𝛿 + √ 𝑛 𝛿 3 𝑡 2 + 𝑛 𝛿 3 𝑡 3 + 𝑡 𝛿 + 𝑡 2 / 𝑛 ≤ 𝜃 } . T hen for any 0 ≤ 𝑧 ≤ 𝜏 ( 𝜃 ) ,     P ( 𝑊 𝑛 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     ≤ 𝐶 1 𝑒 𝜃 ( 1 + 𝑧 2 ) ( √ 𝑛 𝛿 2 + 𝑛 𝛿 3 + 𝑛 𝛿 3 𝑧 + 𝛿 ) (2.7) Remar k 1. T he boundedness conditions in ( 2.6 ) are crucial for establishing t he optimal con v erg ence rate in Theorem 2.1 . They ensure that the contr ibutions from t he various components of the double inde x permut a tion statis tics are controlled, allow in g f or precise asympto tic analy sis. Although the boundedness condition ma y appear complicated, it is in fac t satis fied by t he vas t majority of doubly - inde x ed per mutation statistic s, includin g Kendall’ s 𝜏 , Spearman’ s 𝜌 , the Mann- W hitne y - Wilc o x on statistic , and Chatterjee’ s rank cor r ela tion coefficient. If 𝛿 = 𝑂 ( 1 / √ 𝑛 ) , b y T heorem 2.1 , we are able to obtain the optimal conv erg ence rate.     P ( 𝑊 𝑛 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     = 𝑂 ( 1 ) ( 1 + 𝑧 3 ) / √ 𝑛, 0 ≤ 𝑧 ≤ 𝑛 1 / 6 . (2.8) Cramér -t ype moder ate dev iation for double index permutation statistics 5 Remar k 2. In contrast, without such restrictions, the con ver g ence rate obtained in our theorem in- e vitably includes an extra term of the f orm 𝑧 3  𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | 3 . A similar ter m  𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | 3 also appears in the Ber ry-Es seen bound of Zhao et al. ( 1997 ), which slo w s down the rate of con v er - g ence. For many c lassical double index permutation s tatistics, such as Kendall’ s 𝜏 , Spear man ’ s 𝜌 , etc., we hav e max 𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | = 𝑂 ( 1 / 𝑛 3 / 2 ) , so that  𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | 3 still yields the opti- mal con ver g ence rate . Ho w ev er , f or some rela ti v ely "sparse" statistics, such as the number of de - scents and inv ersions in per mutation, Chatterjee’ s rank cor r ela tion coe fficient, etc., we only ha v e max 𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | = 𝑂 ( 1 / 𝑛 1 / 2 ) . This ar ises because indicator f unc tion al w a ys appears in the defi- nition of such special statistics . For e xample, 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) = 1 { 𝑖 = 𝑗 + 1 } 𝑂 ( 1 / 𝑛 1 / 2 ) . In t hese cases, t he term  𝑖 , 𝑗 , 𝑘 , 𝑙 | 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) | 3 does not ev en tend to w ards 0, and thus the results of pre vious studies are not applicable. By contrast, our theorem ensures that the proposed boundedness condition is still satisfied in this setting , allo win g us to ac hie v e the optimal con v ergenc e rate e ven for such sparse statistics . 3. Applications In this section, t he applica tion of T heorem 2.1 is demonstrated b y three ex amples. In addition to those w e ll-k nown test statistics, Theorem 2.1 also applies to some rela tiv el y "sparse" statistics such as the number of descents and in v ersions of a permut a tion, and the recentl y v er y popular statis tic Chatter jee ’ s rank correlation coe fficient for independence tes tin g . 3.1. The Chatter jee’ s rank correla t ion coefficient Chatterjee ( 2021 ) int roduced a no v el and concise rank -based statistic that has recently gained consider - able attention. U nlike traditional measures, this statistic corresponds to a population quantity proposed b y Det te et al. ( 2013 ) that characterizes independence between random variables, and moreo v er , its distribution under independence can be described b y an a sympto tic nor mal la w . These charac teristics make Chatterjee ’ s coe fficient a par ticular l y appealing tool f or both theoretical in v estig ation and practical applications in dependence modelin g. Moreo ver , the Chatter jee ’ s rank correla tion is a statistic ba sed on rank and can be e xpressed as a double index per mutation statistic ( 2.1 ). T here fore, we can use T heorem 2.1 to obtain its optimal con v ergenc e rate. Let ( 𝑋 , 𝑌 ) be a pair of random variables, 𝑌 is not a constant. Let ( 𝑋 1 , 𝑌 1 ) , . . . , ( 𝑋 𝑛 , 𝑌 𝑛 ) be i.i.d. pairs with the same law as ( 𝑋 , 𝑌 ) , where 𝑛 ≥ 2 . Suppose that the 𝑋 𝑖 ’ s and 𝑌 𝑖 ’ s hav e no ties. Rearran g e the data as ( 𝑋 ( 1 ) , 𝑌 ( 1 ) ) , . . . , ( 𝑋 ( 𝑛 ) , 𝑌 ( 𝑛 ) ) such that 𝑋 ( 1 ) ≤ · · · ≤ 𝑋 ( 𝑛 ) . L et 𝑟 𝑖 be t he rank of 𝑌 ( 𝑖 ) , t hat is, the number of 𝑗 such that 𝑌 ( 𝑗 ) ≤ 𝑌 ( 𝑖 ) . The Chatter jee ’ s rank correla tion coefficient is defined as 𝑊 =  5 𝑛 2  1 − 3  𝑛 − 1 𝑖 = 1 | 𝑟 𝑖 + 1 − 𝑟 𝑖 | 𝑛 2 − 1  . (3.1) The as ympto tic proper ty of t his statis tic is a corolla of t he theorem in Chao et al. ( 1993 ), but its con ver g ence rate has nev er reached an optimal result. The follo wing t heorem will pro vide t he optimal con ver g ence rate . Theorem 3.1. Let 𝑊 defined as ( 3.1 ), we have     P ( 𝑊 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     = 𝑂 ( 1 ) ( 1 + 𝑧 3 ) / √ 𝑛, (3.2) for 0 ≤ 𝑧 ≤ 𝑛 1 / 6 . 6 Remar k 3. Chatterjee ( 2021 ) ment ioned that the a sympto tic propert y of t he statistic ( 3.1 ) is essentially a res tatement of the main theorem of Chao et al. ( 1993 ). T he y consider an es timator called "oscillation of permut ation" defined as 𝑛 − 1  𝑖 = 1 | 𝜋 ( 𝑖 ) − 𝜋 ( 𝑖 + 1 ) | . The Chatter jee ’ s rank cor r ela tion coefficient is normali zed "oscillation of permut ation". Ho w e v er, Chao et al. ( 1993 ) merel y presents the asym ptotic property and does not pro vide the conv erg ence rate. In the subsequent article Chao et al. ( 1996 ) giv e a Ber ry-Esseen bound of  𝑛 𝑖 = 1 𝛼 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 + 1 ) . L et 𝛼 𝑖 𝑗 = | 𝑖 − 𝑗 | , it is t he "oscillation of permut a tion" t hat is namel y t he Chatterjee’ s rank correlation coe fficient. How ev er , when t he result in Chao et al. ( 1996 ) is applied to the Chatterjee rank cor r ela tion, the optimal con v erg ence rate cannot be achiev ed. T her ef ore, our result should be the firs t to present t he optimal conv erg ence rate of oscillation of permut a tion, namel y t he Chatterjee rank correlation coe fficient. 3.2. The Number of descents and inv ersions of per muta tion Let 𝑀 be a real 𝑛 × 𝑛 matr ix and assume that 𝑀 is anti-symmetric, t hat is for each 𝑢 , 𝑣 ∈ { 1 , . . . , 𝑛 } , w e hav e 𝑀 𝑢 𝑣 = − 𝑀 𝑣 𝑢 . N ote that 𝑀 𝑢𝑢 = 0 . Let 𝜋 be a permutation of size 𝑛 , chosen unif or ml y from 𝑆 𝑛 , and consider the statistic 𝑊 =  𝑖 , 𝑗 𝑖 < 𝑗 𝑀 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) . This permutation statistic was considered by many earl y wor ks such as Fulman ( 2004 ), F ang and Röllin ( 2015 ), and it is a special case of doubl y-inde x ed per mutation statistic s 𝑊 =  𝑖 , 𝑗 1 { 𝑖 < 𝑗 } 𝑀 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) . (3.3) The reason to stud y ( 3.3 ) is that tw o important proper ties of per mutations, the number of descents and in v ersions, can be represented in this f orm. Choosing 𝑀 𝑢, 𝑢 + 1 = − 1 and 𝑀 𝑢 𝑣 = 0 f or all other 𝑣 > 𝑢 (f or 𝑣 < 𝑢 , 𝑀 𝑢 𝑣 is defined via ant i-s ymmet r y), ( 3.3 ) becomes 2 Des ( 𝜋 − 1 ) − ( 𝑛 − 1 ) , where Des ( 𝜋 ) is the number of descents of 𝜋 ; wit h 𝑀 𝑢𝑛 = − 1 for all 𝑢 < 𝑣 , ( 3.3 ) becomes 2 Inv ( 𝜋 − 1 ) − 𝐶 2 𝑛 , where In v ( 𝜋 ) is the number of inv ersions of 𝜋 . By using Stein ’ s method, Zhao et al. ( 1997 ) pro v e a general Ber ry -Esseen t ype theorem for double index ed permut ation statistics, but t heir results do not apply to t he number of descents Des ( 𝜋 ) , which seems to be "too sparse". In contrast, using a special ex chan g eable pair , Fulman ( 2004 ) w as able to obtain a rate of conv erg ence of 𝑛 − 1 / 2 f or the K olmog oro v metric f or both, t he number of descents and inv ersions. In cont ras t, our results are superior, and we can not onl y obt ain the optimal conv erg ence rate of those w ell-kno wn classic al statistics, but also achie v e the optimal results f or this "sparse" statis tic. In the follo wing t heorem we consider the normalized number of descents and in v ersions of a random per mutation 𝑊 = Des − ( 𝑛 − 1 ) / 2  ( 𝑛 + 1 ) / 6 , 𝑇 = In v − 𝐶 2 𝑛 / 2  𝑛 ( 𝑛 − 1 ) ( 2 𝑛 + 5 ) / 72 . (3.4) Cramér -t ype moder ate dev iation for double index permutation statistics 7 Theorem 3.2. L e t D e s ( 𝜋 ) and Inv ( 𝜋 ) be the number of desc ents and inv ersions of a random per mutation 𝜋 ∈ 𝑆 𝑛 , 𝑊 and 𝑇 are nor malized s tatistic s defined as ( 3.4 ). T hen we have     P ( 𝑊 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     = 𝑂 ( 1 ) ( 1 + 𝑧 3 ) / √ 𝑛, (3.5)     P ( 𝑇 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     = 𝑂 ( 1 ) ( 1 + 𝑧 3 ) / √ 𝑛, (3.6) for 0 ≤ 𝑧 ≤ 𝑛 1 / 6 . 3.3. The Mann- Whitney - Wilc o x on statistic The Mann- W hitney - Wilco x on statistic is one of the members of U-statistics of degree two . Let 𝑥 1 , . . . , 𝑥 𝑛 1 , and 𝑦 1 , . . . , 𝑦 𝑛 2 , 𝑛 1 + 𝑛 2 = 𝑛 ,be independent uni v ariate random samples from unknown continuous distributions 𝐹 𝑋 and 𝐹 𝑌 , respec ti v ely . T he Mann- Whitney - Wilc o x on statistic for testing the hypothesis 𝐻 0 : 𝐹 𝑋 = 𝐹 𝑌 is defined to be t he total number of pairs ( 𝑥 𝑖 , 𝑦 𝑖 ) for which 𝑥 𝑖 < 𝑦 𝑖 . Let 𝜋 ( 𝑖 ) , 𝑖 = 1 , . . . , 𝑛 1 , denote t he rank of 𝑥 𝑖 and 𝜋 ( 𝑛 1 + 𝑗 ) , 𝑗 = 1 , . . . , 𝑛 2 , denote t ha t of 𝑦 𝑗 in the combined sample . Then t he Mann- W ilco xon- Wilco xon statistic can be e xpres sed as  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , where 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) = 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 , 1 ≤ 𝜋 ( 𝑖 ) < 𝜋 ( 𝑗 ) ≤ 𝑛 } , and 𝜋 is chosen unif ormly from 𝑆 𝑛 under 𝐻 0 . Consider the statistic 𝑊 =  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) − 𝑛 1 𝑛 2 / 2 ( 𝑛 1 𝑛 2 ( 𝑛 + 1 ) / 12 ) 1 / 2 , (3.7) w e ha ve t he f ollowin g t heor em. Theorem 3.3. Assume 𝑊 is defined as ( 3.7 ), we hav e     P ( 𝑊 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     = 𝑂 ( 1 ) ( 1 + 𝑧 3 ) / ( 1 / 𝑛 1 + 1 / 𝑛 2 ) 1 / 2 , (3.8) for 0 ≤ 𝑧 ≤ min ( 𝑛 1 , 𝑛 2 ) 1 / 6 . 4. Proof of main results In t his section, w e giv e the proof of T heorem 2.1 . In subsection 4.1 , w e pro v e the Proposition 2.1 . In subsection 4.2 , we gi v e Theoren 4.1 , a Cramér -t ype moderate de viation result f or bounded e x chan geable pairs, which is the main tool to pro v e Theorem 2.1 . In subsection 4.3 , w e giv e L emma 4.2 and Lemma 4.1 , the proof of T heorem 2.1 follo ws a combination of T heorem 4.1 and t wo L emmas,the det ails of 8 the proof are put in t he subsection 4.3 . W e put the proof of Theorem 4.1 in subsection 4.4 . The proof of Theorem 4.1 is based on Stein ’ s met hod and the e xc hang eable pair approach. W e gi v e two propositions 4.1 , 4.2 in subsection 4.4 . t he proof of Theorem 4.1 follo ws from a combina tion of Proposition 4.1 and Proposition 4.2 . 4.1. A decom position of general DIP S Bef ore givin g t he proof w e briefly recall t he notation used abov e. T he double –index ed per mutation statistic (DIPS) is defined in ( 2.1 ), where 𝜋 is a random per mutation chosen uniformly from 𝑆 𝑛 . As e xplained in the int roduc t ion (and follo wing Zhao et al. ( 1997 )), we aim to reduce t he general kernel 𝜉 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) to t he nor malized form 𝑊 𝑛 =  𝑖 𝑎 ( 𝑖 , 𝜋 ( 𝑖 ) ) +  ′ 𝑖 , 𝑗 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , with  ′ 𝑖 , 𝑗 denotin g summation o ver 𝑖 ≠ 𝑗 . For con v enience we use t he marginal–av eragin g notation introduced earlier , e.g . 𝑎 ( 𝑖 , ·) = 1 𝑛  𝑘 𝑎 ( 𝑖 , 𝑘 ) , 𝑏 ( 𝑖 , 𝑗 , 𝑘 , ·) = 1 𝑛  𝑙 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) , and similarl y for higher – order av erag es. T he proposition assert s that, after an appropriate center in g and scaling , the matrix 𝑎 ( 𝑖 , 𝑘 ) can be chosen to sa tisfy either 𝑎 ( 𝑖 , 𝑘 ) ≡ 0 or  𝑖 , 𝑘 𝑎 2 ( 𝑖 , 𝑘 ) = 𝑛 − 1 , tog et her with t he marginal conditions 𝑎 ( 𝑖 , ·) = 𝑎 ( · , 𝑘 ) = 0 ; like wise the f our– w a y ar ra y 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) ma y be t ak en to hav e all one – dimensional and tw o – dimensional marginals equal to zero (cf. ( 2.4 ) and ( 2.5 )). The proof proceeds b y successi ve marginal– centering of t he kernel 𝜉 ; for notational con v enience we denote the f ull y centered kernel by 𝜉 ∗ as in ( 4.1 ). W e no w tur n to t he v er ific ation of Proposition 2.1 . Proof of Proposition 2.1 . For g eneral DIPS ( 2.1 ), let 𝜉 ∗ ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) = 𝜉 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) − [ 𝜉 ( 𝑖, 𝑗 , 𝑘 , ·) + 𝜉 ( 𝑖 , 𝑗 , · , 𝑙 ) + 𝜉 ( 𝑖 , · , 𝑘 , 𝑙 ) + 𝜉 ( · , 𝑗 , 𝑘 , 𝑙 ) ] + [ 𝜉 ( 𝑖 , 𝑗 , · , ·) + 𝜉 ( 𝑖 , · , 𝑘 , ·) + 𝜉 ( 𝑖 , · , · , 𝑙 ) + 𝜉 ( · , 𝑗 , 𝑘 , ·) + 𝜉 ( · , 𝑗 , · , 𝑙 ) + 𝜉 ( · , · , 𝑘 , 𝑙 ) ] − [ 𝜉 ( 𝑖 , · , · , · ) + 𝜉 ( · , 𝑗 , · , ·) + 𝜉 ( · , · , 𝑘 , ·) + 𝜉 ( · , · , · , 𝑙 ) ] + 𝜉 ( · , · , · , · ) , (4.1) then we hav e 𝜉 ∗ ( 𝑖 , 𝑗 , 𝑘 , ·) = 𝜉 ∗ ( 𝑖 , 𝑗 , · , 𝑙 ) = 𝜉 ∗ ( 𝑖 , · , 𝑘 , 𝑙 ) = 𝜉 ∗ ( · , 𝑗 , 𝑘 , 𝑙 ) = 0 , and  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) =  𝑖 , 𝑗 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) + 𝑛  𝑖 𝜉 ( 𝑖 · , 𝜋 ( 𝑖 ) , · ) + 𝑛  𝑗 𝜉 ( · , 𝑗 , · , 𝜋 ( 𝑗 ) ) − 𝑛 2 𝜉 ( · , · , · , · ) = ′  𝑖 , 𝑗 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) +  𝑖 𝜂 ∗ ( 𝑖 , 𝜋 ( 𝑖 ) ) + 𝑛𝜂 ( · , ·) , (4.2) where 𝜂 ( 𝑖 , 𝑘 ) = 𝜉 ∗ ( 𝑖 , 𝑖 , 𝑘 , 𝑘 ) + 𝑛 𝜉 ( 𝑖 , · , 𝑘 , · ) + 𝑛 𝜉 ( · , 𝑖 , · , 𝑘 ) − 𝑛 𝜉 ( · , · , · , ·) , 𝜂 ∗ ( 𝑖 , 𝑘 ) = 𝜂 ( 𝑖 , 𝑘 ) − 𝜂 ( 𝑖 , ·) − 𝜂 ( · , 𝑘 ) + 𝜂 ( · , ·) , 𝜂 ∗ ( 𝑖 , ·) = 𝜂 ∗ ( · , 𝑘 ) = 0 . (4.3) Cramér -t ype moder ate dev iation for double index permutation statistics 9 If 𝜂 ∗ ( 𝑖 , 𝑘 ) ≠ 0 , w e define 𝜎 2 =  𝑖 , 𝑘 𝜂 ∗ 2 ( 𝑖 , 𝑘 ) / ( 𝑛 − 1 ) , and t he nor malized DIPS is de fined as 𝑊 𝑛 = 𝐷 − 𝑛 𝜂 ( · , · ) 𝜎 =  𝑖 1 𝜎 𝜂 ∗ ( 𝑖 , 𝜋 ( 𝑖 ) ) +  𝑖 , 𝑗 ′ 1 𝜎 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) : =  𝑖 𝑎 ( 𝑖 , 𝜋 ( 𝑖 ) ) +  𝑖 , 𝑗 ′ 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) . If 𝜂 ∗ ( 𝑖 , 𝑘 ) = 0 f or all 𝑖 , 𝑘 ∈ [ 𝑁 ] , w e define the nor malized DIPS as 𝑊 𝑛 = 𝐷 − 𝑛 𝜂 ( · , ·) =  𝑖 , 𝑗 ′ 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) : =  𝑖 , 𝑗 ′ 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) . This completes the proof of Proposition 2.1 . 4.2. Cramér -type moderate de viation for bounded ex changeable pairs T o der i ve the Cramér -type moderate de viation for double index per mutation statistics, w e firstl y pro vide a Cramér -type moderate dev iation result f or bounded e x chan geable pairs. Let ( 𝑋 , 𝑋 ′ ) be an ex chang eable pair , 𝑋 is F -measurable and v alued on a measurable space X , let 𝑊 be an R - v alued random v ariable of interes t. W e consider the f ollo win g condition: ( 𝐷 1 ) : L e t ( 𝑋 , 𝑋 ′ ) be an ex chan g eable pair . Assume t hat there e xits 𝐷 : = Ψ ( 𝑋 , 𝑋 ′ ) , where Ψ : X × X → R is an antisymmetric f unc tion, satis f yin g t hat E ( 𝐷 | F ) = 𝜆 ( 𝑊 + 𝑅 ) f or some constant 𝜆 > 0 and some random v ar iable 𝑅 which is measurable with res pect to F . Theorem 4.1. Let ( 𝑋 , 𝑋 ′ ) be an exc hang eable pair satisfying the condition (D1), and ( 𝑊 , 𝑊 ′ ) is also an exc hang eable pair , Δ = 𝑊 − 𝑊 ′ . Let max { | Δ | , | 𝐷 | } ≤ 𝛿 for some cons tant 𝛿 > 0 . A ssume that there exits a cons tant 𝜏 > 0 such that ( 𝐴 1 ) : E { 𝑒 𝑡 𝑊 } < ∞ , ( 𝐴 2 ) : E {   1 − 1 2 𝜆 E { 𝐷 Δ | 𝑊 }   𝑒 𝑡 𝑊 } ≤ 𝛿 1 ( 𝑡 ) E { 𝑒 𝑡 𝑊 } , ( 𝐴 3 ) : E { | 𝑅 | 𝑒 𝑡 𝑊 } ≤ 𝛿 2 ( 𝑡 ) E { 𝑒 𝑡 𝑊 } , wher e for each 𝑗 = 1 , 2 , the fucntion 𝛿 𝑗 ( ·) is increasing and satis fies that 0 ≤ 𝛿 𝑗 ( 𝜏 ) < ∞ . F or 𝜃 > 0 , let 𝜏 0 ( 𝜃 ) : = max { 0 ≤ 𝑡 ≤ min { 𝜏, 1 / 𝛿 } : 𝑡 2 ( 𝑡 𝛿 + 2 𝛿 1 ( 𝑡 ) ) / 2 + 3 𝑡 𝛿 2 ( 𝑡 ) ≤ 𝜃 } . T hen for any 𝜃 > 0 , 0 ≤ 𝑧 ≤ 𝜏 0 ( 𝜃 ) ,     P ( 𝑊 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     ≤ 31 𝑒 𝜃 ( 1 + 9 𝛿 )  ( 1 + 𝑧 2 ) [ 𝛿 1 ( 𝑧 ) + 𝛿 + 𝛿 · 𝛿 2 ( 𝑧 ) ] + ( 1 + 𝑧 ) 𝛿 2 ( 𝑧 )  (4.4) The proof of Theorem 4.1 is put in subsection 4.4 . Remar k 4. Theorem 4.1 establishes a Cramér -t ype moderate devia tion for bounded ex chan g eable pairs under condition ( 𝐷 1 ) , tog ether with the boundedness assumptions on | 𝐷 | and | Δ | . These boundedness conditions are natural, as all the ex amples of double -inde x per mutation statistics w e consider satis f y them. B y com parison, Theorem 2.1 in Zhang ( 2023 )addresses the unbounded case and req uires v er ifying f our main conditions ( A1)-( A4). Among t hem, condition ( A3) is particularl y difficult to v erif y in prac tice under the unbounded setting , and ev en when restricted to the bounded case, it still needs to be v erified separate l y . By refinin g the proof, w e are able to eliminate t his condition entirel y in the bounded sett ing . 10 4.3. Proof of Theorem 2.1 T o simplify notation, we denote 𝑎 𝑖 𝑘 : = 𝑎 ( 𝑖 , 𝑘 ) and 𝑏 𝑖 𝑗 𝑘 𝑙 : = 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) , denote ℎ ( 𝑡 ) : = E { 𝑒 𝑡 𝑊 𝑛 } and Ψ 𝑡 ( 𝑊 𝑛 ) : = 𝑒 𝑡 𝑊 𝑛 . Proof of Theorem 2.1 . Firstl y , w e define t he ex chan g eable pair ( 𝜋 , 𝜋 ′ ) , where 𝜋 is a random per - mutation chosen uniformly from 𝑆 𝑛 (symmetric g roup of degree 𝑛 ), 𝜋 ′ is a random permut ation by interchan ging 𝜋 ( 𝐼 ) and 𝜋 ( 𝐽 ) and leavin g the rest of the indices of 𝜋 unchan g ed. Let ( 𝐼 , 𝐽 ) be a random pair of indices chosen uniforml y from { ( 𝑖 , 𝑗 ) : 𝑖 ≠ 𝑗 ∈ [ 𝑁 ] } . T hen we can easily hav e ( 𝜋 , 𝜋 ′ ) is an e x chan geable pair . B y Proposition 2.1 , the gener al double inde x permut ation s tatistics is defined as 𝑊 𝑛 =  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) +  𝑖 , 𝑗 ′ 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) , 𝑊 ′ 𝑛 =  𝑖 = 1 𝑎 𝑖 𝜋 ′ ( 𝑖 ) +  𝑖 , 𝑗 ′ 𝑏 𝑖 𝑗 𝜋 ′ ( 𝑖 ) 𝜋 ( 𝑗 ) , (4.5) since ( 𝜋 , 𝜋 ′ ) is an ex chan g eable pair, so as ( 𝑊 𝑛 , 𝑊 ′ 𝑛 ) . Then we define 𝐷 = Ψ ( 𝜋 , 𝜋 ′ ) as an antisymmetric function of e x chan geable pairs ( 𝜋, 𝜋 ′ ) as f ollo w s 𝐷 = 𝑎 𝐼 𝜋 ( 𝐼 ) − 𝑎 𝐼 𝜋 ′ ( 𝐼 ) +  𝑠 ∉ { 𝐼 , 𝐽 } 𝑏 𝐼 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) −  𝑠 ∉ { 𝐼 , 𝐽 } 𝑏 𝐼 𝑠 𝜋 ′ ( 𝐼 ) 𝜋 ′ ( 𝑠 ) = 𝑎 𝐼 𝜋 ( 𝐼 ) − 𝑎 𝐼 𝜋 ( 𝐽 ) +  𝑠 ∉ { 𝐼 , 𝐽 } 𝑏 𝐼 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) −  𝑠 ∉ { 𝐼 , 𝐽 } 𝑏 𝐼 𝑠 𝜋 ( 𝐽 ) 𝜋 ( 𝑠 ) . (4.6) Theref ore w e ha v e E { 𝐷 | 𝜋 } = 1 𝑛 𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) + 1 𝑛 ( 𝑛 − 1 ) 𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) + 1 𝑛  𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) − 1 𝑛 ( 𝑛 − 1 ) 𝑛  𝑖 = 1 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 ) = 𝜆 ( 𝑊 𝑛 + 𝑅 ) , (4.7) where 𝜆 = 1 𝑛 , 𝑅 = 1 𝑛 − 1 𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) − 1 𝑛 − 1 𝑛  𝑖 = 1 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 ) . (4.8) U nder condition ( 2.6 ), w e ha ve | 𝐷 | ≤ 4 𝛿 , similarly we ha v e | Δ | = | 𝑊 𝑛 − 𝑊 ′ 𝑛 | =   𝑎 𝐼 𝜋 ( 𝐼 ) + 𝑎 𝐽 𝜋 ( 𝐽 ) − 𝑎 𝐼 𝜋 ( 𝐽 ) − 𝑎 𝐽 𝜋 ( 𝐼 ) + 𝑏 𝐼 𝐽 𝜋 ( 𝐼 ) 𝜋 ( 𝐽 ) + 𝑏 𝐽 𝐼 𝜋 ( 𝐽 ) 𝜋 ( 𝐼 ) − 𝑏 𝐼 𝐽 𝜋 ( 𝐽 ) 𝜋 ( 𝐼 ) − 𝑏 𝐽 𝐼 𝜋 ( 𝐼 ) 𝜋 ( 𝐽 ) +  𝑠 ∉ { 𝐼 , 𝐽 } ( 𝑏 𝑠 𝐽 𝜋 ( 𝑠 ) 𝜋 ( 𝐽 ) + 𝑏 𝑠 𝐼 𝜋 ( 𝑠 ) 𝜋 ( 𝐼 ) + 𝑏 𝐽 𝑠 𝜋 ( 𝐽 ) 𝜋 ( 𝑠 ) + 𝑏 𝐼 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) ) −  𝑠 ∉ { 𝐼 , 𝐽 } ( 𝑏 𝑠 𝐽 𝜋 ( 𝑠 ) 𝜋 ( 𝐼 ) + 𝑏 𝑠 𝐼 𝜋 ( 𝑠 ) 𝜋 ( 𝐽 ) + 𝑏 𝐽 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) + 𝑏 𝐼 𝑠 𝜋 ( 𝐽 ) 𝜋 ( 𝑠 ) )   ≤ 16 𝛿 . (4.9) T o pro ve Theorem 2.1 , we apply T heorem 4.1 on 𝑊 𝑛 . B y Theorem 4.1 , w e need to v erif y the conditions ( 𝐴 1 ) - ( 𝐴 3 ) . By the definition of 𝑊 𝑛 , we know that { 𝑎 ( 𝑖 , 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑁 ] is a real number matr ix Cramér -t ype moder ate dev iation for double index permutation statistics 11 and { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑁 ] is a 4-index real number ar ra y , so 𝑊 𝑛 is finite. T hen we hav e E { 𝑒 𝑡 𝑊 𝑛 } < ∞ , theref ore the first condition ( A1) holds. Ne xt we consider the second condition ( A2). For an y absolutel y continuous function 𝑓 : R → R satis f yin g that E { | 𝑓 ( 𝑊 𝑛 ) | } < ∞ , w e ha v e E ( 𝑊 𝑛 𝑓 ( 𝑊 𝑛 ) ) = 1 2 𝜆 E  𝐷  0 − Δ 𝑓 ′ ( 𝑊 𝑛 + 𝑢 ) 𝑑 𝑢  − E { 𝑅 𝑓 ( 𝑊 𝑛 ) } . B y appl ying 𝑓 ( 𝑤 ) = 𝑤 , we ha v e 1 2 𝜆 E { 𝐷 Δ } = E { 𝑊 2 𝑛 } + E { 𝑅𝑊 𝑛 } . U nder proposition 2.1 , w e know t hat 𝑊 𝑛 f ollo w s condition 𝑎 ( 𝑖 , ·) = 𝑎 ( · , 𝑘 ) = 0 , 𝑏 ( 𝑖 , 𝑗 , 𝑘 ·) = 𝑏 ( 𝑖, 𝑗 , · , 𝑙 ) = 𝑏 ( 𝑖 , · , 𝑘 , 𝑙 ) = 𝑏 ( · , 𝑗 , 𝑘 , 𝑙 ) = 0 , then we hav e E { 𝑅𝑊 𝑛 } = E         1 𝑛 − 1 𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) − 1 𝑛 − 1 𝑛  𝑖 = 1 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 )     𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) +  𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )           = 1 𝑛 − 1 E         𝑖 , 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) + 𝑛  𝑖 = 1  𝑝 ≠ 𝑞 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 )        , − 1 𝑛 − 1 E         𝑖 , 𝑗 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) − 𝑛  𝑖 = 1  𝑝 ≠ 𝑞 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 ) 𝑏 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 )        , = 1 ( 𝑛 − 1 ) 2  𝑖 , 𝑗 𝑎 2 𝑖 𝑗 + 2 ( 𝑛 − 1 ) 2 ( 𝑛 − 2 )  𝑖 , 𝑗 𝑎 𝑖 𝑗 𝑏 𝑖 𝑖 𝑗 𝑗 − 1 ( 𝑛 − 1 ) 2  𝑖 , 𝑗 𝑎 𝑖 𝑗 𝑏 𝑖 𝑖 𝑗 𝑗 + 1 𝑛 ( 𝑛 − 1 ) 2 ( 𝑛 − 2 )  𝑖 ≠ 𝑗 𝑛  𝑘 = 1 𝑏 𝑖 𝑖 𝑘 𝑘 𝑏 𝑗 𝑗 𝑘 𝑘 − 1 𝑛 ( 𝑛 − 1 ) 2 ( 𝑛 − 2 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 𝑖 𝑖 𝑗 𝑗 𝑏 𝑗 𝑗 𝑙𝑙 + 1 𝑛 ( 𝑛 − 1 ) 2 ( 𝑛 − 2 ) 𝑛  𝑖 = 1  𝑗 ≠ 𝑘 𝑏 𝑖 𝑖 𝑗 𝑗 𝑏 𝑖 𝑖 𝑘 𝑘 − 2 𝑛 − 3 𝑛 ( 𝑛 − 1 ) 2 ( 𝑛 − 2 )  𝑖 , 𝑗 𝑏 2 𝑖 𝑗 𝑖 𝑗 . (4.10) Appl ying condition ( 2.6 ), it f ollo w s t hat E { 𝑅𝑊 𝑛 } ≤ 5 𝛿 2 . T og et her with E { 𝑊 2 𝑛 } = 1 + 𝐶 √ 𝑛 , w e deduce that E      1 − 1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 }     𝑒 𝑡 𝑊 𝑛  = E      1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 } − 1 2 𝜆 E { 𝐷 Δ } + 𝐶 √ 𝑛 + E { 𝑅𝑊 𝑛 }     𝑒 𝑡 𝑊 𝑛  ≤    E   1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 } − 1 2 𝜆 E { 𝐷 Δ } + 𝐶 √ 𝑛 + E { 𝑅𝑊 𝑛 }  2 𝑒 𝑡 𝑊 𝑛   E { 𝑒 𝑡 𝑊 𝑛 } 12 ≤    3 E    1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 } − 1 2 𝜆 E { 𝐷 Δ }  2 + 𝐶 2 𝑛 + 25 𝛿 4  𝑒 𝑡 𝑊 𝑛   E { 𝑒 𝑡 𝑊 𝑛 } . (4.11) B y the definition of 𝐷 and Δ , we decom pose 1 2 𝜆 E { 𝐷 Δ | 𝜋 } as 1 2 𝜆 E { 𝐷 Δ | 𝜋 } = 𝑛 2 2  𝑖 = 1 4  𝑗 = 1 E { 𝐻 𝑖 𝑄 𝑗 | 𝜋 } , where 𝐻 1 = 𝑎 𝐼 𝜋 ( 𝐼 ) − 𝑎 𝐼 𝜋 ( 𝐽 ) , 𝐻 2 =  𝑠 ∉ { 𝐼 , 𝐽 } ( 𝑏 𝐼 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) − 𝑏 𝐼 𝑠 𝜋 ( 𝐽 ) 𝜋 ( 𝑠 ) ) , 𝑄 1 = 𝑎 𝐼 𝜋 ( 𝐼 ) + 𝑎 𝐽 𝜋 ( 𝐽 ) − 𝑎 𝐼 𝜋 ( 𝐽 ) − 𝑎 𝐽 𝜋 ( 𝐼 ) , 𝑄 2 = 𝑏 𝐼 𝐽 𝜋 ( 𝐼 ) 𝜋 ( 𝐽 ) + 𝑏 𝐽 𝐼 𝜋 ( 𝐽 ) 𝜋 ( 𝐼 ) − 𝑏 𝐼 𝐽 𝜋 ( 𝐽 ) 𝜋 ( 𝐼 ) − 𝑏 𝐽 𝐼 𝜋 ( 𝐼 ) 𝜋 ( 𝐽 ) , 𝑄 3 =  𝑠 ∉ { 𝐼 , 𝐽 } ( 𝑏 𝑠 𝐽 𝜋 ( 𝑠 ) 𝜋 ( 𝐽 ) + 𝑏 𝑠 𝐼 𝜋 ( 𝑠 ) 𝜋 ( 𝐼 ) + 𝑏 𝐽 𝑠 𝜋 ( 𝐽 ) 𝜋 ( 𝑠 ) + 𝑏 𝐼 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) ) , 𝑄 4 = −  𝑠 ∉ { 𝐼 , 𝐽 } ( 𝑏 𝑠 𝐽 𝜋 ( 𝑠 ) 𝜋 ( 𝐼 ) + 𝑏 𝑠 𝐼 𝜋 ( 𝑠 ) 𝜋 ( 𝐽 ) + 𝑏 𝐽 𝑠 𝜋 ( 𝐼 ) 𝜋 ( 𝑠 ) + 𝑏 𝐼 𝑠 𝜋 ( 𝐽 ) 𝜋 ( 𝑠 ) ) . Then by Cauch y ’ s inequality , w e ha ve E   1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 } − 1 2 𝜆 E { 𝐷 Δ }  2 𝑒 𝑡 𝑊 𝑛  ≤ 𝐶 2  𝑖 = 1 4  𝑗 = 1 E  ( 𝑛 E { 𝐻 𝑖 𝑄 𝑗 | 𝜋 } − 𝑛 E { 𝐻 𝑖 𝑄 𝑗 }) 2 𝑒 𝑡 𝑊 𝑛  . (4.12) The right hand side of ( 4.12 ) contains ei ght ter ms. W e will analyze t he upper bound of each ter m in the follo wing . For the first ter m E { ( 𝑛 E { 𝐻 1 𝑄 1 | 𝜋 } − 𝑛 E { 𝐻 1 𝑄 1 }) 2 𝑒 𝑡 𝑊 𝑛 } , we calcula te t he expec tation 𝑛 E { 𝐻 1 𝑄 1 | 𝜋 } first 𝑛 E { 𝐻 1 𝑄 1 | 𝜋 } = 𝑛 + 3 𝑛 − 1 𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 ) + 1 𝑛 − 1  𝑖 ≠ 𝑗 ( 𝑎 2 𝑖 𝜋 ( 𝑗 ) + 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) + 𝑎 𝑖 𝜋 ( 𝑗 ) 𝑎 𝑗 𝜋 ( 𝑖 ) ) . (4.13) Then applyin g Cauc h y ’ s inequality , the first term of ( 4.12 ) is bounded by three parts 𝐽 1 , 𝐽 2 , 𝐽 3 , where 𝐽 1 = E         𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 ) − E  𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 )   2 𝑒 𝑡 𝑊 𝑛        , 𝐽 2 = E             1 𝑛  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝜋 ( 𝑗 ) − E        1 𝑛  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          , Cramér -t ype moder ate dev iation for double index permutation statistics 13 𝐽 3 = E             1 𝑛  𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) − E        1 𝑛  𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          . Ne xt, we es tablish a useful lemma that will help us bound 𝐽 1 , 𝐽 2 and 𝐽 3 . Lemma 4.1. Let 𝑊 be defined in ( 2.2 ) whic h satisfies ( 2.4 ), ( 2.5 ) and ( 2.6 ), then we have for 0 < 𝑡 < 1 𝛿 , E         𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 ) − E  𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 )   2 𝑒 𝑡 𝑊 𝑛        ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , (4.14) E             1 𝑛  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝜋 ( 𝑗 ) − E        1 𝑛  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , (4.15) E              𝑖 ≠ 𝑗 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) − E         𝑖 ≠ 𝑗 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , (4.16) E              𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) − E         𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 )           2 𝑒 𝑡 𝑊 𝑛          ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , (4.17) The proof of Lemma 4.1 can be found in Sec tion 5 . Then by Lemma 4.1 , w e ha ve 𝐽 1 ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } , 𝐽 2 ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.18) Considering 𝐽 3 , b y condition ( 2.4 ) and ( 2.6 ) w e obtain E        1 𝑛  𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 )        = 1 𝑛 3 ( 𝑛 − 1 )  𝑖 , 𝑗 𝑎 2 𝑖 𝑗 ≤ 𝛿 2 𝑛 , theref ore, it f ollo w s that 𝐽 3 ≤ 1 𝑛 2       E         𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) 𝑎 𝑘 𝜋 ( 𝑘 ) 𝑎 𝑙 𝜋 ( 𝑙 ) 𝑒 𝑡 𝑊 𝑛              + 2 𝛿 2 𝑛 2       E         𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) 𝑒 𝑡 𝑊 𝑛              + 𝛿 4 𝑛 2 E { 𝑒 𝑡 𝑊 𝑛 } ≤ 1 𝑛 2       E         𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑎 𝑗 𝜋 ( 𝑗 ) 𝑎 𝑘 𝜋 ( 𝑘 ) 𝑎 𝑙 𝜋 ( 𝑙 ) 𝑒 𝑡 𝑊 𝑛              + 𝐶 𝛿 4 E { 𝑒 𝑡 𝑊 𝑛 } ≤ 1 𝑛 4 ( 𝑛 − 1 2 )        𝑖 ≠ 𝑗  𝑘 ≠ 𝑙  𝑝 ≠ 𝑞  𝑢 ≠ 𝑣 𝑎 𝑖 𝑝 𝑎 𝑗 𝑞 𝑎 𝑘 𝑢 𝑎 𝑙 𝑣 E  𝑒 𝑡 𝑊 𝑛    𝜋 ( 𝑖 ) = 𝑝 , 𝜋 ( 𝑗 ) = 𝑞 𝜋 ( 𝑘 ) = 𝑢 , 𝜋 ( 𝑙 ) = 𝑣        + 𝐶 𝛿 4 E { 𝑒 𝑡 𝑊 𝑛 } ≤ 𝑛 2 𝛿 4 max 𝑖 ≠ 𝑗 ≠ 𝑘 ≠ 𝑙 𝑝 ≠ 𝑞 ≠ 𝑢 ≠ 𝑣    E  𝑒 𝑡 𝑊 𝑛    𝜋 ( 𝑖 ) = 𝑝 , 𝜋 ( 𝑗 ) = 𝑞 𝜋 ( 𝑘 ) = 𝑢 , 𝜋 ( 𝑙 ) = 𝑣  − E { 𝑒 𝑡 𝑊 𝑛 }    + 𝐶 𝛿 4 E { 𝑒 𝑡 𝑊 𝑛 } , (4.19) 14 where w e use ( 2.6 ) in t he last inequality and max 𝑖 1 ≠ · · · ≠ 𝑖 𝑘 means ma x 𝑖 1 , .. . ,𝑖 𝑘 under t he condition t ha t 𝑖 1 , . . . , 𝑖 𝑘 are all distinc t. Appl yin g ( 5.27 ) in the proof of Lemma 4.1 , we ha v e max 𝑖 ≠ 𝑗 ≠ 𝑘 ≠ 𝑙 𝑝 ≠ 𝑞 ≠ 𝑢 ≠ 𝑣    E  𝑒 𝑡 𝑊 𝑛    𝜋 ( 𝑖 ) = 𝑝 , 𝜋 ( 𝑗 ) = 𝑞 𝜋 ( 𝑘 ) = 𝑢 , 𝜋 ( 𝑙 ) = 𝑣  − E { 𝑒 𝑡 𝑊 𝑛 }    ≤ 𝐶 ( 1 𝑛 + 𝛿 2 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.20) Combin g ( 4.18 )-( 4.20 ), w e deduce t he upper bound of the first term in ( 4.12 ), E { ( 𝑛 E { 𝐻 1 𝑄 1 | 𝜋 } − 𝑛 E { 𝐻 1 𝑄 1 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.21) Ne xt we consider t he ter m E { ( 𝑛 E { 𝐻 1 𝑄 3 | 𝜋 } − 𝑛 E { 𝐻 1 𝑄 3 }) 2 𝑒 𝑡 𝑊 𝑛 } in ( 4.12 ). S ince 𝑛 E { 𝐻 1 𝑄 3 | 𝜋 } = 𝑛 𝑛 − 1  𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) + 𝑛 𝑛 − 1  𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) + 2 𝑛 − 1  𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑗 ) ( 𝑏 𝑗 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) + 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) ) + 2 𝑛 − 1  𝑖 ≠ 𝑗 ≠ 𝑠 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑠 𝑗 𝜋 ( 𝑠 ) 𝜋 ( 𝑗 ) , (4.22) this ter m can be divided into 𝐽 4 , 𝐽 5 , 𝐽 6 , 𝐽 7 , where 𝐽 4 = E              𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) − E         𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          , 𝐽 5 = E              𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) − E         𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 )           2 𝑒 𝑡 𝑊 𝑛          , 𝐽 6 = E          1 𝑛 2     𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑗 ) ( 𝑏 𝑗 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) + 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) ) − E         𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑗 ) ( 𝑏 𝑗 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) + 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) )           2 𝑒 𝑡 𝑊 𝑛          , 𝐽 7 = E             1 𝑛  𝑖 ≠ 𝑗 ≠ 𝑠 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑠 𝑗 𝜋 ( 𝑠 ) 𝜋 ( 𝑗 ) − E        1 𝑛  𝑖 ≠ 𝑗 ≠ 𝑠 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑠 𝑗 𝜋 ( 𝑠 ) 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          . (4.23) W e use a similar approac h of ( 4.19 ) to get t he upper bounds of these four par ts. The proof s of t hese four part s are very similar . Here, we present the proof of t he upper bound of 𝐽 4 as a representati ve, and the proofs of t he other three part s can be obt ained in t he same argument. Applyin g condition ( 2.4 ), ( 2.5 ) and ( 2.6 ), w e g et E         𝑖 ≠ 𝑗 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )        = 1 𝑛 ( 𝑛 − 1 )  𝑖 , 𝑗 𝑎 𝑖 𝑗 𝑏 𝑖 𝑖 𝑗 𝑗 ≤ 2 𝛿 2 , Cramér -t ype moder ate dev iation for double index permutation statistics 15 it then follo ws t hat 𝐽 4 ≤       E         𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 𝑖 𝜋 ( 𝑖 ) 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑎 𝑘 𝜋 ( 𝑙 ) 𝑏 𝑘 𝑙 𝜋 ( 𝑘 ) 𝜋 ( 𝑙 ) 𝑒 𝑡 𝑊 𝑛              + 𝐶 𝑛 𝛿 4 E { 𝑒 𝑡 𝑊 𝑛 } ≤ 1 𝑛 2 ( 𝑛 − 1 ) 2        𝑖 ≠ 𝑗  𝑘 ≠ 𝑙  𝑝 ≠ 𝑞  𝑢 ≠ 𝑣 𝑎 𝑖 𝑝 𝑏 𝑖 𝑗 𝑝 𝑞 𝑎 𝑘 𝑢 𝑏 𝑘 𝑙𝑢 𝑣 E  𝑒 𝑡 𝑊 𝑛    𝜋 ( 𝑖 ) = 𝑝 , 𝜋 ( 𝑗 ) = 𝑞 𝜋 ( 𝑘 ) = 𝑢 , 𝜋 ( 𝑙 ) = 𝑣        + 𝐶 𝑛 𝛿 4 E { 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 𝑛 2 𝛿 4 max 𝑖 ≠ 𝑗 ≠ 𝑘 ≠ 𝑙 𝑝 ≠ 𝑞 ≠ 𝑢 ≠ 𝑣    E  𝑒 𝑡 𝑊 𝑛    𝜋 ( 𝑖 ) = 𝑝 , 𝜋 ( 𝑗 ) = 𝑞 𝜋 ( 𝑘 ) = 𝑢 , 𝜋 ( 𝑙 ) = 𝑣  − E { 𝑒 𝑡 𝑊 𝑛 }    + 𝐶 𝑛 𝛿 4 E { 𝑒 𝑡 𝑊 𝑛 } , where w e use ( 2.6 ) in t he las t inequality . Appl yin g ( 4.20 ), w e g et 𝐽 4 ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.24) B y a same argument, w e hav e 𝐽 5 ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } , 𝐽 6 ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } , 𝐽 7 ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.25) T og et her wit h ( 4.24 ) and ( 4.25 ), w e obtain E { ( 𝑛 E { 𝐻 1 𝑄 3 | 𝜋 } − 𝑛 E { 𝐻 1 𝑄 3 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.26) B y usin g the same argument, w e obtain the upper bounds of t he other three terms in ( 4.12 ), E { ( 𝑛 E { 𝐻 1 𝑄 2 | 𝜋 } − 𝑛 E { 𝐻 1 𝑄 2 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } , E { ( 𝑛 E { 𝐻 1 𝑄 4 | 𝜋 } − 𝑛 E { 𝐻 1 𝑄 4 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } , E { ( 𝑛 E { 𝐻 2 𝑄 1 | 𝜋 } − 𝑛 E { 𝐻 2 𝑄 1 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.27) Ne xt we consider t he ter m E { ( 𝑛 E { 𝐻 2 𝑄 3 | 𝜋 } − 𝑛 E { 𝐻 2 𝑄 3 }) 2 𝑒 𝑡 𝑊 𝑛 } in ( 4.12 ). S ince 𝑛 E { 𝐻 2 𝑄 3 | 𝜋 } = 𝑛 − 2 𝑛 − 1  𝑖 ≠ 𝑗 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) + 𝑛 − 2 𝑛 − 1  𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 𝑗 𝑖 𝜋 ( 𝑗 ) 𝜋 ( 𝑖 ) + 𝑛 − 3 𝑛 − 1  𝑖 ≠ 𝑗 ≠ 𝑠 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) ( 𝑏 𝑖 𝑠 𝜋 ( 𝑖 ) 𝜋 ( 𝑠 ) + 𝑏 𝑠 𝑖 𝜋 ( 𝑠 ) 𝜋 ( 𝑖 ) ) + 1 𝑛 − 1  𝑖 ≠ 𝑗 ≠ 𝑠 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) ( 𝑏 𝑠 𝑗 𝜋 ( 𝑠 ) 𝜋 ( 𝑗 ) + 𝑏 𝑗 𝑠 𝜋 ( 𝑗 ) 𝜋 ( 𝑠 ) ) + 2 𝑛 − 1  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 ) − 1 𝑛 − 1  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞 𝑏 𝑖 𝑝 𝜋 ( 𝑗 ) 𝜋 ( 𝑝 ) ( 𝑏 𝑞 𝑗 𝜋 ( 𝑞 ) 𝜋 ( 𝑗 ) + 𝑏 𝑞 𝑖 𝜋 ( 𝑞 ) 𝜋 ( 𝑖 ) + 𝑏 𝑗 𝑞 𝜋 ( 𝑗 ) 𝜋 ( 𝑞 ) + 𝑏 𝑖 𝑞 𝜋 ( 𝑖 ) 𝜋 ( 𝑞 ) ) 16 − 1 𝑛 − 1  𝑖 ≠ 𝑗 ≠ 𝑠 𝑏 𝑖 𝑠 𝜋 ( 𝑗 ) 𝜋 ( 𝑠 ) ( 𝑏 𝑠 𝑗 𝜋 ( 𝑠 ) 𝜋 ( 𝑗 ) + 𝑏 𝑠 𝑖 𝜋 ( 𝑠 ) 𝜋 ( 𝑖 ) + 𝑏 𝑗 𝑠 𝜋 ( 𝑗 ) 𝜋 ( 𝑠 ) + 𝑏 𝑖 𝑠 𝜋 ( 𝑖 ) 𝜋 ( 𝑠 ) ) : = 7  𝑖 = 1 𝐴 𝑖 , (4.28) b y using Cauch y’ s inequality , the ter m E { ( 𝑛 E { 𝐻 2 𝑄 3 | 𝜋 } − 𝑛 E { 𝐻 2 𝑄 3 }) 2 𝑒 𝑡 𝑊 𝑛 } is divided into sev en part s, 7  𝑖 = 1 E { ( 𝐴 𝑖 − E ( 𝐴 𝑖 ) ) 2 𝑒 𝑡 𝑊 𝑛 } . (4.29) W e divide ( 4.29 ) into t w o groups, wit h ter ms includin g 𝐴 1 , 𝐴 2 as one group and t he remainin g ter ms inc luding 𝐴 3 − 𝐴 7 as another group, mainl y based on t heir expec tations. Follo win g condition ( 2.5 ) and ( 2.6 ), w e calc ulate the e xpectation of 𝐴 1 − 𝐴 7 as f ollo w s | E { 𝐴 1 } | ≤ 𝐶 𝑛 𝛿 2 , | E { 𝐴 2 } | ≤ 𝐶 𝑛 𝛿 2 , max 𝑖 ∈ { 2 , . .. , 15 } | E { 𝐴 𝑖 } | ≤ 𝐶 𝛿 2 . (4.30) No te t ha t t he e xpectations of 𝐴 1 and 𝐴 2 are of order 𝑛 𝛿 2 , while the expec tations of 𝐴 3 − 𝐴 7 are of order 𝛿 2 . T his difference leads us to adopt different approaches when analyzin g the upper bounds of these t w o g roups . W e first consider t he first group which includin g terms E { ( 𝐴 1 − E ( 𝐴 1 ) ) 2 𝑒 𝑡 𝑊 𝑛 } , and E { ( 𝐴 2 − E ( 𝐴 2 ) ) 2 𝑒 𝑡 𝑊 𝑛 } . These two terms can be bounded directl y by using the result of Lemma 4.1 , 2  𝑖 = 1 E { ( 𝐴 𝑖 − E ( 𝐴 𝑖 ) ) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.31) For t he second g roup includin g ter ms E { ( 𝐴 𝑖 − E ( 𝐴 𝑖 ) ) 2 𝑒 𝑡 𝑊 𝑛 } , 𝑖 = 3 , . . . , 7 , w e use a similar approach of ( 4.19 ) to ge t the upper bounds of these fiv e par ts. The proofs of these five parts are v ery similar . Here, w e present t he proof of the upper bound of E { ( 𝐴 3 − E ( 𝐴 3 ) ) 2 𝑒 𝑡 𝑊 𝑛 } as a representati ve, and t he proof s of the other four parts can be obtained in t he same wa y . Appl yin g condition ( 2.5 ) and ( 2.6 ), w e ha v e E { ( 𝐴 3 − E { 𝐴 3 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶       E         𝑖 ≠ 𝑗 ≠ 𝑝  𝑘 ≠ 𝑙 ≠ 𝑞 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 𝑖 𝑝 𝜋 ( 𝑖 ) 𝜋 ( 𝑝 ) 𝑏 𝑘 𝑙 𝜋 ( 𝑘 ) 𝜋 ( 𝑙 ) 𝑏 𝑘 𝑞 𝜋 ( 𝑘 ) 𝜋 ( 𝑞 ) 𝑒 𝑡 𝑊 𝑛              + 𝐶 𝛿 2       E         𝑖 ≠ 𝑗 ≠ 𝑠 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 𝑖 𝑠 𝜋 ( 𝑖 ) 𝜋 ( 𝑠 ) 𝑒 𝑡 𝑊 𝑛              + 𝐶 𝛿 4 ℎ ( 𝑡 ) ≤ 𝐶 𝑛 2 𝛿 4 max 𝐵 6     E  Ψ 𝑡 ( 𝑊 𝑛 )     𝜋 ( 𝑖 ) = 𝑖 ′ , 𝜋 ( 𝑗 ) = 𝑗 ′ 𝜋 ( 𝑘 ) = 𝑘 ′ , 𝜋 ( 𝑙 ) = 𝑙 ′ 𝜋 ( 𝑝 ) = 𝑝 ′ , 𝜋 ( 𝑞 ) = 𝑞 ′  − ℎ ( 𝑡 )     + 𝐶 𝛿 4 ℎ ( 𝑡 ) + 𝐶 𝑛 𝛿 4 max 𝐵 6     E          Ψ 𝑡 ( 𝑊 𝑛 )        𝜋 ( 𝑖 ) = 𝑖 ′ 𝜋 ( 𝑗 ) = 𝑗 ′ 𝜋 ( 𝑘 ) = 𝑘 ′ 𝜋 ( 𝑙 ) = 𝑙 ′ 𝜋 ( 𝑝 ) = 𝑝 ′          + E  Ψ 𝑡 ( 𝑊 𝑛 )      𝜋 ( 𝑖 ) = 𝑖 ′ 𝜋 ( 𝑗 ) = 𝑗 ′ 𝜋 ( 𝑘 ) = 𝑘 ′ 𝜋 ( 𝑙 ) = 𝑙 ′  + E  Ψ 𝑡 ( 𝑊 𝑛 )     𝜋 ( 𝑖 ) = 𝑖 ′ 𝜋 ( 𝑗 ) = 𝑗 ′ 𝜋 ( 𝑘 ) = 𝑘 ′      , (4.32) Cramér -t ype moder ate dev iation for double index permutation statistics 17 where 𝐵 6 denotes t he set of indices 𝑖 , 𝑗 , 𝑘 , 𝑙 , 𝑝 , 𝑞 ∈ [ 𝑛 ] are all distinct and 𝑖 ′ , 𝑗 ′ , 𝑘 ′ , 𝑙 ′ , 𝑝 ′ , 𝑞 ′ ∈ [ 𝑛 ] are also all distinct. W e define 𝜎 𝑖 𝑗 𝑘 𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ : = P 𝑝 𝑝 ′ ◦ P 𝑘 𝑙 𝑘 ′ 𝑙 ′ ◦ P 𝑖 𝑗 𝑖 ′ 𝑘 ′ ◦ 𝜎 and 𝑆 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ =  𝑛 𝑖 = 1 𝑎 𝑖 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ ( 𝑖 ) +  𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ ( 𝑖 ) 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ ( 𝑗 ) , where 𝜎 is a random per mutation chosen unif ormly from S 𝑛 and independent of 𝜋 . Bef ore proceedin g to the next step, we introduce a ke y auxiliar y lemma that will be critical for ident ifyin g the conditional distr ibution of 𝑊 𝑛 . Lemma 4.2. Let 𝜋 and 𝜎 be two independent r andom per mutations, chosen unifor mly from 𝑆 𝑛 . Suppose 𝑖 ≠ 𝑗 and 𝑘 ≠ 𝑙 are e lements of [ 1 , . . . , 𝑛 ] and denote 𝜏 𝑖 , 𝑗 the tr ansposition of 𝑖 and 𝑗 , then 𝜎 𝑖 𝑘 : = P 𝑖 𝑘 𝜎 =  𝜎 , 𝜎 ( 𝑖 ) = 𝑘 , 𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) , 𝜎 ( 𝑖 ) ≠ 𝑘 , (4.33) 𝜎 𝑖 𝑗 𝑘 𝑙 : = P 𝑖 𝑗 𝑘 𝑙 𝜎 =              𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) ≠ 𝑘 , 𝜎 ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑙 ) ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑙 ) , 𝜎 ( 𝑖 ) ≠ 𝑙 , 𝜎 ( 𝑗 ) = 𝑘 , 𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑙 ) ◦ 𝜏 𝑖 , 𝑗 , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑘 , 𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑙 ) , 𝑜 𝑡 ℎ𝑒 𝑟 𝑤𝑖 𝑠 𝑒, (4.34) ar e two per mutations that satis f y L  𝜎 𝑖 𝑘  𝑑 = L ( 𝜋 | 𝜋 ( 𝑖 ) = 𝑘 ) , (4.35) and L  𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  . (4.36) And for any e lements 𝑝 ≠ 𝑞 , 𝑢 ≠ 𝑣 of [ 𝑛 ] satisf ying 𝑝 , 𝑞 ∉ { 𝑖 , 𝑗 } , 𝑢 , 𝑣 ∉ { 𝑘 , 𝑙 } , we hav e L  P 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑞  , L  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋      𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 ) = 𝑣  . (4.37) The proof of Lemma 4.2 can be found in Sec tion 5 . Then, using Lemma 4.2 , w e obtain L ( 𝑆 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ ) = L ( 𝑊 𝑛 | 𝜋 ( 𝑖 ) = 𝑖 ′ , 𝜋 ( 𝑗 ) = 𝑗 ′ , 𝜋 ( 𝑘 ) = 𝑘 ′ , 𝜋 ( 𝑙 ) = 𝑙 ′ , 𝜋 ( 𝑝 ) = 𝑝 ′ ) . B y t he definition of 𝑆 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ and using condition ( 2.6 ), w e notice t hat | 𝑆 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ − 𝑊 𝑛 | ≤ 𝐶 𝛿 . There f or w e deduce that E          Ψ 𝑡 ( 𝑊 𝑛 )        𝜋 ( 𝑖 ) = 𝑖 ′ 𝜋 ( 𝑗 ) = 𝑗 ′ 𝜋 ( 𝑘 ) = 𝑘 ′ 𝜋 ( 𝑙 ) = 𝑙 ′ 𝜋 ( 𝑝 ) = 𝑝 ′          = E  Ψ 𝑡 ( 𝑆 𝜎 𝑖 𝑗 𝑘𝑙 𝑝 𝑖 ′ 𝑗 ′ 𝑘 ′ 𝑙 ′ 𝑝 ′ )  ≤ 𝐶 ℎ ( 𝑡 ) . B y a same argument, t he other tw o conditional expec tations in t he last line of ( 4.32 ) can be bounded in the same w a y , leading to E { ( 𝐴 3 − E ( 𝐴 3 ) ) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 𝑛 2 𝛿 4 max 𝐵 6     E  Ψ 𝑡 ( 𝑊 𝑛 )     𝜋 ( 𝑖 ) = 𝑖 ′ , 𝜋 ( 𝑗 ) = 𝑗 ′ 𝜋 ( 𝑘 ) = 𝑘 ′ , 𝜋 ( 𝑙 ) = 𝑙 ′ 𝜋 ( 𝑝 ) = 𝑝 ′ , 𝜋 ( 𝑞 ) = 𝑞 ′  − ℎ ( 𝑡 )     + 𝐶 𝑛 𝛿 4 ℎ ( 𝑡 ) . 18 Follo win g ( 5.27 ), w e deduce that E { ( 𝐴 3 − E ( 𝐴 3 ) ) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . (4.38) Theref or , b y a same argument it follo ws that E { ( 𝐴 𝑖 − E ( 𝐴 𝑖 ) ) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , 𝑖 = 4 , . . . , 7 . (4.39) T og et her wit h ( 4.31 ), ( 4.38 ) and ( 4.39 ), w e obtain E { ( 𝑛 E { 𝐻 2 𝑄 3 | 𝜋 } − 𝑛 E { 𝐻 2 𝑄 3 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . B y usin g the same method, w e obtain t he upper bounds of the other tw o terms in ( 4.12 ), E { ( 𝑛 E { 𝐻 2 𝑄 2 | 𝜋 } − 𝑛 E { 𝐻 2 𝑄 2 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , E { ( 𝑛 E { 𝐻 2 𝑄 4 | 𝜋 } − 𝑛 E { 𝐻 2 𝑄 4 }) 2 𝑒 𝑡 𝑊 𝑛 } ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . Combinin g the abo v e bounds for the eight ter ms in ( 4.12 ), w e conclude that E   1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 } − 1 2 𝜆 E { 𝐷 Δ }  2 𝑒 𝑡 𝑊 𝑛  ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) E { 𝑒 𝑡 𝑊 𝑛 } . (4.40) Substitutin g ( 4.40 ) into ( 4.11 ), w e obtain E      1 − 1 2 𝜆 E { 𝐷 Δ | 𝑊 𝑛 }     𝑒 𝑡 𝑊 𝑛  ≤ 𝐶  √ 𝑛 𝛿 2 + 𝑛 𝛿 3 + 𝑛 𝛿 3 𝑡 + 1 / √ 𝑛  E  𝑒 𝑡 𝑊 𝑛  . (4.41) Finall y w e consider the third condition ( A3). As for E { 𝑅 2 𝑒 𝑡 𝑊 𝑛 } , by using cauch y’ s ineq uality , and f ollo win g condition ( 2.6 ), we deduce that E { 𝑅 2 𝑒 𝑡 𝑊 𝑛 } = E         1 𝑛 − 1 𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 ) − 1 𝑛 − 1 𝑛  𝑖 = 1 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 )  2 𝑒 𝑡 𝑊 𝑛        ≤ 2 ( 𝑛 − 1 ) 2       E         𝑛  𝑖 = 1 𝑎 𝑖 𝜋 ( 𝑖 )  2 𝑒 𝑡 𝑊 𝑛        + E         𝑛  𝑖 = 1 𝑏 𝑖 𝑖 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 )  2 𝑒 𝑡 𝑊 𝑛              ≤ 6 𝛿 2 E { 𝑒 𝑡 𝑊 𝑛 } . (4.42) For condition ( 𝐴 3 ) , tog ether with ( 4.42 ) and using holder’ s ineq uality , it follo ws t hat E { | 𝑅 | 𝑒 𝑡 𝑊 } ≤  E { 𝑅 2 𝑒 𝑡 𝑊 𝑛 }  E { 𝑒 𝑡 𝑊 𝑛 } ≤ √ 6 𝛿 E { 𝑒 𝑡 𝑊 𝑛 } . (4.43) R ecallin g Theorem 4.1 and combinin g ( 4.41 ), ( 4.43 ), w e complete t he proof of T heorem 2.1 . 4.4. Proof of Theorem 4.1 In this subsection, w e dev elop the proof of Theorem 4.1 . Proposition 4.1 establishes a bound for E { 𝑒 𝑡 𝑊 } via Stein ’ s method. And in Proposition 4.2 , w e deriv e a more general moderate devia tion t heor em for bounded e xc hang eable pairs. Finally , Theorem 4.1 is obtained b y combining Proposition 4.1 with Proposition 4.2 . Cramér -t ype moder ate dev iation for double index permutation statistics 19 Proposition 4.1. Under the a ssumption in Theorem 4.1 . F or 0 ≤ 𝑡 ≤ min ( 𝜏, 1 𝛿 ) , we hav e E { 𝑒 𝑡 𝑊 } ≤ ( 1 + 9 𝛿 ) exp  𝑡 2 2 ( 1 + 𝑡 𝛿 + 2 𝛿 1 ( 𝑡 ) ) + 3 𝑡 𝛿 2 ( 𝑡 )  . (4.44) Proof of Proposition 4.1 . For 𝑡 = 0 , it is t ri vial t ha t ( 4.44 ) holds. So we only need to consider 0 < 𝑡 ≤ min ( 𝜏 , 1 𝛿 ) . Let ℎ ( 𝑡 ) = E { 𝑒 𝑡 𝑊 } . In order to bound ℎ ( 𝑡 ) , we need to find an upper bound for ℎ ′ ( 𝑡 ) using Stein ’ s method, and t hen obtain a bound f or ( log ℎ ( 𝑡 ) ) ′ . This technique w as firstl y considered b y . . . B y conditions ( 𝐴 1 ) and ( 𝐴 3 ) , E { 𝑒 𝑡 𝑊 } < ∞ and E { | 𝑅 | 𝑒 𝑡 𝑊 } < ∞ . Since condition ( 𝐷 1 ) E { 𝐷 | 𝑊 } = 𝜆 ( 𝑊 + 𝑅 ) . W e hav e E { | 𝑊 | 𝑒 𝑡 𝑊 } < ∞ . U nder the condition ( 𝐷 1 ) , by antisymme tr y , it follo w s that E { 𝐷 ( 𝑓 ( 𝑊 ) + 𝑓 ( 𝑊 ′ ) ) } = 0 f or an y absolute ly continuous f unc tion 𝑓 : R → R satis fying that E { | 𝑓 ( 𝑊 ) | } < ∞ . W e obtain 0 = E { 𝐷 ( 𝑓 ( 𝑊 ) + 𝑓 ( 𝑊 ′ ) ) } = 2 E { 𝐷 𝑓 ( 𝑊 ) } − E { 𝐷 ( 𝑓 ( 𝑊 ) − 𝑓 ( 𝑊 ′ ) ) } = 2 𝜆 E { ( 𝑊 + 𝑅 ) 𝑓 ( 𝑊 ) } − E  𝐷  0 − Δ 𝑓 ′ ( 𝑊 + 𝑢 ) 𝑑 𝑢  . Then E ( 𝑊 𝑓 ( 𝑊 ) ) = 1 2 𝜆 E  𝐷  0 − Δ 𝑓 ′ ( 𝑊 + 𝑢 ) 𝑑 𝑢  − E { 𝑅 𝑓 ( 𝑊 ) } . (4.45) Appl ying ( 4.45 ) wit h 𝑓 ( 𝑤 ) = 𝑒 𝑡 𝑤 , w e ha ve ℎ ′ ( 𝑡 ) = E { 𝑊 𝑒 𝑡 𝑊 } = 𝑡 2 𝜆 E  𝐷  0 − Δ 𝑒 𝑡 ( 𝑊 + 𝑢 ) 𝑑 𝑢  − E { 𝑅 𝑓 ( 𝑊 ) } ≤ 𝑡 E { 𝑒 𝑡 𝑊 } + 𝑡 E      1 2 𝜆 E { 𝐷 Δ | 𝑊 } − 1     𝑒 𝑡 𝑊  + 𝑡 2 𝜆     E  𝐷  0 − Δ ( 𝑒 𝑡 ( 𝑊 + 𝑢 ) − 𝑒 𝑡 𝑊 ) 𝑑 𝑢      + E { | 𝑅 | 𝑒 𝑡 𝑊 } ≤ 𝑡 E { 𝑒 𝑡 𝑊 } + 𝑡 E      1 2 𝜆 E { 𝐷 Δ | 𝑊 } − 1     𝑒 𝑡 𝑊  + 𝑡 E      1 2 𝜆 E { 𝐷 ∗ Δ | 𝑊 }     𝑒 𝑡 𝑊  + E { | 𝑅 | 𝑒 𝑡 𝑊 } (4.46) where 𝐷 ∗ : = 𝐷 ∗ ( 𝑋 , 𝑋 ′ ) is any random variable such that 𝐷 ∗ ( 𝑋 , 𝑋 ′ ) = 𝐷 ∗ ( 𝑋 ′ , 𝑋 ) and 𝐷 ∗ ≥ | 𝐷 | . W e ha v e this result by using Lemma 4.2 in Zhang (2023) in the las t line, and b y t he boundedness condition w e choose 𝐷 ∗ = 𝛿 (4.47) which is a cons tant. Then by conditions ( 𝐴 2 ) , ( 𝐴 3 ) in Theorem 4.1 and condition ( 𝐷 1 ) w e ha ve ℎ ′ ( 𝑡 ) ≤ 𝑡 ℎ ( 𝑡 ) + 𝑡 𝛿 1 ( 𝑡 ) ℎ ( 𝑡 ) + 𝑡 𝛿 2 E { | 𝑊 | 𝑒 𝑡 𝑊 } + ( 1 + 𝑡 𝛿 2 ) 𝛿 2 ( 𝑡 ) ℎ ( 𝑡 ) since | 𝑊 | = 𝑊 + 2 𝑊 − and 𝑥 𝑒 − 𝑡 𝑥 ≤ 𝑒 𝑡 , f or 𝑡 > 0 , we hav e E { | 𝑊 | 𝑒 𝑡 𝑊 } = E { 𝑊 𝑒 𝑡 𝑊 } + 2 E { 𝑊 − 𝑒 𝑡 𝑊 − } ≤ E { 𝑊 𝑒 𝑡 𝑊 } + 2 𝑒 𝑡 . (4.48) 20 then we hav e ℎ ′ ( 𝑡 ) ≤ 𝑡 ℎ ( 𝑡 ) + 𝑡 𝛿 1 ( 𝑡 ) ℎ ( 𝑡 ) + 𝑡 𝛿 2 ( ℎ ′ ( 𝑡 ) + 2 𝑒 𝑡 ) + ( 1 + 𝑡 𝛿 2 ) 𝛿 2 ( 𝑡 ) ℎ ( 𝑡 ) ≤ 2 2 − 𝑡 𝛿  𝑡 ( 1 + 𝛿 1 ( 𝑡 ) ) ℎ ( 𝑡 ) + ( 1 + 𝑡 𝛿 2 ) 𝛿 2 ( 𝑡 ) ℎ ( 𝑡 ) + 𝑒 𝛿  ≤ [ 𝑡 ( 1 + 𝑡 𝛿 2 − 𝑡 𝛿 + 2 𝛿 1 ( 𝑡 ) ) + ( 2 + 𝑡 𝛿 ) 𝛿 2 ( 𝑡 ) ] ℎ ( 𝑡 ) + 2 𝑒 𝛿 : = 𝑔 ( 𝑡 ) ℎ ( 𝑡 ) + 2 𝑒 𝛿 . (4.49) Let 𝜇 ( 𝑡 ) = exp  −  𝑡 0 𝑔 ( 𝑠 ) 𝑑𝑠  , then w e hav e 𝜇 ( 0 ) = 1 and let both side of the abo v e inequality mult iply b y 𝜇 ( 𝑡 ) , we hav e 𝜇 ( 𝑡 ) ℎ ′ ( 𝑡 ) − 𝑔 ( 𝑡 ) 𝜇 ( 𝑡 ) ℎ ( 𝑡 ) ≤ 2 𝑒 𝛿 𝜇 ( 𝑡 ) . 𝑑 𝑑 𝑡 ( 𝜇 ( 𝑡 ) ℎ ( 𝑡 ) ) ≤ 2 𝑒 𝛿 𝜇 ( 𝑡 ) . Integratin g both sides from 0 to 𝑡 , we hav e 𝜇 ( 𝑡 ) ℎ ( 𝑡 ) − 𝜇 ( 0 ) ℎ ( 0 ) ≤ 2 𝑒 𝛿  𝑡 0 𝜇 ( 𝑠 ) 𝑑 𝑠 . No te that ℎ ( 0 ) = 1 , w e ha ve ℎ ( 𝑡 ) ≤ 𝜇 ( 𝑡 ) − 1  1 + 2 𝑒 𝛿  𝑡 0 𝜇 ( 𝑠 ) 𝑑 𝑠  . S ince 𝛿 1 ( 𝑡 ) and 𝛿 2 ( 𝑡 ) are nondecreasin g f unc tions, w e ha v e 𝜇 ( 𝑡 ) = e xp  −  𝑡 0 𝑔 ( 𝑠 ) 𝑑𝑠  = e xp  −  𝑠 0  𝑠 ( 1 + 𝑠 𝛿 2 − 𝑠𝛿 + 2 𝛿 1 ( 𝑠 ) ) + ( 2 + 𝑠 𝛿 ) 𝛿 2 ( 𝑠 )  𝑑𝑠  ≥ e xp  −  𝑡 2 2 + 𝑡 3 𝛿 3 + 𝑡 2 𝛿 1 ( 𝑡 ) + 2 𝑡 𝛿 2 ( 𝑡 ) + 𝑡 2 𝛿 2 𝛿 2 ( 𝑡 )   . And since 𝜇 ( 𝑡 ) ≤ 𝑒 − 𝑡 2 2 , w e ha ve  𝑡 0 𝜇 ( 𝑠 ) 𝑑 𝑠 ≤  𝑡 0 𝑒 − 𝑠 2 2 𝑑𝑠 ≤ 1 +  𝑡 1 1 𝑠 3 𝑑𝑠 ≤ 3 2 , then ℎ ( 𝑡 ) ≤ 𝜇 ( 𝑡 ) − 1  1 + 2 𝑒 𝛿  𝑡 0 𝜇 ( 𝑠 ) 𝑑 𝑠  ≤ ( 1 + 3 𝑒 𝛿 ) exp  𝑡 2 2 ( 1 + 𝑡 𝛿 + 2 𝛿 1 ( 𝑡 ) ) + 3 𝑡 𝛿 2 ( 𝑡 )  which implie s the desired result ( 4.44 ). Proposition 4.2. Let ( 𝑊 , 𝑊 ′ ) , Δ , 𝐷 and 𝐷 ∗ be defined as in T heor em 4.1 . A s sume that ther e exits a cons tant 𝜏 0 > 0 such that for all 0 ≤ 𝑡 ≤ 𝜏 0 , ( 𝐵 1 ) : E {   1 − 1 2 𝜆 E { 𝐷 Δ | 𝑊 }   𝑒 𝑡 𝑊 } ≤ 𝜅 1 ( 𝑡 ) 𝑒 𝑡 2 / 2 , Cramér -t ype moder ate dev iation for double index permutation statistics 21 ( 𝐵 2 ) : E { | 𝑅 | 𝑒 𝑡 𝑊 } ≤ 𝜅 2 ( 𝑡 ) 𝑒 𝑡 2 / 2 , wher e 𝜅 1 ( ·) and 𝜅 2 ( ·) are nondecr easing f unctions satisf ying that 𝜅 1 ( 𝜏 0 ) < ∞ and 𝜅 2 ( 𝜏 0 ) < ∞ . T hen, for 0 ≤ 𝑧 ≤ 𝜏 0     P ( 𝑊 > 𝑧 ) 1 − Φ ( 𝑧 ) − 1     ≤ 31  ( 1 + 𝑧 2 ) ( 𝜅 1 ( 𝑧 ) + 𝛿 ( 1 + 𝛿 3 ( 𝑧 ) + 𝜅 2 ( 𝑧 ) ) ) + ( 1 + 𝑧 ) 𝜅 2 ( 𝑧 )  (4.50) wher e 𝛿 3 ( 𝑧 ) = [ 𝑧 ( 1 + 𝑧 𝛿 + 2 𝛿 1 ( 𝑧 ) ) + ( 2 + 𝑧 𝛿 ) 𝛿 2 ( 𝑧 ) ] ( 1 + 3 𝑒 𝛿 ) 𝑒 𝑧 2 2 ( 𝑧 𝛿 + 2 𝛿 1 ( 𝑧 ) ) + 3 𝑧 𝛿 2 ( 𝑧 ) . (4.51) Proof of Proposition 4.2 . Let 𝑧 ≥ 0 be a fix ed real number, and let 𝑓 𝑧 be t he solution to the Stein equa tion 𝑓 ′ ( 𝑤 ) − 𝑤 𝑓 ( 𝑤 ) = 1 { 𝑤 ≤ 𝑧 } − Φ ( 𝑧 ) , (4.52) where Φ ( ·) is the standard nor mal dis tr ibution function. It is w ell k nown that 𝑓 𝑧 is giv en b y 𝑓 𝑧 ( 𝑤 ) =          Φ ( 𝑤 ) { 1 − Φ ( 𝑧 ) } 𝑝 ( 𝑤 ) , 𝑤 ≤ 𝑧, Φ ( 𝑧 ) { 1 − Φ ( 𝑤 ) } 𝑝 ( 𝑤 ) , 𝑤 > 𝑧 . (4.53) where 𝑝 ( 𝑤 ) = ( 2 𝜋 ) − 1 / 2 𝑒 − 𝑤 2 / 2 is the stadard normal densit y function. B y ( 4.52 ) and ( 4.45 ) and tak in g 𝑓 = 𝑓 𝑧 , w e ha ve P ( 𝑊 > 𝑧 ) − { 1 − Φ ( 𝑧 ) } = E { 𝑓 ′ 𝑧 ( 𝑊 ) − 𝑊 𝑓 𝑧 ( 𝑊 ) } = 𝐽 1 + 𝐽 2 + 𝐽 3 , (4.54) where 𝐽 1 = E  𝑓 ′ 𝑧 ( 𝑊 )  1 − 1 2 𝜆 E ( 𝐷 Δ | 𝑊 )   , 𝐽 2 = 1 2 𝜆 E  𝐷  0 − Δ ( 𝑓 ′ 𝑧 ( 𝑊 + 𝑢 ) − 𝑓 ′ 𝑧 ( 𝑊 ) ) 𝑑 𝑢  , 𝐽 3 = E { 𝑅 𝑓 𝑧 ( 𝑊 ) } . Without loss of g eneralit y , we only consider 𝐽 2 ,because 𝐽 1 and 𝐽 3 can be bounded in a similar w ay . For 𝐽 2 , obser v e t hat 𝑓 ′ 𝑧 ( 𝑤 ) = 𝑤 𝑓 ( 𝑤 ) − 1 { 𝑤 > 𝑧 } + { 1 − Φ ( 𝑧 ) } , and bot h 𝑤 𝑓 𝑧 ( 𝑤 ) and 1 { 𝑤 > 𝑧 } are increasin g functions (see, Chen et al. ( 2010 ), Lemma 2.3), b y lemma 4.2 in Zhang ( 2023 ) and ( 4.47 ), w e ha ve | 𝐽 2 | ≤ 1 2 𝜆     E  𝐷  0 − Δ { ( 𝑊 + 𝑢 ) 𝑓 𝑧 ( 𝑊 + 𝑢 ) − 𝑊 𝑓 𝑧 ( 𝑊 ) }      + 1 2 𝜆     E  𝐷  0 − Δ { 1 { 𝑊 + 𝑢 > 𝑧 } − 1 { 𝑊 > 𝑧 } }      ≤ 1 2 𝜆 E { | E { 𝐷 ∗ Δ | 𝑊 } | ( | 𝑊 𝑓 𝑧 ( 𝑊 ) | + 1 { 𝑊 > 𝑧 } ) } = 𝛿 2 E  | 𝑊 + 𝑅 | ( | 𝑊 𝑓 𝑧 ( 𝑊 ) | + 1 { 𝑊 > 𝑧 } )  = 𝐽 21 + 𝐽 22 (4.55) 22 where 𝐽 21 = 𝛿 2 E { | 𝑊 + 𝑅 | · | 𝑊 𝑓 𝑧 ( 𝑊 ) | } , 𝐽 22 = 𝛿 2 E  | 𝑊 + 𝑅 | 1 { 𝑊 > 𝑧 }  . For any 𝑤 > 0 , it is w ell kno wn that ( 1 − Φ ( 𝑤 ) ) / 𝑝 ( 𝑤 ) ≤ min { 1 / 𝑤 , √ 2 𝜋 / 2 } . T hen f or 𝑤 > 𝑧 , | 𝑓 𝑧 ( 𝑤 ) | ≤ √ 2 𝜋 2 Φ ( 𝑧 ) , | 𝑤 𝑓 𝑧 ( 𝑤 ) | ≤ Φ ( 𝑧 ) (4.56) and b y symme tr y , for 𝑤 < 0 , | 𝑓 𝑧 ( 𝑤 ) | ≤ √ 2 𝜋 2 { 1 − Φ ( 𝑧 ) } , | 𝑤 𝑓 𝑧 ( 𝑤 ) | ≤ 1 − Φ ( 𝑧 ) (4.57) For 𝐽 21 , b y ( 4.53 ), ( 4.56 ) and ( 4.57 ), w e ha ve 𝐽 21 ≤ 𝛿 2 { 1 − Φ ( 𝑧 ) } E  | 𝑊 + 𝑅 | 1 { 𝑊 < 0 }  + √ 2 𝜋 𝛿 2 { 1 − Φ ( 𝑧 ) } E  | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 1 { 0 ≤ 𝑊 ≤ 𝑧 }  + 𝛿 2 E  | 𝑊 + 𝑅 | 1 { 𝑊 > 𝑧 }  . (4.58) Thus, by ( 4.55 ) and ( 4.58 ), | 𝐽 2 | ≤ 𝛿 2 { 1 − Φ ( 𝑧 ) } E  | 𝑊 + 𝑅 | 1 { 𝑊 < 0 }  + √ 2 𝜋 𝛿 2 { 1 − Φ ( 𝑧 ) } E  | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 1 { 0 ≤ 𝑊 ≤ 𝑧 }  + 𝛿 E  | 𝑊 + 𝑅 | 1 { 𝑊 > 𝑧 }  . (4.59) For t he firs t term of ( 4.58 ), without loss of g enerality , w e assume that E { 𝑊 2 } ≤ 2 and we hav e 𝛿 2 { 1 − Φ ( 𝑧 ) } E  | 𝑊 + 𝑅 | 1 { 𝑊 < 0 }  ≤ 𝛿 { 1 − Φ ( 𝑧 ) } ( 1 + 𝜅 2 ( 𝑧 ) ) . (4.60) For t he second ter m of ( 4.58 ), similarl y to lemma 4.3 in Zhang ( 2023 ), w e hav e E  | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 1 { 0 ≤ 𝑊 ≤ 𝑧 }  = ⌊ 𝑧 ⌋  𝑗 = 1 E { | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 1 { 𝑗 − 1 ≤ 𝑊 < 𝑗 } } + E { | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 1 { ⌊ 𝑧 ⌋ ≤ 𝑊 < 𝑧 } } = ⌊ 𝑧 ⌋  𝑗 = 1 E { | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 − 𝑗 𝑊 𝑒 𝑗 𝑊 1 { 𝑗 − 1 ≤ 𝑊 < 𝑗 } } + E { | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 − 𝑧 𝑊 𝑒 𝑧 𝑊 1 { ⌊ 𝑧 ⌋ ≤ 𝑊 < 𝑧 } } ≤ ⌊ 𝑧 ⌋  𝑗 = 1 E { | 𝑊 + 𝑅 | · sup 𝑡 ∈ ( 𝑗 − 1 , 𝑗 ) ( 𝑡 𝑒 𝑡 2 / 2 − 𝑗 𝑡 𝑒 𝑗 𝑊 ) 1 { 𝑗 − 1 ≤ 𝑊 < 𝑗 } } + E { | 𝑊 + 𝑅 | · sup 𝑡 ∈ ( ⌊ 𝑧 ⌋ , 𝑧 ) ( 𝑡 𝑒 𝑡 2 / 2 − 𝑧 𝑡 𝑒 𝑧 𝑊 ) 1 { ⌊ 𝑧 ⌋ ≤ 𝑊 < 𝑧 } } Cramér -t ype moder ate dev iation for double index permutation statistics 23 = ⌊ 𝑧 ⌋  𝑗 = 1 𝑗 𝑒 ( 𝑗 − 1 ) 2 2 − 𝑗 ( 𝑗 − 1 ) E { | 𝑊 + 𝑅 | · 𝑒 𝑗 𝑊 1 { 𝑗 − 1 ≤ 𝑊 < 𝑗 } } + 𝑧 𝑒 ⌊ 𝑧 ⌋ 2 / 2 − 𝑧 ⌊ 𝑧 ⌋ E { | 𝑊 + 𝑅 | · 𝑒 𝑧 𝑊 1 { ⌊ 𝑧 ⌋ ≤ 𝑊 < 𝑧 } } ≤ 2 ⌊ 𝑧 ⌋  𝑗 = 1 𝑗 𝑒 − 𝑗 2 / 2 E { | 𝑊 | · 𝑒 𝑗 𝑊 1 { 𝑗 − 1 ≤ 𝑊 < 𝑗 } } + 2 𝑧 𝑒 − 𝑧 2 / 2 E { | 𝑊 | · 𝑒 𝑧 𝑊 1 { ⌊ 𝑧 ⌋ ≤ 𝑊 < 𝑧 } } + 2 ⌊ 𝑧 ⌋  𝑗 = 1 𝑗 𝑒 − 𝑗 2 / 2 E { | 𝑅 | · 𝑒 𝑗 𝑊 1 { 𝑗 − 1 ≤ 𝑊 < 𝑗 } } + 2 𝑧 𝑒 − 𝑧 2 / 2 E { | 𝑅 | · 𝑒 𝑧 𝑊 1 { ⌊ 𝑧 ⌋ ≤ 𝑊 < 𝑧 } } ≤ 2 ⌊ 𝑧 ⌋  𝑗 = 1 𝑗 𝑒 − 𝑗 2 / 2 ( E { 𝑊 𝑒 𝑗 𝑊 } + 2 𝑒 𝑗 ) + 2 𝑧 𝑒 − 𝑧 2 / 2 ( E { 𝑊 𝑒 𝑧 𝑊 } + 2 𝑒 𝑧 ) + 2 𝜅 2 ( 𝑧 )    ⌊ 𝑧 ⌋  𝑗 = 1 𝑗 + 𝑧    ≤ 2 ⌊ 𝑧 ⌋  𝑗 = 1 𝑗 𝑒 − 𝑗 2 2 ( E { 𝑊 𝑒 𝑗 𝑊 } + 2 𝑒 𝑗 ) + 2 𝑧 𝑒 − 𝑧 2 2 ( E { 𝑊 𝑒 𝑧 𝑊 } + 2 𝑒 𝑧 ) + 4 ( 1 + 𝑧 2 ) 𝜅 2 ( 𝑧 ) . (4.61) Then by ( 4.44 ), ( 4.49 ) and ( 4.61 ), w e ha ve √ 2 𝜋 𝛿 2 { 1 − Φ ( 𝑧 ) } E  | 𝑊 + 𝑅 | · 𝑊 𝑒 𝑊 2 / 2 1 { 0 ≤ 𝑊 ≤ 𝑧 }  ≤ 2 √ 2 𝜋 𝛿 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 2 ) [ 𝑧 ( 1 + 𝑧 𝛿 + 2 𝛿 1 ( 𝑧 ) ) + ( 2 + 𝑧 𝛿 ) 𝛿 2 ( 𝑧 ) ] ( 1 + 3 𝑒 𝛿 ) 𝑒 𝑧 2 2 ( 𝑧 𝛿 + 2 𝛿 1 ( 𝑧 ) ) + 3 𝑧 𝛿 2 ( 𝑧 ) + 2 √ 2 𝜋 𝛿 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 2 ) 𝜅 2 ( 𝑧 ) = 2 √ 2 𝜋 𝛿 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 2 ) ( 𝜅 2 ( 𝑧 ) + 𝛿 3 ( 𝑧 ) ) (4.62) For t he las t term of ( 4.58 ), b y Mark ov’ s inequality , we ha v e f or 𝑧 > 1 , 𝛿 E  | 𝑊 + 𝑅 | 1 { 𝑊 > 𝑧 }  ≤ 𝛿 E { | 𝑊 | 𝑒 𝑧 𝑊 } 𝑒 − 𝑧 2 + 𝛿 E { | 𝑅 | 𝑒 𝑧 𝑊 } 𝑒 − 𝑧 2 ≤ 𝛿 ( E { 𝑊 𝑒 𝑧 𝑊 } + 2 𝑒 𝑧 ) 𝑒 − 𝑧 2 + 𝛿 𝜅 2 ( 𝑧 ) 𝑒 𝑧 2 2 𝑒 − 𝑧 2 ≤ 2 𝛿 ( 1 + 𝛿 + 𝛿 3 ( 𝑧 ) + 𝜅 2 ( 𝑧 ) ) 𝑒 − 𝑧 2 2 . S ince it is w e ll k nown that f or 𝑧 > 0 , 𝑒 − 𝑧 2 / 2 ≤ √ 2 𝜋 ( 1 + 𝑧 ) { 1 − Φ ( 𝑧 ) } ≤ 3 √ 2 𝜋 2 ( 1 + 𝑧 2 ) { 1 − Φ ( 𝑧 ) } , w e deduce that 𝛿 E  | 𝑊 + 𝑅 | 1 { 𝑊 > 𝑧 }  ≤ 9 √ 2 𝜋 𝛿 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 2 ) ( 1 + 𝛿 + 𝛿 3 ( 𝑧 ) + 𝜅 2 ( 𝑧 ) ) . (4.63) Theref ore, combining ( 4.60 ), ( 4.62 ) and ( 4.63 ), for 𝑧 > 1 , w e ha v e | 𝐽 2 | ≤ 12 √ 2 𝜋 𝛿 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 2 ) ( 1 + 𝛿 3 ( 𝑧 ) + 𝜅 2 ( 𝑧 ) ) . (4.64) 24 S imilarl y to that in ( 4.58 ), b y dividin g ( −∞ , ∞) into three par ts, and analyze eac h part care fully , it also f ollo w s that | 𝐽 1 | ≤ 20 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 2 ) 𝜅 1 ( 𝑧 ) , | 𝐽 3 | ≤ 20 { 1 − Φ ( 𝑧 ) } ( 1 + 𝑧 ) 𝜅 2 ( 𝑧 ) . This complete s the proof for 𝑧 > 1 to g ether with ( 4.54 ). As for 0 ≤ 𝑧 ≤ 1 , in t his case 1 − Φ ( 𝑧 ) has a lo w er bound, so w e can directl y use t he Ber ry -Esseen bound result in Zhang ( 2022 ) to complete t he proof . Proof of Theorem 4.1 . By Proposition 4.1 , w e ha v e E { 𝑒 𝑡 𝑊 } ≤ ( 1 + 9 𝛿 ) 𝑒 𝜃 𝑒 𝑡 2 / 2 , f or 0 ≤ 𝑡 ≤ 𝜏 0 ( 𝜃 ) . By conditions ( 𝐴 1 ) - ( 𝐴 3 ) , w e ha v e conditions ( 𝐵 1 ) and ( 𝐵 2 ) are sa tisfied wit h 𝜏 0 = 𝜏 0 ( 𝜃 ) , and 𝜅 1 ( 𝑡 ) = ( 1 + 9 𝛿 ) 𝛿 1 ( 𝑡 ) 𝑒 𝜃 , 𝜅 2 ( 𝑡 ) = ( 1 + 9 𝛿 ) 𝛿 2 ( 𝑡 ) 𝑒 𝜃 . This pro v es T heorem 4.1 by Proposition 4.2 . 5. Proof of other results 5.1. Proof of Theorem 3.1 Proof of Theorem 3.1 . By Chao et al. ( 1996 ), w e know t ha t the Chatterjee ’ s rank correla tion coe fficient 𝜉 𝑛 = ( Γ 𝑛 − E Γ 𝑛 ) / ( V ar Γ 𝑛 ) 1 / 2 = ( 𝑇 𝑛 − E 𝑇 𝑛 ) / ( V ar 𝑇 𝑛 ) 1 / 2 , where Γ 𝑛 is the oscillation of a per mutation which is a cor e part of Chatterjee’ s rank correla tion coefficient and 𝑇 𝑛 =  𝑛 𝑖 = 1 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 + 1 ) , where 𝑎 𝑖 𝑗 = ( 𝛼 𝑖 𝑗 − 𝛼 𝑖 · − 𝛼 · 𝑗 + 𝛼 · · ) / 𝐵 ( 𝑛 ) , 𝐵 2 ( 𝑛 ) =  𝑖 , 𝑗 ( 𝛼 𝑖 𝑗 − 𝛼 𝑖 · − 𝛼 · 𝑗 + 𝛼 · · ) 2 / ( 𝑛 − 1 ) = ( 𝑛 + 1 ) ( 2 𝑛 2 + 7 ) 45 , and 𝛼 𝑖 · =  𝑗 𝛼 𝑖 𝑗 / 𝑛, 𝛼 · 𝑗 =  𝑖 𝛼 𝑖 𝑗 / 𝑛, 𝛼 · · =  𝑖 , 𝑗 𝛼 𝑖 𝑗 / 𝑛 2 . B y a direct calculation, we obtain E 𝑇 𝑛 = − 1 𝑛 − 1  𝑖 𝑎 𝑖 𝑖 , V ar 𝑇 𝑛 = 1 𝑛 − 2  𝑖 , 𝑗 𝑎 2 𝑖 𝑗 − 1 ( 𝑛 − 1 ) ( 𝑛 − 2 )  𝑖 , 𝑗 𝑎 𝑖 𝑗 𝑎 𝑗 𝑖 + 1 ( 𝑛 − 1 ) 2 ( 𝑛 − 2 )   𝑖 𝑎 𝑖 𝑖  2 − 𝑛 ( 𝑛 − 1 ) ( 𝑛 − 2 )  𝑖 𝑎 2 𝑖 𝑖 = 1 + 𝑂 ( 𝑛 − 1 ) . Cramér -t ype moder ate dev iation for double index permutation statistics 25 So the normalized statistic is defined as 𝑊 𝑛 = 𝑇 𝑛 − E 𝑇 𝑛 =  𝑖 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 + 1 ) +  𝑖 𝑎 𝑖 𝑖 / ( 𝑛 − 1 ) =  𝑖 , 𝑗  1 { 𝑗 = 𝑖 + 1 } 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑖 + 1 ) + 𝑎 𝑖 𝑖 𝑛 ( 𝑛 + 1 )  : =  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) . B y ( 4.1 )-( 4.3 ) w e ha ve 𝜉 ( 𝑖 , 𝑗 , 𝑘 , ·) = 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( 𝑖 , 𝑗 , · , 𝑙 ) = 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( 𝑖 , · , 𝑘 , 𝑙 ) = 𝑎 𝑘 𝑙 𝑛 + 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( · , 𝑗 , 𝑘 , 𝑙 ) = 𝑎 𝑘 𝑙 𝑛 − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( 𝑖 , 𝑗 , · , · ) = 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( 𝑖 , · , 𝑘 , ·) = 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( 𝑖 , · , · , 𝑙 ) = 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( · , 𝑗 , 𝑘 , ·) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( · , 𝑗 , · , 𝑙 ) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( · , · , 𝑘 , 𝑙 ) = 𝑎 𝑘 𝑙 𝑛 − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( 𝑖 , · , · , · ) = 𝑎 𝑖 𝑖 𝑛 ( 𝑛 − 1 ) 𝜉 ( · , 𝑗 , · , · ) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( · , · , 𝑘 , ·) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( · , · , · , 𝑙 ) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ( · , · , · , · ) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜉 ∗ ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) = 1 { 𝑗 = 𝑖 + 1 } 𝑎 𝑘 𝑙 − 𝑎 𝑘 𝑙 𝑛 𝜂 ( 𝑖 , 𝑘 ) = 𝑎 𝑖 𝑖 𝑛 − 1 − 𝑎 𝑘 𝑘 𝑛 𝜂 ( 𝑖 , · ) = 𝑎 𝑖 𝑖 𝑛 − 1 + 𝛼 · · 𝑛 𝐵 ( 𝑛 ) 𝜂 ( · , 𝑘 ) = − 𝛼 · · ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) − 𝑎 𝑘 𝑘 𝑛 𝜂 ( · , ·) = − 𝛼 · · 𝑛 ( 𝑛 − 1 ) 𝐵 ( 𝑛 ) 𝜂 ∗ ( 𝑖 , 𝑘 ) = 0 𝛼 · · = 𝑛 2 − 1 3 𝑛 and 𝑊 𝑛 = ′  𝑖 , 𝑗 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) − 𝑛 𝜂 ( · , ·) = ′  𝑖 , 𝑗  1 { 𝑗 = 𝑖 + 1 } 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) − 1 𝑛 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )  +  5 ( 𝑛 + 1 ) 𝑛 2 ( 2 𝑛 2 + 7 ) . (5.1) 26 So , if w e de fine 𝑌 𝑛 = ′  𝑖 , 𝑗  1 { 𝑗 = 𝑖 + 1 } 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) − 1 𝑛 𝑎 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )  : = ′  𝑖 , 𝑗 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) . (5.2) w e ha ve for 𝑧 ≥ 0 , | P ( 𝜉 𝑛 > 𝑧 ) − ( 1 − Φ ( 𝑧 ) ) | ≤      P  𝑌 𝑛 > 𝑧 ( 1 + 𝐶 1 / 𝑛 ) −  5 ( 𝑛 + 1 ) 𝑛 2 ( 2 𝑛 2 + 7 )  − P  𝑍 > 𝑧 ( 1 + 𝐶 1 / 𝑛 ) −  5 ( 𝑛 + 1 ) 𝑛 2 ( 2 𝑛 2 + 7 )       +      P  𝑍 > 𝑧 ( 1 + 𝐶 1 / 𝑛 ) −  5 ( 𝑛 + 1 ) 𝑛 2 ( 2 𝑛 2 + 7 )  − P ( 𝑍 > 𝑧 )      : = 𝐽 1 + 𝐽 2 . where 𝑍 is a standard nor mal random variable, since ma x 𝑖 , 𝑘 | 𝑎 𝑖 𝑘 | = 𝑎 1 𝑛 = 𝑎 𝑛 1 = √ 5 ( 𝑛 2 − 1 ) 𝑛 √ ( 𝑛 + 1 ) ( 2 𝑛 2 + 7 ) ≤  5 𝑛 , and t hen we can easily verify that { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑛 ] satis f y condition ( 2.5 ) and the boundedness condition ( 2.6 ), where 𝛿 = 𝐶 √ 𝑛 . B y Theorem 2.1 , w e ha v e for 0 ≤ 𝑧 ≤ 𝑛 1 / 6 , 𝐽 1 = 𝑂 ( 1 ) ( 1 − Φ ( 𝑧 ) ) ( 1 + 𝑧 3 ) √ 𝑛 . (5.3) since f or 𝑧 > 1 , we hav e 𝜙 ( 𝑧 ) ≤ 2 𝑧 ( 1 − Φ ( 𝑧 ) ) and for 0 ≤ 𝑧 ≤ 1 ,w e hav e 𝜙 ( 𝑧 ) ≤ 2 ( 1 − Φ ( 𝑧 ) ) , by t he mean v alue theorem, w e ha ve for 0 ≤ 𝑧 ≤ 𝑛 1 / 6 , 𝐽 2 = 𝑂 ( 1 ) ( 1 − Φ ( 𝑧 ) ) ( 1 + 𝑧 3 ) √ 𝑛 . (5.4) then we combine 𝐽 1 and 𝐽 2 to comple te the proof. 5.2. Proof of Theorem 3.2 Proof of Theorem 3.2 . Since 𝐷 =  𝑖 , 𝑗 ( 1 { 𝑖 < 𝑗 , 𝜋 ( 𝑖 ) − 1 = 𝜋 ( 𝑗 ) } − 1 { 𝑖 < 𝑗 , 𝜋 ( 𝑖 ) + 1 = 𝜋 ( 𝑗 ) } ) = 2 Des ( 𝜋 − 1 ) − ( 𝑛 − 1 ) : =  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , where 𝜉 ( 𝑖, 𝑗 , 𝑘 , 𝑙 ) = 1 { 𝑖 < 𝑗 , 𝑘 − 1 = 𝑙 } − 1 { 𝑖 < 𝑗 , 𝑘 + 1 = 𝑙 } , b y ( 4.1 )-( 4.3 ) w e ha v e 𝜉 ( 𝑖 , 𝑗 , 𝑘 , ·) = 1 𝑛 ( 1 { 𝑖 < 𝑗 , 𝑘 = 𝑛 } − 1 { 𝑖 < 𝑗 , 𝑘 = 1 } ) , 𝜉 ( 𝑖 , 𝑗 , · , 𝑙 ) = 1 𝑛 ( 1 { 𝑖 < 𝑗 , 𝑙 = 1 } − 1 { 𝑖 < 𝑗 , 𝑙 = 𝑛 } ) 𝜉 ( 𝑖 , · , 𝑘 , 𝑙 ) = 𝑛 − 𝑖 𝑛 ( 1 { 𝑘 − 1 = 𝑙 } − 1 { 𝑘 + 1 = 𝑙 } ) , 𝜉 ( · , 𝑗 , 𝑘 , 𝑙 ) = 𝑗 − 1 𝑛 ( 1 { 𝑘 − 1 = 𝑙 } − 1 { 𝑘 + 1 = 𝑙 } ) Cramér -t ype moder ate dev iation for double index permutation statistics 27 𝜉 ( 𝑖 , 𝑗 , · , · ) = 0 , 𝜉 ( 𝑖, · , 𝑘 , ·) = 𝑛 − 𝑖 𝑛 2 ( 1 { 𝑘 = 𝑛 } − 1 { 𝑘 = 1 } ) 𝜉 ( 𝑖 , · , · , 𝑙 ) = 𝑛 − 𝑖 𝑛 2 ( 1 { 𝑙 = 1 } − 1 { 𝑙 = 𝑛 } ) , 𝜉 ( · , 𝑗 , 𝑘 , ·) = 𝑗 − 1 𝑛 2 ( 1 { 𝑘 = 𝑛 } − 1 { 𝑘 = 1 } ) 𝜉 ( · , 𝑗 , · , 𝑙 ) = 𝑗 − 1 𝑛 2 ( 1 { 𝑙 = 1 } − 1 { 𝑙 = 𝑛 } ) , 𝜉 ( · , · , 𝑘 , 𝑙 ) = 𝑛 − 1 2 𝑛 ( 1 { 𝑘 − 1 = 𝑙 } − 1 { 𝑘 + 1 = 𝑙 } ) and 𝜉 ( 𝑖 , · , · , · ) = 0 , 𝜉 ( · , 𝑗 , · , · ) = 0 , 𝜉 ( · , · , · , ·) = 0 𝜉 ( · , · , 𝑘 , ·) = 𝑛 − 1 2 𝑛 2 ( 1 { 𝑘 = 𝑛 } − 1 { 𝑘 = 1 } ) , 𝜉 ( · , · , · , 𝑙 ) = 𝑛 − 1 2 𝑛 2 ( 1 { 𝑙 = 1 } − 1 { 𝑙 = 𝑛 } ) then we hav e 𝜉 ∗ ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) = 1 { 𝑖 < 𝑗 , 𝑘 − 1 = 𝑙 } − 1 { 𝑖 < 𝑗 , 𝑘 + 1 = 𝑙 } − 1 𝑛 ( 1 { 𝑖 < 𝑗 , 𝑘 = 𝑛 } − 1 { 𝑖 < 𝑗 , 𝑘 = 1 } ) − 1 𝑛 ( 1 { 𝑖 < 𝑗 , 𝑙 = 1 } − 1 { 𝑖 < 𝑗 , 𝑙 = 𝑛 } ) − 𝑛 − 1 + 2 ( 𝑗 − 𝑖 ) 2 𝑛 1 { 𝑘 − 1 = 𝑙 } + 𝑛 − 1 + 2 ( 𝑗 − 𝑖 ) 2 𝑛 1 { 𝑘 + 1 = 𝑙 } + 𝑛 − 1 + 2 ( 𝑗 − 𝑖 ) 2 𝑛 2 1 { 𝑘 = 𝑛 } − 𝑛 − 1 + 2 ( 𝑗 − 𝑖 ) 2 𝑛 2 1 { 𝑘 = 1 } + 𝑛 − 1 + 2 ( 𝑗 − 𝑖 ) 2 𝑛 2 1 { 𝑙 = 1 } − 𝑛 − 1 + 2 ( 𝑗 − 𝑖 ) 2 𝑛 2 1 { 𝑙 = 𝑛 } . and 𝜂 ( 𝑖 , 𝑘 ) = 𝜂 ∗ ( 𝑖 , 𝑘 ) = 𝑛 − 2 𝑖 + 1 𝑛 1 { 𝑘 = 𝑛 } − 𝑛 − 2 𝑖 + 1 𝑛 1 { 𝑘 = 1 } , 𝜂 ( · , 𝑘 ) = 𝜂 ( 𝑖, ·) = 𝜂 ( · , ·) = 0 , 𝜎 2 = 2 ( 𝑛 + 1 ) / 3 , so 𝑊 = 𝐷 − 𝑛 𝜂 ( · , · ) 𝜎 = Des − ( 𝑛 − 1 ) / 2  ( 𝑛 + 1 ) / 6 =  𝑖  6 𝑛 + 1 𝜂 ∗ ( 𝑖 , 𝜋 ( 𝑖 ) ) + ′  𝑖 , 𝑗  6 𝑛 + 1 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) : =  𝑖 𝑎 ( 𝑖 , 𝜋 ( 𝑖 ) ) + ′  𝑖 , 𝑗 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) . (5.5) and we can easily v er ify that { 𝑎 ( 𝑖, 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑛 ] and { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑛 ] satis f y condition ( 2.4 ), ( 2.5 ) and t he boundedness condition ( 2.6 ) where 𝛿 = 𝐶 √ 𝑛 , t heref ore w e apply Theorem 2.1 to pro v e ( 3.5 ), by a same argument we pro v e ( 3.6 ). 28 5.3. Proof of Theorem 3.3 Proof of Theorem 3.3 . Since the Mann- W hitne y- Wilco x on statistic is defined as 𝐷 =  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) , 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) = 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 , 1 ≤ 𝜋 ( 𝑖 ) < 𝜋 ( 𝑗 ) ≤ 𝑛 } , b y ( 4.1 )-( 4.3 ) w e ha v e 𝜉 ( 𝑖 , 𝑗 , 𝑘 , ·) = 𝑛 − 𝑘 𝑛 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } , 𝜉 ( 𝑖 , 𝑗 , · , 𝑙 ) = 𝑙 − 1 𝑛 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } 𝜉 ( 𝑖 , · , 𝑘 , 𝑙 ) = 𝑛 2 𝑛 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } , 𝜉 ( · , 𝑗 , 𝑘 , 𝑙 ) = 𝑛 1 𝑛 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 , 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } 𝜉 ( 𝑖 , 𝑗 , · , · ) = 𝑛 − 1 2 𝑛 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } , 𝜉 ( 𝑖 , · , 𝑘 , ·) = 𝑛 2 ( 𝑛 − 𝑘 ) 𝑛 2 1 { 1 ≤ 𝑖 ≤ 𝑛 1 } 𝜉 ( 𝑖 , · , · , 𝑙 ) = 𝑛 2 ( 𝑙 − 1 ) 𝑛 2 1 { 1 ≤ 𝑖 ≤ 𝑛 1 } , 𝜉 ( · , 𝑗 , 𝑘 , ·) = 𝑛 1 ( 𝑛 − 𝑘 ) 𝑛 2 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } 𝜉 ( · , 𝑗 , · , 𝑙 ) = 𝑛 1 ( 𝑙 − 1 ) 𝑛 2 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } , 𝜉 ( · , · , 𝑘 , 𝑙 ) = 𝑛 1 𝑛 2 𝑛 2 1 { 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } 𝜉 ( 𝑖 , · , · , · ) = 𝑛 2 ( 𝑛 − 1 ) 2 𝑛 2 1 { 1 ≤ 𝑖 ≤ 𝑛 1 } , 𝜉 ( · , 𝑗 , · , ·) = 𝑛 1 ( 𝑛 − 1 ) 2 𝑛 2 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } 𝜉 ( · , · , 𝑘 , ·) = 𝑛 1 𝑛 2 𝑛 3 ( 𝑛 − 𝑘 ) , 𝜉 ( · , · , · , 𝑙 ) = 𝑛 1 𝑛 2 𝑛 3 ( 𝑙 − 1 ) , 𝜉 ( · , · , · , ·) = 𝑛 1 𝑛 2 ( 𝑛 − 1 ) 2 𝑛 3 and 𝜉 ∗ ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) = 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 , 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } − 𝑛 − 1 + 2 ( 𝑙 − 𝑘 ) 2 𝑛 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } − 𝑛 2 𝑛 1 { 1 ≤ 𝑖 ≤ 𝑛 1 , 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } − 𝑛 1 𝑛 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 , 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } + 𝑛 2 ( 𝑛 − 1 + 2 ( 𝑙 − 𝑘 ) ) 2 𝑛 2 1 { 1 ≤ 𝑖 ≤ 𝑛 1 } + 𝑛 1 ( 𝑛 − 1 + 2 ( 𝑙 − 𝑘 ) ) 2 𝑛 2 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 } + 𝑛 1 𝑛 2 𝑛 2 1 { 1 ≤ 𝑘 < 𝑙 ≤ 𝑛 } − 𝑛 1 𝑛 2 ( 𝑛 − 1 + 2 ( 𝑙 − 𝑘 ) ) 2 𝑛 3 . then we hav e 𝜂 ( 𝑖 , 𝑘 ) = 1 { 1 ≤ 𝑖 ≤ 𝑛 1 }  𝑛 2 ( 𝑛 − 1 ) 2 𝑛 2 + 𝑛 2 ( 𝑛 − 𝑘 ) 𝑛  + 1 { 𝑛 1 + 1 ≤ 𝑖 ≤ 𝑛 }  𝑛 1 ( 𝑛 − 1 ) 2 𝑛 2 + 𝑛 1 ( 𝑘 − 1 ) 𝑛  + 𝑛 1 𝑛 2 ( 𝑛 − 1 ) ( 𝑛 + 1 ) 2 𝑛 3 𝜂 ∗ ( 𝑖 , 𝑘 ) = 1 { 1 ≤ 𝑖 ≤ 𝑛 1 }  𝑛 2 ( 𝑛 − 2 𝑘 + 1 ) 2 𝑛  + 1 { 𝑛 1 + 1 ≤ 𝑗 ≤ 𝑛 }  𝑛 1 ( 2 𝑘 − 𝑛 − 1 ) 2 𝑛  𝜂 ( · , ·) = 𝑛 1 𝑛 2 ( 𝑛 − 1 ) ( 𝑛 + 1 ) 2 𝑛 3 , 𝜎 2 = 𝑛 1 𝑛 2 ( 𝑛 + 1 ) 12 Cramér -t ype moder ate dev iation for double index permutation statistics 29 so 𝑊 = 𝐷 − 𝑛 𝜂 ( · , · ) 𝜎 =  𝑖 , 𝑗 𝜉 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) − 𝑛 1 𝑛 2 ( 𝑛 − 1 ) ( 𝑛 + 1 ) / 2 𝑛 2  𝑛 1 𝑛 2 ( 𝑛 + 1 ) / 12 =  𝑖  12 𝑛 1 𝑛 2 ( 𝑛 + 1 ) 𝜂 ∗ ( 𝑖 , 𝜋 ( 𝑖 ) ) + ′  𝑖 , 𝑗  12 𝑛 1 𝑛 2 ( 𝑛 + 1 ) 𝜉 ∗ ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) : =  𝑖 𝑎 ( 𝑖 , 𝜋 ( 𝑖 ) ) + ′  𝑖 , 𝑗 𝑏 ( 𝑖 , 𝑗 , 𝜋 ( 𝑖 ) , 𝜋 ( 𝑗 ) ) . (5.6) and we can easily v er ify that { 𝑎 ( 𝑖, 𝑘 ) } 𝑖 , 𝑘 ∈ [ 𝑛 ] and { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑛 ] satis f y condition ( 2.4 ), ( 2.5 ) and t he boundedness condition ( 2.6 ) where 𝛿 = 𝐶 √ 𝑛 , t her ef ore we apply Theorem 2.1 to pro v e ( 3.8 ). Proof of Lemma 4.1 . W e Firs t consider ( 4.14 ), b y a simple calcula tion w e ha v e E         𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 ) − E  𝑛  𝑖 = 1 𝑎 2 𝑖 𝜋 ( 𝑖 )   2 𝑒 𝑡 𝑊 𝑛        = E            𝑖 ≠ 𝑗 𝑎 2 𝑖 𝜋 ( 𝑖 ) 𝑎 2 𝑗 𝜋 ( 𝑗 )    𝑒 𝑡 𝑊 𝑛        − 2 𝑛  𝑖 , 𝑗 𝑎 2 𝑖 𝑗 E   𝑛  𝑘 = 1 𝑎 2 𝑘 𝜋 ( 𝑘 )  𝑒 𝑡 𝑊 𝑛  + 1 𝑛 2     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) + E   𝑛  𝑖 = 1 𝑎 4 𝑖 𝜋 ( 𝑖 )  𝑒 𝑡 𝑊 𝑛  ≤ 𝐽 1 − 𝐽 2 + 1 𝑛 2     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) + 𝑛 𝛿 4 ℎ ( 𝑡 ) . (5.7) where 𝐽 1 = E            𝑖 ≠ 𝑗 𝑎 2 𝑖 𝜋 ( 𝑖 ) 𝑎 2 𝑗 𝜋 ( 𝑗 )    𝑒 𝑡 𝑊 𝑛        , 𝐽 2 = 2 𝑛  𝑖 , 𝑗 𝑎 2 𝑖 𝑗 E   𝑛  𝑘 = 1 𝑎 2 𝑘 𝜋 ( 𝑘 )  𝑒 𝑡 𝑊 𝑛  , Considering 𝐽 1 , f or an y inde x 𝑖 , 𝑗 ∈ [ 𝑛 ] satis fy 𝑖 ≠ 𝑗 , let 𝑊 ( 𝑖 , 𝑗 ) 𝑛 = 𝑛  𝑝 = 1 𝑝 ∉ { 𝑖 , 𝑗 } 𝑎 𝑝 𝜋 ( 𝑝 ) +  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑖 , 𝑗 } 𝑏 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 ) , 𝑉 𝑖 𝑗 = 𝑊 𝑛 − 𝑊 ( 𝑖 , 𝑗 ) 𝑛 . (5.8) 30 Appl ying condition ( 2.6 ), we g et | 𝑉 𝑖 𝑗 | =         𝑎 𝑖 𝜋 ( 𝑖 ) + 𝑎 𝑗 𝜋 ( 𝑗 ) + 𝑛  𝑝 = 1 𝑝 ≠ 𝑖 ( 𝑏 𝑖 𝑝 𝜋 ( 𝑖 ) 𝜋 ( 𝑝 ) + 𝑏 𝑝 𝑖 𝜋 ( 𝑝 ) 𝜋 ( 𝑖 ) ) + 𝑛  𝑝 = 1 𝑝 ∉ { 𝑖 , 𝑗 } ( 𝑏 𝑗 𝑝 𝜋 ( 𝑗 ) 𝜋 ( 𝑝 ) + 𝑏 𝑝 𝑗 𝜋 ( 𝑝 ) 𝜋 ( 𝑗 ) )         ≤ 12 𝛿 . (5.9) For any inde x 𝑖 ≠ 𝑗 , we perf orm a T a y lor e xpansion 𝑊 ( 𝑖 , 𝑗 ) 𝑛 . It then follo ws t hat 𝐽 1 = 1 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  Ψ 𝑡 ( 𝑊 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  = 1 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  + 𝑡 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  𝑉 𝑖 𝑗 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  + 𝑡 2 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  𝑉 2 𝑖 𝑗 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 + 𝑈 𝑉 𝑖 𝑗 ) ( 1 − 𝑈 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  : = 𝐽 11 + 𝐽 12 + 𝐽 13 , (5.10) where 𝑈 is a unif orm random variable on [ 0 , 1 ] and is independent of 𝜋 . Note that, 𝐽 1 is decomposed into t hree part s 𝐽 11 , 𝐽 12 and 𝐽 13 . W e first consider 𝐽 11 , b y usin g the technique of addin g and subtracting one item and the condition ( 2.6 ), we obt ain 𝐽 11 ≤ 1 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) + 1 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    ≤ 1 𝑛 ( 𝑛 − 1 )     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) + 𝛿 4 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    , (5.11) Ne xt, to estimate 𝐽 1 , we estimate the absolute difference    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    f or any fix ed inde x es 𝑖 ≠ 𝑗 , 𝑘 ≠ 𝑙 . Denote 𝜏 𝑖 , 𝑗 the transposition of 𝑖 and 𝑗 , then we de fine 𝜎 𝑖 𝑗 𝑘 𝑙 and 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 as f ollo w s 𝜎 𝑖 𝑗 𝑘 𝑙 =              𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) ≠ 𝑘 , 𝜎 ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑙 ) ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑙 ) , 𝜎 ( 𝑖 ) ≠ 𝑙 , 𝜎 ( 𝑗 ) = 𝑘 , 𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑙 ) ◦ 𝜏 𝑖 , 𝑗 , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑘 , 𝜎 ◦ 𝜏 𝑖 , 𝜎 − 1 ( 𝑘 ) ◦ 𝜏 𝑗 , 𝜎 − 1 ( 𝑙 ) , 𝑜 𝑡 ℎ𝑒 𝑟 𝑤𝑖 𝑠 𝑒, (5.12) 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 = 𝑛  𝑝 = 1 𝑝 ∉ { 𝑖 , 𝑗 } 𝑎 𝑝 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑝 ) +  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑖 , 𝑗 } 𝑏 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑝 ) 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑞 ) , (5.13) Cramér -t ype moder ate dev iation for double index permutation statistics 31 where 𝑖 ≠ 𝑗 , 𝑘 ≠ 𝑙 ∈ [ 𝑛 ] , 𝜎 is a random permut ation chosen uniformly from 𝑆 𝑛 and independent of 𝜋 . B y lemma 4.2 , w e ha ve L ( 𝜎 𝑖 𝑗 𝑘 𝑙 ) 𝑑 = L ( 𝜋 | 𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 ) . (5.14) Then it follo ws b y t he definition of 𝑊 ( 𝑖 , 𝑗 ) 𝑛 in ( 5.8 ) and the definition of 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 in ( 5.13 ) that L ( 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) 𝑑 = L ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 | 𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 ) . (5.15) Hence E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  can be replaced b y E  Ψ 𝑡 ( 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 )  , and we per f or m a T a ylor e xpansion on Ψ 𝑡 ( 𝑤 ) at 𝑇 𝑛 =  𝑛 𝑖 = 1 𝑎 𝑖 𝜎 ( 𝑖 ) +  𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜎 ( 𝑖 ) 𝜎 ( 𝑗 ) . For any inde x 𝑖 ≠ 𝑗 , 𝑘 ≠ 𝑙 ∈ [ 𝑛 ] , we hav e    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    =     E  Ψ 𝑡 ( 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 )  − ℎ ( 𝑡 )     =     𝑡 E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) Ψ 𝑡 ( 𝑇 𝑛 )  + 𝑡 2 E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) 2 Ψ 𝑡 ( 𝑇 𝑛 + 𝑈 ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) ( 1 − 𝑈 ) )      S imilarl y to ( 5.9 ), applyin g condition ( 2.6 ), we ha v e     𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙     ≤ 𝐶 𝛿 . Rec alling t ha t 0 < 𝑡 < 1 / 𝛿 , w e obtain    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    ≤     𝑡 E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) Ψ 𝑡 ( 𝑇 𝑛 )      + 𝐶 𝑡 2 𝛿 2 ℎ ( 𝑡 ) . ≤         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞          + 2 𝑡 𝛿 𝑛 max 𝑖 ≠ 𝑗 , 𝑝 ≠ 𝑞    E  Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞  − ℎ ( 𝑡 )    + 𝐶 ( 1 𝑛 + 𝑡 2 𝛿 2 ) ℎ ( 𝑡 ) . (5.16) So w e next es timate         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞          (5.17) U nder the condition 𝜎 ( 𝑖 ) = 𝑝 , 𝜎 ( 𝑗 ) = 𝑞 , 𝑖 ≠ 𝑗 , 𝑘 ≠ 𝑙 , 𝑝 ≠ 𝑞 , 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } , (5.18) the values of per mutation 𝜎 𝑖 𝑗 𝑘 𝑙 and 𝜎 on inde x es 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 − 1 ( 𝑙 ) are giv en in the follo wing t able. R ecallin g the definition of 𝑇 𝑛 and 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 , under condition ( 5.18 ) it f ollo w s that, 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 = 𝑎 𝑖 𝑝 + 𝑎 𝑗 𝑞 + 𝑏 𝑖 𝑗 𝑝 𝑞 + 𝑎 𝜎 − 1 ( 𝑘 ) 𝑘 + 𝑎 𝜎 − 1 ( 𝑙 ) 𝑙 − 𝑎 𝜎 − 1 ( 𝑘 ) 𝑝 − 𝑎 𝜎 − 1 ( 𝑙 ) 𝑞 32 permut a tion 𝜎 𝜎 𝑖 𝑗 𝑘 𝑙 inde x 𝑖 𝑗 𝜎 − 1 ( 𝑘 ) 𝜎 − 1 ( 𝑙 ) 𝑖 𝑗 𝜎 − 1 ( 𝑘 ) 𝜎 − 1 ( 𝑙 ) permut a tion(inde x) 𝑝 𝑞 𝑘 𝑙 𝑘 𝑙 𝑝 𝑞 +  𝑠 ∉ { 𝑖 , 𝑗 } ( 𝑏 𝑖 𝑠 𝑝 𝜎 ( 𝑠 ) + 𝑏 𝑠 𝑖 𝜎 ( 𝑠 ) 𝑝 + 𝑏 𝑗 𝑠 𝑞 𝜎 ( 𝑠 ) + 𝑏 𝑠 𝑗 𝜎 ( 𝑠 ) 𝑞 ) + 𝑏 𝜎 − 1 ( 𝑙 ) 𝜎 − 1 ( 𝑘 ) 𝑙 𝑘 + 𝑏 𝜎 − 1 ( 𝑘 ) 𝜎 − 1 ( 𝑙 ) 𝑘 𝑙 − 𝑏 𝜎 − 1 ( 𝑙 ) 𝜎 − 1 ( 𝑘 ) 𝑞 𝑝 − 𝑏 𝜎 − 1 ( 𝑘 ) 𝜎 − 1 ( 𝑙 ) 𝑝 𝑞 +  𝑠 ∉ { 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 − 1 ( 𝑙 ) } ( 𝑏 𝑠 𝜎 − 1 ( 𝑘 ) 𝜎 ( 𝑠 ) 𝑘 + 𝑏 𝜎 − 1 ( 𝑘 ) 𝑠 𝑘 𝜎 ( 𝑠 ) − 𝑏 𝑠 𝜎 − 1 ( 𝑘 ) 𝜎 ( 𝑠 ) 𝑝 − 𝑏 𝜎 − 1 ( 𝑘 ) 𝑠 𝑝 𝜎 ( 𝑠 ) ) +  𝑠 ∉ { 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 − 1 ( 𝑙 ) } ( 𝑏 𝑠 𝜎 − 1 ( 𝑙 ) 𝜎 ( 𝑠 ) 𝑙 + 𝑏 𝜎 − 1 ( 𝑙 ) 𝑠𝑙 𝜎 ( 𝑠 ) − 𝑏 𝜎 − 1 ( 𝑙 ) 𝑠 𝑞 𝜎 ( 𝑠 ) − 𝑏 𝑠 𝜎 − 1 ( 𝑙 ) 𝜎 ( 𝑠 ) 𝑞 ) . (5.19) W e divide ter ms of ( 5.19 ) into four groups. The first g r oup contains 𝑎 𝑖 𝑝 , 𝑎 𝑗 𝑞 and 𝑏 𝑗 𝑖 𝑝 𝑞 which do not ha v e random inde x; the second g roup contains 𝑎 𝜎 − 1 ( 𝑘 ) 𝑘 , 𝑎 𝜎 − 1 ( 𝑙 ) 𝑙 , 𝑎 𝜎 − 1 ( 𝑘 ) 𝑝 , 𝑎 𝜎 − 1 ( 𝑙 ) 𝑞 and  𝑠 ∉ { 𝑖, 𝑗 } ( 𝑏 𝑖 𝑠 𝑝 𝜎 ( 𝑠 ) + 𝑏 𝑠 𝑖 𝜎 ( 𝑠 ) 𝑝 + 𝑏 𝑗 𝑠 𝑞 𝜎 ( 𝑠 ) + 𝑏 𝑠 𝑗 𝜎 ( 𝑠 ) 𝑞 ) which ha v e one random index ; t he third group is the thr id line of ( 5.19 ), which has t w o random inde x es; the last g roup is t he fourth and fif th line of ( 5.19 ) with three random index es. T here fore     𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞      also be div ided into f our parts according ly . For the terms in     𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E  ( 𝑇 𝑛 − 𝑆 ( 𝑖 , 𝑗 ) 𝜎 𝑖 𝑗 𝑘𝑙 ) Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞      correspondin g to each par t , w e can use the same method to estimate their upper bounds, for eac h par t, w e only pro vide t he es timation proces s of t he upper bound of a representativ e ter m here. The rest can be obt ained t hr ough t he same method. For t he first par t, w e consider the representativ e ter m cor re spondin g to 𝑎 𝑖 𝑝 . Applyin g condition ( 2.4 ) and ( 2.6 ), w e ha ve         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E  𝑎 𝑖 𝑝 Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞          ≤         𝑡 𝑛 ( 𝑛 − 1 ) 𝑛  𝑞 = 1 𝑞 ∉ { 𝑘 ,𝑙 } ( − 𝑎 𝑖 𝑞 − 𝑎 𝑖 𝑘 − 𝑎 𝑖 𝑙 ) ℎ ( 𝑡 )         +         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } 𝑎 𝑖 𝑝  E  Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞  − ℎ ( 𝑡 )          ≤ 3 𝑡 𝛿 𝑛 ℎ ( 𝑡 ) + 𝑡 𝛿 max 𝑖 ≠ 𝑗 , 𝑝 ≠ 𝑞    E  Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞  − ℎ ( 𝑡 )    . (5.20) Cramér -t ype moder ate dev iation for double index permutation statistics 33 Ne xt, for the second par t, w e consider t he representati v e ter m rela ted to  𝑠 ∉ { 𝑖, 𝑗 } 𝑏 𝑖 𝑠 𝑝 𝜎 ( 𝑠 ) . Usin g condition ( 2.5 ) and ( 2.6 ), it f ollo w s that         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E         𝑠 ∉ { 𝑖 , 𝑗 } ( 𝑏 𝑖 𝑠 𝑝 𝜎 ( 𝑠 ) Ψ 𝑡 ( 𝑇 𝑛 )       𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞                ≤         𝐶 𝑡 𝑛 3  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑠 ∉ { 𝑖, 𝑗 }  𝑟 ∉ { 𝑝 , 𝑞 } 𝑏 𝑖 𝑠 𝑝 𝑟 ℎ ( 𝑡 )         +         𝐶 𝑡 𝑛 3  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑠 ∉ { 𝑖, 𝑗 }  𝑟 ∉ { 𝑝 , 𝑞 } 𝑏 𝑖 𝑠 𝑝 𝑟  E  Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )          ≤ 𝐶 𝑡 𝑛 3  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } | 𝑏 𝑖 𝑖 𝑝 𝑝 + 𝑏 𝑖 𝑗 𝑝 𝑝 + 𝑏 𝑖 𝑖 𝑝 𝑞 + 𝑏 𝑖 𝑗 𝑝 𝑞 | ℎ ( 𝑡 ) + 𝐶 𝑡 𝑛  𝑠 ,𝑟 | 𝑏 𝑖 𝑠 𝑝 𝑟 | · max 𝑖 ≠ 𝑗 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑟     E  Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )     ≤ 𝐶 𝑡 𝛿 𝑛 ℎ ( 𝑡 ) + 𝐶 𝑡 𝛿 ma x 𝑖 ≠ 𝑗 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑟     E  Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜋 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )     . (5.21) Where 𝑎 ≠ 𝑏 ≠ 𝑐 means 𝑎 , 𝑏 , 𝑐 are distinct, and 𝑎 ≠ 𝑏 ≠ 𝑐 ≠ 𝑑 is a similar g eneralization. Then f or the third par t, w e consider the representati v e term rela ted to 𝑏 𝜎 − 1 ( 𝑙 ) 𝜎 − 1 ( 𝑘 ) 𝑙 𝑘 , by usin g condition ( 2.5 ) and ( 2.6 ), w e deduce that         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E  𝑏 𝜎 − 1 ( 𝑙 ) 𝜎 − 1 ( 𝑘 ) 𝑙 𝑘 Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞          ≤         𝐶 𝑡 𝑛 4  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑢 ≠ 𝑣 𝑢, 𝑣 ∉ { 𝑖 , 𝑗 } 𝑏 𝑣 𝑢𝑙 𝑘 ℎ ( 𝑡 )         +         𝐶 𝑡 𝑛 4  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑢 ≠ 𝑣 𝑢, 𝑣 ∉ { 𝑖 , 𝑗 } 𝑏 𝑣 𝑢𝑙 𝑘  E  Ψ 𝑡 ( 𝑇 𝑛 )    𝜋 ( 𝑖 ) = 𝑝 , 𝜋 ( 𝑗 ) = 1 𝜋 ( 𝑢 ) = 𝑝 , 𝜋 ( 𝑣 ) = 𝑙  − ℎ ( 𝑡 )          ≤         𝐶 𝑡 𝑛 4  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑢 ≠ 𝑖 , 𝑗 ( − 𝑏 𝑖 𝑢𝑙 𝑘 − 𝑏 𝑗 𝑢𝑙 𝑘 − 𝑏 𝑢𝑢𝑙 𝑘 ) ℎ ( 𝑡 )         + 𝐶 𝑡 𝛿 max 𝑖 ≠ 𝑗 ≠ 𝑢 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑘 ≠ 𝑟      E  Ψ 𝑡 ( 𝑇 𝑛 )      𝜋 ( 𝑖 ) = 𝑝 𝜋 ( 𝑗 ) = 1 𝜋 ( 𝑢 ) = 𝑝 𝜋 ( 𝑣 ) = 𝑙  − ℎ ( 𝑡 )      ≤ 𝐶 𝑡 𝛿 𝑛 ℎ ( 𝑡 ) + 𝐶 𝑡 𝛿 ma x 𝑖 ≠ 𝑗 ≠ 𝑢 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑘 ≠ 𝑟      E  Ψ 𝑡 ( 𝑇 𝑛 )      𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑢 ) = 𝑘 𝜎 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )      . (5.22) For t he last par t , w e consider t he representati ve ter m rela ted to  𝑠 ∉ { 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 − 1 ( 𝑙 ) } 𝑏 𝑠 𝜎 − 1 ( 𝑙 ) 𝜎 ( 𝑠 ) 𝑙 , by usin g condition ( 2.5 ) and ( 2.6 ), w e ha v e         𝑡 𝑛 ( 𝑛 − 1 )  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 } E         𝑠 ∉ { 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 − 1 ( 𝑙 ) } 𝑏 𝑠 𝜎 − 1 ( 𝑙 ) 𝜎 ( 𝑠 ) 𝑙 Ψ 𝑡 ( 𝑇 𝑛 )       𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞                34 ≤         𝐶 𝑡 𝑛 5  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑢 ≠ 𝑣 ≠ 𝑠 𝑢, 𝑣 , 𝑠 ∉ { 𝑖 , 𝑗 }  𝑟 ∉ { 𝑝 , 𝑞 , 𝑘 ,𝑙 } 𝑏 𝑠 𝑣𝑟 𝑙 ℎ ( 𝑡 )         +         𝐶 𝑡 𝑛 5  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑢 ≠ 𝑣 ≠ 𝑠 𝑢, 𝑣 , 𝑠 ∉ { 𝑖 , 𝑗 }  𝑟 ∉ { 𝑝 , 𝑞 , 𝑘 ,𝑙 } 𝑏 𝑠 𝑣𝑟 𝑙     E        Ψ 𝑡 ( 𝑇 𝑛 )       𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑢 ) = 𝑘 𝜎 ( 𝑣 ) = 𝑙 𝜎 ( 𝑠 ) = 𝑟        − ℎ ( 𝑡 )             ≤ 𝐶 𝑡 𝑛 2   𝑠 , 𝑝 | 𝑏 𝑠 𝑣 𝑝 𝑙 | ℎ ( 𝑡 ) +  𝑠 , 𝑞 | 𝑏 𝑠 𝑣 𝑞𝑙 | ℎ ( 𝑡 )  +         𝐶 𝑡 𝑛 5  𝑝 ≠ 𝑞 𝑝 , 𝑞 ∉ { 𝑘 , 𝑙 }  𝑢 ≠ 𝑣 𝑢, 𝑣 ∉ { 𝑖 , 𝑗 }  𝑡 ∈ { 𝑖 , 𝑗 ,𝑢 , 𝑣 } ( 𝑏 𝑡 𝑣 𝑘 𝑙 + 𝑏 𝑡 𝑣𝑙 𝑙 ) ℎ ( 𝑡 )         + 𝐶 𝑡 𝑛  𝑟 , 𝑠 | 𝑏 𝑠 𝑣𝑟 𝑙 | · ma x 𝑖 ≠ 𝑗 ≠ 𝑢 ≠ 𝑣 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑘 ≠ 𝑙 ≠ 𝑟       E        Ψ 𝑡 ( 𝑇 𝑛 )       𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑢 ) = 𝑘 𝜎 ( 𝑣 ) = 𝑙 𝜎 ( 𝑠 ) = 𝑟        − ℎ ( 𝑡 )       ≤ 𝐶 𝑡 𝛿 𝑛 ℎ ( 𝑡 ) + 𝐶 𝑡 𝛿 max 𝑖 ≠ 𝑗 ≠ 𝑢 ≠ 𝑣 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑘 ≠ 𝑙 ≠ 𝑟    E  Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 , 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑢 ) = 𝑘 , 𝜎 ( 𝑣 ) = 𝑙 , 𝜎 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )    , (5.23) From ( 5.16 ), ( 5.19 )-( 5.23 ), it f ollo w s that f or an y fix ed index 𝑖 ≠ 𝑗 , 𝑘 ≠ 𝑙 ,    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    ≤ 𝐶 𝑡 𝛿 ( 𝐻 + 𝐻 1 + 𝐻 2 + 𝐻 3 ) + 𝐶 ( 1 𝑛 + 𝑡 2 𝛿 2 ) ℎ ( 𝑡 ) . (5.24) Where 𝐻 = max 𝑖 ≠ 𝑗 , 𝑝 ≠ 𝑞    E  Ψ 𝑡 ( 𝑇 𝑛 )    𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞  − ℎ ( 𝑡 )    , 𝐻 1 = ma x 𝑖 ≠ 𝑗 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑟     E  Ψ 𝑡 ( 𝑇 𝑛 )     𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜋 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )     , 𝐻 2 = max 𝑖 ≠ 𝑗 ≠ 𝑢 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑘 ≠ 𝑟      E  Ψ 𝑡 ( 𝑇 𝑛 )      𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑢 ) = 𝑘 𝜎 ( 𝑠 ) = 𝑟  − ℎ ( 𝑡 )      , 𝐻 3 = max 𝑖 ≠ 𝑗 ≠ 𝑢 ≠ 𝑣 ≠ 𝑠 𝑝 ≠ 𝑞 ≠ 𝑘 ≠ 𝑙 ≠ 𝑟       E        Ψ 𝑡 ( 𝑇 𝑛 )       𝜎 ( 𝑖 ) = 𝑝 𝜎 ( 𝑗 ) = 𝑞 𝜎 ( 𝑢 ) = 𝑘 𝜎 ( 𝑣 ) = 𝑙 𝜎 ( 𝑠 ) = 𝑟        − ℎ ( 𝑡 )       . T o bound the terms 𝐻 , 𝐻 1 , 𝐻 2 , 𝐻 3 and simplify the right -hand side of ( 5.24 ), we now inv oke t he f ollo win g ke y lemma: Lemma 5.1. Let 𝜋 be a r andom per mutation chosen uniformly fr om 𝑆 𝑛 (symmetric gr oup of deg r ee 𝑛 ), 𝑊 𝑛 is defined in ( 2.2 ) and satisfies ( 2.6 ). Suppose 𝑘 < 𝑛 , for any fixed index 𝑖 1 , . . . , 𝑖 𝑘 ∈ [ 𝑛 ] which are all distinct, and 𝑙 1 , . . . , 𝑙 𝑘 ∈ [ 𝑛 ] ar e also distinc t, we have for 0 < 𝑡 < 1 / 𝛿 ,    E  Ψ 𝑡 ( 𝑊 𝑛 )    𝜋 ( 𝑖 1 ) = 𝑙 1 .. . 𝜋 ( 𝑖 𝑘 ) = 𝑙 𝑘  − ℎ ( 𝑡 )    ≤ 𝐶 𝑘 2 𝑒 𝑘 𝑡 𝛿 ℎ ( 𝑡 ) . (5.25) The proof of Lemma 5.1 is in the last part of Sec tion 5 . T hen, using L emma 5.1 , for an y fix ed inde x 𝑖 ≠ 𝑗 , 𝑘 ≠ 𝑙 ∈ [ 𝑛 ] , it f ollo ws that    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    ≤ 𝐶 ( 1 𝑛 + 𝑡 2 𝛿 2 ) ℎ ( 𝑡 ) . (5.26) Cramér -t ype moder ate dev iation for double index permutation statistics 35 B y the same argument, w e can also obt ain for any fix ed inde x 𝑖 1 , . . . , 𝑖 𝑘 ∈ [ 𝑛 ] are all distinct, 𝑙 1 , . . . , 𝑙 𝑘 ∈ [ 𝑛 ] are all distinct, 𝑘 < 𝑛 ,    E  Ψ 𝑡 ( 𝑊 ( 𝑖 1 , .. . ,𝑖 𝑘 ) 𝑛 )    𝜎 ( 𝑖 1 ) = 𝑙 1 .. . 𝜎 ( 𝑖 𝑘 ) = 𝑙 𝑘  − ℎ ( 𝑡 )    ≤ 𝐶 𝑘 2 ( 1 𝑛 + 𝑡 2 𝛿 2 ) ℎ ( 𝑡 ) . (5.27) Theref ore w e bound the second ter m of ( 5.11 ) b y 𝛿 4 𝑛 ( 𝑛 − 1 )  𝑘 ≠ 𝑙  𝑖 ≠ 𝑗    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    ≤ 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . Conseq uently , we obt ain the bound of 𝐽 11 as 𝐽 11 ≤ 1 𝑛 ( 𝑛 − 1 )     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) + 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) , (5.28) Then for 𝐽 12 , to g ether with ( 5.8 ) and ( 5.9 ), w e deduce that 𝐽 12 = 𝑡 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  𝑉 𝑖 𝑗 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  = 𝑡 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E   𝑎 𝑖 𝜋 ( 𝑖 ) + 𝑎 𝑗 𝜋 ( 𝑗 )  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  + 𝑡 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E                 𝑛  𝑝 = 1 𝑝 ≠ 𝑖 ( 𝑏 𝑖 𝑝 𝜋 ( 𝑖 ) 𝜋 ( 𝑝 ) + 𝑏 𝑝 𝑖 𝜋 ( 𝑝 ) 𝜋 ( 𝑖 ) )      Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )         𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙            , + 𝑡 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E                 𝑛  𝑝 = 1 𝑝 ∉ { 𝑖 , 𝑗 } ( 𝑏 𝑗 𝑝 𝜋 ( 𝑗 ) 𝜋 ( 𝑝 ) + 𝑏 𝑝 𝑗 𝜋 ( 𝑝 ) 𝜋 ( 𝑗 ) )      Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )         𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙            , ≤ 2 𝑡 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  + 𝐶 𝑡 𝑛 3  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙      𝑛  𝑝 = 1 𝑝 ∉ { 𝑖 , 𝑗 } 𝑛  𝑡 = 1 𝑡 ∉ { 𝑘 , 𝑙 } ( 𝑏 𝑖 𝑝 𝑘 𝑡 + 𝑏 𝑝 𝑖𝑡 𝑘 ) E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑡       ≤ 2 𝑡 𝑛 2  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) + 𝐶 𝑛 2 𝛿 5 𝑡    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    + 𝑛 𝛿 4 ℎ ( 𝑡 ) + 𝐶 𝑛 2 𝛿 5 𝑡 max 𝑖 ≠ 𝑗 ≠ 𝑝 𝑘 ≠ 𝑙 ≠ 𝑡     E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑘 ) 𝑛 )     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑡  − ℎ ( 𝑡 )     + 𝐶 𝑛 2 𝛿 5 𝑡 2 max 𝑖 ≠ 𝑗 ≠ 𝑝 𝑘 ≠ 𝑙 ≠ 𝑡     E  ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 − 𝑊 ( 𝑖 , 𝑗 , 𝑘 ) 𝑛 ) Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑘 ) 𝑛 )     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑡      36 + 𝐶 𝑛 2 𝛿 5 𝑡 3 max 𝑖 ≠ 𝑗 ≠ 𝑝 𝑘 ≠ 𝑙 ≠ 𝑡     E  ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 − 𝑊 ( 𝑖 , 𝑗 , 𝑘 ) 𝑛 ) 2 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑘 ) 𝑛 + 𝑈 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 − 𝑊 ( 𝑖 , 𝑗 , 𝑘 ) 𝑛 ) ) ( 1 − 𝑈 )     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑡      . (5.29) Where we use ( 2.5 ), ( 2.6 ) and t he rang e 0 < 𝑡 < 1 / 𝛿 in t he last inequality . By condition ( 2.6 ), we ha v e | 𝑊 ( 𝑖 , 𝑗 ) 𝑛 − 𝑊 ( 𝑖 , 𝑗 , 𝑝 ) 𝑛 | ≤ 𝐶 𝛿 . T hen in vie w of Lemma 5.1 and ( 5.63 )-( 5.65 ), it follo ws t hat 𝐽 12 ≤ 2 𝑡 𝑛 2  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) + 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . (5.30) W e ha ve already established the bound f or 𝐽 12 . W e ne xt consider 𝐽 13 , to g ether with ( 5.26 ) w e obtain 𝐽 13 = 𝑡 2 𝑛 ( 𝑛 − 1 )  𝑘 ≠ 𝑙  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 E  𝑉 2 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 + 𝑈 𝑉 ) ( 1 − 𝑈 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  ≤ 𝑡 2 𝛿 2 𝑛 ( 𝑛 − 1 )  𝑘 ≠ 𝑙  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 𝑒 𝑡 𝛿 ℎ ( 𝑡 ) + 𝑡 2 𝛿 2 𝑛 ( 𝑛 − 1 )  𝑘 ≠ 𝑙  𝑖 ≠ 𝑗 𝑎 2 𝑖 𝑘 𝑎 2 𝑗 𝑙 𝑒 𝑡 𝛿    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙  − ℎ ( 𝑡 )    ≤ 𝐶 𝑛 2 𝛿 6 𝑡 2 ℎ ( 𝑡 ) . (5.31) Combine ( 5.28 ), ( 5.30 ) and ( 5.31 ), w e hav e the follo wing bound f or 𝐽 1 𝐽 1 ≤ 1 𝑛 ( 𝑛 − 1 )     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) + 2 𝑡 𝑛 2  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) + 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡  ℎ ( 𝑡 ) . (5.32) Ne xt we consider 𝐽 2 , f or a fix inde x 𝑘 ∈ [ 𝑛 ] , w e define 𝑊 ( 𝑘 ) 𝑛 which is c lose to 𝑊 𝑛 as f ollo w s 𝑊 ( 𝑘 ) 𝑛 = 𝑛  𝑖 = 1 𝑖 ≠ 𝑘 𝑎 𝑖 𝜋 ( 𝑖 ) +  𝑖 ≠ 𝑗 𝑖 , 𝑗 ≠ 𝑘 𝑏 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) . T og et her with ( 2.6 ), we easily obt ain | 𝑉 ′ | = | 𝑊 𝑛 − 𝑊 ( 𝑘 ) 𝑛 | ≤ 3 𝛿 . Then w e do t he T a y lor expansion of Ψ 𝑡 ( 𝑥 ) at the point 𝑊 ( 𝑘 ) 𝑛 , it f ollo w s that 𝐽 2 = 2 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙 E { Ψ 𝑡 ( 𝑊 𝑛 ) | 𝜋 ( 𝑘 ) = 𝑙 } = 2 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙 E { Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 ) | 𝜋 ( 𝑘 ) = 𝑙 } + 2 𝑡 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙 E { 𝑉 ′ Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 ) | 𝜋 ( 𝑘 ) = 𝑙 } + 2 𝑡 2 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙 E { 𝑉 ′ 2 Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 + 𝑈 𝑉 ′ ) ( 1 − 𝑈 ) | 𝜋 ( 𝑘 ) = 𝑙 } : = 𝐽 21 + 𝐽 22 + 𝐽 23 . (5.33) Cramér -t ype moder ate dev iation for double index permutation statistics 37 W e ha v e div ided 𝐽 2 into three par ts, and next, w e will consider these three par ts respec ti v ely . W e first consider 𝐽 21 , applyin g add and subt rac t techniq ue, we ha v e 𝐽 21 ≥ 2 𝑛 2     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) − 2 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙    E { Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 ) | 𝜋 ( 𝑘 ) = 𝑙 } − ℎ ( 𝑡 )    . Then we define 𝜎 𝑘 𝑙 and 𝑆 ( 𝑘 ) 𝜎 𝑘 𝑙 as 𝜎 𝑘 𝑙 =  𝜎 , 𝜎 ( 𝑘 ) = 𝑙 , 𝜎 ◦ 𝜏 𝑘 , 𝜋 − 1 ( 𝑙 ) , 𝜎 ( 𝑘 ) ≠ 𝑙 , 𝑆 ( 𝑘 ) 𝜎 𝑘 𝑙 = 𝑛  𝑖 = 1 𝑖 ≠ 𝑘 𝑎 𝑖 𝜎 𝑘 𝑙 ( 𝑖 ) +  𝑖 ≠ 𝑗 𝑖 , 𝑗 ≠ 𝑘 𝑏 𝑖 𝑗 𝜎 𝑘 𝑙 ( 𝑖 ) 𝜎 𝑘 𝑙 ( 𝑗 ) . U nder Lemma 4.2 , we obt ain L ( 𝑆 ( 𝑘 ) 𝜎 𝑘 𝑙 ) 𝑑 = L ( 𝑊 ( 𝑘 ) 𝑛 | 𝜋 ( 𝑘 ) = 𝑙 ) . Under condition ( 2.6 ), w e ha v e | 𝑊 𝑛 − 𝑆 ( 𝑘 ) 𝜎 𝑘 𝑙 | ≤ 𝐶 𝛿 . Applyin g ( 5.27 ) w e ha ve f or any 𝑘 , 𝑙 ∈ [ 𝑛 ] ,    E { Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 ) | 𝜋 ( 𝑘 ) = 𝑙 } − ℎ ( 𝑡 )    ≤ 𝐶 ( 1 𝑛 + 𝑡 2 𝛿 2 ) ℎ ( 𝑡 ) , (5.34) Theref ore, w e obtain t he lo w er bound of 𝐽 21 as 𝐽 21 ≥ 2 𝑛 2     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) − 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) , (5.35) Ne xt w e consider 𝐽 22 , since 𝑉 ′ = 𝑎 𝑘 𝜋 ( 𝑘 ) +  𝑛 𝑠 = 1 𝑠 ≠ 𝑘 𝑏 𝑠 𝑘 𝜋 ( 𝑠 ) 𝜋 ( 𝑘 ) + 𝑏 𝑘 𝑠 𝜋 ( 𝑘 ) 𝜋 ( 𝑠 ) , we e xpand 𝐽 22 b y definition of 𝑉 ′ as 𝐽 22 = 2 𝑡 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 3 𝑘 𝑙 E  Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 )    𝜋 ( 𝑘 ) = 𝑙  + 2 𝑡 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙     1 𝑛 − 1 𝑛  𝑠 = 1 𝑠 ≠ 𝑘 𝑛  𝑡 = 1 𝑡 ≠ 𝑙 ( 𝑏 𝑠 𝑘 𝑡 𝑙 + 𝑏 𝑘 𝑠 𝑙𝑡 ) E  Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 )    𝜋 ( 𝑘 ) = 𝑙 𝜋 ( 𝑠 ) = 𝑡      , ≥ 2 𝑡 𝑛 2  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) − 𝑛 2 𝛿 4 max 𝑘 , 𝑙    E  Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 )    𝜋 ( 𝑘 ) = 𝑙  − ℎ ( 𝑡 )    − 2 𝑛 𝛿 4  ℎ ( 𝑡 ) + max 𝑘 ≠ 𝑠 , 𝑙 ≠ 𝑡    E  Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 )    𝜋 ( 𝑘 ) = 𝑙 𝜋 ( 𝑠 ) = 𝑡  − ℎ ( 𝑡 )     , to g ether with ( 5.26 ) and ( 5.34 ), it follo ws t hat 𝐽 22 ≥ 2 𝑡 𝑛 2  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) − 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . (5.36) 38 Then we consider t he lo wer bound of 𝐽 13 , b y ( 5.34 ) w e bound 𝐽 23 as 𝐽 23 = 2 𝑡 2 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙 E { 𝑉 ′ 2 Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 + 𝑈 𝑉 ′ ) ( 1 − 𝑈 ) | 𝜋 ( 𝑘 ) = 𝑙 } ≥ − 2 𝑡 2 𝑛 2  𝑖 , 𝑗  𝑘 , 𝑙 𝑎 2 𝑖 𝑗 𝑎 2 𝑘 𝑙 𝛿 2 𝑒 𝑡 𝛿  ℎ ( 𝑡 ) +    E  Ψ 𝑡 ( 𝑊 ( 𝑘 ) 𝑛 )    𝜋 ( 𝑘 ) = 𝑙  − ℎ ( 𝑡 )     ≥ − 𝐶 𝑛 2 𝛿 6 𝑡 2 ℎ ( 𝑡 ) . (5.37) Combinin g ( 5.33 ) and ( 5.35 )-( 5.37 ), w e deduce that 𝐽 2 ≥ 2 𝑛 2     𝑖 , 𝑗 𝑎 2 𝑖 𝑗    2 ℎ ( 𝑡 ) + 2 𝑡 𝑛 2  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑎 3 𝑖 𝑘 𝑎 2 𝑗 𝑙 ℎ ( 𝑡 ) − 𝐶 𝑛 2 𝛿 6 𝑡 2 ℎ ( 𝑡 ) . (5.38) T og et her wit h ( 5.7 ), ( 5.32 ) and ( 5.38 ), w e complete the proof ( 4.14 ). B y a same argument, w e can also proof ( 4.15 ). Then we consider ( 4.16 ), The proof approaches of ( 4.14 ) and ( 4.16 ) are similar , but t he property of { 𝑏 ( 𝑖 , 𝑗 , 𝑘 , 𝑙 ) } 𝑖 , 𝑗 , 𝑘 , 𝑙 ∈ [ 𝑛 ] is repeatedl y utili zed . W e first decompose the left hand side of ( 4.16 ) as E              𝑖 ≠ 𝑗 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) − E         𝑖 ≠ 𝑗 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 )           2 𝑒 𝑡 𝑊 𝑛          = 1 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + E         𝑖 ≠ 𝑗  𝑝 ≠ 𝑞 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 2 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 ) Ψ 𝑡 ( 𝑊 𝑛 )        − 2 𝑛 ( 𝑛 − 1 )  𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙 E   𝑝 ≠ 𝑞 𝑏 2 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 ) Ψ 𝑡 ( 𝑊 𝑛 )  : = 1 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 𝑄 1 − 𝑄 2 . (5.39) W e es timate 𝑄 1 and 𝑄 2 respec t iv ely . W e first consider 𝑄 1 and decompose it into t hree par ts 𝑄 1 = E         𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 2 𝑝 𝑞 𝜋 ( 𝑝 ) 𝜋 ( 𝑞 ) Ψ 𝑡 ( 𝑊 𝑛 )        + E             𝑖 ≠ 𝑗 𝑛  𝑝 = 1 𝑝 ∉ { 𝑖 , 𝑗 }  𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 2 𝑝 𝑖 𝜋 ( 𝑝 ) 𝜋 ( 𝑖 ) + 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 2 𝑝 𝑗 𝜋 ( 𝑝 ) 𝜋 ( 𝑗 )  Ψ 𝑡 ( 𝑊 𝑛 )            + E             𝑖 ≠ 𝑗      𝑛  𝑞 = 1 𝑞 ≠ 𝑖 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 2 𝑖 𝑞 𝜋 ( 𝑖 ) 𝜋 ( 𝑞 ) + 𝑛  𝑞 = 1 𝑞 ≠ 𝑗 𝑏 2 𝑖 𝑗 𝜋 ( 𝑖 ) 𝜋 ( 𝑗 ) 𝑏 2 𝑗 𝑞 𝜋 ( 𝑗 ) 𝜋 ( 𝑞 )      Ψ 𝑡 ( 𝑊 𝑛 )            Cramér -t ype moder ate dev iation for double index permutation statistics 39 : = 𝑄 11 + 𝑄 12 + 𝑄 13 . (5.40) For 𝑄 11 , b y doin g the T a ylor e xpansion, w e decompose it into t hr ee part s 𝑄 11 = ( 𝑛 − 4 ) ! 𝑛 !  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  Ψ 𝑡 ( 𝑊 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 , 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 , 𝜋 ( 𝑞 ) = 𝑣  = ( 𝑛 − 4 ) ! 𝑛 !  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑝 , 𝑞 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 , 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 , 𝜋 ( 𝑞 ) = 𝑣  + ( 𝑛 − 4 ) ! 𝑡 𝑛 !  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  𝑉 𝑖 𝑗 𝑝 𝑞 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑝 , 𝑞 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 , 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 , 𝜋 ( 𝑞 ) = 𝑣  + ( 𝑛 − 4 ) ! 𝑡 2 𝑛 !  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  𝑉 2 𝑖 𝑙 𝑝 𝑞 Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑝 , 𝑞 ) 𝑛 + 𝑈 𝑉 𝑖 𝑗 𝑝 𝑞 ) ( 1 − 𝑈 )    𝜋 ( 𝑖 ) = 𝑘 , 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 , 𝜋 ( 𝑞 ) = 𝑣  : = 𝑄 111 + 𝑄 112 + 𝑄 113 . (5.41) W e first consider 𝑄 111 . Applyin g ( 5.27 ), for an y fixed index 𝑖 , 𝑗 , 𝑝 , 𝑞 ∈ [ 𝑛 ] are all distinc t, and 𝑘 , 𝑙 , 𝑢, 𝑣 ∈ [ 𝑛 ] are all distinct, it f ollow s t ha t    E  Ψ 𝑡 ( 𝑊 ( 𝑖 , 𝑗 , 𝑝 , 𝑞 ) 𝑛 )    𝜋 ( 𝑖 ) = 𝑘 , 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 , 𝜋 ( 𝑞 ) = 𝑣  − ℎ ( 𝑡 )    ≤ 𝐶  1 𝑛 + 𝛿 2 𝑡 2  ℎ ( 𝑡 ) . (5.42) T og et her wit h condition ( 2.6 ), we deduce that 𝑄 111 ≤ 1 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 𝐶 𝑛 4     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2  1 𝑛 + 𝛿 2 𝑡 2  ℎ ( 𝑡 ) ≤ 1 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) . (5.43) As f or 𝑄 112 , since 𝑉 𝑖 𝑗 𝑝 𝑞 = 𝑎 𝑖 𝜋 ( 𝑖 ) + 𝑎 𝑗 𝜋 ( 𝑗 ) + 𝑎 𝑝 𝜋 ( 𝑝 ) + 𝑎 𝑞 𝜋 ( 𝑞 ) + 𝑛  𝑠 = 1 𝑠 ≠ 𝑖 ( 𝑏 𝑠 𝑖 𝜋 ( 𝑠 ) 𝜋 ( 𝑖 ) + 𝑏 𝑖 𝑠 𝜋 ( 𝑖 ) 𝜋 ( 𝑠 ) ) + 𝑛  𝑠 = 1 𝑠 ∉ { 𝑖 , 𝑗 } ( 𝑏 𝑠 𝑗 𝜋 ( 𝑠 ) 𝜋 ( 𝑗 ) + 𝑏 𝑗 𝑠 𝜋 ( 𝑗 ) 𝜋 ( 𝑠 ) ) + 𝑛  𝑠 = 1 𝑠 ∉ { 𝑖 , 𝑗 , 𝑝 } ( 𝑏 𝑠 𝑝 𝜋 ( 𝑠 ) 𝜋 ( 𝑝 ) + 𝑏 𝑝 𝑠 𝜋 ( 𝑝 ) 𝜋 ( 𝑠 ) ) + 𝑛  𝑠 = 1 𝑠 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } ( 𝑏 𝑠 𝑞 𝜋 ( 𝑠 ) 𝜋 ( 𝑞 ) + 𝑏 𝑞 𝑠 𝜋 ( 𝑞 ) 𝜋 ( 𝑠 ) ) , 40 w e e xpand 𝑄 112 b y its definition and usin g ( 2.6 ), ( 5.27 ). It then follo ws t hat 𝑄 112 ≤ 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) + 𝐶 𝑛 4     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2  1 𝑛 + 𝛿 2 𝑡 2  ℎ ( 𝑡 ) ≤ 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) + 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) . (5.44) Ne xt we consider 𝑄 113 , b y ( 2.6 ) and ( 5.27 ), w e obtain 𝑄 113 ≤ 2 𝑡 2 𝛿 2 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 𝐶 𝑛 4     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2  1 𝑛 + 𝛿 2 𝑡 2  ℎ ( 𝑡 ) ≤ 𝐶 𝑛 2 𝛿 6 𝑡 2 ℎ ( 𝑡 ) . (5.45) Combinin g ( 5.41 ), ( 5.43 )-( 5.45 ), w e deduce t ha t 𝑄 11 ≤ 1 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) + 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) . (5.46) B y the same argument of 𝑄 11 , w e ha ve 𝑄 12 ≤ 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) , 𝑄 13 ≤ 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) . (5.47) T og et her wit h ( 5.40 ), ( 5.46 ) and ( 5.47 ), it f ollo w s that 𝑄 1 ≤ 1 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) + 𝐶  𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2  ℎ ( 𝑡 ) . (5.48) For 𝑄 2 , b y T ay lor expansion w e decompose it into three parts 𝑄 2 = 2 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  Ψ 𝑡 ( 𝑊 ( 𝑝 , 𝑞 ) 𝑛 )    𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 ) = 𝑣  + 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  𝑉 𝑝 𝑞 Ψ 𝑡 ( 𝑊 ( 𝑝 , 𝑞 ) 𝑛 )    𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 ) = 𝑣  + 2 𝑡 2 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E  𝑉 2 𝑝 𝑞 Ψ 𝑡 ( 𝑊 ( 𝑝 , 𝑞 ) 𝑛 + 𝑈 𝑉 𝑝 𝑞 ( 1 − 𝑈 ) )    𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 ) = 𝑣  : = 𝑄 21 + 𝑄 22 + 𝑄 23 . (5.49) Cramér -t ype moder ate dev iation for double index permutation statistics 41 where 𝑈 is a uniform random v ariable on [ 0 , 1 ] and independent of an y other random v ariables and 𝑉 𝑝 𝑞 = 𝑎 𝑝 𝜋 ( 𝑝 ) + 𝑎 𝑞 𝜋 ( 𝑞 ) + 𝑛  𝑖 = 1 𝑖 ≠ 𝑝 ( 𝑏 𝑝 𝑖 𝜋 ( 𝑝 ) 𝜋 ( 𝑖 ) + 𝑏 𝑖 𝑝 𝜋 ( 𝑖 ) 𝜋 ( 𝑝 ) ) + 𝑛  𝑖 = 1 𝑖 ∉ { 𝑝 , 𝑞 } ( 𝑏 𝑖 𝑞 𝜋 ( 𝑖 ) 𝜋 ( 𝑞 ) + 𝑏 𝑞 𝑖 𝜋 ( 𝑞 ) 𝜋 ( 𝑖 ) ) . Usin g add and subtract techniq ue and appl ying ( 5.27 ), ( 2.6 ), we obtain the low er bound of 𝑄 21 as 𝑄 21 ≥ 2 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) − 𝐶 𝑛 4     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2  1 𝑛 + 𝛿 2 𝑡 2  ℎ ( 𝑡 ) ≥ 2 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) − 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . (5.50) Ne xt for 𝑄 22 , w e e xpand it b y the definiton of 𝑉 𝑝 𝑞 as 𝑄 22 = 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) E  Ψ 𝑡 ( 𝑊 ( 𝑝 , 𝑞 ) 𝑛 )    𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 = 𝑣 )  + 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E                 𝑛  𝑠 = 1 𝑠 ≠ 𝑝 ( 𝑏 𝑝 𝑠 𝜋 ( 𝑝 ) 𝜋 ( 𝑠 ) + 𝑏 𝑠 𝑝 𝜋 ( 𝑠 ) 𝜋 ( 𝑝 ) )      Ψ 𝑡 ( 𝑊 ( 𝑝 , 𝑞 ) 𝑛 )         𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 = 𝑣 )            , + 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 E                 𝑛  𝑠 = 1 𝑠 ∉ { 𝑝 , 𝑞 } ( 𝑏 𝑠 𝑞 𝜋 ( 𝑠 ) 𝜋 ( 𝑞 ) + 𝑏 𝑞 𝑠 𝜋 ( 𝑞 ) 𝜋 ( 𝑠 ) )      Ψ 𝑡 ( 𝑊 ( 𝑝 , 𝑞 ) 𝑛 )         𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 = 𝑣 )            . S ince  𝑖 ≠ 𝑗  𝑝 ≠ 𝑞  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) =  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) +  𝑖 ≠ 𝑗 𝑛  𝑝 = 1  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣  𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑖𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑖 𝑣 ) + 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑗 𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑗 𝑣 )  +  𝑖 ≠ 𝑗 𝑛  𝑞 = 1 𝑞 ∉ { 𝑖 , 𝑗 }  𝑘 ≠ 𝑙  𝑢 ≠ 𝑣  𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑖 𝑞𝑢 𝑣 ( 𝑎 𝑖 𝑢 + 𝑎 𝑞 𝑣 ) + 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑗 𝑞 𝑢 𝑣 ( 𝑎 𝑗 𝑢 + 𝑎 𝑞 𝑣 )  +  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 𝑛  𝑢 = 1  𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑘 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑘 ) + 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢𝑙 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑙 )  42 +  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 𝑛  𝑣 = 1 𝑣 ∉ { 𝑘 ,𝑙 }  𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞 𝑘 𝑣 ( 𝑎 𝑝 𝑘 + 𝑎 𝑞 𝑣 ) + 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑙 𝑣 ( 𝑎 𝑝𝑙 + 𝑎 𝑞 𝑣 )  , to g ether with ( 2.6 ) and ( 5.27 ), it follo ws t hat 𝑄 22 ≥ 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) − 𝑛 𝛿 4 ℎ ( 𝑡 ) − 𝐶 𝑛 4     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2  1 𝑛 + 𝛿 2 𝑡 2  ℎ ( 𝑡 ) ≥ 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) − 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . (5.51) Then, we consider 𝑄 23 , also b y ( 2.6 ) and ( 5.27 ), w e obtain 𝑄 23 ≥ − 𝐶 𝑛 4     𝑖 , 𝑗  𝑘 , 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 𝛿 2 𝑡 2 ℎ ( 𝑡 ) ≥ − 𝐶 𝑛 2 𝛿 6 𝑡 2 ℎ ( 𝑡 ) . (5.52) From ( 5.49 )-( 5.52 ), w e ha ve a low er bound of 𝑄 2 as 𝑄 2 ≥ 2 𝑛 2 ( 𝑛 − 1 ) 2     𝑖 ≠ 𝑗  𝑘 ≠ 𝑙 𝑏 2 𝑖 𝑗 𝑘 𝑙    2 ℎ ( 𝑡 ) + 2 𝑡 𝑛 2 ( 𝑛 − 1 ) 2  𝑖 ≠ 𝑗 ≠ 𝑝 ≠ 𝑞  𝑘 ≠ 𝑙 ≠ 𝑢 ≠ 𝑣 𝑏 2 𝑖 𝑗 𝑘 𝑙 𝑏 2 𝑝 𝑞𝑢 𝑣 ( 𝑎 𝑝 𝑢 + 𝑎 𝑞 𝑣 ) ℎ ( 𝑡 ) − 𝐶 ( 𝑛 𝛿 4 + 𝑛 2 𝛿 6 𝑡 2 ) ℎ ( 𝑡 ) . (5.53) Combinin g ( 5.39 ), ( 5.48 ) and ( 5.53 ), w e complete the proof of ( 4.16 ). B y a same argument, we can also proof ( 4.17 ). Proof of Lemma 4.2 . T o ge t a Berr y -Esseen bound for Combina torial Central Limit Theorems, a transf or mation was constructed b y Goldstein ( 2005 ), and f urther applied b y Chen and F an g ( 2015 ) and Liu and Zhang ( 2023 ) to pro ve Ber ry-Es seen bound and Cramér t ype moderate dev iation results for combinatorial central limit theorems. Our transf or mation ( 4.34 ) is a bit different from theirs, and w e use a similar train of thought from Proof of Lemma 4.5. in Chen et al. ( 2010 ) to pro v e Lemma 4.2 . W e onl y pro ve ( 4.36 ), since ( 4.35 ) can be pro v ed similarl y . Let 𝐴 1 , 𝐴 2 , 𝐴 3 , 𝐴 4 denote the four cased of ( 4 . 34 ) in t heir respecti ve order . Let 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } be distinc t and satis fy 𝑝 𝑚 ∉ { 𝑘 , 𝑙 } . U nder 𝐴 1 w e ha v e 𝜎 ( 𝑗 ) ≠ 𝑘 and 𝑖 ≠ 𝜎 − 1 ( 𝑘 ) . Hence 𝜎 − 1 ( 𝑘 ) ∉ { 𝑖 , 𝑗 } , then we hav e 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝐴 1 ) = 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) } , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) ≠ 𝑘 , 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝜎 − 1 ( 𝑘 ) ) = 𝑝 𝜎 − 1 ( 𝑘 ) ) = 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) } , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑝 𝜎 − 1 ( 𝑘 ) ) =  𝑞 ∉ { 𝑖 , 𝑗 } 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑞 } , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑝 𝑞 , 𝜎 ( 𝑞 ) = 𝑘 ) Cramér -t ype moder ate dev iation for double index permutation statistics 43 =  𝑞 ∉ { 𝑖 , 𝑗 } 𝑃 ( 𝜎 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑞 } , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑝 𝑞 , 𝜎 ( 𝑞 ) = 𝑘 ) = 𝑛 − 2 𝑛 ! . Case 𝐴 2 can be calc ulated similarly upon interchan ging t he role s of 𝑖 and 𝑗 , and 𝑘 and 𝑙 . So w e obtain 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝐴 1 ∪ 𝐴 2 ) = 2 ( 𝑛 − 2 ) 𝑛 ! . U nder 𝐴 3 , w e ha ve 𝜎 ( 𝑖 ) = 𝑙 and 𝜎 ( 𝑗 ) = 𝑘 , there f ore 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝐴 3 ) = 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑘 ) = 𝑃 ( 𝜎 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝜎 ( 𝑖 ) = 𝑙 , 𝜎 ( 𝑗 ) = 𝑘 ) = 1 𝑛 ! . Finall y , under 𝐴 4 , we di vide 𝐴 4 into subcases depending on 𝑅 = | { 𝜎 ( 𝑖 ) , 𝜎 ( 𝑗 ) } ∩ { 𝑘 , 𝑙 } | , and let 𝐴 4 𝑟 = 𝐴 4 ∩ { 𝑅 = 𝑟 } for 𝑟 = 0 , 1 , 2 . If 𝑅 = 0 , appl ying (4.131) in Chen et al. ( 2010 ), w e ha ve 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝐴 40 ) = ( 𝑛 − 2 ) ( 𝑛 − 3 ) 𝑛 ! . Considering 𝑅 = 1 , by (4.132) in Chen et al. ( 2010 ), 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝐴 41 ) = 2 ( 𝑛 − 2 ) 𝑛 ! . Finall y , if 𝑅 = 2 , we ha v e 𝐴 42 = { 𝜋 ( 𝑖 ) = 𝑘 , 𝜋 ( 𝑗 ) = 𝑙 } . So by interchan gin g 𝑖 and 𝑗 in the calculation of 𝐴 3 , it f ollo w s that 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } , 𝐴 42 ) = 1 𝑛 ! . Summin g o v er all t he case s, w e obtain 𝑃 ( 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 } ) = ( 𝑛 − 2 ) ( 𝑛 − 3 ) 𝑛 ! + 4 ( 𝑛 − 2 ) 𝑛 ! + 2 𝑛 ! = 1 ( 𝑛 − 2 ) ! . (5.54) This sho w s t ha t 𝜎 𝑖 𝑗 𝑘 𝑙 is uniformly distributed o v er the set of permut a tions 𝜏 such that 𝜏 ( 𝑖 ) = 𝑘 and 𝜏 ( 𝑗 ) = 𝑙 . Hence L ( 𝜎 𝑖 𝑗 𝑘 𝑙 ) 𝑑 = L ( 𝜋 | 𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 ) . Thus w e complete t he proof of ( 4.36 ) and b y a same argument w e can proof ( 4.35 ). Then, w e prov e L  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋      𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 ) = 𝑣  in ( 4.37 ) based on t he result of ( 4.36 ). In the follo wing t ables, we show the values of per mutation 𝜎 𝑖 𝑗 𝑘 𝑙 and P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 on inde x 𝑖 , 𝑗 , 𝜎 − 1 ( 𝑘 ) , 𝜎 − 1 ( 𝑙 ) in sev eral cases. 44 case f or all 𝐶 1 𝐶 2 𝐶 3 𝐶 4 inde x 𝑖 𝑗 𝑝 𝑞 𝑝 𝑞 𝑝 𝑞 𝜎 − 1 ( 𝑣 ) 𝑝 𝑞 𝜎 − 1 ( 𝑢 ) 𝜎 𝑖 𝑗 𝑘 𝑙 ( inde x ) 𝑘 𝑙 𝑢 𝑣 𝑣 𝑢 𝑢 𝜎 ( 𝑞 ) 𝑣 𝜎 ( 𝑝 ) 𝑣 𝑢 P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( inde x ) 𝑘 𝑙 𝑢 𝑣 𝑢 𝑣 𝑢 𝑣 𝜎 ( 𝑞 ) 𝑢 𝑣 𝜎 ( 𝑝 ) case 𝐶 5 𝐶 6 𝐶 7 inde x 𝑝 𝑞 𝜎 − 1 ( 𝑣 ) 𝑝 𝑞 𝜎 − 1 ( 𝑢 ) 𝑝 𝑞 𝜎 − 1 ( 𝑢 ) 𝜎 − 1 ( 𝑣 ) 𝜎 𝑖 𝑗 𝑘 𝑙 ( inde x ) 𝜎 ( 𝑝 ) 𝑢 𝑣 𝑣 𝜎 ( 𝑞 ) 𝑢 𝜎 ( 𝑝 ) 𝜎 ( 𝑞 ) 𝑢 𝑣 P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( inde x ) 𝑢 𝑣 𝜎 ( 𝑝 ) 𝑢 𝑣 𝜎 ( 𝑞 ) 𝑢 𝑣 𝜎 ( 𝑝 ) 𝜎 ( 𝑞 ) S ince Ω = ∪ 7 𝑖 = 1 𝐶 𝑖 and 𝐶 𝑖 , 𝑖 = 1 , . . . , 7 are disjoint ev ents, theref ore we hav e 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 }  = 7  𝑖 = 1 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 𝑖  , (5.55) where 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } are distinct and satisfy 𝑡 𝑚 ∉ { 𝑘 , 𝑙 , 𝑢 , 𝑣 } . T hen w e calculate ( 5.55 ) case b y case. R ecallin g ( 5.54 ), under 𝐶 1 it f ollo w s that 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 1  = 𝑃  𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑝 ) = 𝑢 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑞 ) = 𝑣  = 1 ( 𝑛 − 2 ) ! . (5.56) B y interchan ging 𝑝 and 𝑞 , w e ha v e under 𝐶 2 , 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 2  = 1 ( 𝑛 − 2 ) ! . (5.57) Then we calculate t he probability under 𝐶 3 and 𝐶 5 as 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 3  =  𝑟 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } 𝑃    𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 , 𝑟 } , 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑝 ) = 𝑢 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑞 ) = 𝑡 𝑟 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑟 ) = 𝑢    = 𝑛 − 4 ( 𝑛 − 2 ) ! , (5.58) and 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 5  =  𝑟 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } 𝑃    𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 , 𝑟 } , 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑞 ) = 𝑢, 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑝 ) = 𝑡 𝑟 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑟 ) = 𝑣    = 𝑛 − 4 ( 𝑛 − 2 ) ! . (5.59) Cramér -t ype moder ate dev iation for double index permutation statistics 45 Case 𝐶 4 and 𝐶 6 can be calc ulated similar ly upon interchan ging the roles of 𝑝 and 𝑞 , and 𝑢 and 𝑣 . So we obtain 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 4  = 𝑛 − 4 ( 𝑛 − 2 ) ! , (5.60) 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 6  = 𝑛 − 4 ( 𝑛 − 2 ) ! . (5.61) Then we calculate t he probability under 𝐶 7 as f ollo w s 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } , 𝐶 7  =  𝑟 , 𝑠 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 } 𝑟 ≠ 𝑠 𝑃  𝜎 𝑖 𝑗 𝑝 𝑞 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 , 𝑟 , 𝑠 } , 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑝 ) = 𝑡 𝑟 , 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑞 ) = 𝑡 𝑠 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑟 ) = 𝑢 , 𝜎 𝑖 𝑗 𝑘𝑙 ( 𝑠 ) = 𝑣  = ( 𝑛 − 4 ) ( 𝑛 − 5 ) ( 𝑛 − 2 ) ! . (5.62) Combinin g ( 5.55 )-( 5.62 ), w e deduce t ha t 𝑃  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑡 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑞 }  = 2 ( 𝑛 − 2 ) ! + 4 ( 𝑛 − 4 ) ( 𝑛 − 2 ) ! + ( 𝑛 − 4 ) ( 𝑛 − 5 ) ( 𝑛 − 2 ) ! = 1 ( 𝑛 − 4 ) ! . It implies that L  P 𝑝 𝑞 𝑢 𝑣 𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋      𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝑝 𝑖 ( 𝑝 ) = 𝑢 𝜋 ( 𝑞 ) = 𝑣  . Ne xt w e pro v e L  P 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑞  in ( 4.37 ). W e define 𝑞 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 } are distinct and satis f y 𝑞 𝑚 ∉ { 𝑘 , 𝑙 , 𝑞 } . T hen b y ( 5.54 ), we ha v e 𝑃  P 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑞 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 } , 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑝 ) = 𝑞  = 𝑃  𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 } , 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑝 ) = 𝑞  = 1 ( 𝑛 − 2 ) ! , and 𝑃  P 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑞 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 } , 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑝 ) ≠ 𝑞  =  𝑟 ∉ { 𝑖 , 𝑗 , 𝑝 } 𝑃  𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑝 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 , 𝑟 } , 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑟 ) = 𝑞 , 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑝 ) = 𝑞 𝑟  = 𝑛 − 3 ( 𝑛 − 2 ) ! . Theref ore, 𝑃  P 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘 𝑙 ( 𝑚 ) = 𝑞 𝑚 , 𝑚 ∉ { 𝑖 , 𝑗 , 𝑝 }  = 1 ( 𝑛 − 2 ) ! + 𝑛 − 3 ( 𝑛 − 2 ) ! = 1 ( 𝑛 − 3 ) ! . It implies that L  P 𝑝 𝑞 𝜎 𝑖 𝑗 𝑘 𝑙  𝑑 = L  𝜋     𝜋 ( 𝑖 ) = 𝑘 𝜋 ( 𝑗 ) = 𝑙 𝜋 ( 𝑝 ) = 𝑞  . Thus, we complete the proof of Lemma 4.2 . 46 Proof of Lemma 5.1 . Define 𝜎 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘 and 𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 as 𝜎 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘 =  P 𝑖 𝑘 , 𝑖 𝑘 − 1 𝑙 𝑘 , 𝑙 𝑘 − 1 ◦ · · · ◦ P 𝑖 1 , 𝑖 2 𝑙 1 , 𝑙 2 ◦ 𝜎 , 𝑘 is ev en , P 𝑖 𝑘 𝑙 𝑘 ◦ P 𝑖 𝑘 − 1 , 𝑖 𝑘 − 2 𝑙 𝑘 − 1 , 𝑙 𝑘 − 2 ◦ · · · ◦ P 𝑖 1 , 𝑖 2 𝑙 1 , 𝑙 2 ◦ 𝜎 , 𝑘 is odd , and 𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 = 𝑛  𝑝 = 1 𝑎 𝑝 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 ( 𝑝 ) +  𝑝 ≠ 𝑞 𝑏 𝑝 𝑞 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 ( 𝑝 ) 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 ( 𝑞 ) , where 𝜎 is a random per mutation chosen unif ormly from 𝑆 𝑛 and independent of 𝜋 , P 𝑖 𝑘 and P 𝑖 𝑗 𝑘 𝑙 are defined as ( 4.33 ) and ( 4.34 ). Sinc e 𝑖 1 , . . . , 𝑖 𝑘 ∈ [ 𝑛 ] are all dis tinct, and 𝑙 1 , . . . , 𝑙 𝑘 ∈ [ 𝑛 ] are also all dis tinct, when 𝑘 is ev en, by applyin g ( 4.36 ) in Lemma 4.2 𝑘 / 2 times, w e ha ve L  𝜎 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘  𝑑 = L  𝜋    𝜋 ( 𝑖 1 ) = 𝑙 1 .. . 𝜋 ( 𝑖 𝑘 ) = 𝑙 𝑘  , (5.63) when 𝑘 is odd, b y applyin g ( 4.35 ) once and ( 4.36 ) ( 𝑘 − 1 ) / 2 times, w e also ha v e ( 5.63 ). T hen it f ollow s b y de finition that L  𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘  𝑑 = L  𝑊 𝑛    𝜋 ( 𝑖 1 ) = 𝑙 1 .. . 𝜋 ( 𝑖 𝑘 ) = 𝑙 𝑘  . (5.64) Denote    E  Ψ 𝑡 ( 𝑊 𝑛 )    𝜋 ( 𝑖 1 ) = 𝑙 1 .. . 𝜋 ( 𝑖 𝑘 ) = 𝑙 𝑘  − ℎ ( 𝑡 )    as 𝐻 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘 , b y ( 5.64 ) it f ollo w s that 𝐻 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘 =     E  Ψ 𝑡  𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘   − ℎ ( 𝑡 )     . Then we do t he T ay lor expansion of 𝑒 𝑡 𝑥 at 𝑥 = 𝑇 𝑛 =  𝑛 𝑖 = 1 𝑎 𝑖 𝜎 ( 𝑖 ) +  𝑖 ≠ 𝑗 𝑏 𝑖 𝑗 𝜎 ( 𝑖 ) 𝜎 ( 𝑗 ) , w e obtain Ψ 𝑡 ( 𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 ) = Ψ 𝑡 ( 𝑇 𝑛 ) + ( 𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 − 𝑇 𝑛 ) E  Ψ ′ 𝑡  𝑇 𝑛 + 𝑈  𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 − 𝑇 𝑛    , where 𝑈 is a 𝑈 [ 0 , 1 ] random v ariable independent of any other random variables, theref ore 𝐻 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘 = | E { Ψ 𝑡 ( 𝑇 𝑛 ) } − ℎ ( 𝑡 ) | + 𝑡 | E { 𝑉 Ψ 𝑡 ( 𝑇 𝑛 + 𝑈 𝑉 ) } | = 𝑡 | E { 𝑉 Ψ 𝑡 ( 𝑇 𝑛 + 𝑈 𝑉 ) } | , where 𝑉 = 𝑆 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 − 𝑇 𝑛 . B y the definition of 𝑆 ( 𝑖 1 , .. . ,𝑖 𝑘 ) 𝜎 𝑖 1 , .. ., 𝑖 𝑘 𝑙 1 , .. .,𝑙 𝑘 and 𝑇 𝑛 and the condition ( 2.6 ), we hav e | 𝑉 | ≤ 𝑘  𝑚 = 1   𝑎 𝑖 𝑚 𝜎 ( 𝑖 𝑚 )   + 𝑘  𝑚 = 1    𝑎 𝜎 − 1 ( 𝑙 𝑚 ) 𝑙 𝑚    + 𝑘  𝑚 = 1     𝑎 𝑖 𝑚 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑖 𝑚 )     + 𝑘  𝑚 = 1      𝑎  𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘  − 1 ( 𝑙 𝑚 ) 𝑙 𝑚      + 𝑘  𝑚 = 1  𝑠 ∉ { 𝑖 1 , .. . ,𝑖 𝑚 }   𝑏 𝑖 𝑠 𝜎 ( 𝑖 ) 𝜎 ( 𝑠 ) + 𝑏 𝑠 𝑖 𝜎 ( 𝑠 ) 𝜎 ( 𝑖 )   Cramér -t ype moder ate dev iation for double index permutation statistics 47 + 𝑘  𝑚 = 1  𝑠 ∉ { 𝑖 1 , .. . ,𝑖 𝑚 }     𝑏 𝑖 𝑠 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑖 ) 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑠 ) + 𝑏 𝑠 𝑖 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑠 ) 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑖 )     + 𝑘  𝑚 = 1  𝑠 ∉ { 𝑖 1 , .. . ,𝑖 𝑘 , 𝜎 − 1 ( 𝑙 1 ) , .. ., 𝜎 − 1 ( 𝑙 𝑚 ) }    𝑏 𝑠 𝜎 − 1 ( 𝑙 𝑚 ) 𝜎 ( 𝑠 ) 𝑙 𝑚 + 𝑏 𝜎 − 1 ( 𝑙 𝑚 ) 𝑠𝑙 𝑚 𝜎 ( 𝑠 )    + 𝑘  𝑚 = 1  𝑠 ∉ { 𝑖 1 , .. . ,𝑖 𝑘 , 𝜎 − 1 ( 𝑙 1 ) , .. ., 𝜎 − 1 ( 𝑙 𝑚 ) }     𝑏 𝑠 𝜎 − 1 ( 𝑙 𝑚 ) 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑠 ) 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝜎 − 1 ( 𝑙 𝑚 ) )     + 𝑘  𝑚 = 1  𝑠 ∉ { 𝑖 1 , .. . ,𝑖 𝑘 , 𝜎 − 1 ( 𝑙 1 ) , .. ., 𝜎 − 1 ( 𝑙 𝑚 ) }     𝑏 𝜎 − 1 ( 𝑙 𝑚 ) 𝑠 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝜎 − 1 ( 𝑙 𝑚 ) ) 𝜎 ( 𝑖 1 , .. ., 𝑖 𝑘 ) 𝑙 1 , .. .,𝑙 𝑘 ( 𝑠 )     ≤ 𝐶 𝑘 𝛿 . R ecallin g 0 < 𝑡 < 1 / 𝛿 , it follo ws that 𝐻 𝑖 1 , .. . ,𝑖 𝑘 𝑙 1 , .. . ,𝑙 𝑘 ≤ 𝐶 𝑘 𝑒 𝑡 𝛿 𝑘 𝑡 𝛿 ℎ ( 𝑡 ) ≤ 𝐶 𝑘 𝑒 𝑘 𝑡 𝛿 ℎ ( 𝑡 ) . (5.65) This completes the proof of Lemma 5.1 . A ckno wledgments Liu S.H. w as part iall y supported b y t he Fundamental Researc h Funds for the Central Uni v ersities DUT25R C(3)133. Re f erences Abe, O. (1969). A central limit theorem for t he number of edg es in the random intersection of two graphs. T he Annals of Mathematic al Statistic s , 40(1):144–151. Barbour , A. and Eagleson, G. (1986). Random a ssociation of symmetric ar ra ys. Stochastic Analysis and Applications , 4(3):239–281. Bloemena, A. (1964). Sam plin g from a Gr aph . Mathematical Cent r e tracts. Mathematisc h Centr um. Chao, C.- C ., Bai, Z ., and Liang , W .- Q. (1993). Asymp totic nor mality for oscillation of per mutation. Probability in the Engineering and Informational Sciences , 7(2):227–235. Chao, C.- C ., Zhao, L., and Liang , W .- Q. (1996). Estimating the er r or of a per mutational central limit theorem. Probability in the Engineering and Informational Scienc es , 10(4):533–541. Chatterjee, S. (2021). A new coefficient of correla tion. Jour nal of the Amer ican Statistical A ssociation , 116(536):2009–2022. Chen, L. H. and Fan g, X. (2015). On the er ror bound in a combinat orial cent ral limit theorem. Bernoulli , 21(1):335 – 359. 48 Chen, L. H., Fan g , X., and Shao, Q.-M. (2013a). From stein identities to moderate deviations . The Annals of Probability , 41(1):262–293. Chen, L. H., Fan g, X., and Shao, Q.-M. (2013b). Moderate de viations in poisson appro ximation: a first attempt. Statistica Sinic a , 23(4):1523–1540. Chen, L. H., Goldstein, L., and Shao , Q.-M. (2010). Nor mal appro ximation by Stein’ s method . Spr in g er Science & Busines s Media. Cliff, A. and Ord, J. (1981). Spatial Proc esses: Mode ls & Applications . P ion Limited. Danie ls, H. E. (1944). T he rela tion between measures of correlation in t he univ erse of sample per mu- tations. Biometrika , 33(2):129–135. Dette, H., Sibur g, K. F ., and Stoimeno v , P . A. (2013). A copula-based non-parametric measure of re gression dependence. Scandinavian Jour nal of Statistic s , 40(1):21–41. F an g, X. and Röllin, A. (2015). Rates of con v erg ence for multiv ar iate normal appro ximation with applications to dense g raphs and doubly inde x ed per mutation statistics. Bernoulli , 21(4):2157 – 2189. Friedman, J. H. and Rafsky , L. C. (1979). Multi v ariate g eneralizations of the w ald- w olf o witz and smirnov t w o-sample tes ts. T he Annals of statistic s , 7(4):697–717. Friedman, J. H. and Rafs ky , L. C. (1983). Graph-theoretic measures of multiv ar ia te association and prediction. T he Annals of Statistic s , 11(2):377–391. Fulman, J. (2004). Stein ’ s method and non-rev ersible mark o v chains. Lecture Notes -Monog raph Series , pag es 69–77. Goldstein, L. (2005). Ber ry-es seen bounds f or combinatorial central limit theorems and pattern occur - rences, using zero and size biasin g . Journal of Applied Probabilit y , 42(3):661–683. Hubert , L. and Sc hultz, J. (1976). Quadratic assignment as a general data analy sis strategy . British journal of mathematical and statistic al psycholo gy , 29(2):190–241. Jog deo, K. (1968). Asym ptotic nor mality in nonparametric methods. The Annals of Ma thematic al Statistic s , 39(3):905–922. Liu, S.-H. and Zhang, Z.-S. (2023). Cramér-type moderate dev iations under local dependence. T he Annals of Applied Probability , 33(6A ):4747 – 4797. P etro v , V . V . (2012). Sums of independent r andom variables , v olume 82. Spr in ger Science & Business Media. Pham, D. T ., Möcks, J., and Sroka, L. (1989). As ympt otic nor mality of double -inde xed linear per mu- tation statistic s. Annals of the Institute of Statistic al Mathematics , 41(3):415–427. Rinott, Y . and Ro tar , V . (1997). On coupling constructions and rates in the c lt for dependent sum- mands with applications to the antiv oter model and wei ghted u-statis tics. T he Annals of Applied Probability , pag es 1080–1105. Sc hilling , M. F . (1986). Multiv ariate two- sample tests based on nearest neighbors . Journal of the American Statistic al A ssociation , 81(395):799–806. Cramér -t ype moder ate dev iation for double index permutation statistics 49 Shao , Q.-M., Zhang , M., and Zhan g , Z.-S. (2021). Cramér -t ype moderate de viation theorems f or nonnormal approxima tion. T he Annals of Applied Probability , 31(1):247–283. Shapir o, C. P . and Huber t, L. (1979). Asympto tic nor mality of permutation statis tics deriv ed from w ei ghted sums of bi v ariate functions. T he Annals of Statis tics , 7(4):788–794. Shi, H., Dr ton, M., and Han, F . (2022). Distr ibution- free consistent independence tests via center - outw ard ranks and si gns. Journal of the American Statistic al Association , 117(537):395–410. Stein, C. (1972). A bound f or the er ror in the nor mal appro ximation to the distr ibution of a sum of dependent random variables. In Proc eeding s of the sixth Berkeley symposium on mathematical statis tics and probabilit y , volume 2: Probabilit y theor y , pages 583–603. Uni ver sity of California Press . Vi gna, S. (2015). A w eighted cor re lation inde x f or rank in gs with ties. In Proc eeding s of the 24th international conf er ence on Wor ld Wide W eb , pag es 1166–1176. Zhan g, Z .-S. (2022). Ber ry– es seen bounds f or g eneralized u-s tatistics . Electronic Jour nal of Probability , 27:1–36. Zhan g, Z.- S. (2023). Cramér -t ype moderate de viation of normal appro ximation f or unbounded ex - chan geable pairs. Ber noulli , 29(1):274–299. Zhao, L., Bai, Z ., Chao, C.- C., and Lian g, W .- Q. (1997). Er ror bound in a central limit theorem of double-index ed per mutation s tatistics. T he Annals of Statistic s , 25(5):2210–2227.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment