Central limit theorem for the global clustering coefficient of random geometric graphs

Cen tral limit theorem for the global clustering co eﬃcien t of random geometric graphs Mingao Y uan 1 , Md. Niam ul Islam Sium 2 1 Dep artment of Mathematic al Scienc es, The University of T exas at El Paso, El Paso, T exas 79968, USA e-mail: myuan2@utep.edu 2 Dep artment of Mathematic al Scienc es, The University of T exas at El Paso, El Paso, T exas 79968, USA e-mail: msium@miners.utep.edu Abstract: The global clustering co eﬃcien t serves as a p o werful metric for the struc- tural analysis and comparison of complex net works. Random geometric graphs oﬀer a realistic framew ork for represen ting the spatial constrain ts and geometry often found in real-w orld netw ork datasets. In this paper, we establish a central limit theorem for the global clustering co eﬃcien t of random geometric graphs. Our main result identiﬁes the cen tering and scaling sequences required for con vergence in la w to the standard normal distribution. Our approac h v aries by regime: in the dense case, w e emplo y the Lya- puno v CL T; in the intermediate case, w e utilize the asymptotic theory of U -statistics with sample-size-dependent kernels; and in the sparse regime, we use the metho d of momen ts to deriv e the asymptotic distribution. Notably , the con vergence rates for non- uniform and uniform random geometric graphs diverge in the dense regime, yet they coincide in the sparse regime. In addition, w e ﬁnd that the global clustering co eﬃcien t for b oth uniform and non-uniform RGGs is asymptotically equal to 3 / 4. MSC2020 sub ject classiﬁcations : 62A09; 62E20. Keyw ords and phrases: global clustering co eﬃcient, random geometric graph, cen- tral limit theorem. 1. In tro duction Random geometric graphs hav e emerged as p ow erful mo dels for net w orks with spatial or ge- ometric structure. In these mo dels, no des are t ypically distributed according to a probabilit y distribution o v er a metric space, and an edge connects t wo no des if the distance b et w een them is b elo w a certain threshold [ 23 , 11 , 2 ]. When the no de distribution is uniform, the mo del is referred to as a uniform random geometric graph. Con versely , if the distribution is non-uniform, it is termed a non-uniform random geometric graph. Unlike classical ran- dom graph mo dels such as the Erd˝ os-R ´ enyi mo del where edges form indep endently with 1 M. Y uan /CL T for clustering c o eﬃcient of RGGs 2 a ﬁxed probabilit y [ 8 ], random geometric graphs naturally incorp orate spatial constrain ts and geometric dep endencies that arise in man y real-world net w orks [ 23 , 4 , 35 ]. The spatial em b edding induces correlations among edges that share common endpoints, leading to struc- tural prop erties fundamen tally diﬀerent from classical random graphs [ 5 ]. These mo dels ﬁnd applications in div erse domains including wireless sensor net w orks [ 13 ], biological systems [ 14 ], public health [ 21 , 33 ]. The global clustering co eﬃcien t is a foundational measure for quan tifying net work top ol- ogy , reﬂecting the tendency of no des to form tightly connected clusters [ 26 , 20 ]. Deﬁned as the ratio of three times the num b er of triangles to the num b er of 2-paths [5], this metric has b ecome central to net work data analysis [ 24 , 20 , 18 , 25 ]. F or example, [ 24 ] used clustering co eﬃcien ts to diﬀerentiate betw een netw ork structures through extensive sim ulation studies. Understanding the prop erties of the global clustering co eﬃcien t within random graph mo dels is essential [ 18 , 5 , 29 ]. One primary research fo cus inv olves deriving analytical ex- pressions for this metric. F or example, [ 18 ] analytically deriv ed the clustering co eﬃcien t for an online so cial net w ork mo del, while [ 5 ] established an expression for a random geometric graph mo del constructed diﬀeren tly from the one presen ted here. Another signiﬁcan t research direction explores the limiting distribution of the global clustering co eﬃcient [ 29 ]. T o our kno wledge, the clustering co eﬃcient of non-uniform random geometric graphs—where no des are distributed non-uniformly—remains less w ell understo o d. In this pap er, we establish a central limit theorem for the global clustering co eﬃcien t of non uniform random geometric graphs. W e b egin b y c haracterizing the leading-order terms for the exp ected coun ts of small subgraphs, such as edges, 2-paths, and triangles. These results are foundational to our analysis and of indep enden t interest, as they elucidate how non uniform vertex distributions inﬂuence lo cal subgraph architecture. Building on these char- acterizations, w e deriv e the exact asymptotic limit and limiting distribution of the clustering co eﬃcien t. Our analysis demonstrates that the global clustering co eﬃcien t in the non uniform random geometric graph con v erges in probabilit y to 3 / 4 , characterizing a limit that, to our kno wledge, has not b een previously rep orted. This limit coincides with the uniform case. W e pro v e that, under appropriate scaling and centering, the co eﬃcien t conv erges in law to the standard normal distribution. Our metho dology spans three distinct regimes: the Ly apunov cen tral limit theorem for the dense case, the asymptotic theory of U -statistics with sample- size-dep enden t k ernels for the in termediate case, and the metho d of moments for the sparse case. Despite sharing identical limits and asymptotic distributions, the rates of con vergence M. Y uan /CL T for clustering c o eﬃcient of RGGs 3 for nonuniform and uniform random geometric graphs diverge. Sp eciﬁcally , the non uniform mo del exhibits a slow er con vergence rate than its uniform counterpart in the dense regime. Ho w ever, in the sparse regime, these conv ergence rates coincide. These ﬁndings highlight a critical disparity in the underlying architecture of these mo dels, oﬀering deep er insight in to ho w nonuniformit y shap es the global prop erties of geometric graphs. The pap er is organized as follo ws. Our main results are presented in Section 2 , while the pro ofs are deferred to Section 3 . Notations: Throughout this pap er, w e adopt the Bachmann–Landau notation for asymp- totic analysis. Let c 1 , c 2 b e t w o p ositiv e constan ts. F or tw o p ositive sequence a n , b n , de- note a n = Θ( b n ) if c 1 ≤ a n b n ≤ c 2 ; denote a n = O ( b n ) if a n b n ≤ c 2 ; a n = o ( b n ) or b n = ω ( a n ) if lim n →∞ a n b n = 0. Let X n b e a sequence of random v ariables. X n = O P ( a n ) means X n a n is b ounded in probabilit y . X n = o P ( a n ) means X n a n con v erges to zero in probability . X n ⇒ N (0 , 1) represents X n con v erges in distribution to the standard normal distribution. The notation P i  = j  = k  = l represen ts summation o ver indices i, j, k , l ∈ { 1 , 2 , 3 , . . . , n } with i  = j, i  = k , i  = l , j  = k , j  = l , k  = l . I [ E ] is the indicator function of even t E . E c represen ts the complement of ev ent E . F or a set B , | B | denote the n um b er of elements in the set B . F or a function g ( x ), g ( k ) ( x ) denotes the k -th deriv ative of g ( x ). W e also use f ′ ( x ), f ′′ ( x ) and f ′′′ ( x ) denote the ﬁrst, second, and third deriv atives of f ( x ), resp ectively . 2. Main results Let G = ( V , E ) b e an undirected graph on n verti ces, where V = { 1 , 2 , . . . , n } is the v ertex set and E ⊆ {{ i, j } : i, j ∈ V , i  = j } is the edge set. An adjacency matrix A is a square n × n matrix used to represen t a graph, where A ij = 1 if { i, j } ∈ E , and A ij = 0 otherwise. The num b er of edges incident to a v ertex i is called its degree, denoted by d i . A triangle is a set of three mutually adjacent v ertices, while a 2-path is a set of three distinct v ertices connected by exactly t w o edges. The global clustering co eﬃcien t C n is deﬁned as the ratio of the num b er of triangles to the num b er of 2-paths [ 26 , 19 ]: C n = P i  = j  = k A ij A j k A ki P i  = j  = k A ij A j k . It quan tiﬁes the transitivit y of the graph, representing the tendency of tw o neighbors of a common vertex to b e connected. A graph is considered random if its edge set is determined according to a speciﬁc sto c hastic pro cess. In the classical Erd˝ os-R ´ enyi mo del, every edge exists indep enden tly with a ﬁxed, M. Y uan /CL T for clustering c o eﬃcient of RGGs 4 uniform probability . In contrast, random geometric graphs (RGGs) incorp orate spatial con- strain ts: v ertices are distributed within a metric space, and an edge exists b et w een t w o v ertices if and only if their distance is less than a prescrib ed connectivity radius [ 6 , 12 ]. In this work, we fo cus on the random geometric graphs deﬁned b elow [ 6 , 12 ]. Deﬁnition 2.1. L et r n ∈ [0 , 0 . 5] b e a r e al numb er and f ( x ) b e a pr ob ability density function on [0 , 1] . Given indep endent r andom variables X 1 , X 2 , . . . , X n distribute d ac c or ding to f ( x ) , the R andom Ge ometric Gr aph (R GG) G n ( f , r n ) is deﬁne d as A ij = I [ d ( X i , X j ) ≤ r n ] , wher e A ii = 0 and d ( X i , X j ) = min {| X i − X j | , 1 − | X i − X j |} . The mo del G n ( f , r n ) is a one-dimensional random geometric graph, where the v ertices are indep enden tly sampled from the density function f ( x ) on the unit in terv al [0 , 1]. An edge exists betw een any tw o v ertices within distance r n [ 15 , 16 , 3 ]. This framew ork eﬀectiv ely cap- tures the spatial geometry and structural dep endencies c haracteristic of real-w orld net works. When f ( x ) is the uniform densit y , G n ( f , r n ) corresp onds to the uniform r andom ge ometric gr aph . Con v ersely , the mo del is termed a non-uniform r andom ge ometric gr aph when f ( x ) is non-constan t. T o date, research has fo cused extensively on the uniform case, largely due to its analytical tractability [ 7 , 9 , 12 , 34 , 28 , 29 , 31 ]. How ever, non-uniform RGGs hav e re- cen tly gained prominence for their sup erior capacity to mo del the inhomogeneities inherent in empirical netw orks [ 22 , 10 , 30 ]. W e no w present a pivotal prop osition that is fundamental to the study of the asymptotic prop erties of the global clustering co eﬃcient for non-uniform random geometric graphs. Prop osition 2.2. L et A b e sample d fr om G n ( f , r n ) . Supp ose f ( x ) = g ( x ) I [0 ≤ x ≤ 1] , wher e g ( x ) is a p erio dic function with p erio d one that is b ounde d away fr om zer o and has a b ounde d fourth derivative. In addition, we assume r n = o (1) . L et µ n = E [ A 12 A 13 A 23 ] E [ A 12 A 13 ] . Then we have E [ A 12 | X 1 ] = 2 r n f ( X 1 ) + f ′′ ( X 1 ) 3 r 3 n + O ( r 5 n ) , (1) E [ A 12 A 13 | X 1 ] = 4 r 2 n f 2 ( X 1 ) + 4 r 4 n 3 f ( X 1 ) f ′′ ( X 1 ) + O ( r 6 n ) , (2) E [ A 12 A 23 | X 1 ] = 4 r 2 n f 2 ( X 1 ) + r 4 n 3  4( f ′ ( X 1 )) 2 + 6 f ( X 1 ) f ′′ ( X 1 )  + O ( r 6 n ) , (3) E [ A 12 A 13 A 23 | X 1 ] = 3 r 2 n f 2 ( X 1 ) + 5 r 4 n 12  ( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )  + O ( r 5 n ) , (4) E [ A 12 A 13 ] = 4 r 2 n E [ f 2 ( X 1 )] + 4 r 4 n 3 E [ f ( X 1 ) f ′′ ( X 1 )] + O ( r 5 n ) , (5) M. Y uan /CL T for clustering c o eﬃcient of RGGs 5 E [ A 12 A 13 A 23 ] = 3 r 2 n E [ f 2 ( X 1 )] + 5 r 4 n 12 E [( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )] + O ( r 6 n ) , (6) µ n = 3 + r 2 n a f 4 + r 2 n b f + O ( r 3 n ) , (7) Her e, the r emainder terms O ( r 3 n ) , O ( r 5 n ) and O ( r 6 n ) do not dep end on X 1 , and a f = 5 E [( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )] 12 E [ f 2 ( X 1 )] , b f = 4 E [ f ( X 1 ) f ′′ ( X 1 )] 3 E [ f 2 ( X 1 )] . (8) Prop osition 2.2 plays a crucial role in our analysis and is of indep endent interest for under- standing the local subgraph structures of non-uniform random geometric graphs. Sp eciﬁcally , it characterizes the leading-order terms for the exp ected coun ts of small subgraphs—including edges, 2-paths, and triangles—thereb y illuminating how non-uniform v ertex distributions af- fect these coun ts. These results are also essential for deriving b oth the asymptotic limit and the limiting distribution of the clustering co eﬃcient. F or instance, w e demonstrate in the sequel that the clustering co eﬃcien t is asymptotically equal to µ n , which conv erges to 3 / 4. Before moving forward, w e examine the assumptions of Prop osition 2.2 that underlie our primary ﬁndings. The condition r n = o (1) ensures that the resulting graph is rela- tiv ely sparse. This assumption is consistent with empirical observ ations, as most real-world net w orks exhibit sparsit y [ 1 ]. The condition f ( x ) = g ( x ) I [0 ≤ x ≤ 1], where g satisﬁes g ( x + 1) = g ( x ) = g ( x − 1) for all x ∈ R , eﬀectiv ely deﬁnes f ( x ) as a density function on a circle with circumference 1. This assumption is crucial for the analytical tractability of the leading terms in ( 1 )–( 7 ) of Prop osition 2.2 . Without this p erio dic structure, the ex- pressions for these terms would b ecome prohibitively complex due to b oundary eﬀects. The assumption that g ( x ) p ossesses a b ounded fourth deriv ativ e is a technical requirement. This condition allows for a T a ylor expansion that ensures the remainder terms of ( 1 )–( 7 ) are O ( r 3 n ) or O ( r 5 n ) or O ( r 6 n ). Crucially , these b ounds hold uniformly in X 1 , ensuring the remainders are indep endent of the sp eciﬁc lo cation of the vertex. The assumptions imp osed on the density f ( x ) in Prop osition 2.2 are relativ ely mild and easily satisﬁed in practice. Notably , they are fulﬁlled b y the uniform distribution as w ell as the ubiquitous v on Mises densit y . A more comprehensive analysis of these conditions is pro vided later. Equipp ed with Prop osition 2.2 , w e no w present a cen tral limit theorem for the global clustering co eﬃcient. Theorem 2.3. Supp ose the assumption of Pr op osition 2.2 holds. L et µ n = E [ A 12 A 13 A 23 ] E [ A 12 A 13 ] . Then the fol lowing r esults hold. M. Y uan /CL T for clustering c o eﬃcient of RGGs 6 (I). If nr 5 n = ω (1) , then 16 √ n E [ f 2 ( X 1 )] 3 r 2 n σ 1  C n − µ n  ⇒ N (0 , 1) , (9) pr ovide d that σ 2 1 = E h − 3 c f f 2 ( X 1 ) − 2 f ( x ) f ′′ ( X 1 ) − ( f ′ ( X 1 )) 2  2 i > 0 , wher e c f = E [( f ′ ( X 1 )) 2 ] E [ f 2 ( X 1 )] . (II). Supp ose nr n = ω (1) . If nr 5 n = o (1) or f ( x ) is the uniform density, then 2 √ 2 nr 2 n E [ f 2 ( X 1 )] 3 σ 2 n  C n − µ n  ⇒ N (0 , 1) , (10) wher e σ 2 2 n = E [ h ( X 1 , X 2 , X 3 ) h ( X 1 , X 2 , X 4 )] = Θ( r 3 n ) and h ( X 1 , X 2 , X 3 ) = A 12 A 13 A 23 − µ n 3 ( A 12 A 13 + A 12 A 23 + A 13 A 23 ) . (11) (III). If nr n = o (1) and n 3 r 2 n = ω (1) , then 8 n √ nr n p E [ f 2 ( X 1 )] 3  C n − µ n  ⇒ N (0 , 1) . (12) Theorem 2.3 establishes a central limit theorem for the global clustering co eﬃcient of ran- dom geometric graphs, fo cusing particularly on the nonuniform case. After suitable cen tering and scaling, the global clustering coeﬃcient con verges in distribution to the standard normal distribution. When n 3 r 2 n = o (1), the exp ected num b er of triangles tends to zero according to Prop osition 2.2 . Consequently , this regime is excluded from the scop e of Theorem 2.3 . The pro of strategy for Theorem 2.3 is structured as follows. In Case (I), the dense regime, w e show that the co eﬃcient is asymptotically equiv alen t to a sum of i.i.d. terms, allo wing us to apply the Lyapuno v CL T (see Lemma 3.3 ). In Case (I I), the in termediate dense case, w e prov e that the co eﬃcient is asymptotically equiv alent to a U -statistic of order 2 with a sample-size-dep enden t k ernel. W e then utilize the asymptotic theory for such statistics to deriv e the distribution (see Lemmas 3.2 and 3.4 ). Finally , in Case (I I I), the sparse regime, since no existing theory directly applies as far as w e kno w, w e employ the metho d of moments to derive the asymptotic distribution (see Lemma 3.5 ). Theorem 2.3 pro vides the exact limit of the global clustering co eﬃcient; to the b est of our kno wledge, this v alue was previously unknown for non-uniform random geometric graphs. According to Theorem 2.3 , the global clustering co eﬃcient is asymptotically equal to µ n . By Prop osition 2.2 , we ha v e C n = 3 + r 2 n a f 4 + r 2 n b f + o P (1) . M. Y uan /CL T for clustering c o eﬃcient of RGGs 7 F or a uniform random geometric graph, f ( x ) is constan t (sp eciﬁcally , f ( x ) = 1), which implies a f = b f = 0; th us, the limit is precisely 3 / 4, consistent with the result established in [ 29 ]. F or non-uniform random geometric graphs, the limit is 3 / 4 with an additional error term of order O ( r 2 n ). When r n = o (1), the limits are asymptotically identical. Although the global clustering co eﬃcien ts for b oth uniform and non-uniform RGGs share the same limit and asymptotic distribution, their con vergence rates diﬀer in the dense regime. Note that our result for non-uniform R GGs in Case (I) does not apply to the uniform case, as the latter corresp onds to the setting where σ 1 = 0. In the dense case where nr 5 n = ω (1), the clustering co eﬃcient for non-uniform R GGs (with σ 1 > 0) con verges to the standard normal distribution at a rate of r 2 n √ n . In contrast, for the uniform RGG, the conv ergence rate is 1 n √ r n (see Case (I I)). Since 1 n √ r n = o  r 2 n √ n  under the condition nr 5 n = ω (1), the con vergence rate for the non-uniform case is notably slo wer than that of the uniform case. In the sparse regime where nr 5 n = o (1), the conv ergence rates for uniform and non-uniform RGGs are identical. A similar phenomenon is observed in the asymptotic prop erties of the friendship paradox in [ 27 , 32 ]. The paradox exhibits diﬀeren t limits for uniform and nonuniform random geometric graphs in the dense case, whereas these limits are iden tical in the sparse regime. W e no w provide examples of densit y functions satisfying the assumptions of Theorem 2.3 . Bey ond the obvious case of the uniform density function, we pro vide the following non trivial example. Let us consider the family of von Mises densities on [0 , 1], deﬁned b y f ( x ) = g ( x ) I [0 ≤ x ≤ 1] , g ( x ) = e κ cos(2 π x − µ ) I 0 , (13) where κ ≥ 0, µ ∈ [0 , 1], and I 0 = 1 2 π Z 2 π 0 e κ cos( x ) dx. The uniform distribution is reco v ered when κ = 0. As the concen tration parameter κ in- creases, the distribution clusters more tigh tly around the mean µ , ev entually b eha ving lik e a normal distribution with v ariance 1 κ . The function g ( x ) in ( 13 ) is smooth, with b ounded and contin uous deriv ativ es of all orders. It is strictly b ounded from b elo w by the p ositive constant ( e κ I 0 ( κ )) − 1 , ensuring the density is non-v anishing on its domain. F urthermore, the p erio dicit y of the cosine function implies that g ( x ) satisﬁes the p eriodic b oundary conditions g ( x + 1) = g ( x ) = g ( x − 1) for all x ∈ R . Consequen tly , the v on Mises distribution satisﬁes the assumptions of Theorem 2.3 . The parameter σ 2 1 in Case (I) do es not admit a simple closed-form expression. W e provide sev eral numerical v alues corresp onding to diﬀerent sets of parameters ( κ, µ ) in T able 1 . All v alues of σ 2 1 are p ositive. A larger κ results in a larger σ 2 1 , while the v alues remain inv arian t M. Y uan /CL T for clustering c o eﬃcient of RGGs 8 with resp ect to the mean µ . A p ossible reason for this is that f ( x ) is a probability density on a circle and the distance is translation-inv arian t. T a ble 1 Numeric values of σ 2 1 . ( κ, µ ) (0.1, 0.1) (0.5, 0.1) (1.0, 0.1) (5, 0.1) σ 2 1 31.78 1264.83 13924.35 13646828.67 ( κ, µ ) (0.1, 0.3) (0.5, 0.3) (1.0, 0.3) (5, 0.3) σ 2 1 31.78 1264.83 13924.35 13646828.67 ( κ, µ ) (0.1, 0.5) (0.5, 0.5) (1.0, 0.5) (5, 0.5) σ 2 1 31.78 1264.83 13924.35 13646828.67 3. Pro of of main results In this section, we pro vide detailed pro ofs of the main results. 3.1. Pr o of of Pr op osition 2.2 Firstly , we pro ve ( 1 ). Note that d ( X 1 , X 2 ) ≤ r n is equiv alent to three cases: (a) r n ≤ X 1 ≤ 1 − r n and X 1 − r n ≤ X 2 ≤ X 1 + r n , (b) 0 ≤ X 1 ≤ r n and 0 ≤ X 2 ≤ X 1 + r n or 1 − r n + X 1 ≤ X 2 ≤ 1, (c) 1 − r n ≤ X 1 ≤ 1 and X 1 − r n ≤ X 2 ≤ 1 or 0 ≤ X 2 ≤ X 1 + r n − 1. Then the conditional exp ectation E [ A 12 | X 1 ] can b e expressed as E [ A 12 | X 1 ] =          R X 1 + r n X 1 − r n f ( x ) dx, if r n ≤ X 1 ≤ 1 − r n , R X 1 + r n 0 f ( x ) dx + R 1 1 − r n + X 1 f ( x ) dx, if 0 ≤ X 1 ≤ r n , R 1 X 1 − r n f ( x ) dx + R X 1 + r n − 1 0 f ( x ) dx, if 1 − r n ≤ X 1 ≤ 1. By assumption, f ( x ) = g ( x ) I [0 ≤ x ≤ 1], and g ( x ) is a p erio dic function with p erio d one. Then g ( x + 1) = g ( x ) = g ( x − 1) for all x ∈ R . Using a change of v ariables in the deﬁnite in tegral, one has Z 1 1 − r n + X 1 f ( x ) dx = Z 1 1 − r n + X 1 g ( x ) I [0 ≤ x ≤ 1] dx = Z 0 X 1 − r n g ( x + 1) I [0 ≤ x + 1 ≤ 1] dx = Z 0 X 1 − r n g ( x ) dx, and Z X 1 + r n − 1 0 f ( x ) dx = Z X 1 + r n 1 g ( x − 1) I [0 ≤ x − 1 ≤ 1] dx M. Y uan /CL T for clustering c o eﬃcient of RGGs 9 = Z X 1 + r n 1 g ( x ) dx. Hence, for all X 1 ∈ [0 , 1], we obtain E [ A 12 | X 1 ] = Z X 1 + r n X 1 − r n g ( x ) dx. By assumption, g ( x ) has b ounded fourth deriv ative. Moreov er, | x − X 1 | ≤ r n for X 1 − r n ≤ x ≤ X 1 + r n . Using a fourth-order T aylor expansion of g ( x ) at X 1 , we hav e g ( x ) = 3 X k =0 g ( k ) ( X 1 ) k ! ( x − X 1 ) k + O ( r 4 n ) , (14) where the remainder term O ( r 4 n ) do es not dep end on X 1 . Based on ( 14 ), it is easy to obtain Z X 1 + r n X 1 − r n g ( x ) dx = 2 r n g ( X 1 ) + g ′′ ( X 1 ) 3 r 3 n + O ( r 5 n ) . (15) Note that f ( x ) = g ( x ) I [0 ≤ x ≤ 1]. F or X 1 ∈ [0 , 1], we ha ve f ( X 1 ) = g ( X 1 ) and f ′′ ( X 1 ) = g ′′ ( X 1 ). Thus, equation ( 1 ) is prov ed. Pro of of equation ( 2 ) . The equation ( 2 ) is trivial, noting that ( 1 ) holds and E [ A 12 A 13 | X 1 ] = E [ A 12 | X 1 ] E [ A 13 | X 1 ] =  E [ A 12 | X 1 ]  2 . Pro of of equation ( 3 ) . By the prop erties of conditional exp ectations and equation ( 1 ), w e hav e E [ A 12 A 23 | X 1 ] = E [ E [ A 12 A 23 | X 1 , X 2 ] | X 1 ] = E [ A 12 E [ A 23 | X 2 ] | X 1 ] = 2 r n E [ A 12 f ( X 2 ) | X 1 ] + r 3 n 3 E [ A 12 f ′′ ( X 2 ) | X 1 ] + O ( r 5 n ) E [ A 12 | X 1 ] . (16) By a similar argument as in the pro of of equation ( 1 ), we get E [ A 12 f ( X 2 ) | X 1 ] = Z X 1 + r n X 1 − r n g 2 ( x ) dx = 2 r n g 2 ( X 1 ) + 2 3  ( g ′ ( X 1 )) 2 + g ( X 1 ) g ′′ ( X 1 )  r 3 n + O ( r 5 n ) = 2 r n f 2 ( X 1 ) + 2 3  ( f ′ ( X 1 )) 2 + f ( X 1 ) f ′′ ( X 1 )  r 3 n + O ( r 5 n ) , (17) and E [ A 12 f ′′ ( X 2 ) | X 1 ] = Z X 1 + r n X 1 − r n g ( x ) g ′′ ( x ) dx = 2 r n g ( X 1 ) g ′′ ( X 1 ) + O ( r 3 n ) . (18) M. Y uan /CL T for clustering c o eﬃcient of RGGs 10 Com bining ( 16 )-( 18 ) yields ( 3 ). Pro of of equation ( 4 ): W e prov e ( 4 ) in three scenarios: (I) X 1 ∈ ( r n , 1 − r n ); (I I) X 1 ∈ [0 , r n ); and (I I I) X 1 ∈ [1 − r n , 1]. Firstly , w e consider X 1 ∈ ( r n , 1 − r n ). In this case, A 12 A 13 A 23 = 1 if and only if one of the follo wing conditions holds: (a) X 2 ∈ ( X 1 , X 1 + r n ) and X 3 ∈ ( X 1 , X 1 + r n ) or X 3 ∈ ( X 2 − r n , X 1 ); (b) X 2 ∈ ( X 1 − r n , X 1 ) and X 3 ∈ ( X 1 − r n , X 1 ) or X 3 ∈ ( X 1 , X 2 + r n ). Then the conditional exp ectation in ( 4 ) can b e expressed as E [ A 12 A 13 A 23 | X 1 ] = Z X 1 + r n X 1 f ( x 2 ) dx 2  Z X 1 + r n X 1 f ( x 3 ) dx 3 + Z X 1 X 2 − r n f ( x 3 ) dx 3  + Z X 1 X 1 − r n f ( x 2 ) dx 2  Z X 1 X 1 − r n f ( x 3 ) dx 3 + Z X 2 + r n X 1 f ( x 3 ) dx 3  (19) The second term of ( 19 ) can b e obtained from the ﬁrst term by changing r n to − r n . Therefore, w e only need to calculate the ﬁrst term of ( 19 ). Since f ( x ) is assumed to hav e a b ounded fourth deriv ativ e, T a ylor expanding f ( x ) ab out X 1 yields Z X 1 + r n X 1 f ( x 2 ) dx 2 = f ( X 1 ) r n + f ′ ( X 1 ) 2 r 2 n + f ′′ ( X 1 ) 6 r 3 n + f ′′′ ( X 1 ) 4! r 4 n + O ( r 5 n ) . (20) Then  Z X 1 + r n X 1 f ( x 2 ) dx 2  2 = f ( X 1 ) 2 r 2 n + ( f ′ ( X 1 )) 2 4 r 4 n + f ( X 1 ) f ′ ( X 1 ) r 3 n + 2 f ( X 1 ) f ′′ ( X 1 ) 3! r 4 n + O ( r 5 n ) . (21) Let F ( t ) = R t 0 f ( x ) dx . It is straightforw ard to get that Z X 1 + r X 1 f ( x 2 ) dx 2 Z X 1 X 2 − r n f ( x 3 ) dx 3 = Z X 1 + r n X 1 F ( X 1 ) f ( x 2 ) dx 2 − Z X 1 + r n X 1 F ( x 2 − r n ) f ( x 2 ) dx 2 . (22) By ( 20 ), we ha v e Z X 1 + r n X 1 F ( X 1 ) f ( x 2 ) dx 2 = F ( X 1 )  f ( X 1 ) r n + f ′ ( X 1 ) 2 r 2 n + f ′′ ( X 1 ) 6 r 3 n + f ′′′ ( X 1 ) 4! r 4 n + O ( r 5 n )  . (23) No w w e ev aluate the second integral in ( 22 ). Applying the T aylor expansion to F ( x 2 − r n ) at x 2 yields F ( x 2 − r n ) f ( x 2 ) =  F ( x 2 ) − f ( x 2 ) r n + f ′ ( x 2 ) 2! r 2 n − f ′′ ( x 2 ) 3! r 3 n + f ′′′ ( x 2 ) 4! r 4 n + O ( r 5 n )  f ( x 2 ) . (24) M. Y uan /CL T for clustering c o eﬃcient of RGGs 11 The integral of the ﬁrst term of ( 24 ) is equal to Z X 1 + r n X 1 F ( x 2 ) f ( x 2 ) dx 2 = 1 2  F ( X 1 + r n ) 2 − F ( X 1 ) 2  . (25) By the T a ylor expansion of F ( X 1 + r n ) at X 1 , we hav e F ( X 1 + r n ) = F ( X 1 ) + f ( X 1 ) r n + 1 2 f ′ ( X 1 ) r 2 n + 1 3! f ′′ ( X 1 ) r 3 n + 1 4! f ′′′ ( X 1 ) r 4 n + O ( r 5 n ) . Then we hav e Z X 1 + r n X 1 F ( x 2 ) f ( x 2 ) dx 2 = 1 2  f ( X 1 ) 2 r 2 n + ( f ′ ( X 1 )) 2 4 r 4 n + 2 F ( X 1 ) f ( X 1 ) r n + F ( X 1 ) f ′ ( X 1 ) r 2 n + 2 · F ( X 1 ) f ′′ ( X 1 ) 3! r 3 n + 2 F ( x 1 ) f ′′′ ( X 1 ) 4! r 4 n + f ( X 1 ) f ′ ( X 1 ) r 3 n + 2 f ( X 1 ) f ′′ ( X 1 ) 3! r 4 n + O ( r 5 n )  . (26) Substituting the T a ylor expansion of f ( x ), the in tegral of the second term of ( 24 ) is equal to r n Z X 1 + r n X 1 f ( x 2 ) 2 dx 2 = r n Z X 1 + r n X 1 h f ( X 1 ) + f ′ ( X 1 )( x 2 − X 1 ) + f ′′ ( X 1 ) 2! ( x 2 − X 1 ) 2 + f ′′′ ( X 1 ) 3! ( x 2 − X 1 ) 3 + O ( r 4 n ) i 2 dx 2 = r n Z X 1 + r n X 1 h f ( X 1 ) 2 + ( f ′ ( X 1 )) 2 ( x 2 − X 1 ) 2 + 2 f ( X 1 ) f ′ ( X 1 )( x 2 − X 1 ) + f ( X 1 ) f ′′ ( X 1 )( x 2 − X 1 ) 2 i dx 2 + O ( r 5 n ) = f ( X 1 ) 2 r 2 n + ( f ′ ( X 1 )) 2 3 r 4 n + f ( X 1 ) f ′′ ( X 1 ) r 4 n 3 + f ( X 1 ) f ′ ( X 1 ) r 3 n + O ( r 5 n ) . (27) Similarly , the in tegral of the third term of ( 24 ) is equal to r 2 n 2! Z X 1 + r n X 1 f ( x 2 ) f ′ ( x 2 ) dx 2 = r 2 n 4  f ( X 1 + r n ) 2 − f ( X 1 ) 2  = r 2 n 4 "  f ( X 1 ) + f ′ ( X 1 ) r n + f ′′ ( X 1 ) 2 r 2 n  2 − f ( X 1 ) 2 # + O ( r 5 n ) = r 2 n 4  ( f ′ ( X 1 )) 2 r 2 n + 2 f ( X 1 ) f ′ ( X 1 ) r n + f ( X 1 ) f ′′ ( X 1 ) r 2 n  + O ( r 5 n ) . (28) M. Y uan /CL T for clustering c o eﬃcient of RGGs 12 The integral of the fourth term of ( 24 ) is equal to r 3 n 3! Z X 1 + r n X 1 f ′′ ( x 2 ) f ( x 2 ) dx 2 = f ′′ ( X 1 ) f ( X 1 ) 3! r 4 n + O ( r 5 n ) . (29) Com bining ( 22 )-( 29 ) yields Z X 1 + r n X 1 f ( x 2 ) dx 2 Z X 1 X 2 − r n f ( x 3 ) dx 3 = r 2 n 2 f 2 ( X 1 ) +  f ( X 1 ) f ′′ ( X 1 ) 12 − ( f ′ ( X 1 )) 2 24  r 4 n + O ( r 5 n ) . (30) By ( 21 ) and ( 30 ), w e ha ve Z X 1 + r n X 1 f ( x 2 ) dx 2 Z X 1 + r n X 2 − r n f ( x 3 ) dx 3 = 3 2 f ( X 1 ) 2 r 2 n + 5 24  ( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )  r 4 n + f ( X 1 ) f ′ ( X 1 ) r 3 n + O ( r 5 n ) . (31) Changing r n of the ﬁrst term of ( 19 ) to − r n , we get the second term of ( 19 ). That is, Z X 1 X 1 − r n f ( x 2 ) dx 2  Z X 1 X 1 − r f ( x 3 ) dx 3 + Z X 2 + r n X 1 f ( x 3 ) dx 3  = 3 2 f ( X 1 ) 2 r 2 n + 5 24  ( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )  r 4 n − f ( X 1 ) f ′ ( X 1 ) r 3 n + O ( r 5 n ) . (32) In view of ( 19 ), ( 31 ) and ( 32 ), it follows that for r n < X 1 ≤ 1 − r n : E [ A 12 A 13 A 23 | X 1 ] = 3 r 2 n f 2 ( X 1 ) + 5 r 4 n 12  ( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )  + O ( r 5 n ) . No w we consider the case X 1 ∈ [0 , r n ). In this case, A 12 A 13 A 23 = 1 is satisﬁed if and only if one of the following four conﬁgurations o ccurs: If X 2 ∈ [0 , X 1 ], then X 3 ∈ ( X 1 , X 2 + r n ) or X 3 ∈ (0 , X 1 ) or X 3 ∈ (1 + X 1 − r n , 1). If X 2 ∈ (1 + X 1 − r n , 1), then X 3 ∈ [0 , X 1 ] or X 3 ∈ [1 + X 1 − r n , 1] or X 3 ∈ [ X 1 , X 2 + r n − 1]. If X 2 ∈ [ X 1 , r n ], then X 3 ∈ [0 , X 1 + r n ] or X 3 ∈ [ X 2 + 1 − r n , 1]. If X 2 ∈ [ r n , X 1 + r n ], then X 3 ∈ ( X 1 , X 1 + r n ) or X 3 ∈ ( X 2 − r n , X 1 ). Then the conditional exp ectation E [ A 12 A 13 A 23 | X 1 ] is written as E [ A 12 A 13 A 23 | X 1 ] = Z X 1 0 f ( x 2 ) dx 2  Z x 2 + r n 0 f ( x 3 ) dx 3 + Z 1 1+ X 1 − r n f ( x 3 ) dx 3  + Z 1 1+ X 1 − r n f ( x 2 ) dx 2  Z x 2 + r n − 1 0 f ( x 3 ) dx 3 + Z 1 1+ X 1 − r n f ( x 3 ) dx 3  + Z r n X 1 f ( x 2 ) dx 2  Z X 1 + r n 0 f ( x 3 ) dX 3 + Z 1 X 2 +1 − r n f ( x 3 ) dX 3  M. Y uan /CL T for clustering c o eﬃcient of RGGs 13 + Z X 1 + r n r n f ( x 2 ) dx 2 Z X 1 + r n x 2 − r n f ( x 3 ) dx 3 . By assumption, f ( x ) = g ( x ) I [0 ≤ x ≤ 1], and g ( x ) satisﬁes g ( x + 1) = g ( x ) = g ( x − 1) for all x ∈ R . Applying a change of v ariables to the deﬁnite integrals, we obtain Z 1 1+ X 1 − r n f ( x 3 ) dx 3 = Z 0 X 1 − r n g ( y + 1) dy = Z 0 X 1 − r n g ( y ) dy Z 1 x 2 +1 − r n f ( x 3 ) dx 3 = Z 0 x 2 − r n g ( y + 1) dy = Z 0 x 2 − r n g ( y ) dy , Z 1 1+ X 1 − r n f ( x 2 ) Z x 2 + r n − 1 X 1 − r n f ( x 3 ) dx 3 dx 2 = Z 0 X 1 − r n g ( x 2 + 1) Z x 2 + r n X 1 − r n g ( x 3 ) dx 3 dx 2 = Z 0 X 1 − r n g ( x 2 ) Z x 2 + r n X 1 − r n g ( x 3 ) dx 3 dx 2 . Therefore, we hav e E [ A 12 A 13 A 23 | X 1 ] = Z X 1 0 f ( x 2 ) dx 2 Z x 2 + r n X 1 − r n f ( x 3 ) dx 3 + + Z 0 X 1 − r n f ( x 2 ) Z x 2 + r n X 1 − r n f ( x 3 ) dx 3 dx 2 + Z X 1 + r n X 1 g ( x 2 ) dx 2 Z X 1 + r n x 2 − r n g ( x 3 ) dx 3 = Z X 1 X 1 − r n g ( x 2 ) dx 2 Z x 2 + r n X 1 − r n g ( x 3 ) dx 3 + Z X 1 + r n X 1 g ( x 2 ) dx 2 Z X 1 + r n x 2 − r n g ( x 3 ) dx 3 , By the same argument as used in ( 19 ), w e conclude ( 4 ) holds for X 1 ∈ [0 , r n ). F or X 1 ∈ [1 − r n , 1], it is easy to sho w that ( 4 ) holds by a similar argument. Pro of of equations ( 5 ) and ( 6 ): Equations ( 5 ) and ( 6 ) follo w directly from ( 2 ) and ( 4 ), resp ectively . Pro of of equation ( 7 ): By ( 5 ) and ( 6 ), it is easy to verify that µ n = 3 r 2 n R 1 0 f 3 ( x ) dx + 5 r 4 n 12 R 1 0 f ( x ) [( f ′ ( x )) 2 + 2 f ( x ) f ′′ ( x )] dx + O ( r 5 n ) 4 r 2 n R 1 0 f 3 ( x ) dx + 4 r 4 n 3 R 1 0 f 2 ( x ) f ′′ ( x ) dx + O ( r 5 n ) = 3 + r 2 n a f 4 + r 2 n b f + O ( r 3 n ) , Then the pro of is complete. M. Y uan /CL T for clustering c o eﬃcient of RGGs 14 3.2. Some lemmas Before proving Theorem 2.3 , we presen t sev eral lemmas. F or conv enience, let ∆ 123 = A 12 A 23 A 31 and P 123 = A 12 A 23 . Recall that µ n = E [∆ 123 ] E [ P 123 ] . The symmetric function h ( X 1 , X 2 , X 3 ) deﬁned in ( 11 ) is equal to h ( X 1 , X 2 , X 3 ) = ∆ 123 − µ n 3 ( P 123 + P 213 + P 231 ) . (33) Let h 1 ( X 1 ) = E [ h ( X 1 , X 2 , X 3 ) | X 1 ] , h 2 ( X 1 , X 2 ) = E [ h ( X 1 , X 2 , X 3 ) | X 1 , X 2 ] . (34) First, w e provide the asymptotic coun t of 2-paths asso ciated with the global clustering co eﬃcien t. Lemma 3.1. Supp ose the assumption of Pr op osition 2.2 holds and n 3 r 2 n = ω (1) . Then we have P i  = j  = k A ij A j k 4 n 3 r 2 n E [ f 2 ( X 1 )] = 1 + o P (1) . Pro of of Lemma 3.1 : By Prop osition 2.2 , E [ A 12 A 23 ] = 4 r 2 n E [ f 2 ( X 1 )] + O ( r 4 n ). The pro of pro ceeds b y sho wing that P i  = j  = k A ij A j k is asymptotically equal to n 3 E [ A 12 A 23 ]. T o this end, w e show that the v ariance of P i  = j  = k A ij A j k is of smaller order than n 6 ( E [ A 12 A 23 ]) 2 . The v ariance can b e expressed as E   X i  = j  = k ( A ij A j k − E [ A ij A j k ]) ! 2   = X i  = j  = k i 1  = j 1  = k 1 E [( A ij A j k − E [ A ij A j k ])( A i 1 j 1 A j 1 k 1 − E [ A ij A j k ])] . (35) If { i, j, k } ∩ { i 1 , j 1 , k 1 } = ∅ , then A ij A j k and A i 1 j 1 A j 1 k 1 are indep enden t. In this case, the exp ectation in ( 35 ) v anishes. Then { i, j, k } ∩ { i 1 , j 1 , k 1 }  = ∅ . Supp ose |{ i, j, k } ∩ { i 1 , j 1 , k 1 }| = 1. There are at most n 5 suc h indices. In the case where i = i 1 , it follo ws from Prop osition 2.2 that E [( A ij A j k − E [ A ij A j k ])( A i 1 j 1 A j 1 k 1 − E [ A ij A j k ])] = E [ A ij A j k A ij 1 A j 1 k 1 ] − E [ A ij A j k ] E [ A ij A j k ] = E [ E [ A ij A j k | X i ] E [ A ij 1 A j 1 k 1 | X i ]] − E [ A ij A j k ] E [ A ij A j k ] = O ( r 4 n ) . M. Y uan /CL T for clustering c o eﬃcient of RGGs 15 The remaining cases, suc h as i = j 1 and i = k 1 , can be b ounded similarly . Supp ose |{ i, j, k } ∩ { i 1 , j 1 , k 1 }| = 2. There are at most n 4 suc h indices. If i 1 , j 1 ∈ { i, j, k } , then | E [( A ij A j k − E [ A ij A j k ])( A i 1 j 1 A j 1 k 1 − E [ A ij A j k ])] | ≤ E [ A ij A j k A j 1 k 1 ] + E [ A ij A j k ] E [ A ij A j k ] = O ( r 3 n ) . The remaining cases i 1 , k 1 ∈ { i, j, k } and j 1 , k 1 ∈ { i, j, k } can b e b ounded similarly . Supp ose { i, j, k } = { i 1 , j 1 , k 1 } . There are at most n 3 suc h indices. In this case, | E [( A ij A j k − E [ A ij A j k ])( A i 1 j 1 A j 1 k 1 − E [ A ij A j k ])] | ≤ E [ A ij A j k ] + E [ A ij A j k ] E [ A ij A j k ] = O ( r 2 n ) . In summary , w e ha ve E   X i  = j  = k ( A ij A j k − E [ A ij A j k ]) ! 2   = O ( n 5 r 4 n + n 4 r 3 n + n 3 r 2 n ) = o ( n 6 r 4 n ) . Then the result of Lemma 3.1 follows from Mark ov’s inequality . Next, we presen t an asymptotic result for U -statistics of order 2 with sample-size-dep enden t k ernels. Let U n = 1 2 P i  = j k n ( X i , X j ), where k n ( x, y ) is a symmetric function. Without loss of generalit y , let E [ k n ( X i , X j )] = 0. Let q n ( x ) = E [ k n ( x, X 1 )]. Denote v 2 n = n 2 2 E [ k 2 n ( X 1 , X 2 )] + n 3 E [ q 2 n ( X 1 )].The following result from [ 17 ] characterizes the asymptotic distribution of this U n . Lemma 3.2. Supp ose the fol lowing c onditions holds. sup x,y | k n ( x, y ) | = o ( v n ) , (36) sup x E [ | k n ( x, X 1 ) | ] = o  v n n  . (37) Then U n v n ⇒ N (0 , 1) . Lemma 3.2 is instrumen tal in establishing the asymptotic distribution of the global clus- tering co eﬃcient for non uniform random geometric graphs in the intermediate dense regime. Next, we provide several lemmas that establish the order of the v ariances and the asymp- totic distributions of h 1 ( X 1 ), h 2 ( X 1 , X 2 ) and h ( X 1 , X 2 , X 3 ) deﬁned in ( 33 ) and ( 34 ). M. Y uan /CL T for clustering c o eﬃcient of RGGs 16 Lemma 3.3. Under the assumption of Pr op osition 2.2 , we have E [ h 2 1 ( X 1 )] = r 8 n 16 σ 2 1 + O ( r 9 n ) , (38) wher e σ 2 1 = E  − 3 c f f 2 ( X 1 ) − 2 f ( X 1 ) f ′′ ( X 1 ) − ( f ′ ( X 1 )) 2  2  . If σ 2 1 > 0 , then 4 P i h 1 ( X i ) √ nr 4 n σ 1 ⇒ N (0 , 1) . (39) Pro of of Lemma 3.3 : By Prop osition 2.2 , the function h 1 ( X 1 ) has the following asymp- totic expression h 1 ( X 1 ) = E [∆ 123 | X 1 ] − µ n 3 ( E [ P 123 | X 1 ] + E [ P 213 | X 1 ] + E [ P 231 | X 1 ]) = 3 r 2 n f 2 ( X 1 ) + 5 r 4 n 12  ( f ′ ( X 1 )) 2 + 2 f ( X 1 ) f ′′ ( X 1 )  − 3 + r 2 n a f 3(4 + r 2 n b f )  12 r 2 n f 2 ( X 1 ) + 4 r 4 n 3 f ( X 1 ) f ′′ ( X 1 ) + r 4 n 3  8( f ′ ( X 1 )) 2 + 12 f ( X 1 ) f ′′ ( X 1 )   + O ( r 5 n ) = r 4 n 4  (3 b f − 4 a f ) f 2 ( X 1 ) − 2 f ( X 1 ) f ′′ ( X 1 ) − ( f ′ ( X 1 )) 2  + O ( r 5 n ) . Next, we simplify the term 3 b f − 4 a f . By assumption, f (1) = g (1) = g (0) = f (0) and f ′ (1) = g ′ (1) = g ′ (0) = f ′ (0). Then Z 1 0 f 2 ( x ) f ′′ ( x ) dx = f 2 ( x ) f ′ ( x ) | 1 0 − 2 Z 1 0 f ( x )( f ′ ( x )) 2 dx = − 2 Z 1 0 f ( x )( f ′ ( x )) 2 dx. Recall the deﬁnition of a f and b f in ( 8 ). Then (3 b f − 4 a f ) Z 1 0 f 3 ( x ) dx = 4 Z 1 0 f 2 ( x ) f ′′ ( x ) dx − 5 3 Z 1 0 f ( x )( f ′ ( x )) 2 dx − 10 3 Z 1 0 f 2 ( x ) f ′′ ( x ) dx = − 3 Z 1 0 f ( x )( f ′ ( x )) 2 dx. Then it follows that h 1 ( X 1 ) = r 4 n 4  − 3 c f f 2 ( X 1 ) − 2 f ( X 1 ) f ′′ ( X 1 ) − ( f ′ ( X 1 )) 2  + O ( r 5 n ) . (40) Then we get ( 38 ). M. Y uan /CL T for clustering c o eﬃcient of RGGs 17 Next, we establish ( 39 ) by the Lyapuno v Central Limit Theorem. By the deﬁnition of h 1 ( X 1 ) in ( 34 ), it follo ws that E [ h 1 ( X 1 )] = 0. In view of ( 38 ) and ( 40 ), it is clear that s 2 n = X i E "  4 h 1 ( X i ) √ nr 4 n σ 1  2 # = 1 + o (1) . F urthermore, according to ( 40 ), one can easily v erify that E " X i  4 h 1 ( X i ) √ nr 4 n σ 1  4 # = O  nr 16 n n 2 r 16 n  = O  1 n  = o ( s 4 n ) . Then ( 39 ) follows from the Lyapuno v Central Limit Theorem. Lemma 3.4. Under the assumption of Pr op osition 2.2 , we have E [ h 2 2 ( X 1 , X 2 )] = Θ( r 3 n ) . (41) Supp ose nr n = ω (1) . If f ( x ) is the uniform density or nr 5 n = o (1) , then we have √ 2 P i  = j h 2 ( X i , X j ) 2 nσ 2 n ⇒ N (0 , 1) , (42) wher e σ 2 2 n = E [ h 2 2 ( X 1 , X 2 )] . Pro of of Lemma 3.4 : First, we pro v e ( 41 ). T o b egin with, w e ﬁnd a lo wer b ound for E [ h 2 2 ( X 1 , X 2 )]. Deﬁne even t E = { X 1 ∈ (0 . 5 , 0 . 5 + 0 . 1 r n ) , X 2 ∈ (0 . 5 + 1 . 2 r n , 0 . 5 + 1 . 4 r n ) } . On the ev ent E , A 12 = 0. Then ∆ 123 I [ E ] = P 123 I [ E ] = P 213 I [ E ] = 0. In this case, b y the deﬁnition of h ( X 1 , X 2 , X 3 ) in ( 33 ), the function h 2 ( X 1 , X 2 ) can b e expressed as h 2 ( X 1 , X 2 ) = E [ h ( X 1 , X 2 , X 3 ) | X 1 , X 2 ] I [ E ] + E [ h ( X 1 , X 2 , X 3 ) | X 1 , X 2 ] I [ E c ] = − µ n 3 A 13 A 23 I [ E ] + E [ h ( X 1 , X 2 , X 3 ) | X 1 , X 2 ] I [ E c ] . (43) Note that I [ E ] I [ E c ] = 0. On the even t E , if X 3 ∈ (0 . 5 + 0 . 8 r n , 0 . 5 + 0 . 9 r n ), then A 13 A 23 = 1. Therefore, we hav e h 2 2 ( X 1 , X 2 ) ≥ µ 2 n 9 A 13 A 23 I [ E ] ≥ µ 2 n 9 I [ E ] I [ X 3 ∈ (0 . 5 + 0 . 8 r n , 0 . 5 + 0 . 9 r n )] . By assumption, f ( x ) is b ounded aw ay from zero. There exists p ositive constant c suc h that f ( x ) ≥ c > 0. Then E  h 2 2 ( X 1 , X 2 )  ≥ µ 2 n 9 E  I [ E ] I [ X 3 ∈ (0 . 5 + 0 . 8 r n , 0 . 5 + 0 . 9 r n )]  M. Y uan /CL T for clustering c o eﬃcient of RGGs 18 ≥ c 3 µ 2 n 9 Z 0 . 5+0 . 1 r n 0 . 5 dx 1 Z 0 . 5+1 . 4 r n 0 . 5+1 . 2 r n dx 2 Z 0 . 5+0 . 9 r n 0 . 5+0 . 8 r n dx 3 = (0 . 002) c 3 µ 2 n 9 r 3 n . (44) Recall that µ n = 3 4 + o (1). Then E [ h 2 2 ( X 1 , X 2 )] ≥ C r 3 n for a p ositive constant C and large n . Next, w e ﬁnd an upp er b ound for E [ h 2 2 ( X 1 , X 2 )]. By the prop erties of conditional exp ec- tation, we hav e E  h 2 2 ( X 1 , X 2 )  = E  E [ h ( X 1 , X 2 , X 3 ) | X 1 , X 2 ] E [ h ( X 1 , X 2 , X 4 ) | X 1 , X 2 ]  = E  E [ h ( X 1 , X 2 , X 3 ) h ( X 1 , X 2 , X 4 ) | X 1 , X 2 ]  = E  h ( X 1 , X 2 , X 3 ) h ( X 1 , X 2 , X 4 )  . (45) F rom the deﬁnition of h ( X 1 , X 2 , X 3 ) in ( 33 ), straightforw ard calculation yields h ( X 1 , X 2 , X 3 ) h ( X 1 , X 2 , X 4 ) ≤ A 12 A 13 A 23 A 14 A 24 + µ n 3 A 12 A 13 A 23 ( A 12 A 24 + A 12 A 14 + A 14 A 42 ) + µ n 3 A 12 A 14 A 24 ( A 12 A 23 + A 12 A 13 + A 13 A 32 ) + µ 2 n 9 ( A 12 A 23 + A 12 A 13 + A 13 A 32 )( A 12 A 24 + A 12 A 14 + A 14 A 42 ) . (46) Note that 0 ≤ A ij ≤ 1. By Prop osition 2.2 , we hav e E [ A 12 A 13 A 23 A 14 A 24 ] ≤ E [ A 12 A 13 A 14 ] = E [( E [ A 12 | X 1 ]) 3 ] = O ( r 3 n ) . (47) According to Prop osition 2.2 , for large n , one has µ n ≤ 1. F ollowing a similar argumen t to that in ( 47 ), it is easy to show that the exp ectations of the remaining terms in ( 46 ) are of order O ( r 3 n ). In view of ( 45 ), w e get E  h 2 2 ( X 1 , X 2 )  = E [ h ( X 1 , X 2 , X 3 ) h ( X 1 , X 2 , X 4 )] = O ( r 3 n ) . (48) Com bining ( 48 ) and ( 44 ) yields ( 41 ) of Lemma 3.4 . W e will use Lemma 3.2 to derive the asymptotic distribution of P i  = j h 2 ( X i , X j ). Supp ose nr 5 n = o (1). Firstly , w e v erify condition ( 36 ). According to ( 48 ) and Lemma 3.3 , v 2 n = Θ( n 2 r 3 n + n 3 r 8 n ). By Prop osition 2.2 and the deﬁnition of h 2 ( X 1 , X 2 ), we hav e | h 2 ( X 1 , X 2 ) | = | E [ h ( X 1 , X 2 , X 3 ) | X 1 , X 2 ] | ≤ E [ A 12 A 23 A 13 | X 1 , X 2 ] + µ n 3  E [ A 12 A 23 | X 1 , X 2 ] + E [ A 12 A 13 | X 1 , X 2 ] M. Y uan /CL T for clustering c o eﬃcient of RGGs 19 + E [ A 23 A 13 | X 1 , X 2 ]  , ≤ E [ A 13 | X 1 , X 2 ] + µ n 3  E [ A 23 | X 1 , X 2 ] + E [ A 13 | X 1 , X 2 ] + E [ A 13 | X 1 , X 2 ]  , = O ( r n ) , (49) where O ( r n ) do es not dep end on X 1 , X 2 . Then sup x,y | h 2 ( x, y ) | v n = O r n p n 2 r 3 n + n 3 r 8 n ! = O √ r n nr n p 1 + nr 5 n ! = o (1) . Then condition ( 36 ) holds. Next, we verify condition ( 37 ). Note that | h 2 ( X 1 , X 2 ) | ≤ E [ A 23 A 13 | X 1 , X 2 ] + µ n 3  E [ A 12 A 23 | X 1 , X 2 ] + E [ A 12 A 13 | X 1 , X 2 ] + E [ A 23 A 13 | X 1 , X 2 ]  . Then, by ( 49 ) and Prop osition 2.2 , we ha v e E [ | h 2 ( X 1 , X 2 ) |   X 1 ] ≤ E [ E [ A 23 A 13 | X 1 , X 2 ]   X 1 ] + µ n 3  E [ E [ A 12 A 23 | X 1 , X 2 ]   X 1 ] + E [ E [ A 12 A 13 | X 1 , X 2 ]   X 1 ] + E [ E [ A 23 A 13 | X 1 , X 2 ]   X 1 ]  = E [ A 23 A 13 | X 1 ] + µ n 3  E [ A 12 A 23 | X 1 ] + E [ A 12 A 13 | X 1 ] + E [ A 23 A 13 | X 1 ]  = O ( r 2 n ) , where O ( r 2 n ) do es not dep end on X 1 . It then follows that n sup x E [ | h 2 ( x, X 1 ) | ] v n = O nr 2 n p n 2 r 3 n + n 3 r 8 n ! = O √ r n p 1 + nr 5 n ! = o (1) . Then condition ( 37 ) holds. Then ( 42 ) follows from Lemma 3.2 . When f ( x ) is the uniform density , it is easy to verify that h 1 ( X 1 ) = 0. In this case, v 2 n = n 2 2 E [ h 2 2 ( X 1 , X 2 )] = Θ( n 2 r 3 n ). The pro of for the case nr 5 n = o (1) still w orks. Then the pro of of Lemma 3.4 is complete. Lemma 3.5. Under the assumption of Pr op osition 2.2 , we have σ 2 3 n = E [ h 2 ( X 1 , X 2 , X 3 )] = 3 r 2 n 8 E [ f 2 ( X 1 )] + O ( r 4 n ) . (50) M. Y uan /CL T for clustering c o eﬃcient of RGGs 20 If nr n = o (1) and n 3 r 2 n = ω (1) , then P i  = j  = k h ( X 1 , X 2 , X 3 ) √ 6 n √ nσ 3 n ⇒ N (0 , 1) . (51) wher e σ 2 3 n = E [ h 2 ( X 1 , X 2 , X 3 )] . Pro of of Lemma 3.5 : Firstly , we pro ve ( 50 ). By the deﬁnition of h ( X 1 , X 2 , X 3 ) in ( 33 ), w e hav e h 2 ( X 1 , X 2 , X 3 ) = ∆ 123 − 2 µ n ∆ 123 + µ 2 n 9 ( P 123 + P 213 + P 231 ) + 2 µ 2 n 3 ∆ 123 . In view of Prop osition 2.2 , taking the exp ectation of b oth sides of the previous equation yields E [ h 2 ( X 1 , X 2 , X 3 )] =  1 − 2 µ n + 2 µ 2 n 3  E [∆ 123 ] + µ 2 n 3 E [ P 123 ] =  1 − 3 2 + 3 8  3 r 2 n E [ f 2 ( X 1 )] + 3 4 r 2 n E [ f 2 ( X 1 )] + O ( r 4 n ) = 3 r 2 n 8 E [ f 2 ( X 1 )] + O ( r 4 n ) . Then ( 50 ) holds. W e will use the metho d of moments to prov e ( 51 ). Sp eciﬁcally , w e show that the moments of P i  = j  = k h ( X 1 ,X 2 ,X 3 ) n √ 6 nσ 3 n con v erges to the momen ts of the standard normal distribution. F or con- v enience, let J s = ( i s , j s , k s ), where i s , j s , k s ∈ { 1 , 2 , 3 , . . . , n } are pairwise distinct. Denote h J s = h ( X i s , X j s , X k s ). Let m b e a p ositive integer. Firstly , w e show that the even-order moments conv erge to those of the standard normal distribution. The 2 m -th moments of P i  = j  = k h ( X 1 ,X 2 ,X 3 ) n √ 6 nσ 3 n can b e expressed as E "  P i  = j  = k h ( X 1 , X 2 , X 3 ) n √ 6 nσ 3 n  2 m # = P J 1 ,J 2 ,...,J 2 m E [ h J 1 h J 2 . . . h J 2 m ] (6 n 3 σ 2 3 n ) m . (52) Giv en s ∈ { 1 , 2 , 3 . . . , 2 m } , if J s ∩ J l = ∅ for all l ∈ { 1 , 2 , . . . , 2 m }\{ s } , then h J s is indep endent of h J l for all l ∈ { 1 , 2 , . . . , 2 m } \ { s } . Recall that E [ h J s ] = 0. In this case, E [ h J 1 h J 2 · · · h J 2 m ] = E [ h J s ] E   Y l ∈{ 1 , 2 ,..., 2 m }\{ s } h J l   = 0 . Therefore, J s ∩ J l 0  = ∅ for some l 0 ∈ { 1 , 2 , . . . , 2 m } \ { s } . The exp ectation in ( 52 ) is non- zero if and only if ev ery J s has a non-empt y intersection with some J t ( s  = t ). Then, the M. Y uan /CL T for clustering c o eﬃcient of RGGs 21 collection { J 1 , J 2 , . . . , J 2 m } can b e partitioned in to t (1 ≤ t ≤ m ) disjoint components, where eac h comp onent consists of triples that share at least one index with at least one other triple in the same comp onent. F or eac h l ∈ { 1 , 2 , . . . , t } , let C l = { J s l 1 , J s l 2 , . . . , J s l m l } b e the l -th connected comp onent, where s lq ∈ { 1 , 2 , 3 . . . , 2 m } , 1 ≤ q ≤ m l and 2 ≤ m l ≤ 2 m . Supp ose t = m . In this case, eac h connected comp onent C l (1 ≤ l ≤ t ) con tains exactly t w o identical triples. That is, the collection { J 1 , J 2 , . . . , J 2 m } are partitioned into m disjoint pairs { J s , J t } suc h that J s = J t . There are (2 m − 1)!! suc h partitions. The exp ectation in ( 52 ) is identical for each suc h partition. Without loss of generality , let J t = J m + t for 1 ≤ t ≤ m and J t 1 ∩ J t 2 = ∅ for distinct t 1 , t 2 ∈ { 1 , 2 , . . . , m } . There are 6 p ossible w a ys for the set of indices J t = { i t , j t , k t } to equal the set of indices J m + t = { i m + t , j m + t , k m + t } . Moreo v er, h J 1 , h J 2 , . . . , h J m are indep endent. Recall that σ 2 3 n = E [ h 2 J 1 ]. Then P J 1 ,J 2 ,...,J 2 m E [ h J 1 h J 2 . . . h J 2 m ] (6 n 3 σ 2 3 n ) m = (2 m − 1)!! 6 m P J 1 ,J 2 ,...,J m E [ h 2 J 1 h 2 J 2 . . . h 2 J m ] (6 n 3 σ 2 3 n ) m = (2 m − 1)!! 6 m Q 3 m t =1 ( n − t + 1) (6 n 3 σ 2 3 n ) m E [ h 2 J 1 ] E [ h 2 J 2 ] . . . E [ h 2 J m ] = (2 m − 1)!! n 3 m + O ( n 3 m − 1 ) n 3 m = (2 m − 1)!! + o (1) . (53) Supp ose 1 ≤ t ≤ m − 1. Since the connected comp onen ts C l (1 ≤ l ≤ t ) are disjoint, then Q I s ∈ C 1 h I s , . . . , Q I s ∈ C t h I s are indep endent. Then E [ h J 1 h J 2 . . . h J 2 m ] = E " Y J s ∈ C 1 h J s # . . . E " Y J s ∈ C t h J s # . By the deﬁnition of µ n , w e ha ve µ n 3 ≤ 1 for all n . Then the absolute v alue of eac h pro duct term in the preceding equation is b ounded by      E " Y J s ∈ C l h J s #      ≤ E " Y J s ∈ C l | h J s | # ≤ E " Y J s ∈ C l (∆ i s j s k s + P i s j s k s + P i s k s j s + P j s i s k s ) # . The pro duct Q J s ∈ C l (∆ i s j s k s + P i s j s k s + P i s k s j s + P j s i s k s ) expands in to a sum of 4 m l terms. Eac h suc h term is a pro duct of the form H 1 H 2 . . . H m l , where H s ∈ { ∆ i s j s k s , P i s j s k s , P j s i s k s , P i s k s j s } for 1 ≤ s ≤ m l . Recall that the triples in C l are connected; that is, any t w o triples in C l share at least one common v ertex. Supp ose C l has v l distinct v ertices. There are at most n v l c hoices of suc h vertices. When H 1 H 2 . . . H m l = 1, the union S m l l =1 H l forms a connected graph, denoted by G l . Then G l has v l v ertices. There exists a spanning tree T l for G l with M. Y uan /CL T for clustering c o eﬃcient of RGGs 22 exactly v l − 1 edges; let E ( T l ) denote its edge set. Since 0 ≤ A ij ≤ 1, then E [ H 1 H 2 . . . H m l ] ≤ E   Y e ∈E ( T l ) A e   = O ( r v l − 1 n ) , where the last equalit y is obtained b y rep eatedly applying ( 1 ) of Prop osition 2.2 v l − 1 times. Since m is a ﬁxed p ositiv e integer, then 4 m l and t are ﬁxed constan ts. Then      E " Y J s ∈ C 1 h J s # . . . E " Y J s ∈ C t h J s #      = O ( r v 1 + v 2 + ··· + v t − t n ) . Therefore, we hav e P C 1 ,...,C t E  Q J s ∈ C 1 h J s  . . . E  Q J s ∈ C t h J s  (6 n 3 σ 2 3 n ) m = O  n v 1 + v 2 + ··· + v t r v 1 + v 2 + ··· + v t − t n (6 n 3 σ 2 3 n ) m  . (54) Since the triples in C l are connected and the three indices within each triple are distinct, it follo ws that 3 ≤ v l ≤ 3 m l − 1. Then 3 t ≤ v 1 + · · · + v t ≤ 3( m 1 + . . . + m t ) − t . Since m 1 + m 2 + . . . + m t = 2 m , then 3 t ≤ v 1 + · · · + v t ≤ 6 m − t . Let w = v 1 + · · · + v t . Giv en that nr n = o (1), then n w r w − t n n w +1 r w +1 − t n = 1 nr n = ω (1) , whic h implies n w +1 r w +1 − t n = o ( n w r w − t n ). Recalling from ( 50 ) that σ 2 3 n = Θ( r 2 n ), and noting that 1 ≤ t ≤ m − 1 with n 3 r 2 n = ω (1) by assumption, we obtain n v 1 + v 2 + ··· + v t r v 1 + v 2 + ··· + v t − t n ( n 3 σ 2 3 n ) m = O  n 3 t r 3 t − t n n 3 m r 2 m n  = O  ( n 3 r 2 n ) t ( n 3 r 2 n ) m  = O  1 n 3 r 2 n  = o (1) . (55) Com bining ( 52 )-( 55 ) yields E "  P i  = j  = k h ( X 1 , X 2 , X 3 ) n √ 6 nσ 3 n  2 m # = (2 m − 1)!! + o (1) . (56) The 2 m -th moment of P i  = j  = k h ( X 1 ,X 2 ,X 3 ) n √ 6 nσ 3 n con v erges to the 2 m -th momen t of the standard normal distribution. Next, w e show the o dd-order momen ts of P i  = j  = k h ( X 1 ,X 2 ,X 3 ) n √ 6 nσ 3 n con v erges to zero. The (2 m + 1)- th moments of P i  = j  = k h ( X 1 ,X 2 ,X 3 ) n √ 6 nσ 3 n can b e expressed as E "  P i  = j  = k h ( X 1 , X 2 , X 3 ) n √ 6 nσ 3 n  2 m +1 # = P J 1 ,J 2 ,...,J 2 m +1 E [ h J 1 h J 2 . . . h J 2 m +1 ] n √ 6 nσ 3 n ( n 3 σ 2 3 n ) m . M. Y uan /CL T for clustering c o eﬃcient of RGGs 23 Let C l (1 ≤ l ≤ t ) b e deﬁned as in the even-order case. In this case, 1 ≤ t ≤ m . By a similar argumen t as in ( 54 ) and ( 55 ), w e ha v e E "  P i  = j  = k h ( X 1 , X 2 , X 3 ) n √ 6 nσ 3 n  2 m +1 # = O  ( n 3 r 2 n ) t n √ nr n ( n 3 r 2 n ) m  = O 1 p n 3 r 2 n ! = o (1) . Then (2 m + 1)-th moments of P i  = j  = k h ( X 1 ,X 2 ,X 3 ) n √ 6 nσ 3 n con v erges to zero. In view of ( 56 ), it fol- lo ws that the moments of the normalized sum conv erge to those of the standard normal distribution. Consequently , we conclude that ( 51 ) holds, completing the pro of. 3.3. Pr o of of The or em 2.3 Recall that ∆ 123 = A 12 A 23 A 31 , P 123 = A 12 A 23 , and µ n = E [∆ 123 ] E [ P 123 ] . Deﬁne a U-statistic U n as follo ws U n = X i  = j  = k h ( X i , X j , X k ) , where the function h ( X 1 , X 2 , X 3 ) is given in ( 33 ). Then the global clustering co eﬃcient C n can b e expressed in terms of U n as follows: X i  = j  = k A ij A j k ! ( C n − µ n ) = U n . (57) W e will determine the leading-order term of U n and derive its asymptotic distribution. T o this end, we ev aluate its v ariance. It is straightforw ard to v erify that E [ h ( X 1 , X 2 , X 3 )] = 0. Moreov er, if the sets of indices { i, j, k } and { l , s, t } are disjoin t, then h ( X i , X j , X k ) and h ( X l , X s , X t ) are indep endent. In this case, we ha v e E [ h ( X i , X j , X k ) h ( X l , X s , X t )] = E [ h ( X i , X j , X k )] E [ h ( X l , X s , X t )] = 0 . Then the v ariance of U n can b e written as V ar( U n ) = X i  = j  = k l  = s  = t E [ h ( X i , X j , X k ) h ( X l , X s , X t )] = 9 X i  = j  = k  = s  = l E [ h ( X i , X j , X k ) h ( X i , X l , X s )] + 18 X l  = i  = j  = k E [ h ( X i , X j , X k ) h ( X i , X j , X l )] + 6 X i  = j  = k E [ h 2 ( X i , X j , X k )] . M. Y uan /CL T for clustering c o eﬃcient of RGGs 24 Note that E [ h ( X i , X j , X k ) h ( X i , X l , X s )] = E [ E [ h ( X i , X j , X k ) | X i ] E [ h ( X i , X l , X s ) | X i ]] = E [ h 2 1 ( X i )] , E [ h ( X i , X j , X k ) h ( X i , X j , X l )] = E [ E [ h ( X i , X j , X k ) | X i , X j ] E [ h ( X i , X j , X l ) | X i , X j ]] = E [ h 2 2 ( X i , X j )] . Then the v ariance of U n tak es the form: V ar( U n ) = 9 n 5 E [ h 2 1 ( X 1 )] + 18 n 4 E [ h 2 2 ( X 1 , X 2 )] + 6 n 3 E [ h 2 ( X 1 , X 2 , X 3 )] + O ( n 4 ) E [ h 2 1 ( X 1 )] + O ( n 3 ) E [ h 2 2 ( X 1 , X 2 )] + O ( n 2 ) E [ h 2 ( X 1 , X 2 , X 3 )] . (58) Pro of of Case (I) . Supp ose nr 5 n → ∞ , then n 4 r 3 n = o ( n 5 r 8 n ) and n 3 r 2 n = o ( n 5 r 8 n ). By Lemma 3.3 , Lemma 3.4 and Lemma 3.5 , the ﬁrst term of ( 58 ) is the leading term. W e show U n is asymptotically equal to 3 n 2 P i h 1 ( X i ). Note that E   U n − 3 n 2 X i h 1 ( X i ) ! 2   = E [ U 2 n ] − 6 n 2 E " U n X i h 1 ( X i ) # + 9 n 4 E   X i h 1 ( X i ) ! 2   . (59) Next, we simplify the exp ectations in ( 59 ) to show that their sum is of order O ( n 4 r 8 n ). By ( 58 ), one has E [ U 2 n ] = 9 n 5 E [ h 2 1 ( X 1 )] + O ( n 4 r 3 n + n 3 r 2 n + n 2 r 2 n ) . (60) The second exp ectation of ( 59 ) is equal to E " U n X i h 1 ( X i ) # = X j  = k  = l,i E [ h ( X j , X k , X l ) h 1 ( X i )] . (61) If i ∈ { j, k , l } , then X i is indep endent of X j , X k , X l . In this case, E [ h ( X j , X k , X l ) h 1 ( X i )] = E [ h ( X j , X k , X l )] E [ h 1 ( X i )] = 0 . Therefore, we get i ∈ { j, k , l } . The exp ectations in ( 61 ) are identical for all three cases i = j , or i = k , or i = l . Without loss of generality , let i = j . Then E [ h ( X j , X k , X l ) h 1 ( X i )] = E [ E [ h ( X j , X k , X l ) h 1 ( X j ) | X j ]] = E [ h 1 ( X j ) E [ h ( X j , X k , X l ) | X j ]] M. Y uan /CL T for clustering c o eﬃcient of RGGs 25 = E  h 2 1 ( X j )  . It then follows from ( 61 ) that E " U n X i h 1 ( X i ) # = 3 n 3 E  h 2 1 ( X j )  + O ( n 2 ) E  h 2 1 ( X j )  . (62) Note that E [ h 1 ( X i ) h 1 ( X j )] = E [ h 1 ( X i )] E [ h 1 ( X j )] = 0 if i  = j . Then the last exp ectation of ( 59 ) is equal to E   X i h 1 ( X i ) ! 2   = X i,j E [ h 1 ( X i ) h 1 ( X j )] = X i E  h 2 1 ( X i )  = n E  h 2 1 ( X 1 )  . (63) In view of ( 59 ), ( 60 ), ( 62 ) and ( 63 ), we ha v e E   U n − 3 n 2 X i h 1 ( X i ) ! 2   = O ( n 4 r 3 n ) . Then we conclude that U n = 3 n 2 P i h 1 ( X i ) + O P ( n 2 r n √ r n ). By Lemma 3.3 , one has 4 U n 3 n 2 √ nr 4 n σ 1 = 4 P i h 1 ( X i ) √ nr 4 n σ 1 + O P 1 p nr 5 n ! ⇒ N (0 , 1) . In view of ( 57 ) and Lemma 3.1 , the result in (I) follows. Pro of of Case (II) . Supp ose nr 5 n = o (1), nr n = ω (1) and f ( x ) is not the uniform densit y . Then n 3 r 2 n = o ( n 4 r 3 n ) and n 5 r 8 n = o ( n 4 r 3 n ). By Lemma 3.3 , Lemma 3.4 and Lemma 3.5 , the second term of ( 58 ) is the leading term. W e will show that U n is asymptotically equal to 3 n P i  = j h 2 ( X i , X j ). It is easy to v erify that E   U n − 3 n X i  = j h 2 ( X i , X j ) ! 2   = E [ U 2 n ] − 6 n E " U n X i  = j h 2 ( X i , X j ) # + 9 n 2 E   X i  = j h 2 ( X i , X j ) ! 2   . (64) By ( 58 ), the ﬁrst exp ectation of ( 64 ) is equal to E [ U 2 n ] = 18 n 4 E [ h 2 2 ( X 1 , X 2 )] + O ( n 5 r 8 n + n 3 r 2 n ) . (65) The second exp ectation of ( 64 ) is expressed as E " U n X i  = j h 2 ( X i , X j ) # = X i 1  = j 1  = k 1 ,i  = j E [ h ( X i 1 , X j 1 , X k 1 ) h 2 ( X i , X j )] . M. Y uan /CL T for clustering c o eﬃcient of RGGs 26 If { i 1 , j 1 , k 1 } ∩ { i, j } = ∅ , then E [ h ( X i 1 , X j 1 , X k 1 ) h 2 ( X i , X j )] = E [ h ( X i 1 , X j 1 , X k 1 )] E [ h 2 ( X i , X j )] = 0 . Supp ose |{ i, j } ∩{ i 1 , j 1 , k 1 }| = 1. There are at most n 4 suc h indices. Without loss of generality , let i = i 1 . Then E [ h ( X i , X j 1 , X k 1 ) h 2 ( X i , X j )] = E [ E [ h ( X i , X j 1 , X k 1 ) h 2 ( X i , X j ) | X i ]] = E  h 2 1 ( X i )  = O ( r 8 n ) . Supp ose { i, j } ⊂ { i 1 , j 1 , k 1 } . There are at most n 3 suc h indices. Without loss of generalit y , let { i, j } = { i 1 , j 1 } . E [ h ( X i 1 , X j 1 , X k 1 ) h 2 ( X i , X j )] = E [ E [ h ( X i , X j , X k 1 ) h 2 ( X i , X j ) | X i , X j ]] = E  h 2 2 ( X i , X j )  . This result holds for the other cases such as { i, j } = { i 1 , k 1 } . Therefore, we ha v e E " U n X i  = j h 2 ( X i , X j ) # = 6 n 3 E  h 2 2 ( X i , X j )  + O ( n 4 r 8 n + n 2 r 3 n ) . (66) The third exp ectation of ( 64 ) is expressed as E   X i  = j h 2 ( X i , X j ) ! 2   = X i 1  = j 1 ,i  = j E [ h 2 ( X i , X j ) h 2 ( X i 1 , X j 1 )] If { i, j } ∩ { i 1 , j 1 } = ∅ , then h 2 ( X i , X j ) and h 2 ( X i 1 , X j 1 ) are indep endent. In this case, E [ h 2 ( X i , X j ) h 2 ( X i 1 , X j 1 )] = E [ h 2 ( X i 1 , X j 1 )] E [ h 2 ( X i , X j )] = 0 Supp ose { i, j } = { i 1 , j 1 } . There are at most n 2 suc h indices, and there are tw o w ays for the set of indices { i, j } to equal { i 1 , j 1 } . Then E [ h 2 ( X i , X j ) h 2 ( X i 1 , X j 1 )] = E  h 2 2 ( X i , X j )  = Θ( r 3 n ) . Supp ose |{ i, j } ∩ { i 1 , j 1 }| = 1. There are at most n 3 suc h indices. Without loss of generalit y , let i = i 1 . In this case, w e ha ve E [ h 2 ( X i , X j ) h 2 ( X i 1 , X j 1 )] = E [ E [ h 2 ( X i , X j ) h 2 ( X i , X j 1 ) | X i ]] = E  h 2 1 ( X 1 )  = O ( r 8 n ) . Then E   X i  = j h 2 ( X i , X j ) ! 2   = 2 n 2 E  h 2 2 ( X 1 , X 2 )  + O ( n 3 r 8 n + n 2 r 3 n ) . (67) M. Y uan /CL T for clustering c o eﬃcient of RGGs 27 Com bining ( 64 )-( 67 ) yields E   U n − 3 n X i  = j h 2 ( X i , X j ) ! 2   = O ( n 5 r 8 n + n 3 r 2 n ) , from which it follo ws that U n = 3 n X i  = j h 2 ( X i , X j ) + O P ( p n 5 r 8 n + n 3 r 2 n ) . By Lemma 3.4 , we ha v e √ 2 U n 6 n 2 σ 2 n = √ 2 P i  = j h 2 ( X i , X j ) 2 nσ 2 n + O P  r nr 5 n + 1 nr n  ⇒ N (0 , 1) . In view of Lemma 3.1 and ( 57 ), we complete the pro of for the case nr 5 n = o (1), nr n = ω (1) and f ( x ) is not the uniform density . When f ( x ) is the uniform density , h 1 ( x ) = 0. In this case, the previous pro of still works. Then pro of of Case (I I) is complete. Pro of of Case (I I I) . The result of Case (I I I) follows from Lemma 3.1 , equation ( 57 ) and Lemma 3.5 . Then the pro of of Theorem 2.3 is complete. Ac kno wledgemen t Mingao Y uan thanks The Universit y of T exas at El P aso for providing generous startup funds. References [1] Abb e, E. (2017). Comm unity detection and sto c hastic blo c k mo dels: recen t developmen ts. Journal of Machine L e arning R ese ar ch , 18 , 1-86. [2] Bangachev,K. and Bresler, G.(2024). Detection of L ∞ geometry in random geometric graphs:sub optimalit y of triangles and cluster expansion. Pr o c e e dings of Machine L e arning R ese ar ch , 247:1–71. [3] Badiu, M.-A. and Co on, J. P .(2023), Structural complexit y of one-dimensional random geometric graphs, IEEE T r ansactions on Information The ory , 69(2),794-812. [4] Barth´ elemy , M. (2011). Spatial net works. Physics R ep orts , 499(1–3), 1–101. [5] Dall, J. and Christensen, M. (2002). Random geometric graphs. Physic al R eview E , 66(1), 016121. M. Y uan /CL T for clustering c o eﬃcient of RGGs 28 [6] Duchemin, Q., De Castro, Y. (2023). Random geometric graph: some recent devel opments and p ersp ectives. High Dimensional Pr ob ability IX. Pr o gr ess in Pr ob ability Birkh¨ auser, Cham. [7] Desmond J. Higham, Marija Ra ˇ sa jski, Nata ˇ sa Pr ˇ zulj (2008). Fitting a geometric graph to a protein–protein interaction net w ork, Bioinformatics , 24(8), 1093–1099. [8] Erd˝ os, P . and R´ enyi, A. (1959). On random graphs I. Public ationes Mathematic ae , 6, 290–297. [9] Go el, A., Rai, S., Krishnamachari, B.(2005). Monotone prop erties of random geometric graphs hav e sharp thresholds. Ann. Appl. Pr ob ab. 15 (4) 2535-2552. [10] Ganesan, G., Robust paths in random geometric graphs with applications to mobile net- w orks, 2021 International Confer enc e on COMmunic ation Systems & NETworkS (COM- SNETS) , Bangalore, India, 2021, pp. 119-123. [11] Gilb ert, E.N. (1961). Random plane netw orks. Journal of the So ciety for Industrial and Applie d Mathematics , 9(4), 533–543. [12] Galhotra, S., Mazumdar, A., P al, S., Saha, B.(2023). Communit y reco v ery in the geo- metric blo ck mo del, Journal of Machine L e arning R ese ar ch 24: 1-53. [13] Gupta, P . and Kumar, P .R. (2000). The capacity of wireless netw orks. IEEE T r ansac- tions on Information The ory , 46(2), 388–404. [14] Higham, D.J., Ra ˇ sa jski, M., and Prˇ zulj, N. (2008). Fitting a geometric graph to a protein-protein interaction netw ork. Bioinformatics , 24(8), 1093–1099. [15] Han, G. and Mako wski, A.(2009). One-dimensional geometric random graphs with non- v anishing densities—Part I: A strong zero-one law for connectivit y . IEEE T r ansactions on information the ory , 55(12),5832-5839. [16] Han, G. and Mako wski, A.(2012). One-dimensional geometric random graphs with non- v anishing densities—Part I I: a very strong zero-one la w for connectivity . Queueing Syst , 72, 103-138. [17] Jammalamadak a, S.R. and Janson, S. (1986). Limit theorems for a triangular scheme of U-statistics with applications to inter-point distances, The A nnals of Pr ob ability , 14(4),1347-1358. [18] Lee, D., etc. (2014), Analysis of clustering co eﬃcients of online social net w orks by duplication mo dels, 2014 IEEE International Confer enc e on Communic ations (ICC) , 4095-4100. [19] Newman, M.E.J. (2009). Random Graphs with Clustering, Physic al R eview L etters , 103, 058701. M. Y uan /CL T for clustering c o eﬃcient of RGGs 29 [20] Newman, M E J (2003), The structure and function of complex netw orks. SIAM R eview 45, 167-256 . [21] O’Malley , A. J. and Marsden,P . V. (2008). The analysis of so cial net works, He alth Serv Outc omes R es Metho dol , 8 , 222-269. [22] Paolino, R., Bo jchevski, A., Gunnemann, S., Kut yniok, G. and Levie, R.(2023). Un v eil- ing the sampling density in non-uniform geometric graphs, ICLR 2023 [23] Penrose, M. (2003). R andom Ge ometric Gr aphs . Oxford Universit y Press, Oxford. [24] Robins, G., P . P attison, and J. W o olco c k (2005). Small and other worl ds: Global net w ork structures from lo cal pro cesses. A meric an Journal of So ciolo gy 110 (4), 894–936. [25] Simpson, S., Bowman F. and Laurien ti, P .(2013). Analyzing complex functional brain net w orks: F using statistics and net work science to understand the brain, Statistics Sur- veys , 7: 1-36. [26] W atts,D. and Strogatz, S. (1998). Collective dynamics of ‘small-world’ netw orks, Natur e , 393, 440-442. [27] Y uan, M. (2024). Asymptotic distribution of the friendship parado x of a random geo- metric graph. Br azilian Journal of Pr ob ability and Statistics , 38, 444-462. [28] Y uan, M. and Y u, F. (2025). Hyp othesis testing for the dimension of random geometric graph, preprint, ResearchGate, DOI: 10.13140/RG.2.2.31959.53920 [29] Y uan, M. (2025). Asymptotic distribution of the global clustering co eﬃcient in a random ann ulus graph, [30] Y uan, M. (2025b). Hyp othesis testing for the uniformit y of random geometric graph, h ttps://arxiv.org/p df/2510.14210 [31] Y uan, M,(2025). Limiting distribution for the Randic Index of a random geometric graph, MA TCH Commun. Math. Comput. Chem. , 93,767-789. [32] Y uan, M. (2026). The w eak la w of large num b ers for the friendship parado x index. h ttps://arxiv.org/p df/2602.10055 [33] Zheng, T., Zheng, X., Xue, B., Xiao, S., and Zhang, C. (2025). A netw ork analysis of depressiv e symptoms and cognitive p erformance in older adults with m ultimorbidit y: A nation wide p opulation-based study . Journal of Aﬀe ctive Disor ders , 383, 78–86. [34] Zhao,J. (2015). The absence of isolated no de in geometric random graphs, 53r d Annual A l lerton Confer enc e on Communic ation, Contr ol, and Computing (Al lerton) , Monticello, IL, USA, 2015, pp. 881-886. [35] Zhang, W., Lim, C., Korniss, G. et al. Opinion dynamics and inﬂuencing on random geometric graphs. Sci R ep 4, 5568 (2014).

Central limit theorem for the global clustering coefficient of random geometric graphs

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment