The asymptotic value of Randic index for trees

Let $\mathcal{T}_n$ denote the set of all unrooted and unlabeled trees with $n$ vertices, and $(i,j)$ a double-star. By assuming that every tree of $\mathcal{T}_n$ is equally likely, we show that the limiting distribution of the number of occurrences…

Authors: Xueliang Li, Yiyang Li

Let T n denote the set of all unrooted and unlabeled trees T n with n vertices. A pattern M is a given subtree. We say that M occurs in a tree if M is a subtree of T n such that except for the vertices of M with degree 1, the other vertices must have the same degrees with the corresponding vertices in T n . Surely, we can also let the vertices with degree 1 match with each other. Set t n = |T n |. We introduce two functions: where the coefficients t n,k denote the number of trees in T n that have k occurrences of the pattern M. We assume that every tree of T n is equally likely. Let X n denote the number of occurrences of M in a tree of T n . Therefore, X n is a random variable on T n with probability Pr[X n = k] = t n,k t n . In [7], Kok showed that for any pattern M the limiting distribution of (X n -EX n )/ √ V arX n is a distribution with density of the form (A + Bt 2 ) exp -Ct 2 , and E(X n ) = (µ + o(1))n and V ar(X n ) = (σ +o (1))n, where A, B, C, µ, σ are some constants. Clearly, if B = 0, it is a normal distribution. It has been showed that if the pattern is a star or a path, the corresponding distribution is asymptotically normal. We refer the readers to [7,12,13] for more details. Recall that a path is a graph with a sequence of vertices with an edge between every two consecutive vertices. A star is a complete bipartite graph such that one partition contains only one vertex, and we call this vertex the center of the star. A double-star is a graph which is formed from two stars by connecting their centers with an edge. In this paper, we will show that if the pattern is a double-star, the corresponding limiting distribution is also a normal distribution, and get an estimate for the number X n of occurrences of a double-star for almost all trees. Based on the result, we then obtain the asymptotic value of Randić index for almost all trees in T n . The Randić index was introduced by Randić [11] in 1975, and later, Bollobás and Erdös [2] generalized it to the general Randić index. The definition will be given in Section 3, and for a detailed survey we refer the readers to [9]. There is a conjecture on the relation between the Randić index and the average distance of a connected graph, proposed by Fajtlowicz in [5]. Conjecture 1. Let R(G) and D(G) denote, respectively, the Randić index and the average distance of a graph G. Then, for any connected graph G, R(G) ≥ D(G). We will show that the conjecture is true not only for almost all connected graphs but also for almost all trees. In Section 2, we explore the limiting distribution of X n corresponding to a double-star. In Section 3, we apply the results in Section 2 to considering the Randić index. In this section, we concentrate on the limiting distribution of X n for a double-star. Throughout this paper, we use (i, j) to denote the double-star with one vertex corresponding to a center of degree i and the other of degree j. Evidently, the number of occurrence of (i, j) in a tree is the number of edges in the tree such that one end of the edge is of degree i while the other is of degree j. Without loss of generality, we always assume i ≤ j. In what follows, we first introduce some terminology and and notation, which will be used in the sequel. For those not defined here, we refer the readers to the book [6]. Analogous to trees, we have the generating functions for rooted trees and planted trees. Let R n be the set of all rooted trees with n vertices and r n = |R n |. We have r n,k x n u k , and r n,k is the number of all rooted trees in T n that have k occurrences of (i, j). A planted tree is formed from a rooted tree and a new vertex by connecting the vertex and the root of the rooted tree with a new edge. The new vertex is called the plant, and we never count it in the sequel. Let P n denote the set of all planted trees with n vertices and p n = |P n |. Then, we have generating functions: where p n,k denote the number of planted trees in P n that have k occurrences of (i, j). By the definitions of planted trees and rooted trees, it is easy to see that r(x, 1) = r(x) = p(x, 1) = p(x). Furthermore, suppose the radius of convergence of r(x) is x 0 , Otter [10] showed that x 0 satisfies r(x 0 ) = 1 and the asymptotic expansion of r(x) is where x 0 ≈ 0.3383219 and b ≈ 2.68112266. Let y(x, u) = (y 1 (x, u), . . . , y N (x, u)) T be a column vector. We suppose G(x, y, u) is an analytic function with non-negative Taylor coefficients. G(x, y, u) can be expanded as where g n = k g n,k . To show the limiting distribution of the number of occurrences of the double-star (i, j) for all trees is normal, we need a useful lemma, which was used to explore the distribution of the number of occurrences of a pattern for some other families of trees, such as planar trees, labelled trees, rooted trees, et al. We refer the readers to [3] for more details. Lemma 1. Let F(x, y, u) = (F 1 (x, y, u), . . . , F N (x, y, u)) T be functions analytic around x = 0, y = (y 1 , . . . , y N ) T = 0, u = 0, with Taylor coefficients that are all non-negative. Suppose F(0, y, u) = 0, F(x, 0, u) = 0, F x (x, y, u) = 0, and for some j, F y j y j (x, y, u) = 0. Furthermore, assume that x = x 0 , y = y 0 is a non-negative solution of the system of equations inside the region of convergence of F, and I is the unit matrix. Let y = (y 1 (x, u), . . . , y N (x, u)) T denote the analytic solution of the function system with y(0, u) = 0. Moreover, let G(x, y, u) be an analytic function with non-negative Taylor coefficients such that the point (x 0 , y(x 0 , 1), 1) is contained in the region of convergence. Finally, let X n be the random variable defined in (2). If the dependency graph G F of the function system ( 5) is strongly connected, then the random variable X n is asymptotically normal with mean V ar Moreover, suppose v T is the vector satisfying v T (I -F y (x 0 , y 0 , 1)) = 0, we have that where F u and F x are the partial derivatives of F(x, y, u). Remark 1. The dependency graph G F of y = F(x, y, u) is strongly connected, if there is no subsystem of equations that can be solved independently from others. If G F is strongly connected, then I -F y (x 0 , y 0 , 1) has rank N -1, i.e., v is unique up to a nonzero factor. We refer the readers to [3,4] for more details. Now, we consider the asymptotic distribution of X n corresponding to the pattern (i, j). Our main contribution is to establish functional equations and apply Lemma 1 to obtain Theorem 3. For different i, j, we distinguish the following three cases. Since only the tree with exactly two vertices contains the pattern (1, 1), we do not need to consider the case for i = j = 1. Case 1. i = j > 1. We split P n into three subsets according to the degree of the root: the root is of degree i, j and neither i nor j, and we respectively let a i (x, u), a j (x, u) and a 0 (x, u) be the generating functions (or a i , a j , a 0 for short). It is easy to see that In what follows, there appears an expression of the form Z(S n , f (x, u)) (or f (x)), which is the substitution of the counting series f (x, u) (or f (x)) into the cycle index Z(S n ) of the symmetric group S n . This involves replacing each variable We refer the readers to [6] for details. Employing the classical Pólya Enumeration Theorem, we have Z(S k-1 ; p(x)) as the counting series of the planted trees whose roots have degree k, and the coefficients of x p in Z(S k-1 ; p(x)) is the number of planted trees of order p + 1 (see [6] p.51-54). Therefore, p(x) satisfies By the same way, we can obtain the following functional equations For a 0 (x, u), since the degrees of the roots are neither i nor j, therefore there are two minor modifications. For a i (x, u), if there exist ℓ 2 vertices of degree j adjacent to the root, we should count ℓ 2 occurrences of (i, j) in addition, and thus it is of Z(S ℓ 1 ; a 0 (x, u) + a i (x, u)) • Z(S ℓ 2 ; a j (x, u))u ℓ 2 . Analogously, the equation of a j (x, u) follows. Then, for rooted trees, we have In order to get the generating function for general trees, we need the following lemma, which was used in [10] to get the famous equation We can also obtain a similar equation for t(x, u) from this lemma. Two edges in a tree are similar, if they are the same under some automorphism of the tree. To join two planted trees is to connect the two roots of the trees with a new edge and get rid of the two plants. If the two planted trees are the same, we say that the new edge is symmetric. Lemma 2. For any tree, the number of rooted trees corresponding to this tree minus the number of nonsimilar edges (except the symmetric edge) is the number 1. Note that, if we delete any one edge of a similar set in a tree, the yielded trees are the same two trees. Hence, different pairs of planted trees correspond to nonsimilar edges. We refer the readers to [10] for details. Then, analogous to (12), we have The last term serves to count the occurrences of (i, j) when joining two planted trees to form a tree, in which one has a root of degree i and the other has a root of degree j. Now, we will use Lemma 1 to show that the distribution of X n converges to a normal distribution and get the asymptotic value of E(X n ) corresponding to (i, j). We just need to verify that the system of functions ( 9), ( 10), (11) satisfies equation ( 4), since the other conditions are easy to illustrated. We still denote the system of functions by F. It is the function of vector a(x, u) = (a 0 (x, u), a i (x, u), a j (x, u)) T . Let F a 0 , F a i , F a j be the partial derivations, respectively. Combining the fact that the partial derivative enjoys (see [13]) ∂ ∂s 1 Z(S n ; s 1 , . . . , s n ) = Z(S n-1 ; s 1 , . . . , s n-1 ), with (1), we obtain that 1 -x 0 Z(S i-2 ; p(x 0 , 1)) -x 0 Z(S j-2 ; p(x 0 , 1)) x 0 ℓ 1 +ℓ 2 =i-1 Z(S ℓ 1 -1 ; a 0 (x 0 , 1) + a i (x 0 , 1))Z(S ℓ 2 ; a j (x 0 , 1)) x 0 r 1 +r 2 =j-1 Z(S r 1 -1 ; a 0 (x 0 , 1) + a j (x 0 , 1))Z(S r 2 ; a i (x 0 , 1)) 1 -x 0 Z(S i-2 ; p(x 0 , 1)) -x 0 Z(S j-2 ; p(x 0 , 1)) x 0 Z(S i-2 ; p(x 0 , 1)) x 0 Z(S j-2 ; p(x 0 , 1)) Similarly, we can get that F a i (x 0 , a(x 0 , 1), 1) = F a j (x 0 , a(x 0 , 1), 1) = F a 0 (x 0 , a(x 0 , 1), 1). Therefore, one can readily see that det(I -F a (x 0 , a(x 0 , 1), 1)) = 0. Moreover, from equation ( 12) it follows that t(x 0 , 1) = (1 + r(x 2 0 ))/2. Note that x 0 < 1, and thus x 2 0 is surely inside the region of convergence of r(x). So, for the generating function t(x, u), all the conditions required by Lemma 1 are satisfied. Thus, the distribution of X n corresponding to (i, j), i = j > 1, is asymptotically normal. From the form of F a (x 0 , a(x 0 , 1), 1), it is not difficult to obtain that v T = (1, 1, 1) is a basic solution. In what follows, we will compute v T F x (x 0 , a(x 0 , 1), 1) and v T F u (x 0 , a(x 0 , 1), 1) to estimate µ, which would be more brief than just to do with F x (x 0 , a(x 0 , 1), 1) and F u (x 0 , a(x 0 , 1), 1). Then, we have In view of p(x, 1) = p(x) = t(x), combining with ( 1) and ( 8), it follows that and thus However, we failed to do any further simplification for (15). For convenience, denote the value of v T F u (x 0 , a(x 0 , 1), 1) by w(i, j). One can use a computer to get an approximate value of it. Thus, We proceed to obtain the result in a same way as in Case 1. We still use the same notation. But notice that when we split up P n according to the degrees of the roots, there exists only one planted tree with root of degree 1, that is, the tree with only two nodes. Thus, we have x + a 0 (x, u) + a j (x, u) = p(x, u), and the system of functions is as follows The same as previous, we can establish the generating function for rooted trees and for general trees It is not difficult to verify that ( 16), ( 17) and (18) satisfy the conditions of Lemma 1. We can obtain v T = (1, 1), and Again, for convenience, we denote v T F u (x 0 , a(x 0 , 1), 1) by w(1, j). Then, it follows that Since the procedure is the same as previous, we leave out the details of the proof for brevity. However, we still use the same notation here without any conflicts. For general trees, we have Employing Lemma 1, asymptotic analysis of the functional equations will give that v T = (1, 1), v T F x (x 0 , a(x 0 , 1), 1) = b 2 /2 and v T F u (x 0 , a(x 0 , 1), 1) Z(S m 1 ; a 0 (x 0 , 1))Z(S m 2 ; a j (x 0 , 1)) • m 2 . Then, we obtain that µ = 2 x 0 b 2 w(j, j), where w(j, j) denote the value of v T F u (x 0 , a(x 0 , 1), 1). As a conclusion, we can establish the following theorem now. Theorem 3. Suppose X n is the random variable corresponding to the occurrences of pattern (i, j). The probability measure of X n is defined as (2) for the generating function of trees t(x, u). Then, the distribution of X n is asymptotically normal with mean and variance V arX n = σ(i, j)n + O(1), where w(i, j), σ(i, j) are some constants. Following the book [1], we will say that almost every (a.e.) graph in a graph space G n has a certain property Q if the probability Pr(Q) in G n converges to 1 as n tends to infinity. Occasionally, we will say almost all instead of almost every. From the above theorem and employing Chebyshev inequality, it is easy to see that Thus, for almost all trees in T n , X n equals ( 2 x 0 b 2 • w(i, j) + o(1))n. Consequently, the following result is relevant. For almost all trees, the number of occurrences of pattern (i, j) is ( 2x 0 b 2 • w(i, j) + o(1))n. In this section, we use the result of Corollary 4 to investigate the values of the Randić index and general Randić index, and show that Conjecture 1 is true for almost all trees. Let G = (V, E) be a graph with vertex set V and edge set E. The Randić index is defined as where d u , d v are the degrees of the vertices u, v ∈ V . We know that the number of occurrences of the pattern (i, j) is the number of edges with one end of degree i and the other of degree j in the tree. Still, we assume i ≤ j. Then, the number of edges (i, j) in almost all trees of T n is ( 2x 0 b 2 • w(i, j) + o(1))n. Moreover, every tree in T n has n -1 edges. So, for any integer K, i≤j≤K ( 2 x 0 b 2 √ i•j • w(i, j) also converges to some constant λ. Although the exact value of λ can not be given, one can employ a computer to get that 0.1 < λ < 1. Then, for any ε > 0, there exists an integer K 0 such that for any K ≥ K 0 i≤j,j≥K ( 2 x 0 b 2 • w(i, j)) < ε. That is, for almost all trees, the number of edges with one end of degree larger than K is less than εn. Hence, the Randić index enjoys Immediately, we obtain the following result. Theorem 5. For any ε > 0, the Randić index of almost all trees enjoys Bollobás and Erdös [2] generalized the Randić index as which is called general Randić index, where α is a real number. Clearly, if α = -1 2 , then R -1 2 (G) = R(G). We refer the readers to a survey [9] for more details on this index. Here, we suppose α < 0. Following the sketch to obtain (19), we can analogously get an estimate of R α (T n ). Then, the following result is relevant. Corollary 6. Suppose α < 0. For any ε > 0 we have where λ α is some constant corresponding to every α. In what follows, we consider Conjecture 1. Let d(u, v) be the distance between vertices u, v ∈ V . The average distance is defined as the average value of the distances between all pairs of vertices of a graph G, i.e., . We will show that Conjecture 1 is true for almost all trees. To this end, we first introduce the concept of Wiener index for a graph G, which is defined as Clearly, W (G) = n 2 D(G). W (T n ) is a random variable on T n , and Wagner [14] established the following result.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment