Small components in k-nearest neighbour graphs

Small comp onen ts in k -nearest neigh b our graphs Mark W alters ∗ June 8, 2018 Abstract Let G = G n,k denote the graph formed by placing points in a square of area n according to a Poisson proces s o f density 1 and joining e ac h po in t to its k nearest neigh b ours. In [2] Balister, Bo llob´ as, Sark ar and W alters prov e d that if k < 0 . 3043 log n then the pro babilit y that G is connected tends to 0 , whereas if k > 0 . 5139 log n then the proba bilit y that G is co nnec t ed tends to 1 . W e prove that, around the thresho ld for connectivity , a ll vertices near t he bo undary of the square are par t o f the (unique) giant co mpo- nent . This shows that a rgumen ts ab out the connectivit y of G do not need to consider ‘b oundary’ eﬀects. W e als o improv e the uppe r b ound for the threshold for connectivity of G to k = 0 . 4125 log n . 1 In tr oduction Let S n denote a √ n × √ n square and let G n,k denote the graph formed b y placing p oin ts in S n according to a P oisson pro cess P of d e n sit y 1 and joining eac h p oin t to its k -nearest neighbour s b y an und irect ed edge. Sin ce w e sh a ll b e interested in the asymp t otic b eha viour of this graph as n → ∞ , it is conv enien t to in tro duce one piece of notat ion. F or a graph prop ert y Π w e say that G n,k has Π with high pr ob ability (abbreviated to whp) if P ( G n,k has Π) → 1 as n → ∞ . Xue and Kumar [5 ] pr ov ed that the th r eshold for connectivit y is Θ(log n ); more precisely they sho wed that if k = k ( n ) > 5 . 1774 log n then G n,k is con- nected whp , and if k = k ( n ) < 0 . 074 log n then G n,k is wh p not connected. Subsequ ent work by Balister, Bollob´ as, Sark ar and W alters [2] substan- tially imp ro v ed the u pp er and low er b ound s to 0 . 5139 log n and 0 . 3043 log n ∗ School of Mathematical Sciences, Queen Mary , Universit y of London, Lond o n E1 4NS , England. m.walt ers@qmul.ac.uk 1 resp ectiv ely . In their p roof they also sho wed that for any k = Θ(log n ) the graph consists of a giant comp onen t con taining a pr o p ortion 1 − o (1) of all vertice s and (p ossibly) some other ‘small’ comp onen ts of (Euclidean) diameter O ( √ log n ) (for a formal s tatement see Lemma 3). Moreo v er, they sho wed that if k > 0 . 311 log n then G h a s no small com- p onen t within distance O ( √ log n ) of the b oundary of S n . Unfortunately , there is a gap b et ween this b ound and the lo we r b ound of 0 . 3043 men- tioned ab o ve . This means that close to the th reshold for conn ectivit y the obstruction to connectivit y could o ccur near the b ound a ry of the square or it could occur in the centre (their metho ds did ru le out the p ossibilit y that the obstr ucti on o ccurs in the corner of the square). This h a s caused several problems in later pap ers (e.g., [3]) wh ere the authors had to consider b oth cases in their pro ofs. Our main result is the follo wing theorem sho wing th at, in fact, the ob- struction must o ccur a wa y from the b oundary of S n . Th is sh o u ld simplify subsequent work in the area as only central comp onen ts need to b e consid- ered. (Of course, the imp ro v ement itself is only of minor interest, it is the fact that the new upp er b ound for the existence of comp onen ts near the b oundary is smaller than th e general lo w er b ound that is of imp ortance .) Theorem 1. Supp ose that G = G n,k for some k > 0 . 272 log n . Then ther e is a c onsta nt ε > 0 such that the pr ob ability that ther e exists a vertex within distanc e log n of the b ounda ry of S n that i s not c ontaine d in the g i a nt c om- p onent is O ( n − ε ) . R emark. The distance log n to the b ound ary is muc h larger than the t yp i cal edge length and (non-gian t) comp onen t sizes whic h are O ( √ log n ). M ore- o ve r , the theorem wo u ld s t ill b e true w it h log n replaced by a small p o wer of n . Our second r esult is the follo wing impr o v emen t on the u pp er b ound for connectivit y of G . Theorem 2. Supp ose that G = G n,k for some k > 0 . 4125 log n . Then whp G is c onne cte d. T o illustrate Th e orem 2 let D b e a disc of radius r and consider th e ev ent that there are k + 1 p oints inside D and no p oin ts in 3 D \ D (where 3 D d e notes the d i sc with same cen tre as D and three times the radiu s ) . If this even t o cc urs then the k -nearest neigh b ours of an y p oin t in D also lie in D : in particular, there are no ‘out’-edge s from D to the rest of the graph. If w e choose r suc h that 9 π r 2 ≈ k + 1 (to maximise th e probabilit y of 2 this ev en t) then the p robabilit y of a sp eciﬁc instance of this ev ent is ab out 9 − ( k + 1 ) . Since w e can ﬁt Θ( n/ log n ) disjoin t copies of this even t in to S n w e see that if k < ( 1 log 9 − ε ) log n (for some ε > 0) then whp this eve nt o ccur s somewhere in S n and th u s that G has a subgraph with no out-degree. Since 1 / log 9 ≈ 0 . 455 > 0 . 412 5, Theorem 2 shows that there is a range of k for whic h the graph is connected wh p b ut con tains piece s with no outdegree. (The corresp onding result for in-degree w as pro ved in [2].) The pro ofs of these tw o theorems are br o adly similar: they use the ideas from [2] but also consider p oin ts which are near the small comp onen t bu t not con tained in it. Indeed, if one lo oks at the low er b ound pr o v ed in [2] w e see that the densit y of p oin ts n ea r the small comp onen t is h ig h e r than a ve r a ge. This is an un l ik ely ev en t and w e incorp orate it in to our b ounds. Indeed, the ab o ve obs er v ation that ther e are s m a ll p ie ces of the graph with no out-degree shows th a t any pro of of Th eorem 2 (or any stronger b ound) m u s t consider p oin ts outsid e of a p oten tial small comp onen t and sho w that they send edges in. The key step is to split int o tw o regimes dep ending on w hether there is a p oin t ‘close’ to the small comp onen t. If there is n o suc h p oi n t then the ‘excluded area’ from the small comp onen t is quite large (which is un l ik ely), whereas if there is such a p oin t then it must ha ve a small k -nearest neigh b our radius (which is also unlikely) . 2 Notation and Preliminaries W e start with some n o tation. F or any p oin t x and real n u m b er r let D ( x, r ) denote the closed disc of radius r ab out x . W e shall also use the term half-disc of r adius r b ase d at x to mean one of the four regions obtained b y dividing the disc D ( x, r ) in half v ertically or horizon tally . F or a set A in S n let | A | denote the measure of A , and # A d e note the n u m b er of p oin ts of P in A . F or an y real n u mb er r let A ( r ) b e the r -blo wu p of A deﬁn ed by A ( r ) = { x ∈ R 2 : d ( x, A ) < r } . Note that we do allo w A ( r ) to con tain p oin ts outside of S n . Finally , whenev er w e use the term d ia m e ter w e shall al w ays mean the Euclide an diameter: we do n o t u se graph diameter at an y p oin t in the pap er. W e sh al l need a few results from the pap er of Balister, Bollob´ as, Sark ar and W alters [2]. Since our notation is sligh tly diﬀeren t w e qu ot e them here for con ve nience. The ﬁrst is a slight v arian t of Lemma 6 of [2] which follo ws immediately fr o m the pro of given there (see also Lemma 1 of [3]). 3 Lemma 3. F or ﬁxe d c > 0 and L , ther e exists c 1 = c 1 ( c, L ) > 0 , dep e nd ing only on c and L , such that for any k ≥ c log n , the pr ob ability that G n,k c ontains two c omp onents e ach of ( E u clid e an ) diameter at le ast c 1 √ log n , is O ( n − L ) . The second b ounds the probabilit y of a small comp o n en t near one side, or t w o sides of S n ; it is explicit in the pro o f of T heorem 7 of [2 ]. (Note, Theorem 1 imp ro v es th e ﬁ r st of these b ounds.) Lemma 4. Supp ose that k = Θ(log n ) . The pr ob a b ility that ther e is a smal l c omp onent c onta i ning a vertex within log n of one b ounda ry of S n is O ( n 1 2 + o (1) 5 − k ) and the pr ob ability that ther e is a smal l c omp onent c ontaining a vertex within log n of two si des of S n is O ( n o (1) 3 − k ) . The ﬁnal result follo ws easily f rom concen tration r esu lt s for the Poisson distribution (see e.g. [1]) and most of it is imp lic it in Lemma 2 of [2]. Lemma 5. F or any ﬁxe d c and L ther e is a c onstant c 2 ( c, L ) such that for any k with c log n < k < log n the pr ob ability that ther e is any e dge of length at le ast c 2 √ log n , or any two p oints within distanc e 1 c 2 √ log n of e ach other not joine d by an e dge, or a p oint x ∈ P with a half-disc of r adius c 2 √ log n b ase d at x c onta ine d entir ely inside S n that c ontains no p oint s of P , is O ( n − L ) . W e w i ll use the f o llowing simple but technical lemma seve r a l times. Lemma 6. Su p p ose that A, B , C ar e thr e e sets in S n with | A | ≤ | C | and | B | ≤ | C | then P (# A ≥ k , # B ≥ k , #( A ∩ B ) = 0 and # C = 0) ≤  4 | A || B | ( | A | + | B | + | C | ) 2  k . Pr o o f . Let A ′ = ( A \ B ) \ C , B ′ = ( B \ A ) \ C , C ′ = C ∪ ( A ∩ B ), and U = A ∪ B ∪ C = A ′ ∪ B ′ ∪ C . W e see that A ′ , B ′ and C ′ are pairwise disjoin t so | U | = | A ′ | + | B ′ | + | C ′ | and, s in c e # ( A ∩ B ) = 0, th a t # A ′ ≥ k , # B ′ ≥ k . W e ha v e P (# A ≥ k , # B ≥ k , #( A ∩ B ) = 0 and # C = 0) = P (# A ′ ≥ k , # B ′ ≥ k and # C ′ = 0) = X l ≥ k ,m ≥ k P (# A ′ = l , # B ′ = m and # U = l + m ) = X l ≥ k ,m ≥ k P (# A ′ = l , # B ′ = m | # U = l + m ) P (# U = l + m ) ≤ max l ≥ k ,m ≥ k P (# A ′ = l , # B ′ = m | # U = l + m ) 4 (the ﬁnal line follo ws s ince P l ≥ k ,m ≥ k P (# U = l + m ) ≤ 1). W e h a v e | A ′ | ≤ | A | ≤ | C | ≤ | C ′ | so | A ′ | ≤ 1 2 | U | and similarly B ′ ≤ 1 2 | U | . Hence, for l , m ≥ k , P (# A ′ = l , # B ′ = m | # U = l + m ) =  l + m l   | A ′ | | U |  l  | B ′ | | U |  m ≤ 2 l + m  | A ′ | | U |  l  | B ′ | | U |  m ≤ 2 2 k  | A ′ | | U |  k  | B ′ | | U |  k =  4 | A ′ || B ′ | | U | 2  k =  4 | A ′ || B ′ | ( | A ′ | + | B ′ | + | C ′ | ) 2  k . Finally , observ e that | A ′ | ≤ | A | ≤ | C | ≤ | C ′ | and | B ′ | ≤ | B | ≤ | C | ≤ | C ′ | imply that 4 | A ′ || B ′ | ( | A ′ | + | B ′ | + | C ′ | ) 2 ≤ 4 | A || B ′ | ( | A | + | B ′ | + | C ′ | ) 2 ≤ 4 | A || B | ( | A | + | B | + | C ′ | ) 2 ≤ 4 | A || B | ( | A | + | B | + | C | ) 2 . whic h completes the pro of. 3 Pro of of Theorem 2 By h yp othesis w e ha ve k > 0 . 4125 log n . Also, we may assu me that k < 0 . 6 log n since we already kno w that G n,k is connected whp if k ≥ 0 . 6 log n . Let c ′ = max { c 1 (0 . 25 , 1) , c 2 (0 . 25 , 1) , 1 } b e as giv en b y Lemmas 3 and 5 and let M = 20000 c ′ . (W e shall reuse some of the b ound s we pr o v e here in the pro of of Theorem 1 so these are con venien t v alues.) Tile S n with sm a ll squares of side le n gt h s = √ log n/ M . W e form a graph b G on these tiles b y joining t wo tiles wh e nev er the distance b et w een their cen tres is at most 2 c ′ √ log n . W e call a p oin tset P b ad if an y of the f ollo w ing hold: 1. there exist t wo p oin ts that are joined in G but the tiles con taining these p oin ts are n o t joined in b G , 5 2. there exist t wo p oin ts, at m o s t distance 20000 s apart, that are not joined, 3. there exists a half-disc based at a p oint of P of radius c ′ √ log n that is con tained en tirely in S n and con tains no (other) p oin t of P , 4. there exist tw o comp onen ts in G n,k with Euclidean diameter at least c ′ √ log n , 5. there exists a comp onen t of d ia m e ter at most c ′ √ log n con taining a v ertex within distance 2 c ′ √ log n of the b oundary of S n , and go o d otherwise. W e s ee that our c h o ice of c ′ and M together with Lemma 5 imply that the probabilit y that any of the ﬁr st three conditions o cc u r is O ( n − 1 ). By Lemma 3 th e probabilit y of the fourth condition is O ( n − 1 ). S ince k > 0 . 4125 log n > 1 log 25 log n , Lemma 4 implies th e proba- bilit y of the last condition is O ( n − ε ) for some 0 < ε < 1. (Alternativ ely th is follo ws from Theorem 1 ). Com bining th e s e we see that the probabilit y of a bad conﬁguration is O ( n − ε ). Supp ose that P is a go od conﬁguration bu t G is not connected. Then there exists a comp onen t F with diameter at most c ′ √ log n not conta ining an y v ertex within 2 c ′ √ log n of the b oundary of S n . Let A b e the collection of ti les that co ntain a p oin t of F . Since the conﬁ g u rati on is go od A is a connected subset of b G conta ining no tile within c ′ √ log n of the b oundary of S n . Moreo v er, the b ound on the diameter of F imp lie s that A conta in s at most 16( c ′ M ) 2 tiles. The h ea r t of the pro of is in the follo wing lemma that b ounds the prob- abilit y of G having s u c h a comp onen t. Lemma 7. Supp ose A is a c onne cte d subset of b G c ontaining no tile within c ′ √ log n of the b oundary of S n . The pr ob ability that the c onﬁgur atio n is go o d and that G has a c omp onent c ont aine d entir ely inside A me eting ev ery tile of A is at most O (1 1 . 3 − k ) . Pr o o f . Su pp o se that F is a comp onen t of G m e eting every tile in A . The pro of of this lemma naturally d ivid e s in to thr e e steps. In th e ﬁrst step we deﬁne some regions b ase d on the comp onen t F some of whic h must con tain man y p oin ts and some which must b e empt y . In th e second step we b ound the area of these regions. In th e ﬁ n al step we b ound the pr obabilit y that these r e gions do ind e ed con tain the required n umb er of p oin ts. Step 1: Deﬁning the r e gions. W e u s e the follo wing hexagonal construc- tion w hic h w as introdu c ed b y Balister, Bollob´ as, Sark ar and W alters in [2]. 6 H 1 A 1 A 2 A 3 A 4 A 5 A 6 A 0 H H 2 H 3 H 4 H 5 H 6 P 1 P 2 P 3 P 4 P 5 P 6 Figure 1: T he circumscrib ed hexagon H and asso ciated regions. Let H b e the circumscr ib ed h e xagon of the p oints of F obtained by taking the six tangen ts to the conv ex h ull of F at angles 0 and ± 6 0 ◦ to the horizon- tal,and let H 1 , . . . , H 6 b e th e regions b ounded b y the exterior angle bisectors of H as in Figure 1. Let P 1 , . . . , P 6 b e the p oint s of F on these tangen ts, and let D 1 , . . . , D 6 denote the k -nearest neigh b our disks of P 1 , . . . , P 6 . F or 1 ≤ i ≤ 6 let A i = D i ∩ H i . Let A 0 b e the set D i ∩ H w it h the smallest area. W e s ee that for eac h 1 ≤ i ≤ 6 the set A i con tains no p oint s of P . Also A 0 con tains k + 1 p oin ts all of which must b e in F and thus in A . W riting A ′ for the set A 0 ∩ A , w e see that A ′ con tains at least k + 1 p oin ts of P . W e also wish to tak e accoun t of p oin ts near to but not con tained in F . Let P ∈ F and Q ∈ G \ F b e vertic es minimising the distance b et ween F and G \ F . Let r 0 = d ( P , Q ) and r = r 0 − √ 2 s . S ince, we are assum ing that ev ery square of A con tains a p oin t in F we s e e th a t A ( r ) \ A conta in s no p oin t of P . Indeed, su p pose there is a p oin t of P in A ( r ) \ A . Then th is p oin t is in G \ F and is within r 0 of some p oi n t of F whic h con tradicts the deﬁnition of r 0 . Ob viously the p oin ts Q and P are not joined so, in p a rticular, the k p oin ts nearest to Q must all b e nearer to Q than P is. Moreo ve r, sin ce Q is the p oin t closest to F , w e see that these k p oin ts must all b e furth e r a wa y from P than Q is. C o m b in ing th e s e w e see that these k p oin ts lie in in the 7 set B = D ( Q, r 0 ) \ D ( P, r 0 ). Summarising all of the ab o ve , w e see that A ′ and B eac h conta in at least k p oin ts and A ( r ) \ A and S 6 i =1 A i are b oth empty . T he in tersection A ′ ∩ B con tains no p oin ts (so we can think of them as disjoin t) b ut A ( r ) and S 6 i =1 A i will o verlap signiﬁcan tly . Thus w e w ill use Lemma 3 to form t wo separate b ounds, one based on A ( r ) \ A b eing empt y and one b a sed on S 6 i =1 A i b eing empt y . Step 2: Bounding the ar e a of the r e gions. In this step we assume that the conﬁguration is go o d . First w e b ound | S 6 i =1 A i | . Since the conﬁguration is go od eac h disc D i has rad iu s at most c ′ √ log n and eac h p oint P i is more than 2 c ′ √ log n from the b oundary of S n . In p artic u la r D i is con tained in S n for eac h i . Moreo v er, since | D i ∩ H i | ≥ | D i ∩ H | f or eac h 1 ≤ i ≤ 6, w e see that | A i | ≥ | A 0 | . Since the H i and therefore the A i are disjoint, we h a v e | 6 [ i =1 A i | ≥ 6 | A 0 | ≥ 6 | A ′ | . The sets B and A ( r ) b oth d ep end on r so it is con v enient to wr it e r in terms of | A ′ | b y letting x = r / ( p | A ′ | /π ). Since B = D ( Q, r 0 ) \ D ( P , r 0 ) a simp l e calculation s h o ws that | B | =  π 3 + √ 3 2  r 2 0 . Sin c e th e conﬁgur a tion is go od , r 0 > 20000 s so r = r 0 − √ 2 s > r 0 (1 − 10 − 4 ) . Hence, | B | = π 3 + √ 3 2 ! r 2 0 ≤ π 3 + √ 3 2 ! x 2 | A ′ | π (1 − 10 − 4 ) 2 < 0 . 61 x 2 | A ′ | . Finally we b ound A ( r ) . Let D and D ′ b e balls of area | A | and | A ′ | resp ec- tiv ely . Since the conﬁgur at ion is goo d the the half-disc of radius c ′ √ log n ab out the righ t-most p o int of F m ust cont ain a p oin t of P . In particular r < r 0 ≤ c ′ √ log n , and so A ( r ) is con tained in S n . By the isop erimetric inequalit y in the plane | A ( r ) \ A | ≥ | D ( r ) \ D | , and it easy to see that | D ( r ) \ D | ≥ | D ′ ( r ) \ D ′ | . S ince D ′ is a ball of radius p | A ′ | /π , D ′ ( r ) is a ball of r a d ius p | A ′ | /π + r = (1 + x ) p | A ′ | /π , and we hav e | D ′ ( r ) \ D ′ | = (( x + 1) 2 − 1) | A ′ | . 8 Step 3: Bounding the pr ob ability of such a c onﬁgur ation. W e hav e seen that if there is such a comp onen t F then there exist regions as deﬁn ed in Step 1. These r e gions are d et ermined b y 14 p oints: the six p oin ts d eﬁning sides of th e hexagonal hull, their six k th nearest neigh b our p oin ts and the p oin ts P and Q ; that is, if there is su c h a comp onen t F then there are 14 p oin ts of P deﬁning regions A ′ , B , A 1 , . . . , A 6 and A ( r ) with # A ′ ≥ k , # B ≥ k , # ( A ′ ∩ B ) = 0, and both # S 6 i =1 A i = 0 and # ( A ( r ) \ A ) = 0. Moreo v er, if th e conﬁguration is go od all of th e s e p oin ts must lie within c ′ √ log n of A . Let Z b e the ev en t that there are 14 p oin ts of P all within c ′ √ log n of A deﬁn ing regions with th e ab o v e prop erties. W e ha ve P (there exists F and the conﬁguration is go od) ≤ P (Z and the conﬁ g u ratio n is go o d) ≤ P ( Z ) . W e b ound the probabilit y that Z o cc u rs (note we are not assum ing that the conﬁguration is goo d). Fix a p a rticular collectio n of 14 p oin ts of P and let Z ′ b e the ev en t that these particular p oin ts witness Z . Note, since w e are assumin g these 14 p oin ts all lie with c ′ √ log n of A , the corresp onding regions all lie en tirely within S n . W e apply Lemma 6 to th e sets A ′ , B together w ith eac h of S 6 i =1 A i and A ( r ) \ A . First we form the b ound based on # S 6 i =1 A i = 0. W e hav e | A ′ | ≤ | S 6 i =1 A i | and , pro vid e d x < 3 . 13, we hav e | B | ≤ 0 . 61 x 2 < 6 | A ′ | ≤ | S 6 i =1 A i | so Lemma 6 applies. Thus we s ee that P ( Z ′ ) ≤ 4 | A ′ || B | ( | A ′ | + | ( S 6 i =1 A i ) | + | B | ) 2 ! k ≤  4 · 0 . 61 x 2 (7 + 0 . 61 x 2 ) 2  k . Secondly we f o r m a b ound b a s e d on #( A ( r ) \ A ) = 0. This time | B | ≤ 0 . 61 x 2 | A ′ | ≤ (( x + 1) 2 − 1) | A ′ | ≤ | A ( r ) \ A | and , pr o vided that x > √ 2 − 1, w e ha ve | A ( r ) \ A | ≥ | A ′ | so the conditions of Lemma 6 are satisﬁed. Thus P ( Z ′ ) ≤  4 | A ′ || B | ( | A ′ | + | A ( r ) \ A | + | B | ) 2  k ≤  4 · 0 . 61 x 2 (( x + 1) 2 + 0 . 61 x 2 ) 2  k . It is easy to chec k that the maxim u m of the minim um of these tw o b ounds o ccurs wh e n th ey are equal, i.e ., when x = √ 7 − 1; at this p oint they are α − k for some α > 11 . 3. Th e refore P ( Z ′ ) ≤ α − k . 9 Since all 14 p oin ts must lie within c ′ √ log n of A there are O ( (log n ) 14 ) w ays of c ho o sing them. H en c e, th e exp ected n umb er of 14 p o int sets for whic h Z ′ o cc u rs is is O ((log n ) 14 α − k ) = O (11 . 3 − k ). Thus P ( Z ) = O (11 . 3 − k ) and the pro of of the lemma is complete. Since the degree of ve r t ices in b G is b ounded and 16( c ′ M ) 2 is a (large) constan t, there are only a constant num b er of connected sets of b G of size at most 16( c ′ M ) 2 whic h con tain a ﬁxed tile, and therefore O ( n ) suc h sets in total. Since k > 0 . 4125 lo g n > 1 log 11 . 3 log n the exp ected num b er of small comp onen ts in G with the conﬁguration goo d is O ( n ( 11 . 3) k ) = o (1). Thus P ( G is not conn e cted) ≤ P (there is a small comp onen t and P is go o d) + P ( P is bad) = o (1) + O ( n − ε ) = o (1) , so whp G is connected. 4 Pro of of Theorem 1 Muc h of this is the same as the p roof of Theorem 2 so w e shall concen- trate on th e diﬀerences. Th is time, b y h yp othesis we hav e k > 0 . 27 2 log n and again w e ma y assume k < 0 . 6 log n . W e use exactly the same tesse- lation of S n with small squares of side length s = √ log n/ M where c ′ = max { c 1 (0 . 25 , 1) , c 2 (0 . 25 , 1) } and M = 20000 c ′ are giv en b y Lemmas 3 and 5 as b efore. Ag ain we form a graph b G on these tiles by joining tw o tiles whenev er the d istance b et w een their cen tr e s is at most 2 c ′ √ log n . W e need a slightly diﬀeren t deﬁnition of a bad p oin tset: the ﬁrst four conditions are exact ly as b efore but w e replace the ﬁfth condtion b y 5. there exists a comp onen t of d ia m e ter at most c ′ √ log n con taining a v ertex within distance 3 c ′ √ log n of t wo sides of S n . Note th a t this cond ition, together with Condition 4 on the diameter of small comp onen ts, imp lies that for an y small comp o n en t at most one side of S n can hav e p oin ts of this s m a ll comp onen t within d istance 2 c ′ √ log n of it. Since the tesselation is the same as in the pro of of Theorem 2 w e see that th e p robabilit y th a t an y of the original four conditions hold is O ( n − 1 ) as b efore. Since k > 0 . 272 log n Lemma 4 imp lies that the pr o babilit y of the new condition ab o ve is O ( n − ε ) for some 0 < ε < 1. Combining th e se we see that the probabilit y of a bad conﬁguration is O ( n − ε ). 10 H 1 H 2 H 3 H 4 A 1 A 2 A 3 A 4 P 2 P 3 P 4 P 1 E A 0 H Figure 2: T he circumscrib i ng set H and asso cia ted regions. Supp ose that P is a go od conﬁguration but not all p oin ts within log n of the b oundary of S n are conta ined in th e giant comp onent. Then there exists a comp onen t F w it h d ia m e ter at most c ′ √ log n con taining a v ertex within log n of the b oundary of S n . Let A be th e collection of tiles th a t con tain a p oin t of F . Since the conﬁguration is go od A is a connected subset of b G and, as b efore, the b ound on the diameter of F implies that A con tains at most 1 6( c ′ M ) 2 tiles. Th is time at most one side of S n has any tiles of A within c ′ √ log n of it. The follo wing lemma, whic h is similar to Lemma 7 b ounds th e pr obabil- it y of suc h a small comp onent . Lemma 8. Supp ose A is a c onne cte d sub set of b G such that at most one side of S n has any tiles in A within c ′ √ log n of it. The pr ob ability that the c onﬁgur a tion is go o d and that G has a smal l c omp onent c ontaine d entir ely inside A which me ets every squar e of A is at most (6 . 3) − k . R emark. O b viously this lemma is only of inte rest f o r sets A near the b ound- ary , since otherwise Lemma 7 is stronger. Pr o o f . Th e pro of divides into the same three steps as Lemma 7. Step 1: Deﬁning the r e gions. As b efore supp ose that F is a comp o nen t of G meeting eve r y tile in A . Let E b e th e (almost surely unique) sid e of S n closest to F . This time let H b e th e region b ounded by the four in terior sides of the circumscrib ed hexagon of the p oin ts of F obtained b y taking four of the 11 tangen ts to the con vex hull of F at angles 90 ◦ and ± 30 ◦ to E , together with E as in Figure 2. Let H 1 , . . . , H 4 b e the regions b ounded by the exterior angle bisectors of H and E . L e t P 1 , . . . , P 4 b e the p oint s of F on these tangen ts, and let D 1 , . . . , D 4 denote the k -nearest neigh b our disks of P 1 , . . . , P 4 . F or 1 ≤ i ≤ 4 let A i = D i ∩ H i . Let A 0 b e the set D i ∩ H with the smallest area and write A ′ for the set A 0 ∩ A . Exactly as b efore w e see that for 1 ≤ i ≤ 4 the set A i is empt y , and that A ′ m u s t con tain a t least k + 1 p oin ts of P . As b efore let P ∈ F and Q ∈ G \ F b e v ertices minimising th e distance b et w een F and G \ F , r 0 = d ( P , Q ) and r = r 0 − √ 2 s . Again, since F meets ev ery tile of A w e see that A ( r ) \ A must b e empty . Also, as b efore, the set B = ( D ( Q, r 0 ) \ D ( P, r 0 )) ∩ S n m u s t con tain at least k p oin ts. Step 2: Bounding the ar e a of the r e gions. In this step we assume the con- ﬁguration is go o d. First we b ound | S 4 i =1 A i | . Similarly to b efore we see that eac h disc D i has rad iu s at most c ′ √ log n so meets no s ide of S n apart from p ossibly E . Th us, we ha v e | D i ∩ H i | ≥ | D i ∩ H | for eac h 1 ≤ i ≤ 4, s o w e see th at | A i | ≥ | A 0 | . As b efore the H i and therefore the A i are disjoint so | 4 [ i =1 A i | ≥ 4 | A 0 | ≥ 4 | A ′ | . As b efore let x = r / p | A ′ | /π and exactly as in the pro of of Lemma 7 we ha ve | B | < 0 . 61 x 2 | A ′ | . Finally we b ound A ( r ) \ A . Consider the p oin t of F fur th e st f r o m E and the half disc of radius c ′ √ log n ab out that p oin t facing a wa y fr o m E . Since no p oin t of F is within c ′ √ log n of any side of S n apart f rom E , this half disc is en tirely insid e S n , and so m us t contai n a p oin t of P (whic h is obviously not in F ). T herefore, as b efore, r < r 0 ≤ c ′ √ log n . Thus A ( r ) ∩ S n = A ( r ) ∩ E + where E + denotes the halfp lane b ound ed by E that contai ns S n . This time let D and D ′ b e half d iscs of area | A | and | A ′ | resp e ctive ly cen tred on E . Then, by the isop erimetric inequ a lity in the h a lf plane E + (an easy consequ en c e of the same inequalit y in the whole p la n e ), | ( A ( r ) ∩ E + ) \ A | ≥ | ( D ( r ) ∩ E + ) \ D | ≥ | ( D ′ ( r ) ∩ E + ) \ D ′ | . No w D ′ is half a disc of radius √ 2 p | A ′ | /π and D ′ ( r ) ∩ E + is half a disc of radius √ 2 p | A ′ | /π + r = (1 + x/ √ 2) p 2 | A ′ | /π , so this time w e we ha ve | ( D ′ ( r ) ∩ E + ) \ D ′ | = ((1 + x/ √ 2) 2 − 1) | A ′ | . 12 Step 3: Bounding the pr ob ability of such a c onﬁgur ation. W e h a v e seen that if there is suc h a comp onen t F then th e r e exist regi on s as deﬁned ab o ve. These regions are determined by 10 p oin ts: the four p oin ts deﬁning sides of the hexagonal h ull, their four k th nearest neigh b our p oin ts and the p oints P and Q ; th a t is, if there is such a comp onen t F then there are 10 p oin ts of P deﬁ ning regions A ′ , B , A 1 , . . . , A 4 and A ( r ) with # A ′ ≥ k , # B ≥ k , #( A ′ ∩ B ) = 0, and b oth # S 4 i =1 A i = 0 and #(( A ( r ) ∩ S n ) \ A ) = 0. Again, if the conﬁgur a tion is go od , all these p oin ts must lie within c ′ √ log n of A . Similarly to b efore, let Z b e the ev ent that there are 10 p oint s of P all within c ′ √ log n of A deﬁning regions with the ab o v e prop erties. Again P (there exists F and the conﬁguration is go od) ≤ P (Z and the conﬁ g u ratio n is go o d) ≤ P ( Z ) so, as b efore, we b ound P ( Z ). Fix a p artic u la r collection of 10 p oin ts and let Z ′ b e the ev ent that these 10 p oin ts witn ess Z . Note, s in ce we are assuming th ese 10 p oin ts all lie with c ′ √ log n of A , the regions A ′ , A 1 , . . . , A 4 all lie en tirely within S n . By deﬁnition, B and ( A ( r ) ∩ S n ) \ A also lie in S n . Again w e app ly Lemm a 6 to the sets A ′ , B together with eac h of S 4 i =1 A i and ( A ( r ) ∩ S n ) \ A . This time, ho wev er, neither b ound will b e v alid for large x so we form a th ird b ound based ju st on the t w o sets A ′ and ( A ( r ) ∩ S n ) \ A . As b efore we base the ﬁrst b ound on # S 4 i =1 A i = 0. W e h a v e | A ′ | ≤ | S 4 i =1 A i | and, pro vided x < 2 . 56, w e h a v e | B | ≤ 0 . 61 x 2 | A ′ | < 4 | A ′ | ≤ | S 4 i =1 A i | so Lemma 6 implies P ( Z ′ ) ≤ 4 | A ′ || B | ( | A ′ | + | ( S 4 i =1 A i ) | + | B | ) 2 ! k ≤  4 · 0 . 61 x 2 (5 + 0 . 61 x 2 ) 2  k . The second b ound based on #(( A ( r ) ∩ S n ) \ A ) = 0 is also v ery similar to b efore. Ho wev er, this time the m id dle inequalit y in | B | ≤ 0 . 61 x 2 | A ′ | ≤ ((1 + x/ √ 2) 2 − 1) | A ′ | ≤ | ( A ( r ) ∩ S n ) \ A | is not v alid for all x , b u t it is v alid for all x < 12. Also p ro vided that x > 2 − √ 2, we h a v e | ( A ( r ) ∩ S n ) \ A | ≥ | A ′ | so for 2 − √ 2 < x < 12 the conditions of Lemma 6 are satisﬁed. Thus P ( Z ′ ) ≤  4 | A ′ || B | ( | A ′ | + | ( A ( r ) ∩ S n ) \ A | + | B | ) 2  k ≤  4 · 0 . 61 x 2 ((1 + x/ √ 2) 2 + 0 . 61 x 2 ) 2  k . 13 Since neither b ound applies for large x we form a third b ound based on the tw o sets A ′ and ( A ( r ) ∩ S n ) \ A . W e know A ′ con tains at least k p oin ts and ( A ( r ) ∩ S n ) \ A is emp ty . Th is has p robabilit y at most P ( Z ′ ) ≤  | A ′ | | A ′ | + | ( A ( r ) ∩ S n ) \ A |  k ≤ 1 (1 + x/ √ 2) 2 k whic h is less th a n 80 − k for all x ≥ 12. As b efore the maxim um of the minim u m of the ﬁrst tw o b ounds o ccurs when th e y are equal at x = √ 2( √ 5 − 1); at this p oint they are α − k for some α > 6 . 3. Moreo v er the third b ound is tiny in comparison. Thus, in all cases, P ( Z ′ ) ≤ α − k for some α > 6 . 3. Since all 10 p oin ts must lie within c ′ √ log n of A there are O ( (log n ) 10 ) w ays of c ho osing them. Hence, similarly to b efore, the exp ec ted n umb er of 10 p oin t sets for whic h Z ′ o cc u rs is is O ((log n ) 10 α − k ) = O (6 . 3 − k ). Hence P ( Z ) = O (6 . 3 − k ) and th e pro of of the lemma is complete. The remainder of th e pro of is v ery similar to b efore. There are only a constan t n umb e r of connected sets of b G of size at most 16 ( c ′ M ) 2 whic h con tain a ﬁxed tile, and therefore O ( √ n log n ) suc h sets which con tain a tile within distance log n of the b oundary of S n . S ince k > 0 . 2 72 log n > 1+ ε ′ log 6 . 3 log( √ n ) for some ε ′ > 0 the expected num b er of small comp onen ts of G that con tain a verte x w it hin distance log n of the b oundary of S n when the conﬁguration is go od is O ( √ n log n (6 . 3) − k ) = o ( n − ε ′ / 2 ). Let ε = min( ε ′ / 2 , 1) and p b e the the pr o babilit y that th e r e exists a p oin t P within log n of the b oundary of S n that is not in the gian t comp onen t. Th en p ≤ P (there is a small b oundary comp onen t and P is go od) + P ( P is bad) = o ( n − ε ) + O ( n − ε ) = O ( n − ε ) as claimed. Op en Questions In this p a p er we ha v e pro ved t w o results ab out the b eha viour of the small comp onen ts in the graph G n,k . Ho wev er, sev eral qu e s t ion ab out their prop- erties r emain op en. W e are interested in the b eha viour near the connectivit y threshold so, in particular, w e assu me in the f o llowing questions that k is at least 0 . 3 log n . 14 Question 1. M ust the smal l c omp onents of G n,k b e isolate d? Mor e pr e cisely, is it the c ase that, whp, ther e do not exist two smal l c omp onents within distanc e of O ( √ log n ) of e ach other. Since the ﬁrs t draft of this pap er F algas-Ra vry [4] has answered this question in the aﬃrmativ e pro vid e d that the probabilit y that G is conn ected is not too small: more p reci sely he pro ves it whenever P ( G is connected) = Ω( n γ ) (where γ is an absolute constan t). Question 2. How many vertic es do smal l c omp onents c ontain? It is immediate from Lemma 6 of [2] (quoted as Lemma 3 of this pap er) that all small comp onen ts conta in O ( k ) ve r t ices. If the lo wer b ound construction of Balister, Bollob´ as, Sark ar and W alters in [2] is extremal then, as the authors remark th er e, all small comp onen ts w ould con tain k + O (1) vertic es. Question 3. Ar e al l the smal l c omp onents c onvex in the sense that al l p oints of P within the c on vex hul l of a smal l c omp onent ar e actual ly p art of the smal l c omp onent? References [1] N. Alon and J. H. Sp encer. The pr ob abilistic metho d . Wiley-In terscience Series in Disc rete Mathemat ics and Optimizat ion. John Wiley & Sons Inc., Hob ok en, NJ, third edition, 2008. [2] P . Balister, B. Bollob´ as, A. Sark ar, and M. W alters. Connectivit y of random k -nearest-neigh b our graphs. A dv. in Appl. Pr ob ab. , 37(1) :1–24, 2005. [3] P . Balister, B. Bollob´ as, A. Sark ar, and M. W alters. A critical constan t for the k -n earest-neighbour mo del. A dv. in Appl. Pr ob ab. , 41(1):1– 12, 2009. [4] V. F algas-Ra vry . On th e distrib utio n of s m a ll comp onen ts in the k - nearest neighbour s random geometric graph mo del. Preprin t. [5] F. Xu e and P . R. Kumar. Th e n u m b er of n e igh b ors needed f o r connec- tivit y of wireless netw orks. Wi r eless Networks , 10:169 –181, 2004. 15

Small components in k-nearest neighbour graphs

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment