Cubefree words with many squares

CUBEFREE W ORDS WITH MANY SQUARES JAMES CURRIE AND NARAD RAMPERSAD Abstract. W e construct inﬁnite cubefree bina ry w or ds con taining exp onentially m a ny distinct squares of length n . W e a lso show t hat fo r every po sitive in teger n , there is a cube fr ee binary square of length 2 n . 1. Introduction A squar e is a non- empt y word of the fo rm xx , a nd a cub e is a non-empt y w ord of the form xxx . An o v erlap is a w ord of the for m axaxa , where a is a letter and x is a w ord (p o ssibly empt y). A w ord is squar efr e e (resp. cu b efr e e , overlap-fr e e ) if none of its fa ctors are squares (resp. cub es, ov erlaps). F or further bac kground material concerning combin ato rics o n w ords w e refer the reader to [2]. It is w ell-kno wn that there exist inﬁnite squarefree w ords o v er a ternary alphab et and inﬁnite o v erlap-f ree w or ds o v er a binary alphab et. Clearly , an y ov erlap-free word is also cub efree. An y inﬁnite cub efree binary w o rd m ust contain squares; how ev er, Dekking [8] pro v ed tha t t here exists an inﬁnite cub efree binary w ord con taining no squares xx where the length of x is greater than 3 (see also [1 3, 14]) . In this pap er we consider instead the existence of inﬁnite cubefree binary w ords with man y distinct squares. Most kno wn constructions of inﬁnite cub efree w ords in v olve the iteration of a morphism. W ords constructed in this manner are of t en refered to as inﬁnite D0L w o r ds . E hrenfeuc ht and Rozen b erg [9, 10, 11 ] pro v ed sev eral results concerning the factor complexit y o f inﬁnite D0L words. They show ed that a n y squarefree or cub efree D 0L w ord has O ( n log n ) f actors of length n . Thus , an inﬁnite cubefree D0L w ord cannot hav e man y distinct square f ac- tors. By constrast, w e sho w here how to construct inﬁnite cubefree binary words containing exp o nen tially man y distinct squares of length n . Other w ork related to the pro blems considered here include [1, 6 , 7 ]. Let µ denote the Thue–Morse morphis m : i.e., the morphism that maps 0 → 01 a nd 1 → 10. The T hue–Morse wor d is the inﬁnite word t = 011010 01100101 10100101 10 · · · obtained by iterativ ely a pplying µ to the w ord 0. T he Th ue–Morse word is w ell-kno wn to b e o v erlap-free, and hence, a fortiori, cub efree [16]. T he squares o ccurring in the Th ue– Morse w ord w ere ch ara cterized b y P ansiot [12] and Brlek [4 ] as fo llo ws. Deﬁne sets A = { 00 , 11 , 010010 , 101101 } a nd A = [ k ≥ 0 µ k ( A ) . Date : Octob er 18 , 2018. 2000 Mathematics Subje ct Classiﬁc ation. 68R15. The ﬁrst author is supp orted by an NSER C Discov ery Grant. The second author is supp orted by an NSER C Postdo ctoral F ellowship. 1 The set A is the set of squares app earing in the Th ue–Morse w ord. Shelton and Soni [15] c haracterized t he o v erlap- f ree squares (the result is also att r ibuted to Th ue b y Berstel [3]), as b eing the conjugates of the w ords in A . (A c on jugate of x is a w ord y suc h that x = uv and y = v u fo r some u, v .) Currie a nd Ramp ersad [6] sho wed that the conjugates of the w ords in A are a lso precisely the 7 / 3-p o w er-free squares. Th us, there are only 7 / 3-p o w er-fr ee squares of length 2 n when n is a p o w er o f 2, or 3 times a p o w er of 2. By con trast, w e show that there are cub efree binary squares of length 2 n for ev ery p ositiv e integer n . W e use this result to construct inﬁnite cub efree binary w ords con taining exp o nen tially man y distinct squares. 2. Main res ul ts The main results o f this pap er are the follo wing t w o theorems. Theorem 1. L et n b e a p ositive inte g e r. Th er e exists a cub efr e e binary squar e of length 2 n . Theorem 2. Ther e exists an inﬁnite cub efr e e binary wor d c on tain i n g exp one n tial ly many distinct s quar es of length n . W e ﬁrst establish some preliminary results. Lemma 3. The Thue–Morse wor d c ontains a factor of the form x = 1 001 x ′′ = x ′ 1001 of every p ositive even len g th n 6 = 2 , 6 . Pr o o f. Ab erk ane and Currie [1, Lemma 4] prov ed t ha t for every in teger m ≥ 6, the Thue – Morse word contains a factor of length m o f the fo rm 10 y 10. Then the Th ue–Morse w ord also contains the fa ctor µ (10 y 10 ) = 1001 µ ( y )10 01, whic h has length 2 m . F inally , w e observ e that 1001 1001 and 10 0 1101001 ar e factors of the Th ue–Morse w o rd of lengths 8 and 10 resp ectiv ely .  Lemma 4. If y is overla p-fr e e and ay b is a cub e of p erio d p , then p ≤ | ab | . Pr o o f. Otherwise deleting a and b remo ve s less than a f ull p erio d from ay b , leaving an o v erlap.  Lemma 5. If z is a factor of y y y wher e | y | = p and | z | ≤ p + 1 , then ther e ar e two o c curr en c es of z in y y y . Pr o o f. Certainly if z is a fa ctor of y y it o ccurs tw ice in y y y . If z is a factor of y y y but not of y y , then z m ust span the cen tral y of y y y and a bit more on b oth ends, giving z a length of p + 2 or more.  Theorem 6. L et x b e a factor of the Thue–Morse wor d of the form x = 1001 x ′′ = x ′ 1001 . Then the wor d x 0 x 0 is cub efr e e. R emark 1 . W ord 010 1 0 o ccurs exactly once in x 0 x 0. (No t e that t his w ord is an o ve rlap, and hence not a factor o f the Th ue–Morse word.) Pr o o f o f The or em 6. Supp ose y y y is a cub e in x 0 x 0 with | y | = p > 0. Case 1: Perio d p ≥ 4 . By Lemma 5 and Remark 1, w ord 01010 is not a factor o f y y y . W e ha v e tw o p ossibilities: Case 1a: Cub e y y y is a factor of x ′ 100101 . This is impossible b y Lemma 4, since x ′ 1001 is o v erlap-free, | 01 | = 2, and p ≥ 4 > 2 . 2 Case 1b: Cub e y y y is a factor of 101001 x ′′ 0 . This is again imp ossible by Lemma 4, since 1001 x ′′ is o ve rlap- free. Case 2: Perio d p ≤ 3 . If 01010 is a facto r of y y y , t hen one of 001010 and 010 100 is a fa ctor. Ho w ev er, neither of these has p erio d 1, 2 or 3; this is imp ossible. W e conclude that 0101 0 is not a factor of y y y . This giv es a similar case breakdo wn as in Case 1. Case 2a: C ub e y y y is a f a c tor of x ′ 100101 . Case 2ai: Cub e y y y is a suﬃx of x ′ 100101 . In this case, p ≤ 2 by Lemma 4, since x ′ 1001 is o v erlap- f ree. How ev er, the lo ng est suﬃx o f x ′ 100101 of p erio d 1 or 2 is 01 01, whic h is cubefree. Case 2aii: C ub e y y y is a suﬃx of x ′ 10010 . This forces p = 1, whic h is imp ossible. Case 2b: Cub e y y y is a factor of 101001 x ′′ 0 . Case 2bi: C ub e y y y is a pr eﬁx of 101001 x ′′ 0 or of 01001 x ′′ 0 . Since | y y y | = 3 p ≤ 9 ≤ | 01001 x ′′ | , y y y is a factor of 1 01001 x ′′ . This is symmetrical to Case 2a. Case 2bii: Cub e y y y is a fac tor of 1001 x ′′ 0 = x 0 . This is imp ossible b y Case 2a.  Theorem 7. L et x b e a factor of the Thue–Morse wor d of the form x = 1001 x ′′ = x ′ 1001 . Then the wor d x 1011 00 x 101100 is cub efr e e. R emark 2 . W ord 00 100 o ccurs exactly once in x 1011 00 x 101100. W ord 1101 1 o ccurs exactly t wice. Pr o o f o f The or em 7. Supp ose y y y is a cub e in x 1011 0 0 x 101100 with | y | = p > 0. Case 1: Perio d p ≥ 4 . By Lemma 5 and Remark 2, w ord 00100 is not a factor o f y y y . W e ha v e tw o p ossibilities: Case 1a: Cub e y y y is a factor of x 10110010 . W ord x 10110010 contains 110 1 1 as a factor exactly once. By Lemma 5 and Remark 2, there are t w o p ossibilities: Case 1ai: Cub e y y y is c ontaine d in x 101 . In this case, p ≤ 3 b y Lemma 4, since x is o v erlap-f r ee. This is a con tradiction. Case 1aii: C ub e y y y is c o ntaine d in 101 10010 . This is clearly imp ossible. Case 1b: Cub e y y y is a fac tor of 0 x 101100 . Again, w ord 0 x 101100 con tains 11011 as a factor exactly once. Therefore, either y y y is con tained in 101100 or in 0 x 101. The ﬁrst alternativ e eviden tly is imp ossible, while the second is r uled out b y Lemma 4. Case 2: Perio d p ≤ 3 . If 0 0100 is a factor of y y y , then w e m ust ha ve p = 3, since 001 00 do es not hav e p erio d 1 or 2. How ev er, in x 101100 x 101 1 00, the maximal f actor of p erio d 3 con taining 00100 is 1001001, whic h is not a cub e. W e conclude that 00100 is not a factor of y y y . This gives a similar case breakdo wn to Case 1: Case 2a: Cub e y y y is a factor of x 101100 1 0 . By Lemma 4 the w ord x 10 m ust b e cubefree. Therefore, y y y m ust b e a suﬃx of one of these words: w 8 = x ′ 10011011 0010 w 7 = x ′ 10011011 001 w 6 = x ′ 10011011 00 w 5 = x ′ 10011011 0 w 4 = x ′ 10011011 w 3 = x ′ 1001101 3 None of the w n ends in a cub e of p erio d 1, 2 or 3. (In the case of w o r ds w 4 , w 3 , the long est suﬃxes o f p erio d 3 hav e lengths 6 and 5 resp ectiv ely .) It follo ws that y y y is not a suﬃx of an y of the w n , and t his case do es not o ccur. Case 2b: Cub e y y y is a fa ctor of 0 x 101100 . Since | y y y | = 3 p ≤ 9 ≤ | 0 x | , y y y is a fa ctor of 0 x or of x 10110 0. The ﬁrst p ossibility was ruled out in Theorem 6, and the second in Case 2a.  Theorems 6 and 7 together establish Theorem 1. Next w e sho w that the num b er o f cub efree binary squares of length n grows exponentially . Prop osition 8. Ther e exist exp onential ly many cub efr e e binary squar es of length n . Pr o o f. Let m b e a p ositiv e integer and let xx b e a cub efree binary square of length 2 m ov er { 0 , 1 } . Supp ose tha t 0 o ccurs at least as often as 1 in x . Construct a new cub efree square y y ov er { 0 , 1 , 2 } , where y is o bta ined fro m x b y arbitrarily replacing some of the 0’s in x b y 2’s. There are at least 2 m/ 2 suc h squares y y of length 2 m . Let h b e the morphism 0 → 001011 1 → 001101 2 → 011001 . Branden burg [5, Theorem 6] sho we d that h maps cub efree words to cub efree words. More- o v er, since h is uniform and injectiv e, t he set of words h ( y y ) consists of at least 2 m/ 2 cub efree squares o f length 12 m . Asymptotically , w e t hus ha ve expo nen tially man y cubefree binary squares of length n , as required.  W e no w prov e Theorem 2. Pr o o f o f The or em 2. In the pro of of Prop osition 8 w e sho w ed tha t there are at least 2 m/ 2 cub efree binary squares of length 12 m for ev ery p ositiv e in teger m . Let S therefore b e a n y set o f cub efr ee squares ov er { 0 , 1 } where S con tains a t least 2 m/ 2 w ords of length 12 m f o r ev ery p o sitiv e in teger m . Let x = x 1 x 2 · · · b e any inﬁnite cubefree binar y w ord ov er { 2 , 3 } . Construct a w o r d w = x 1 S 1 x 2 S 2 · · · , where the set of S i ’s is equal to the set S , so that w is cub efree and con tains exp onen tially man y distinct squares of length n . Let g b e the morphism 0 → 001001101 1 → 001010011 2 → 001101011 3 → 011001011 . Branden burg [5, Theorem 6] show ed tha t g maps cub efree words to cub efree words. Th us, g ( w ) is cub efree and, by the unifor mity and inj ectivity of g , con tains exponentially many distinct squares of length n .  Note that Theorem 2 implies that existence of an inﬁnite cub efree binary word with exp o nen tial factor c omplexity —i.e., with exp o nen tially many fa cto r s of length n . Similarly , 4 one can easily construct an inﬁnite squarefree w ord ov er { 0 , 1 , 2 } with exp o nen tial factor complexit y . Prop osition 9. Ther e exists a n inﬁnite squar efr e e wor d over { 0 , 1 , 2 } w ith exp onential f a ctor c omple x i ty. Pr o o f. Let w be any inﬁnite squarefree word ov er { 0 , 1 , 2 } a nd let x b e an y inﬁnite w o r d o v er { 3 , 4 } with 2 n factors of length n for ev ery p ositiv e n . Let y b e t he w or d obtained b y fo rming the p erfe ct shuﬄe of w and x : that is, if w = w 0 w 1 w 2 · · · and x = x 0 x 1 x 2 · · · , then deﬁne y = w 0 x 0 w 1 x 1 w 2 x 2 · · · . Clearly , y is a squarefree w ord with exp onen tial fa ctor complexit y . Let f b e the morphism 0 → 0102012 02101210 212 1 → 0102012 02102010 212 2 → 0102012 02120121 012 3 → 0102012 10201021 012 4 → 0102012 10212021 012 . Branden burg [5, Theorem 4] show ed t ha t f maps squarefree w ords to squarefree w ords. The uniformit y and injectivit y of f implies that f ( y ) is a squarefree w o r d with exp onen tial factor complexit y , as required.  Reference s [1] A. Ab erk a ne, J. Curr ie, “There exist binar y circular 5 / 2 + power free words of every length”, Ele ctro n. J. Combinatorics 11 (20 0 4), #R10. [2] J.-P . Allouche, J. Shallit, A ut omatic Se quenc es: The ory, Appli c ations, Gener alizations , Cam bridg e, 2003. [3] J. Berstel, “Axe l Th ue’s w or k o n repetitions in words”. In P . Lero ux, C. Reutenauer, eds., S´ eries formel les et c ombinatoir e alg´ ebrique , Publications du LaCIM, pp 6 5–80, UQAM, 1992. [4] S. Brlek, “Enumeration o f factors in the Thue–Morse word”, Discr ete A ppl. Math. 24 (1989), 83–96 . [5] F.-J. Br andenburg, “Uniformly gr owing k th power-free homomor phisms”, The or et. Comput. Sci. 23 (1983), 69–82 . [6] J. Cur r ie, N. Ramper sad, “Inﬁnite w or ds containing squares a t every po sition”. In Pr o c e e dings of Journ´ ees Montoises D’Informatique Th´ eorique 2008 . [7] J. Currie, N. Ramper s ad, J. Sha llit, “Binary w or ds containing inﬁnitely many ov er laps”, Ele ctr on. J. Combinatorics 13 (200 6), #R82. [8] F. M. Dekking , “ O n rep etitions of blo cks in binar y se quences”, J . Combin. The ory. Ser. A 20 (197 6), 292–2 99. [9] A. Ehr enfeuc ht, G. Roze nber g, “ On the sub word co mplexity o f square-fre e DOL languages” , The or et. Comput. S ci. 16 (1981 ), 25– 32. [10] A. Ehrenfeuch t, G. Rozenberg , “On th e subw ord complexit y of m -free DOL languages ”, Info rm. Pr o c ess. L et t. 17 (1983), 121–12 4. [11] A. E hrenfeuch t, G. Rozenberg, “On the size of the alphabe t and the subw or d complexit y of s q uare-free DOL languages” , Semigr oup F orum 2 6 (1983), 215–22 3. [12] J.-J. Pansiot, “The Morse sequence and iterated morphisms” , Inform. Pr o c ess. L ett. 12 (1981 ), 68– 70. [13] N. Ramp ersa d, J. Shallit, M.-w. W ang, “Avoiding lar ge squares in inﬁnite bina ry words”, The or et. Comput. S ci. 339 (2005), 19– 3 4. [14] J. Shallit, “Simultaneous avoidance of la rge squares and fractiona l p ow ers in inﬁnite binary words”, Int’l. J. F ound. Comput. Sci. 15 (2004), 317–3 27. [15] R. Shelton, R. Soni, “Cha ins a nd ﬁxing blo c ks in irreducible seq uences”, Discr ete Math. 54 (1985 ), 93–99 . 5 [16] A. Th ue, “ ¨ Uber die g egenseitige Lage gleicher T eile gewiss e r Zeichenreihen”, Kra . Vidensk. Selsk. Skrifter. I. Math. Nat. Kl. 1 (191 2), 1 –67. Dep ar tment of Ma thema tics and St a tistics, University of Winnipeg, 515 Por t age A venue, Winnipeg, Manitoba R3B 2E9 (Canada) E-mail addr ess : { j. curri e,n.ra mpersad } @uwinnipeg.ca 6

Cubefree words with many squares

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment