A connection between palindromic and factor complexity using return words

In this paper we prove that for any infinite word W whose set of factors is closed under reversal, the following conditions are equivalent: (I) all complete returns to palindromes are palindromes; (II) P(n) + P(n+1) = C(n+1) - C(n) + 2 for all n,…

Authors: Michelangelo Bucci, Aless, ro De Luca

A connection b et w een palindrom ic and fa c t or complexit y using return w ords Mic helangelo Bucci ∗ Alessandro De Luca † Am y Glen ‡ Luca Q. Zam b oni § Submitted: F ebruary 9, 2008; Accepted: Marc h 25, 2008; R e vised: April 10, 2008 Abstract In this paper we prov e that for any infinite w ord w who se set of factors is closed under reversal, the following conditions are equiv alent: (I) all complete returns to palindr omes are palindromes; (II) P ( n ) + P ( n + 1) = C ( n + 1 ) − C ( n ) + 2 for a ll n , where P (resp. C ) denotes the p alindr omic c omplexity (resp. factor c omplexity ) function of w , which counts the n umber of distinct palindr omic factors (res p. factors) of ea c h le ngth in w . Keyw ords : return w o rd; palindrome; pa lindromic complex ity; fa ctor complex ity; Rauzy g raph; rich word. MSC (2000): 68R15. 1 In tro duction Giv en an in finite word w , let P ( n ) (resp. C ( n )) d enot e the p alindr omic c omplexity (resp. factor c omplexity ) of w , i.e., the num b er of distinct palindromic f a ctors (r e sp . factors) of w of length n . In [1], J.-P . Allouche, M. Baak e, J . Cassaigne, and D. Damanik established the following inequality relating the palindromic and factor complexities of a non-ultimately p eriodic infinite w ord : P ( n ) ≤ 16 n C  n + j n 4 k for all n ∈ N . More r ec ently , u sing R auzy gr aphs , P . Bal´ a ˇ zi, Z. Mas´ ak ov´ a , and E. Pelan tov´ a [5] p ro ved th a t for any uniformly recurrent in fi nite word whose set of f a ctors is closed und er rev ers al, P ( n ) + P ( n + 1) ≤ C ( n + 1) − C ( n ) + 2 for all n ∈ N . (1.1) They also pr o vided several examples of infinite w ords for w h ic h P ( n ) + P ( n + 1) alw ays reac h e s the upp er b ound given in relation (1.1). Such infin ite w ord s include Arnoux-R auzy se que nc es , ∗ Dipartimento di Matematica e Applicazoni “R. Caccioppoli” , Un i versit` a degli Studi di Nap oli F ederico I I, V i a Cin tia, Mon t e S. Angelo, I-80126, Napoli, IT AL Y ( micbucci@unina. it ). † Dipartimento di Matematica e Applicazoni “R. Caccioppoli” , Un i versit` a degli Studi di Nap oli F ederico I I, V i a Cin tia, Mon t e S. Angelo, I-80126, Napoli, IT AL Y ( alessandro.delu ca@unina.it ). ‡ Corresponding author: LaCIM, Universit ´ e du Q u ´ ebec ` a Montr ´ eal, C.P . 8888, succursale Centre-ville, Montr ´ eal, Qu´ eb ec, H3C 3P8, CANADA ( amy.glen@gmai l.com ). Supp orted b y CRM, IS M, and LaCIM. § Institut Camille Jordan, Un iversit ´ e Claude Bernard Lyon 1, 43 b oulev ard du 11 n ovem bre 1918, 69622 V i lleur- banne Cedex FRANCE ( luca@unt.edu ). 1 c omplementation-symmetric se qu enc es , certain words asso cia ted with β -expansions wh er e β is a simple P arry numb er , and a class of words cod ing r -inte r v al exchange tr a n sformatio n s . In this pap er we give a charact erization of all infinite wo rd s with factors closed un der reversal for wh ic h th e equality P ( n ) + P ( n + 1) = C ( n + 1) − C ( n ) + 2 h olds for all n : these are exactly the infinite words with the prop ert y th a t all ‘complete returns’ to palindr omes are p a lind romes. Giv en a finite or infi nite word w and a factor u of w , we say that a factor r of w is a c omplete r eturn to u in w if r con tains exactly tw o occurr e n c es of u , one as a p refix and one as a suffi x. Return words pla y an imp ortan t role in the study of min imal subshifts; see [12, 13, 14, 15, 20, 24]. Our main theorem is the following: Theorem 1.1. F or any infinite wor d w whose set of factors is close d under r eversal, the fol lowing c onditions ar e e qui v a lent: (I) al l c omplete r eturns to any p alindr omic factor of w ar e p alindr omes; (II ) P ( n ) + P ( n + 1) = C ( n + 1) − C ( n ) + 2 for al l n ∈ N . Recen tly , in [19], it w as shown that prop ert y (I) is equiv alent to every facto r u of w ha ving exactly | u | + 1 distinct palindr o mic factors (includ i n g the empt y wo rd ). Such wo r ds are ‘ric h ’ in palindromes in the sense th a t th ey con tain the maximum num b er of different palind romic factors. Indeed, X. Droubay , J . Justin, and G. Pirillo [10] observed that any finite w ord w of length | w | con tains at most | w | + 1 d i s tin ct palindromes. The family of fi nite and infin ite words having pr o p erty (I) are calle d rich wor ds in [19]. In indep enden t work, P . Ambro ˇ z, C . F rougny , Z. Mas´ ak ov´ a, and E. Pelan tov´ a [2] h a v e considered the same class of words whic h they call ful l wor ds , following earlier w ork of S . Brlek, S. Hamel, M. Niv at, and C. Reutenauer in [6]. Ric h words encompass the well-kno wn family of episturmian wor ds originally introduced by X. Droub a y , J. Justin, and G. Pirillo in [10] (see Section 4 for more d etails). Another sp ecial class of ric h words consists of S. Fisc hler’s sequences with “abundant palindr omic p refixes” , which were intro- duced and stud ie d in [16] in relation to Diophant in e approxima tion (see also [17]). Other examples of ric h wo r ds that are neither episturmian nor of “Fisc hler type” includ e: non-recurrent rich words, lik e abbbb · · · and abaabaaabaaaab · · · ; the p eriodic rich in finite w ords : ( aab k aabab )( aab k aabab ) · · · , with k ≥ 0; the non-ultimately perio dic recurrent ric h infinite w ord ψ ( f ) wh ere f = abaababaaba · · · is the Fib onac ci wor d and ψ is the morp hism: a 7→ aab k aabab , b 7→ bab ; and the recurrent, but not uniformly recurrent, r ic h infinite wo rd generated by the morph ism: a 7→ aba , b 7→ bb . (See [19] for these examples an d m o r e.) F rom the work in [10, 19], we ha ve the following equiv alences. Prop o sit ion 1.2. A finite or infinite wor d w is r ic h if e qu ivalent ly: • al l c omplete r eturns to any p alindr omic factor of w ar e p alindr omes; • every factor u of w c ontains | u | + 1 distinct p alindomes; • the longest p alindr omic suffix of any pr efix p of w o c curs exactly onc e in p . F rom the p erspective of ric h ness, our main theorem can b e viewed as a charac terization of r e curr ent ric h infinite wor ds since any rich infinite w ord is r ec u rren t if and only if its set of f a ctors is closed un der rev ersal (see [19] or Remark 2.1). Interesti n gl y , the p roof of Theorem 1.1 relies up on another new c haracterization of rich wo rd s (Prop osition 2.3), w h ic h is useful for establishin g the key step, namely that the s o-called sup er r e duc e d R auzy gr aph is a tree. This answe r s a claim 2 made in the last few lines of [5] w here it was r ema r k ed that th e Rauzy graph s of words satisfying equalit y (I I) must ha ve a very sp eci al form. After some preliminary definitions and results in the next section, S e ction 3 is d e voted to the pro of of Theorem 1.1 and some interesting consequences are p ro ve d in Section 4. 2 Preliminaries 2.1 Notation and terminology In this pap er, all w ords are taken o ver a finite alphab et A , i.e., a finite non-empty s e t of symbols called letters . A finite wor d o ver A is a fi nite sequence of letters from A . Th e empty wor d ε is the empty sequence. A (righ t) infinite wor d x is a sequence indexed by N + with v alues in A , i.e., x = x 1 x 2 x 3 · · · with each x i ∈ A . F or easier r ea d i n g , infinite words are h erea fter typed in b oldface to distinguish them fr o m fi n ite wo r d s. Giv en a fi nite w ord w = x 1 x 2 · · · x m (where each x i is a letter), th e length of w , denoted by | w | , is equal to m . By conv ention, the empy word is the un ique word of length 0. W e denote by ˜ w the r eve rsal of w , giv en by ˜ w = x m · · · x 2 x 1 . If w = ˜ w , then w is called a p alindr ome . A finite wo r d z is a factor of a fin it e or infi nite wo r d w if w = uz v for some wo rd s u , v . In the sp ec ial case u = ε (resp. v = ε ), we call z a pr efix (resp. suffix ) of w . If u 6 = ε and v 6 = ε , then we sa y that z is an interior factor of w = uz v . Moreo ver, z is said to b e a c entr al factor of w if | u | = | v | . W e say that z is unio c curr ent in w if z o ccurs exactly once in w . F or an y fin it e or in finite word w , the set of all factors of w is d enot ed by F ( w ) and w e denote by F n ( w ) the s et of all factors of w of length n , i.e., F n ( w ) := F ( w ) ∩ A n (where | w | ≥ n if w is finite). W e say that F ( w ) is close d under r eve rsal if for any u ∈ F ( w ), ˜ u ∈ F ( w ). A factor of an infinite word w is r e c u rr e nt in w if it o ccurs infinitely often in w , and w itself is said to b e r e curr ent if all of its factors are r e cur ren t in it. F urtherm o r e , w is uniformly r e curr ent if any factor of w occurs infinitely many times in w with b o u nded gaps. Remark 2.1. A notew orthy fact (p r o ve d in [19]) is that a rich infinite wo rd is recurrent if and only if its set of factors is closed under reve r sa l. More generally , we hav e the follo wing well-kno wn result: Prop o sit ion 2.2 (folklore) . If w is an i nfin i te wor d with F ( w ) close d under r eve rsal, then w is r e curr ent. Pr o of. C o n sider some o ccurrence of a facto r u in w and let v b e a prefix of w cont ainin g u . As F ( w ) is closed under rev ersal, ˜ v ∈ F ( w ). Thus, if v is long enough, there is an occurrence of ˜ u strictly on the right of this particular o cc u rrence of u in w . Similarly u o ccurs on the right of this ˜ u and thus u is recurrent in w . 2.2 Key results W e now p r o ve tw o useful results, the fi r st b eing a new charac terization of rich words. Prop o sit ion 2.3. A finite or infinite wor d w is rich if and only if, for e ach factor v ∈ F ( w ) , any factor of w b e ginning with v and ending with ˜ v and not c ontaining v or ˜ v as an interior factor is a p alindr ome. 3 Pr o of. O NL Y IF: Consider any f a ctor v ∈ F ( w ) and let u b e a factor of w beginning w ith v and ending with ˜ v and not containing v or ˜ v as an in terior factor. If v is a p a lind rome, then either u = v = e v (in which case u is clearly a palindr ome ), or u is a complete return to v in w , and h e n ce u is (again) a palind r ome by Prop osit ion 1.2. Now assum e that v is not a p a lind rome. Supp ose by w ay of contradictio n th a t u is not a p a lin d rome and let p b e the longest p a lind romic suffix of u (wh ic h is un io ccurrent in u by r ic hness). Then | p | < | u | as u is not a palindrome. If | p | > | v | , then ˜ v is a p r oper suffix of p , and hence v is a pr op er p refix of p . But then v is an inte r ior factor of u , a con tradiction. On the other hand , if | p | ≤ | v | , then | p | 6 = | v | and p is a prop er s u ffix of ˜ v (a s ˜ v is not a palindrome), and h ence p is a pr o p er prefix of v . Thus p is b oth a prefix and a suffix of u ; in particular p is not unio cc u rren t in u , a contradictio n . IF: The given conditions tell u s that an y complete r etur n to a palindromic factor v (= ˜ v ) of w is a palindrome. Hence w is rich by Prop ositio n 1.2. Prop o sit ion 2.4. Supp ose w is a rich wor d. Then, for any non-p alindr omic factor v of w , ˜ v is a unio c curr ent factor of any c omplete r eturn to v in w . Pr o of. L e t r b e a complete return to v in w an d let p b e the longest palindromic su ffix of r . Then | p | > | v | ; otherwise, if | p | ≤ | v | , then p would occur at least twice in r (as a suffix of each of the tw o occurr e n ce s of v in r ), which is imp o ss ible as r is ric h . Thus v is a prop er suffix of p , and hence ˜ v is a prop er prefix of p . So ˜ v is clearly an interior factor of r . It remains to show that ˜ v is u nioccurrent in r . Arguin g by con tradiction, we su ppose that ˜ v occurs more than once in r . Th e n a complete r et u rn r ′ to ˜ v o c cu r s as a prop er factor of r . Using the same reasoning as abov e, v is an inte r ior factor of r ′ , and hence an interior factor of r , contradict in g the f a ct that r is a complete return to v . Thus ˜ v is unio cc u ren t in r . Note. The ab o ve p roposition tells us that for any factor v of a rich word w , o ccur rences of v and ˜ v alternate in w . 3 Pro of of Theorem 1.1 F ollo w in g the metho d of Bal´ a ˇ zi e t al. [5], a k ey tool for the pro of of our main theorem is the notion of a R auzy gr aph , defin e d as follo ws. Given an in fi nite word w , the R auzy gr aph of or der n for w , denoted by Γ n ( w ), is the directed grap h with set of vertices F n ( w ) and set of edges F n +1 ( w ) such that an edge e ∈ F n +1 ( w ) starts at ve r te x v and ends at a vertex v ′ if and only if v is a prefi x of e and v ′ is a suffix of e . F or a v ertex v , the out-de gr e e of v (denoted b y d eg + ( v )) is the number of distinct edges leaving v , and the in-de gr e e of v (denoted by d eg − ( v )) is the num b er of distinct edges ent erin g v . More pr ec isely: deg + ( v ) = ♯ { x ∈ A | v x ∈ F n +1 ( w ) } and deg − ( v ) = ♯ { x ∈ A | xv ∈ F n +1 ( w ) } . W e observe that, for all n ∈ N , X v ∈ F n ( w ) deg + ( v ) = ♯F n +1 ( w ) = X v ∈ F n ( w ) deg − ( v ) . (Note that ♯F n +1 ( w ) = C ( n + 1).) Hence C ( n + 1) − C ( n ) = X v ∈ F n ( w ) (deg + ( v ) − 1) = X v ∈ F n ( w ) (deg − ( v ) − 1) . (3.1) 4 It is therefore easy to see that a factor v ∈ F n ( w ) p o sitively cont r ibutes to C ( n + 1) − C ( n ) if and only if deg + ( v ) ≥ 2, i.e., if and only if there exist at least t wo distinct lett ers a , b such that v a , v b ∈ F n +1 ( w ), in which case v is said to b e a right-sp e cial factor of w . Similarly , a factor v ∈ F n ( w ) is said to b e a left-sp e cial f a ctor of w if there exist at least t wo distinct letters a , b such that av , bv ∈ F n +1 ( w ). A factor of w is said to b e sp e cial if it is either left-special or r i ght-sp e cial (not necessarily b oth). With this terminology , if we let S n ( w ) denote the set of sp ecial factors of w of length n , then formula (3.1) may b e expressed as: C ( n + 1) − C ( n ) = X v ∈ S n ( w ) (deg + ( v ) − 1) for all n ∈ N . (3.2) Using similar terminology to that in [5], a d irec ted p a th P in the R au zy graph Γ n ( w ) is said to b e a simple p ath of or der n if it b egins with a sp ecia l factor v and ends with a sp ecial factor v ′ and con tains no other sp ecial factors, i.e., P is a dir e cted path of the f o r m v v ′ or v z 1 · · · z k v ′ where eac h z i is a non-sp ecial factor of length n . A sp ecia l fact or v ∈ S n ( w ) is calle d a trivial simple path of order n . In what follo w s , we use the following termin ology f o r p aths. Hereafter, “path” should b e tak en to mean “directed path” . Definition 3.1. Sup pose w is an infin ite wo r d and let P = v · · · v ′ b e a path in Γ n ( w ). • The first ve r tex v (resp. last ve r tex v ′ ) is called the initial vertex (resp. terminal vertex ) of P . • A v ertex of P th at is n ei ther an initial vertex nor a terminal v ertex of P is called an interior vertex of P . • P is said to b e a non-trivial p ath if it consists of at least t wo distinct vertice s. • The r e v ersa l ˜ P of the p at h P is the p at h ob tained fr o m P b e rev ers in g all edge lab els (and arrows) and all labels of vertices. • W e sa y that P is p alindr omic (or that P is invariant under r eversal ) if P = ˜ P . Note. Given a path P in Γ n ( w ), the reversal of P do es n o t necessarily exist in Γ n ( w ). Supp ose P = w 1 w 2 · · · w k is a non-trivial p at h in Γ n ( w ), and for eac h i with 1 ≤ i ≤ k , let a i and b i denote the resp ective first an d last letters of w i . Th en, by the definition of Γ n ( w ), w e hav e w 1 b 2 · · · b k = a 1 · · · a k − 1 w k . W e call this w ord the lab el of the p a th P , denoted by ℓ P . Note that the i -th shift of ℓ P := w 1 b 2 · · · b k b eg ins with w i +1 for all i with 1 ≤ i ≤ k − 1. F or our purp oses, it is conv enient to consider the r e duc e d R auzy gr aph of or der n , denoted by Γ ′ n ( w ), which is the directed graph obtained from Γ n ( w ) by replacing eac h simple p ath P = w 1 w 2 · · · w k − 1 w k with a directed edge w 1 → w k labelled by ℓ P . Thus the set of vertice s of Γ ′ n ( w ) is S n ( w ). F or example, consider the (ric h) Fib onac ci wor d : f = abaababaabaababaababaabaababaabaababaababaabaababaaba · · · which is generated by the Fib onac ci morphism ϕ : a 7→ ab, b 7→ a . Th e reduced Rauzy graph Γ ′ 2 ( f ) consists of the two (sp ecial) vertices: ab , ba an d three d irec ted edges: ab → ba , ba → ba , ba → ab with resp ec tive labels: aba , baab , bab . Lemma 3.2. L et w b e a rich infinite wor d and supp ose P = w 1 w 2 · · · w k is a non-trivial p ath in Γ n ( w ) with k ≥ 2 . Then the lab el ℓ P = w 1 b 2 · · · b k is a rich wor d. 5 Pr o of. W e pro ceed by induction on the number of v ertices k in P . T he lemma is clearly true for k = 2 since ℓ P = w 1 b 2 is a factor of w of length n + 1. Now s u ppose k ≥ 3 and assume that the label of any path consisting of k − 1 vertices is rich. Consider any path consisting of k v ertices, namely P = w 1 w 2 · · · w k , and su p pose by w ay of con tradiction that its lab el ℓ P = w 1 b 2 · · · b k is not rich. T h en th e longest palindromic pr efix p of ℓ P occurs more than once in ℓ P . Hence there exists a complete return r to p which is a pr e fi x of ℓ P . It follo ws that r = ℓ P , otherwise r would b e a facto r of the prefix u := w 1 b 2 · · · b k − 1 of ℓ P , and h en c e a palindr o me since u is rich by the induction hypothesis. But this con tr ad icts the maximality of th e palind romic prefix p . So ℓ P is a n on -p a lind romic complete return to p . Let q b e the longest palindromic prefix of u (which is unio cc u rren t in u by richness). If | p | > | q | , then q is a prop er prefix of p , and hence q o cc u rs more than twic e in u , a con tradiction. On the other hand , if | p | ≤ | q | , then p is a prefix of q , and hence p is an interior factor of ℓ P (occurrin g as a suffix of q ), a contradictio n . Thus ℓ P is rich, as required. The pro of of Theorem 1.1 relies up on the following extensions of Prop ositions 2.3 –2 .4 to paths. Lemma 3.3. (An alogue of Prop o sition 2.3.) Su p p ose w is a rich infinite wor d and let v b e any factor of w of length n . If P = v · · · ˜ v is a p ath fr om v to ˜ v in Γ n ( w ) that do es not c ontain v or ˜ v as an interior vertex, then P is p alindr omic. This pr op erty also holds for p aths in Γ ′ n ( w ) . Pr o of. W e firs t observe th a t if P consists of a single vertex, then P = v = ˜ v , and hence P is palindromic. No w supp ose P is a non-trivial path. If P = v ˜ v , then P is clearly palindromic. So supp ose P = v z 1 · · · z k ˜ v wh ere the z i are factors of w of length n . By defin it ion, th e lab el ℓ P = v b 1 · · · b k b k +1 b eg ins with v an d ends with ˜ v and contains neither v nor ˜ v as an interio r factor (otherwise P would conta in v or ˜ v as an inte r io r vertex, which is not p ossible). Th u s, as ℓ P is ric h (by Lemma 3.2), it follo ws that ℓ P is a palindrome by Prop o sition 2.3; wh e n c e P must b e inv ariant under reversal to o . It is easy to see that this prop ert y is also tru e for paths in the reduced Rauzy graph Γ ′ n ( w ). Lemma 3.4. (An alogue of Prop o sition 2.4.) Su p p ose w is a rich infinite wor d and let v b e any non-p alindr omic factor of w of length n . If P = v · · · v i s a non-trivial p ath in Γ n ( w ) that do es not c ontain v as an interior vertex, then P p asses thr ough ˜ v exactly onc e. This pr op erty also hold s for p aths in Γ ′ n ( w ) . Note. Of particular usefuln ess is the f act that any p ath from v to v must pass through ˜ v . Pr o of. L e t us w rite P = v z 1 · · · z k v w here th e z i are fact ors of w of length n . By definition, the label ℓ P = v b 1 · · · b k b k +1 con tains exactly tw o occurrences of v , one as a pr e fi x and one as a suffi x (otherwise, if ℓ P con tained v as an interio r factor, then v would b e an inte r i or vertex of P , which is not p ossible). T h us, as ℓ P is ric h (by Lemma 3.2), it follows that ˜ v is a unio ccurren t (int erior) factor of ℓ P by Prop ositio n 2.4; wh e n ce P passes through ˜ v exactly on ce. I t is easy to see that this prop ert y is also tr ue for paths in the r e d uced Rauzy graph Γ ′ n ( w ). 3.1 (I) implies (I I) Supp ose w is an infin it e word with F ( w ) closed under reversal and sati sf ying p r operty (I). Then w is recurr ent by Prop o s ition 2. 2 (i.e., w is a r ec u rren t ric h in finite word). Moreov er, recurrence implies that for all n , the Rauzy graph Γ n ( w ) is strongly connected, i.e., there exists a directed path from an y verte x v to every other vertex v ′ in Γ n ( w ). 6 Fix n ∈ N and let us now consider the sup er r e duc e d R auzy gr aph of or der n , den oted by Γ ′′ n ( w ), whose set of v ertices consists of all [ v ] := { v , ˜ v } where v is any sp ecial f a ctor of length n . Any tw o distinct v ertices [ v ], [ w ] (with v 6∈ { w, ˜ w } ) are joined by an undirected edge w it h lab el [ ℓ P ] := { ℓ P , ℓ ˜ P } if P or ˜ P is a simple path b eginning with v or ˜ v and ending w ith w or ˜ w . F or example, in the case of the Fib onacci word, Γ ′′ 2 ( f ) consists of only one ve r tex: [ ab ]. In general, the sup er reduced Rauzy graph consists of more than one v ertex and may conta in multiple edges b et we en vertices. Supp ose Γ ′′ n ( w ) consists of s ve rtices; namely [ v i ], i = 1 , . . . , s . S ince Γ n ( w ) is strongly conn ected (by r ec u rrence), Γ ′′ n ( w ) is connected; thus it conta ins at least s − 1 edges. No w, from Lemma 3.3, we kn ow that if v is a sp ec ial f a ctor, an y simple path from v to ˜ v is palindromic (i.e., inv ariant un der reversal). Moreo ver, b y closure under reversal, if there exists a simple path P fr o m a special factor v to a sp ecial facto r w , with v 6∈ { w , ˜ w } , then there is also a simple path from ˜ w to ˜ v (n amely , the rev ersal of the path P ). Neither of these simp le paths is palindromic. W e th us d educe that there exist at le ast 2( s − 1) n o n - trivial simple paths in the Rauzy graph Γ n ( w ) that are n on-palindromic (i.e., not inv ariant un der reversal). In fact, we will sho w that there are exactly 2( s − 1) non-trivial s im p le paths of order n that are non-palindr o mic. Indeed, if this tr ue then, as eac h palindr o m i c f a ctor of length n or n + 1 is a central factor of a (un iqu e) p a lin d romic simple path of order n , we hav e: P ( n ) + P ( n + 1) = X v ∈ S n ( w ) deg + ( v ) − 2( s − 1) + p (3.3) where, on the righ t hand side, th e first summand is the tot al number of n o n - trivial sim p le p at h s, the second summan d is the num b er of non-trivial simp le p a th s that are non-palindromic, and p is the num b er of sp ecial p a lind romes of length n (i.e., th e number of trivial simple paths of order n that are palindromic). By observin g that the num b er of special facto r s of length n is 2 s − p , w e can simplify equation (3.3) to obtain the required equalit y (I I) as f o llows: P ( n ) + P ( n + 1) = X v ∈ S n ( w ) deg + ( v ) − (2 s − p ) + 2 = X v ∈ S n ( w ) (deg + ( v ) − 1) + 2 = C ( n + 1) − C ( n ) + 2 (by (3.2)) . W e observe, in particular, that any infi nite wo r d w with F ( w ) closed under rev ersal sati sfi es equalit y (II) if and only if any simp l e p a th betw een a sp e cial factor and its r e versal is palindr o mic, and f or eac h n , there are exactly 2( s − 1) n on-trivia l simple paths of ord e r n th a t are non-palindromic. The lat ter condition sa ys that, for all n , th e sup er reduced Rauzy graph Γ ′′ n ( w ) con tains exactly s − 1 edges (with eac h edge corresp onding to a simp le path and its reversal), and hence Γ ′′ n ( w ) is a tree as it conta in s s v ertices, s − 1 edges, and must b e connected by the recurr e n c e of w (which follo ws from Prop osition 2.2). More formally: Prop o sit ion 3.5. An infinite wor d w with F ( w ) close d under r eversal satisfies e quality (II) if and only if the fol lowing c onditions hold: 1) any simple p ath b e twe en a sp e c i a l facto r and its r eversal is p alindr omic; 2) the sup er r e duc e d R auzy gr aph Γ ′′ n ( w ) is a tr e e f or al l n . 7 Pr o of. S upp ose w is an in finite w ord with F ( w ) closed und er reversal. Then w is recurrent by Prop osit ion 2.2. W e h a v e already sho wn that conditions 1) and 2) imp ly that w satisfies equal- it y (I I). C o nversely , if at least one of cond i tions 1) and 2) do e s not h o ld , then P ( n ) + P ( n + 1) < C ( n + 1) − C ( n ) + 2 (by the arguments preceding th is pr o p osition), i.e., w do es not satisfies equal- it y (I I). T o complete the pr oof of “(I) ⇒ (I I)” , it remains to show that any recurr en t ric h infinite wo r d w satisfies condition 2) of Prop ositio n 3.5, since we h a ve already shown that condition 1) holds for any such w (using Lemma 3.3). T he pr oof of the fact that w satisfies condition 2) u ses the follo wing tw o lemmas (Lemmas 3.6–3.7 ) . Notation. Giv en tw o distinct sp ecia l fact ors v , w of the same length n , we wr ite v 6→ w if there do es not exist a directed edge from v to w in th e redu c ed Rauzy graph Γ ′ n ( w ) (i.e. , if there do es not exist a simp le path from v to w ). Lemma 3.6. Supp ose w is a r e curr ent rich infinite wor d and let v , w b e two distinct sp e cial factors of w of the sam e length with v 6∈ { w, ˜ w } . If ther e exists a simple p ath P fr om v to w , then P is unique and ther e also exists a unique simple p ath fr om ˜ w to ˜ v (namely, the r eversal of P ). M or e over: i) v 6→ ˜ w , and henc e w 6→ ˜ v (unless w is a p alindr ome); ii) ˜ w 6→ v , and henc e ˜ v 6→ w (unless v is a p alindr ome); iii) w 6→ v , and henc e ˜ v 6→ ˜ w (unless v and w ar e b oth p alindr omes). Pr o of. By closure u nder reve r sal (Remark 2.1), if there exists a simple path P from v to w , then the reversal of P is a simp le path f r om ˜ w to ˜ v in th e Rauzy graph of order | v | = | w | = n . T o prov e the uniqueness of P , let us supp ose there exist tw o different simple paths P 1 , P 2 from v to w in the Rauzy grap h Γ n ( w ). Th en P 1 = v u 1 · · · u k w and P 2 = v z 1 · · · z ℓ w for some k , l ∈ N , where u 1 , . . . , u k , z 1 , . . . , z ℓ are non-sp ecial factors of w of length n and u i 6 = z i for some i . Note that either P 1 or P 2 (not b oth) ma y b e of th e form v w . T o k eep th e rest of the pr oof as simp le as p ossible, w e assum e hereafter th a t neither v n or w is a palindrome; the arguments are similar, and in fact easier, in the cases when either v or w (or b ot h ) is a palindrome. Consider a path Q of minimal length b eginning w it h P 1 and ending with P 2 (in the Rauzy graph Γ n ( w )): Q = P 1 · · · P 2 = v u 1 · · · u k w · · · v | {z } Q 1 z 1 · · · z ℓ w. First we obs e r v e that Q con tains ˜ v since any path from v to itself must pass thr ou gh ˜ v , b y Lemma 3.4. Moreo ver, the left-most ˜ v in Q must o ccur in the subp a th Q 1 (since ˜ v is not equal to any of the n on -sp ecial factors u i , z j and ˜ v 6 = w ). Therefore Q = v u 1 · · · u k w · · · ˜ v | {z } Q 2 · · · v z 1 · · · z ℓ w where the subp ath Q 2 ends with the left-most ˜ v in the p at h Q . By Lemma 3.4, Q 2 is a path from v to ˜ v that does not con tain v or ˜ v as an interior vertex. Thus, by L e m m a 3.3, Q 2 is palindromic, 8 and hence Q 2 ends with the rev ersal of the path P 1 since it b egins with P 1 . More explicitly: Q = v u 1 · · · u k w | {z } P 1 · · · Q 3 z }| { ˜ w ˜ u k · · · ˜ u 1 ˜ v | {z } e P 1 · · · v z 1 · · · z ℓ w | {z } P 2 . W e distinguish tw o cases. Case 1: If th e subpath Q 3 con tains w as a terminal vertex only , then ˜ w is not an interior verte x of Q 3 by Lemm a 3.4, and hence Q 3 is palindromic by Lemma 3.3. It follo ws that k = ℓ and z i = u i for all i = 1 , . . . , k . Thus P 1 = P 2 ; a contradiction. Case 2: If the subpath Q 3 con tains w as an interior vertex, then Q 3 first p asses through w after taking the p a th e P 1 (at the b eginning) and b e f ore taking the p at h P 2 (at the end ). Hence, by Lemma 3.3, Q 3 b eg ins w ith a palindr omic path from ˜ w to w th a t b egins with e P 1 and h ence ends with P 1 . But then Q passes through the path P 1 at least twice before taking the path P 2 , con tradicting the fact that Q is a p at h of minimal length b eginning with P 1 and ending w i th P 2 . Both cases lead to a cont r a d ic tion; thus the simple path P from v to w is unique (and its reve r sal e P is the u nique simp le path from ˜ w to ˜ v ). It remains to sh ow that conditions i )– iii ) hold. As ii ) is symmetric to i ), we prov e only th a t i ) and iii ) are satisfied. By what precedes, it suffices to consider paths in th e redu c ed Rauzy graph Γ ′ n ( w ). i ): Arguing by con trad iction, let u s supp ose that there exists a (unique) simple path from v to ˜ w , i.e., there exists a d irec ted edge fr o m v to ˜ w in the redu c ed Rauzy graph Γ ′ n ( w ). T hen (from ab ov e) we know that there also exists a directed edge f r o m w to ˜ v . C o n sider a shortest path Q in the reduced Rauzy graph Γ ′ n ( w ) beginnin g with v ˜ w and ending with v w . By Lemma 3.4, any path from v to itself p a s s e s throu gh ˜ v , so we m a y w rite Q = v ˜ w · · · ˜ v | {z } Q 1 · · · v w, where the su b path Q 1 ends w it h the left-most ˜ v in the path Q . By Lemmas 3.3–3.4 , the p a th Q 1 = v ˜ w · · · ˜ v is palindr o m ic , and hence it ends w it h w ˜ v . So we h av e Q = v ˜ w · · · w ˜ v · · · v w ; moreo ver, by Lemma 3.4, ˜ w must occur b e twe en the last t wo w ’s s ho wn here. In particular, Q = v ˜ w · · · w ˜ v · · · ˜ w | {z } Q 2 · · · v w where the s u bpath Q 2 con tains ˜ w as a termin a l v ertex only . Thus, by Lemm as 3.3–3 .4 , the path Q 2 = w ˜ v · · · ˜ w is palindromic, and hence it end s with v ˜ w . But then Q end s with a sh o r t er path of the form v ˜ w · · · v w , contradicti n g the fact th a t Q is a path of minimal length b eginning with v ˜ w and ending w i th v w . iii ): Again, the pro of p roceeds by contradictio n . Supp ose there exists a (un ique) simple path fr o m w to v . Consider a shortest path Z in the reduced Rauzy graph Γ ′ n ( w ) b eginning with w v and ending with v w . By L emm a 3.4, the path Z m us t pass through ˜ w ; th u s Z = w v · · · ˜ w | {z } Z 1 · · · v w. 9 where the sub path Z 1 ends w it h the left-most ˜ w in the path Z . Now it follo ws from Lemmas 3.3–3.4 that the s ubpath Z 1 is palindromic, and h ence Z 1 must end with ˜ v ˜ w . So we may wr it e Z = w v · · · ˜ v ˜ w · · · v | {z } Z 2 w. If the subpath Z 2 con tains v as a terminal vertex only , then neither v n or ˜ v is an interio r vertex of Z 2 by Lemm a 3.4. Thus Z 2 is palindromic by Lemma 3.3, and h e n c e Z 2 ends with w v . But th e n the path Z ends with the path wv w , wh ich is imp ossible by Lemma 3.4. Thus, the subpath Z 2 must pass th rough v at an earlier p o int, and hence we hav e Z 2 = ˜ v ˜ w · · · v · · · v . In particular, the path Z 2 b eg ins with a palindromic su bpath of the form ˜ v ˜ w · · · w v , by Lemma 3.3. Bu t then the path Z ends w it h a sh o r t er path from w v to v w , con tradicting the minimality of Z . Notation. F or a finite word v , let v ǫ represent either v or ˜ v and set v − ǫ := e v ǫ . Lemma 3.7. L et w b e a r e curr ent rich infinite wor d. F or fixe d n ∈ N + , supp ose the sup er r e duc e d R auzy gr aph Γ ′′ n ( w ) c ontains at le ast thr e e distinct vertic es: [ v 1 ] , [ v 2 ] , . . . , [ v s ] , s ≥ 3 . Then, for e ach k with 3 ≤ k ≤ s , the r e duc e d R auzy gr aph Γ ′ n ( w ) c ontains a p ath fr om v 1 to v ǫ k k of the form: v 1 v ǫ 2 2 · · · v 2 v ǫ 3 3 · · · v k − 2 v ǫ k − 1 k − 1 · · · v k − 1 v ǫ k k , wher e for al l i = 2 , . . . , k − 1 , the subp ath v ǫ i i · · · v i (which may c onsist of only the single vertex v ǫ i i ) do es not c ontain v j , ˜ v j for al l j with 1 ≤ j ≤ k , j 6 = i . Pr o of. W e use induction on k an d emp lo y s i m il ar reasoning to the pro of of Lemma 3.6. First consider th e case k = 3. Recurrence implies that Γ ′ n ( w ) is connected, so we ma y assu me without loss of generalit y that Γ ′ n ( w ) con tains a directed edge from v 1 to v ǫ 2 2 , a d irec ted edge fr o m v 2 to v ǫ 3 3 , and a path from v ǫ 2 2 to v 2 . Th at is, Γ ′ n ( w ) contains a path b eginning with v 1 v ǫ 2 2 and ending with v 2 v ǫ 3 3 . Consider s u c h a p a th of minimal length: Q = v 1 v ǫ 2 2 · · · v 2 v ǫ 3 3 . T o prov e the claim for k = 3, we show that non e of the sp ecia l factors v 1 , ˜ v 1 , v 3 , ˜ v 3 are interior ve r t ices of Q . If v ǫ 2 2 = v 2 , then Q = v 1 v 2 v ǫ 3 3 (by min imal ity) and w e are done. So let us assum e that v ǫ 2 2 = ˜ v 2 6 = v 2 . Observe that if v 1 is an interior vertex of Q , then ˜ v 1 must b e an inte rior vertex of Q since any path from v 1 to itself must cont ain ˜ v 1 , by Lemma 3.4. Similarly , if v ǫ 3 3 is an in terior vertex of Q , then v − ǫ 3 3 is an interior vertex of Q . Th erefore it suffices to show th a t ˜ v 1 and v − ǫ 3 3 are not interior ve r t ices of Q . W e prov e this fact only for ˜ v 1 as the proof is similar for v − ǫ 3 3 . Arguing by contradictio n , sup p ose ˜ v 1 is an interior vertex of Q . Then Q b egins with a p alin- dromic path fr o m v 1 ˜ v 2 to ˜ v 1 (by Lemmas 3.3–3.4), and this palindromic path clearly ends with v 2 ˜ v 1 . Hence Q = v 1 ˜ v 2 · · · v 2 ˜ v 1 · · · v 2 | {z } Q ′ v ǫ 3 3 where the sub path Q ′ b eg ins with a palindromic path from v 2 ˜ v 1 to ˜ v 2 (by Lemmas 3.3–3.4), and this palindromic p ath clearly ends with v 1 ˜ v 2 . But then the path Q ends with a shorter path fr o m v 1 ˜ v 2 to v 2 v ǫ 3 3 , cont r ad icting the minimality of Q . Thus the lemma holds for k = 3. 10 No w su ppose 4 ≤ k ≤ s and assume the claim holds for k − 1. Since Γ ′ n ( w ) is connected, it cont ains a path b eg in ning w it h v 1 v ǫ 2 2 · · · v 2 v ǫ 3 3 · · · v k − 2 v ǫ k − 1 k − 1 and ending with v k − 1 v ǫ k k (where the former path satisfies the conditions of the lemma). Consider such a path of m inimal length: Z = v 1 v ǫ 2 2 · · · v 2 v ǫ 3 3 · · · v k − 2 | {z } Z 1 v ǫ k − 1 k − 1 · · · v k − 1 | {z } Z 2 v ǫ k k (3.4) where for all i = 2 , . . . , k − 2, the su b path v ǫ i i · · · v i (which ma y consist of only the single vertex v ǫ i i ) do es not contain v j , ˜ v j for all j with 1 ≤ j ≤ k − 1, j 6 = i . T o prov e the induction step, w e sho w that the p at h Z satisfies the following tw o conditions: i ) the subp ath Z 1 con tains neither v k nor ˜ v k ; ii ) the subpath Z 2 = v ǫ k − 1 k − 1 · · · v k − 1 do es not contain v j , ˜ v j for all j with 1 ≤ j ≤ k , j 6 = k − 1. First supp ose that condition i ) is not satisfied, i.e., Z 1 con tains v k or ˜ v k . Wit h out loss of generalit y we assum e that v k is the right -most of the ve r ti ces v k , ˜ v k app ea r ing in Z 1 . Case 1 : S upp ose v ǫ k k = v k 6 = ˜ v k . Then Z ends with a path from v k to itself, which m us t pass through ˜ v k by Lemma 3.4; moreo ver, ˜ v k must b e an interior v ertex of Z 2 (by th e c hoice of v k ). Thus, by Lemmas 3.3–3.4, Z 2 v k (and h ence Z ) ends with a palindr o m ic path from ˜ v k to v k − 1 v k . Hence Z 2 con tains ˜ v k ˜ v k − 1 , and we ha ve: Z 2 v k = v ǫ k − 1 k − 1 · · · ˜ v k ˜ v k − 1 | {z } Z 3 · · · v k − 1 v k where the su bpath Z 3 ends with a palindromic path from v k − 1 to ˜ v k ˜ v k − 1 (by L e mm a s 3.3–3.4); thus Z 3 con tains v k − 1 v k . But then Z begins w it h a shorter path f rom Z 1 to v k − 1 v ǫ k k , contradicting the m inimali ty of Z . Case 2 : S u ppose v ǫ k k = ˜ v k . Then th e path Z (= Z 1 Z 2 ˜ v k ) ends with a p at h of the form: Z 4 = v k · · · · · · | {z } no v k , ˜ v k Z 2 ˜ v k . If v k or ˜ v k is an inte rior vertex of Z 2 , then we reac h a contradicti on using the same arguments as in Case 1. On the other hand, if n e ither v k nor ˜ v k is an interior vertex of Z 2 , then Z 4 is palindromic by Lemma 3.3. So the path Z 4 b eg ins with v k ˜ v k − 1 since it ends with v k − 1 ˜ v k . But then ˜ v k − 1 is an inte r io r vertex of Z 1 , a contradiction. Thus the p a th Z satisfies condition i ). In p ro ving this fact, we ha ve also shown that v k , ˜ v k are not interior vertices of Z 2 . It remains to sh o w that the subpath Z 2 do es not contain v j , ˜ v j for all j with 1 ≤ j ≤ k − 2 (and hence Z satisfies cond it ion ii )). W e prov e only that Z 2 do es not con tain ˜ v 1 or ˜ v 1 since th e pro of is similar when considering other v j , ˜ v j . Supp ose on the co ntrary that Z 2 con tains v 1 or ˜ v 1 . Then, by Lemmas 3.3–3.4, Z b egins w it h a palindr o mic p a th from v 1 to ˜ v 1 , and this palindromic path b egins with Y = Z 1 v ǫ k − 1 k − 1 (and hence ends w it h ˜ Y ) by the conditions on Z und e r the induction hyp othesis. More explicitly , we h av e: Z = palindromic z }| { v 1 v ǫ 2 2 · · · v k − 2 v ǫ k − 1 k − 1 | {z } Y · · · v − ǫ k − 1 k − 1 ˜ v k − 2 · · · v − ǫ 2 2 ˜ v 1 | {z } ˜ Y · · · v k − 1 v ǫ k k | {z } Z 5 . 11 Hence, as v k − 1 and ˜ v k − 1 are not interior vertices of Y (by the indu ct ion hypothesis), the sub p at h ˜ Y Z 5 b eg ins with a palindromic p a th from v − ǫ k k − 1 to v ǫ k − 1 k − 1 , and this p a lin d romic path b eg in s with ˜ Y (and hence ends with Y ), by Lemmas 3.3–3.4. Bu t th en Z ends with a shorter path from Y to v k − 1 v ǫ k k , contradicting the minimalit y of Z . W e conclude that th e s ubpath Z 2 = v ǫ k − 1 k − 1 · · · v k − 1 do es not con tain v j , ˜ v j for all j w ith 1 ≤ j ≤ k , j 6 = i (i.e., th e path Z satisfies condition ii )), and th e pro o f is thus complete. Lemma 3.8. Supp ose w is a r e curr ent rich infinite wor d. Then the sup er r e duc e d R auzy gr aph Γ ′′ n ( w ) is a tr e e f or al l n ∈ N + . Pr o of. Firs t recall that for all n , Γ ′′ n ( w ) is connected (b y th e recur r ence prop ert y of w ). Moreo ver, Lemma 3.6 tells us that if t wo distinct vertices in Γ ′′ n ( w ) are j o in e d by an edge, th e n th is edge is unique (and correspond s to a simple path and its rev ersal). It remains to show that Γ ′′ n ( w ) does not contain an y cycle (i.e., do es not contain a chain linking a vertex with itself ). Supp ose on th e contrary that Γ ′′ n ( w ) cont ains a cycle for some n . Then Γ ′′ n ( w ) must cont ain at least th ree distinct vertic es: [ v 1 ], [ v 2 ], . . . , [ v s ], s ≥ 3, and a cycle of the follo wing f o r m: [ v 1 ]—[ v 2 ]— · · · —[ v k ]—[ v 1 ] for some k w i th 3 ≤ k ≤ s. (3.5) W e thus deduce f r o m Lemma 3.7 th a t the reduced Rauzy graph Γ ′ n ( w ) contains a path fr o m v 1 to v ǫ 1 1 of th e form: P = v 1 v ǫ 2 2 · · · v 2 v ǫ 3 3 · · · v k − 2 v ǫ k − 1 k − 1 · · · v k − 1 v ǫ k k · · · v k v ǫ 1 1 , where for all i = 2 , . . . , k , the subpath v ǫ i i · · · v i (which m a y consist of only the single vertex v ǫ i i ) do es not contain v j , ˜ v j for all j with 1 ≤ j ≤ k , j 6 = i . (Note that P corresp onds to the cycle giv en in (3.5).) First supp ose that v 1 is a palindrome. In this case, as neither v 1 nor ˜ v 1 is an inte r ior vertex of P , it must b e a p al in dromic p at h by Lemma 3.3. But then v k = v − ǫ 2 2 , a contradiction (as k ≥ 3). No w sup pose that v 1 is not a palindrome. I f v ǫ 1 1 = ˜ v 1 , then we d e d uce (as ab o ve , using Lemma 3.3) that th e path P must b e palindr o mic, yielding a con tradiction. On the other hand, if v ǫ 1 = v 1 , then, by Lemma 3.4, th e path P must pass through ˜ v 1 , a contradiction. Thus Γ ′′ n ( w ) is a tree. This conclud e s our pr oof of the “(I) ⇒ (I I)” part of Theorem 1.1. 3.2 (I I) implies (I) Conv ersely , su ppose w is an infinite w ord with F ( w ) closed un der reversal and satisfying equal- it y (I I). T h en w satisfies conditions 1) and 2) of Prop osit ion 3.5. No w, arguing by con tr ad iction, supp ose w do es not satisfy prop ert y (I) (i.e., w is n o t rich). Then there exists a palindr o m ic factor p that has a non-p alind romic complete return u in w ; in particular, we ha ve u = pq av b ˜ q p for some words q , v (p ossibly empty) and letters a , b , with a 6 = b . So the words pq a , b ˜ q p and their reversals a ˜ q p , pq b are factors of w . Thus pq (resp. ˜ q p ) is a right- sp ec ial (resp. left-special) f a ctor of w . Hence, if u do es not con tain any other sp ecial factors, th e n u forms the label of a non-palindromic simple p a th b eginning with pq and ending with ˜ q p . But this contradicts condition 1) of Prop ositi on 3.5. T herefore u must cont ain other sp ec ial factors of length n := | pq | , b esides pq and ˜ q p . In particular, u b eg in s with the lab el of a simp le p a th of order n b eginning with pq and ending with another sp ec ial factor s 1 of length n . Similarly , u end s with the label of a simple path of order n b eginning with a sp ec ial factor s 2 of length n an d end ing with ˜ q p . Moreo ver, sin c e u is a complete return to p , n e ither s 1 nor s 2 is equal to pq or ˜ q p (otherwise 12 p o ccurs as an in terior fact or of u ). Thus, in the su per red u ce d Rauzy graph Γ ′′ n ( w ), th er e is an edge b et ween the vertex [ pq ] and eac h of the vertices [ s 1 ] and [ s 2 ]. I n particular, there exists a path of the f o r m : [ s 1 ]—[ pq ]—[ s 2 ]. F urthermore, as u contains a factor that b egins with s 1 and ends with s 2 and contains no o cc u rrence of pq or ˜ q p , there also exists a chain (or p ossibly just an edge) linking [ s 1 ] and [ s 2 ] that do e s not conta in the verte x [ pq ]. Thus, if { s 1 , ˜ s 1 } 6 = { s 2 , ˜ s 2 } , then we see that Γ ′′ n ( w ) contains a cycle, co ntradicting condition 2) of Prop ositio n 3.5. On the other hand , if { s 1 , ˜ s 1 } = { s 2 , ˜ s 2 } , th e n there are at least tw o edges joining the vertice s [ s 1 ] and [ pq ]. Ind ee d , there exists a simple path P 1 from pq to s 1 and there also exists a simple path P 2 either from s 1 to ˜ q p or from ˜ s 1 to ˜ q p . By closure under reversal, the reve r s a ls ˜ P 1 , ˜ P 2 of the resp ectiv e simple paths P 1 , P 2 also exist. Moreo ve r , none of these four simp le paths coincide. Certainly , P 1 6 = P 2 , P 1 6 = ˜ P 1 , and P 2 6 = ˜ P 2 as neither s 1 nor ˜ s 1 is equal to pq or ˜ qp , and P 1 6 = ˜ P 2 as the second ve r te x in P 1 ends with the letter a , whereas the second vertex in the path ˜ P 2 ends w it h the letter b 6 = a . So Γ ′′ n ( w ) is n o t a tree, contradicting cond ition 2) of Prop osition 3.5. This concludes our pro of of Theorem 1.1. 4 A few consequences and remarks F rom Theorem 1.1, we easily deduce that prop ert y (I) is equiv alent to equalit y (I I) for any uniform l y recurrent infinite wo r d. Indeed, equalit y (I I) imp lie s the existence of arbitrarily long p al in dromes since P ( n ) + P ( n + 1) ≥ 2 for all n , so together with un iform recurrence one can r e adily show that factors are closed un der rev ersal; hence prop er ty (I) holds by Th eorem 1.1. Conv ersely , richness (prop ert y (I)) together with uniform r ec u rrence im p lie s closure under reversal by Remark 2.1, and hence equalit y (I I) h ol d s. Question: In the statement of The or em 1.1, c an the hyp othesis of factors b e i ng close d under r eve rsal b e r eplac e d b y the we aker hyp othesis of r e curr enc e? As abov e, it follo ws directly from Theorem 1.1 and Remark 2.1 that for any recurrent infinite wo r d w , if w satisfies prop ert y (I) (i.e., if w is ric h , and hence has factors closed under reversal), then equalit y (II) holds. Ho wev er, to prov e the conv erse using our metho ds, one w ould need to know that any recurrent infi nite word satisfying equality (I I) has factors closed un der rev ersal. W e could not find a p roof of this claim nor could we find a count er-example. Let us p oin t out that whilst un iform recurrence and the existence of arbitrarily long palindromes imply closure under reve r sal , this is not true in the case of recur rence only . F or instance, consider the following infinite wo r d: s = bca 2 bca 3 bca 2 bca 4 bca 2 bca 3 bca 2 bca 5 bc · · · , which is the limit as n go es to in finit y of the sequence ( s n ) n ≥ 1 of finite wo r ds defi ned by: s 1 = bc and s n = s n − 1 a n s n − 1 for n > 1 . This infinite word is clearly recurrent (but not un ifo r mly recurrent) and contains arbitrarily long palindromes, bu t its s et of factors is not clo sed under reversal. (Note that s is not rich and does not satisfy equality (II ).) If one could sh o w that recurrence together with equalit y (I I) implies arbitrarily long p a lind romic prefixes, this would b e enough to p r o ve that factors are closed under reve r sal . In the cont ext of fi nite wo r d s w , the hyp ot h e s is of factors b ei n g closed u nder r eversal can b e replaced by the requir e ment that w is a palindrome. In d ee d , all we really need is the su per red uce d Rauzy grap h to b e connected, which is tru e for palindromes. 13 Theorem 4.1. F or any p alindr ome w , the fol lowing pr op erties ar e e quivalent: i ) w c ontains | w | + 1 distinct p alindr omes; ii ) al l c omplete r eturns to p alindr omes in w ar e p alindr omes; iii ) P ( i ) + P ( i + 1) = C ( i + 1) − C ( i ) + 2 for al l i with 0 ≤ i ≤ | w | . W e n o w pr o ve tw o easy consequences of Th e orem 1.1. Corollary 4.2. Supp ose w is a r e curr ent rich infinite wor d. Then the fol low i ng pr op erties hold . i ) w is (pur ely) p erio dic if and only i f P ( n ) + P ( n + 1) = 2 for some n . ii ) ( P ( n )) n ≥ 1 is eventual ly p erio dic with p erio d 2 if and only if ther e exist non-ne gative inte gers K , L , N such that C ( n ) = K n + L for al l n ≥ N . Pr o of. S upp ose w is a recurrent ric h infinite word. Then P ( n ) + P ( n + 1) = C ( n + 1) − C ( n ) + 2 for all n , by Theorem 1.1 an d Remark 2.1. i ): If P ( n ) + P ( n + 1) = 2 for some n , then C ( n + 1) = C ( n ), and hence w is eve ntually p eriodic; in particular, w must b e (purely) p e r iodic as it is recurrent. Conv ersely , if w is p eriod ic , then C ( n + 1) = C ( n ) for some n , and h ence P ( n ) + P ( n + 1) = 2. ii ): The co n ditio n on C ( n ) implies that for all n ≥ N , C ( n + 1) − C ( n ) = K , and hence P ( n ) + P ( n + 1) = K + 2 = P ( n + 1) + P ( n + 2). Thus P ( n ) = P ( n + 2) f o r all n ≥ N . Con versely , sup pose ( P ( n )) n ≥ 1 is even tually p eriod ic with p eriod 2. Then there exists a non-negativ e in teger N such that P ( n ) = P ( n + 2) for all n ≥ N . Hence, for all n ≥ N , P ( n ) + P ( n + 1) = C ( n + 1) − C ( n ) + 2 = P ( n + 1) + P ( n + 2) = M ≥ 2. Therefore C ( n + 1) − C ( n ) = M − 2 for all n ≥ N . Remark 4.3. Item ii ) of the ab o ve corollary can be compared with a result of J. Cassaigne [8], who prov ed that if C ( n ) h a s linear growth, then C ( n + 1) − C ( n ) is b ounded. Remark 4.4. In [5], Bala ˇ zi et al. remarked: “ According to our knowledge, all known ex- amples of infinite words wh ich satisfy the equality P ( n ) + P ( n + 1) = C ( n + 1) − C ( n ) + 2 for all n ∈ N hav e sublinear factor complexity .” Actually , there do exist recurrent r ic h in- finite words with non-su blinear complexit y . F or instance, the follo win g example from [19]: abab 2 abab 3 abab 2 abab 4 abab 2 abab 3 abab 2 abab 5 · · · (whic h is the fixed point of the morphism : a 7→ abab , b 7→ b ) is a recurrent ric h infinite word and its complexit y C ( n ) gro ws quadratically w ith n . Another example that was in dica ted to u s by J . Cassaigne is the fixed p o int of a 7→ aab , b 7→ b : aabaabbaabaabbbaabaabbaabaabbbbaabaabbaabaabbbaabaabbaabaabbbbb · · · . It is a r e cu r ren t ric h infinite word and its complexit y is equiv alen t to n 2 / 2. More precisely , P ( n ) + P ( n + 1) − 2 = C ( n + 1) − C ( n ) = n + 1 − ♯ { k > 0 | 2 k + k − 2 < n } . In [10], X. Droubay et al. sh o we d that the family of episturmian wor ds (e.g., see [10, 21, 18]), which includes the well- kn o wn Sturmian wor ds , compr ises a sp ecial class of uniformly recurrent ric h infi nite words. Sp ecifical ly , they prov ed that if an in fi nite word w is episturmian, then any factor u of w cont ains exactly | u | + 1 d istinct palindromic factors (see [10, Cor. 2]). An alternativ e pro of of the ric hn ess of epistu rmian words can b e found in the pap er [3] wh e r e th e fourth author, together with V. Anne and I. Zorca, prov ed that for episturmian words, all complete returns to palindromes are palind romes. (A shorter pro of of this fact is also given in [7].) More recently , 14 P . Bal´ aˇ zi et al. [5] sho wed th at all strict episturmian wo r ds (i.e., Arn oux- R auzy se que nc es [4, 23]) satisfy P ( n ) + P ( n + 1) = C ( n + 1) − C ( n ) + 2 for all n . This fact, together with Theorem 1.1, provides yet another p roof that al l episturmian words are ric h (since an y factor of an episturmian wo r d is a f a ctor of some strict episturmian word). Sturmian words are exactly th e aperio dic epistu rmian words o ver a 2-letter alphabet. They hav e complexit y n + 1 for eac h n and are c haracterized by their palindr omic complexit y: any Sturm ian wo r d has P ( n ) = 1 whenever n is ev en and P ( n ) = 2 whenever n is odd (see [11]). F r o m th e se observ ations, on e can readily c heck that Sturm ian wo rd s satisfy equality (I I) (and hence they are ric h ). W e ca n no w say even more: the set of factors of all Sturmian words satisfies equality (I I). T o show this, we fir st recall that F. Mignosi [22] prov ed that, for any n ≥ 0, the number c ( n ) of finite Sturmian wor ds of length n is giv en by c ( n ) = 1 + n X i =1 ( n + 1 − i ) φ ( i ) , where φ is Euler’s totient function . More recen tly , in [9], the seco n d au th o r together with A. de Luca prov ed th a t for any n ≥ 0, the num b er p ( n ) of S tu rmian palindromes of length n is given by p ( n ) = 1 + ⌈ n/ 2 ⌉− 1 X i =0 φ ( n − 2 i ) . Equiv alen tly , for an y n ≥ 0, p (2 n ) = 1 + n X i =1 φ (2 i ) and p (2 n + 1) = 1 + n X i =0 φ (2 i + 1) . Thus, f o r all n ≥ 0, p (2 n ) + p (2 n + 1) = 2 + n X i =1 { φ (2 i ) + φ (2 i + 1) } + 2 = 2 n +1 X i =1 φ ( i ) + 2 , and c (2 n + 1) − c (2 n ) + 2 = 2 n +1 X i =1 (2 n + 2 − i ) φ ( i ) − 2 n X i =1 (2 n + 1 − i ) φ ( i ) + 2 = φ (2 n + 1) + 2 n X i =1 φ ( i ) + 2 = 2 n +1 X i =1 φ ( i ) + 2 = p (2 n ) + p (2 n + 1) . F rom this p oin t of view, it would b e interesti n g to count for instance the number of all bin ary ric h words of length n for eac h n . Ac knowledge ments. The authors would lik e to thank J a cqu es Justin for helpf ul comments and s ugg estions on a preliminary ve r s i on of this pap er. The first three authors would also like to ac knowledge the hospitalit y of the Department of Mathematics at the Un iv ersity of North T exas where this wo r k was done. 15 References [1] J.-P . Allouc he, M. Baak e, J. Cassaigne, D. Damanik, Pa lind rome complexit y , The or et. Comput. Sci. 292 (2003) 9–31. [2] P . Ambro ˇ z, C . F rougny , Z. Mas´ ak ov´ a, E . Pelan tov´ a, Palindromic complexity of in finite words associated w ith simp le Parry numbers, Ann. Inst. F ourier (Gr enoble) 56 (2006) 2131–21 60. [3] V. Ann e , L.Q. Zamboni, I. Zorca, P alindr ome s and p seudo-palindromes in episturmian and pseudo-palindromic infi nite w ords, in: Pr o c e e dings of the Fifth International Confer enc e on Wor ds (Montr ´ eal, Canada), September 13–17, 2005. Public ations du L aCIM 36 (2005) 91–100. [4] P . Ar noux, G. Rauzy , Repr ´ esen tation g ´ eom ´ etrique de suites de complexit´ e 2 n + 1, Bu l l. So c. Math. F r anc e 119 (1991 ) 199–21 5. [5] P . Bal´ a ˇ zi, Z. Mas´ ako v´ a, E. Pelan tov´ a, F actor versus palindromic complexity of uniformly recurrent infi nite words, The or et. Comput. Sci. 380 (2007 ) 266–27 5. [6] S. Brlek, S. Hamel, M. Niv at, C. Reutenauer, On the palindr omic complexity of infinite words, Internat. J. F ound. Comput. Sci . 15 (2004) 293–3 06. [7] M. Bucci, A. de Lu ca, A. De Luca, L.Q. Z amb o n i, On some problems related to palindromic closure, The or et. Inform. Appl. (in press), doi:10.10 51/ita:200 7 06 4. [8] J. Cassaigne, S pecial f a ctors of sequ e n ce s with lin e ar subw ord complexit y , in: Developments in L anguage The ory II , W orld Scien tific, Singap ore, 1996, pp . 25–34. [9] A. de Luca, A. De Lu ca , Com b i n at orial prop erties of Sturm ia n palindromes, Internat. J. F ound. Comput. Sci. 17 (2006) 557– 573. [10] X. Drouba y , J. J ustin, G. Pirillo, Episturm ia n wo r d s and some constructions of de Lu ca and Rauzy , The or et. Com put. Sci. 255 (2001) 539–55 3. [11] X. Droub a y , G. Pirillo, P alindromes and Stur mia n w ord s, The or et. Comput. Sci. 223 (1999) 73–85 . [12] F. Durand, A c haracterization of su bstitutiv e sequences using return words, Discr ete Math. 179 (1998) 89–101. [13] F. Du rand, A generalizatio n of Cobh am’s theorem, The ory Comput. Syst. 31 (1998) 169–1 85. [14] F. Durand, Linearly recurr en t su b shifts hav e a finite num b er of non-p eriodic sub shift factors, Er g o dic The ory Dynam. Systems 19 (1999) 953–993. [15] S. F erenczi, C. Mauduit, A. Nogueira, Substitutional d ynamica l systems: algebraic c h a r acter- ization of eigenv alues, Ann. Sci. ´ Ec ole Norm. Sup. 29 (1995 ) 519–533 . [16] S. Fisc h le r, Palindromic prefixes and episturmian words, J. Combin. The ory Ser. A 113 (2006) 1281– 1304. [17] S. Fischler, Pa lind romic prefixes and diophantine approximatio n , Monatsh. Math. 151 (2007) 11–37 . [18] A. Glen, J . Justin, Episturmian wo r d s: a survey , Pr e p rin t, 2007, arXiv:0801.165 5 . 16 [19] A. Glen, J . Justin, S. Widmer, L.Q . Zambon i, Pa lind romic richness, Eur op e an J. Combin. , to app ea r , arXiv:0801 .1656 . [20] C. Holton, L.Q. Zamboni, Descendan ts of primitiv e su bstitutions, The ory Comput. Syst. 32 (1999 ) 133–15 7. [21] J. Justin, G. Pirillo, Episturmian w ords and epistur mia n morphisms , The or et. Comput. Sci . 276 (2002) 281–313. [22] F. Mignosi, On the number of f a ctors of S turmian w ord s, The or et. Comput. Sci. 82 (1991) 71–84 . [23] G. Rauzy , Suites ` a termes d an s u n alphab et fin i, in: S´ emin. Th ´ eorie des Nombr es , Exp. No. 25, p p. 16, Univ. Bordeaux I , T alence, 1982–1 983. [24] A. S ie gel, Pure d isc r e te sp ectrum dynamical systems and p eriodic tiling asso cia ted w it h a substitution, Ann. Inst. F ourier (Gr enoble) 54 (2004) 341–381 . 17

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment