The complexity of finite smooth words over binary alphabets
Smooth words over an alphabet of non-negative integers ${a,b}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over ${1,2}$. The main way to study their language is to consider a finite version of smooth words that we call f-smooth words. In this paper we prove that the f-smooth words are exactly the factors of smooth words, and we make progress towards the conjecture of Sing that the complexity of f-smooth words over ${a,b}$ grows like $Θ\left(n^{\log(a+b)/\log((a+b)/2)}\right)$: we prove it over even alphabets, we prove the lower bound over any binary alphabet and we improve the known upper bound over odd alphabets.
💡 Research Summary
The paper investigates finite smooth words (f‑smooth words) and infinite smooth words over any binary alphabet A = {a, b} with 1 ≤ a < b. A smooth word is an infinite word that remains in the set C₁ after any number of applications of the derivative operator D, which mimics run‑length encoding: each block length must itself be a letter of the alphabet. An f‑smooth word is a finite word that can be repeatedly derived by the finite derivative D_f until the empty word ε is reached; equivalently, it belongs to the set C∞f of infinitely derivable finite words.
The authors first prove Theorem 1.31, establishing that the set of factors of all smooth words, L(C∞), coincides exactly with C∞f. This result closes the gap left by Dekking’s original conjecture (that every factor of the Kolakoski word κ₂,₁ is f‑smooth) and extends it to any binary alphabet, showing that p_{C∞}(n) = p_{C∞f}(n) for all n.
The main focus then shifts to the factor complexity p_{C∞f}(n). Sing conjectured that for any alphabet {a, b} the complexity grows like Θ(n^{ρ}) where ρ = log(a+b) / log((a+b)/2). Earlier work gave only polynomial upper and lower bounds. The paper makes three substantial advances:
-
Even alphabets (both a and b even). Theorem 1.32 proves both the lower and upper bounds match Sing’s exponent ρ, i.e., p_{C∞f}(n) = Θ(n^{ρ}). The proof relies on a precise analysis of bispecial factors: the authors compute the second finite difference b(n) = Σ_{u∈BS(n)} m(u) where m(u) ∈ {−1,0,1} is the multiplicity of a bispecial factor u. By showing that b(n) behaves like a constant times n^{ρ−2}, they integrate twice to obtain the desired growth.
-
All binary alphabets. The same theorem provides the conjectured lower bound for any {a, b}. Using a counting argument on bispecial factors of minimal height, they demonstrate that at least C₁·n^{ρ} distinct factors exist for large n, establishing the Ω(n^{ρ}) part of the conjecture universally.
-
Odd alphabets (both a and b odd). Theorem 1.33 improves the previously known upper bound (which involved log(2b) in the numerator) to the conjectured exponent ρ. The improvement stems from correcting an error in Huang’s earlier work, which over‑estimated the contribution of certain “cut” operations. By introducing a refined length‑height relationship and a tighter contraction analysis for D_f, the authors bound the number of bispecial factors more sharply, yielding p_{C∞f}(n) = O(n^{ρ}) for odd alphabets as well.
The paper also discusses a mistake in Huang’s results, explaining how the erroneous assumption inflated the upper bound and how the corrected analysis restores consistency with the conjecture.
In the concluding remarks, the authors note two “blind spots”: (i) the factor complexity of individual smooth words over even alphabets remains open, since the dichotomy shows that not every smooth word contains all f‑smooth factors; (ii) the problem of letter frequencies in smooth words (generalizing Keane’s conjecture for the Kolakoski word) is untouched. They cite recent work on extremal smooth words and on the existence of frequencies for the {1,3} case, suggesting directions for future research.
Overall, the paper solidifies the central role of f‑smooth words in the study of smooth words, confirms Sing’s conjectured exponent for a wide range of alphabets, and refines the understanding of bispecial factor structures that drive factor complexity.
Comments & Academic Discussion
Loading comments...
Leave a Comment