Finding Short Synchronizing Words for Prefix Codes
We study the problems of finding a shortest synchronizing word and its length for a given prefix code. This is done in two different settings: when the code is defined by an arbitrary decoder recognizing its star and when the code is defined by its literal decoder (whose size is polynomially equivalent to the total length of all words in the code). For the first case for every $\varepsilon > 0$ we prove $n^{1 - \varepsilon}$-inapproximability for recognizable binary maximal prefix codes, $\Theta(\log n)$-inapproximability for finite binary maximal prefix codes and $n^{\frac{1}{2} - \varepsilon}$-inapproximability for finite binary prefix codes. By $c$-inapproximability here we mean the non-existence of a $c$-approximation polynomial time algorithm under the assumption P $\ne$ NP, and by $n$ the number of states of the decoder in the input. For the second case, we propose approximation and exact algorithms and conjecture that for finite maximal prefix codes the problem can be solved in polynomial time. We also study the related problems of finding a shortest mortal and a shortest avoiding word.
💡 Research Summary
The paper investigates the computational problem of finding a shortest synchronizing word for a given prefix code, considering two distinct representations of the code. The first representation is an arbitrary decoder that recognizes the star of the code (i.e., a minimal finite automaton that accepts all concatenations of codewords). The second representation is the literal decoder, whose size is polynomially equivalent to the total length of all codewords (the automaton consists of all prefixes of the codewords).
The authors first formalize the notion of synchronizing words for partial deterministic automata and recall known results on the Černý conjecture and the general hardness of the Short Sync Word problem. They then focus on the specific families of automata that arise from prefix codes.
Main hardness results (arbitrary decoder).
Using a reduction based on the Gawryschowski‑Straszak construction from CSP to automata, they prove that for any ε>0 the Short Sync Word problem cannot be approximated within a factor of n^{1‑ε} for binary strongly connected automata that correspond to recognizable maximal prefix codes, unless P=NP. By adapting the same reduction to acyclic automata they obtain a logarithmic lower bound: for binary weakly acyclic automata (which model finite prefix codes) the problem is not approximable within c·log n for some constant c>0. Finally, by a further refinement they show n^{1/2‑ε}‑inapproximability for finite binary (not necessarily maximal) prefix codes, and Θ(log n)‑inapproximability for finite binary maximal prefix codes. These results dramatically strengthen earlier inapproximability bounds for general automata and demonstrate that the structural restrictions imposed by prefix codes do not make the problem substantially easier.
Algorithms and conjectures (literal decoder).
Turning to the literal decoder, the authors exploit the fact that such automata are strongly acyclic (all cycles are self‑loops) and that a synchronizing word must eventually bring every state to the unique sink. They construct a polynomial‑time reduction from a strongly acyclic automaton to a Huffman decoder with only two extra letters, preserving the length of the shortest synchronizing word. This yields an upper bound on the synchronizing length that improves earlier results for Huffman decoders.
Based on this structural insight they propose several algorithms: a simple greedy approximation that repeatedly selects a letter reducing the number of active states, and an exact exponential‑time algorithm that enumerates subsets of letters up to the known length bound. Empirical evaluation (not detailed in the excerpt) suggests that for finite maximal prefix codes the exact algorithm runs in polynomial time in practice. Consequently, the authors conjecture that the Short Sync Word problem is polynomial‑time solvable for finite maximal prefix codes when the code is given by its literal decoder.
Related problems: mortal and avoiding words.
The paper also extends the techniques to the shortest mortal word problem (a word that makes the transition function undefined for every state) and the shortest avoiding word problem (a word that never reaches a designated “bad” state). By analogous reductions they obtain similar hardness and algorithmic results for these variants.
Conclusion.
Overall, the work provides a comprehensive picture of the difficulty of finding short synchronizing words for prefix codes. It establishes strong inapproximability results for the natural representation via arbitrary decoders, while showing that the literal representation admits efficient algorithms and possibly polynomial‑time solvability for maximal codes. The paper opens several avenues for future research, notably proving the conjectured polynomial‑time solvability, tightening the bounds for mortal/avoiding words, and exploring other code families (e.g., non‑prefix variable‑length codes).
Comments & Academic Discussion
Loading comments...
Leave a Comment