Algorithms for Glushkov K-graphs

Algorithms for Glushkov K-graphs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The automata arising from the well known conversion of regular expression to non deterministic automata have rather particular transition graphs. We refer to them as the Glushkov graphs, to honour his nice expression-to-automaton algorithmic short cut (On a synthesis algorithm for abstract automata, Ukr. Matem. Zhurnal, 12(2):147-156, 1960, In Russian). The Glushkov graphs have been characterized (P. Caron and D. Ziadi, Characterization of Glushkov automata. Theoret. Comput. Sci., 233(1-2):75-90, 2000) in terms of simple graph theoretical properties and certain reduction rules. We show how to carry, under certain restrictions, this characterization over to the weighted Glushkov graphs. With the weights in a semiring K, they are defined as the transition Glushkov K-graphs of the Weighted Finite Automata (WFA) obtained by the generalized Glushkov construction (P. Caron and M. Flouret, Glushkov construction for series: the non commutative case, Internat. J. Comput. Math., 80(4):457-472, 2003) from the K-expressions. It works provided that the semiring K is factorial and the K-expressions are in the so called star normal form (SNF) of Bruggeman-Klein (Regular expressions into finite automata, Theoret. Comput. Sci., 120(2):197-213, 1993) The restriction to the factorial semiring ensures to obtain algorithms. The restriction to the SNF would not be necessary if every K-expressions were equivalent to some with the same litteral length, as it is the case for the boolean semiring B but remains an open question for a general K.


💡 Research Summary

The paper investigates the extension of Glushkov graphs—special transition graphs that arise from the classic regular‑expression‑to‑NFA construction—to the weighted setting. In the Boolean case, Glushkov’s algorithm assigns a unique state to each occurrence of a symbol in a regular expression and connects states according to the possible successive occurrences, yielding an automaton with n + 1 states for an expression of literal length n. The authors generalize this construction to weighted finite automata (WFA) over an arbitrary semiring K, defining “Glushkov K‑graphs” as the transition graphs of the weighted automata produced by the generalized Glushkov construction.

Two restrictive hypotheses are required for the theory to hold. First, the semiring K must be factorial (i.e., zero‑divisor‑free and every non‑zero element can be uniquely factored). This property guarantees that weight multiplication and addition behave well during the graph reductions and that the algorithms can be implemented deterministically. Second, the input K‑expression must be in Star Normal Form (SNF), a syntactic condition originally introduced by Bruggemann‑Klein for Boolean expressions. In SNF, the Follow and First sets of each closure (∗ or +) are disjoint, which ensures that each edge of the Glushkov graph is generated exactly once, preserving a clean acyclic structure.

The authors first reformulate the classical Glushkov construction using ordered pairs (coefficient, position) and define K‑versions of the First, Last, and Follow functions. These functions return sets of ordered pairs and are combined using a specialized ⊎ operation that merges coefficients belonging to the same position. With these definitions, they introduce a collection of reduction rules—called K‑rules—that iteratively eliminate non‑essential vertices and edges from an acyclic K‑graph while preserving language equivalence. A key technical result is the confluence of the K‑rules: regardless of the order in which the rules are applied, the same reduced graph is obtained, which makes the reduction process deterministic and polynomial‑time.

To handle more complex structures, the paper introduces the notion of an “orbit”: a maximal strongly connected component of vertices that share the same label. When each orbit satisfies the SNF conditions, it can be reduced independently, and the reduced orbits can be recombined to obtain a globally minimal Glushkov K‑graph. This orbit‑based decomposition enables a modular algorithm that first partitions the graph, applies local reductions, and finally merges the results.

The central algorithm takes a weighted automaton M as input and decides whether M is the Glushkov automaton of some proper K‑expression. If so, the algorithm reconstructs an equivalent K‑expression E. The reconstruction proceeds by (1) identifying the unique initial state and the set of final states, (2) extracting First, Last, and Follow information for each state, (3) traversing the graph backward to rebuild the closure and concatenation structure, and (4) applying the inverse of the K‑rules to obtain the original expression. If M does not arise from the Glushkov construction, the algorithm reports failure.

The contributions of the paper are threefold: (i) a rigorous extension of the Glushkov characterization to weighted automata over factorial semirings, (ii) a set of confluent reduction rules that enable polynomial‑time minimization of weighted Glushkov graphs, and (iii) an effective procedure for converting a Glushkov K‑graph back into a compact K‑expression, thereby closing the loop between expressions and automata in the weighted setting. The work also corrects an error in a previous characterization by Caron and Ziadi and adapts the epsilon‑normal and star‑normal forms to the weighted case.

Limitations are acknowledged. The necessity of the SNF restriction stems from the fact that, unlike the Boolean semiring, it is not known whether every K‑expression can be transformed into an equivalent SNF expression without increasing its literal length. This open problem restricts the applicability of the results to semirings where such a transformation is guaranteed (e.g., the Boolean semiring). Future research directions include investigating SNF equivalence for broader classes of semirings and exploring extensions to semirings that are not factorial.


Comments & Academic Discussion

Loading comments...

Leave a Comment