Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests

Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Phylogenetic trees are leaf-labelled trees used to model the evolution of species. In practice it is not uncommon to obtain two topologically distinct trees for the same set of species, and this motivates the use of distance measures to quantify dissimilarity. A well-known measure is the maximum agreement forest (MAF): a minimum-size partition of the leaf labels which splits both trees into the same set of disjoint, leaf-labelled subtrees (up to isomorphism after suppressing degree-2 vertices). Computing such a MAF is NP-hard and so considerable effort has been invested in finding FPT algorithms, parameterised by $k$, the number of components of a MAF. The state of the art has been unchanged since 2015, with running times of $O^(3^k)$ for unrooted trees and $O^(2.3431^k)$ for rooted trees. In this work we present improved algorithms for both the unrooted and rooted cases, with runtimes $O^(2.846^k)$ and $O^(2.3391^k)$ respectively. The key to our improvement is a novel branching strategy in which we show that any overlapping components obtained on the way to a MAF can be `split’ by a branching rule with favourable branching factor, and then the problem can be decomposed into disjoint subproblems to be solved separately. We expect that this technique may be more widely applicable to other problems in algorithmic phylogenetics.


💡 Research Summary

The paper addresses the Maximum Agreement Forest (MAF) problem, a fundamental measure for quantifying the dissimilarity between two phylogenetic trees that share the same set of leaf labels. Computing a MAF is NP‑hard (indeed APX‑hard), and the parameterized‑complexity community has long sought fixed‑parameter tractable (FPT) algorithms whose running time depends exponentially only on the optimum size k (the number of components in a minimum agreement forest). Prior to this work, the best known FPT branching algorithms achieved running times of O*(3^k) for the unrooted variant (uMAF) and O*(2.3431^k) for the rooted variant (rMAF). These algorithms rely on extensive case analyses (more than twenty distinct configurations) and achieve a branching factor of at most three.

The authors introduce a novel “split‑or‑decompose” branching technique that dramatically simplifies the analysis and improves the exponential base. The key observation is that, during the iterative process of cutting edges in one tree to form a forest F′, the components of F′ may overlap when viewed in the other tree T. Whenever such an overlap occurs, they prove that a safe branching rule exists that either (i) leaves the overlapping component unchanged or (ii) splits it into at least two smaller components. Crucially, this rule requires at most one additional cut, so the parameter k decreases by one in each branch, yielding a branching factor of 2.

After repeatedly applying the split rule until no overlaps remain, the instance decomposes into a collection of independent sub‑instances, each consisting of a single component of F′ that is disjoint from all others in T. Because the sub‑instances are independent, the total number of cuts needed across all of them still sums to k, and the overall recursion solves each sub‑instance separately using the existing branching rules. The decomposition therefore reduces the effective branching factor of the whole algorithm from three to two for the split steps, while the remaining steps retain the previously known recurrences. This structural approach eliminates the need for the cumbersome 23‑case analysis that underlies earlier algorithms.

Applying this framework to the unrooted case, the authors obtain an algorithm with running time O*(2.846^k), a substantial improvement over the previous O*(3^k). For the rooted case, they achieve O*(2.3391^k), a modest but non‑trivial gain over O*(2.3431^k). They also demonstrate the technique on a special class of trees—caterpillars—improving the best known bound from O*(2.49^k) to O*(2.4634^k).

The paper’s contributions can be summarised as follows:

  1. Structural Insight – Proving that any overlapping pair of components can be safely handled by a binary split with branching factor 2.
  2. Decomposition Strategy – After all overlaps are eliminated, the problem naturally splits into disjoint sub‑problems that can be solved independently, preserving the total parameter k.
  3. Algorithmic Improvements – Concrete FPT algorithms for uMAF (O*(2.846^k)) and rMAF (O*(2.3391^k)), and an improved bound for caterpillar instances.
  4. Methodological Impact – The split‑or‑decompose technique reduces reliance on exhaustive case analysis and is likely applicable to many MAF variants (e.g., hybridisation number, subtree prune‑and‑regraft distance, multi‑tree agreement forests).

While the authors do not provide empirical evaluation, they argue that overlapping components are common in practice, suggesting that the new branching rule will yield tangible speed‑ups on real phylogenetic data sets. The work represents the first structural advance in MAF branching algorithms since the seminal papers of Whidden et al., and it opens a promising avenue for further refinements and extensions in algorithmic phylogenetics.


Comments & Academic Discussion

Loading comments...

Leave a Comment