Benchmarking of algorithms for set partitions

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Set partitions are arrangements of distinct objects into groups. The problem of listing all set partitions arises in a variety of settings, in particular in combinatorial optimization tasks. After a brief review, we give practical approximate formulas for determining the number of set partitions, both for small and large set sizes. Several algorithms for enumerating all set partitions are reviewed, and benchmarking tests were conducted. The algorithm of Djokic et al. is recommended for practical use.

💡 Research Summary

The paper “Benchmarking of algorithms for set partitions” provides a comprehensive study of both the combinatorial background of set partitions (Bell numbers) and the practical performance of four modern non‑recursive algorithms that generate all partitions of a set of size n. After a brief motivation—enumerating all partitions is useful in optimization, scheduling, packing, and even poetic rhyme‑scheme generation—the authors first discuss how many partitions exist. They recall the exact recurrence Bₙ = 1 + ∑_{k=1}^{n‑1} C(n‑1, k) B_k and list the first twenty Bell numbers. Because exact computation quickly becomes cumbersome, they present an asymptotic approximation derived by Moser and Wyman that involves the Lambert W function. Their numerical tests show that this formula, denoted B*ₙ, has a relative error below 3 × 10⁻³ for 2 ≤ n ≤ 50 and even yields the exact integer part for n ≤ 7. They also compare a simpler upper bound from Berend and Tassa, noting that it is tight for small n but overestimates dramatically for larger n (e.g., a factor of four at n = 20).

The core of the work evaluates four algorithms for enumerating set partitions, all of which are non‑recursive to avoid function‑call overhead. The algorithms are: (1) Hutchinson’s classic method (1963), reproduced in Knuth’s volume; (2) Semba’s fast iterative scheme (1984); (3) Er’s algorithm (1988), which borrows ideas from Gray codes without constructing them explicitly; and (4) Djokic et al.’s iterative algorithm (1989). The authors give a concise description of each method and provide a reference implementation in C.

Benchmarking was performed on three hardware contexts (a laptop, a desktop PC, and cloud VMs) under both Linux and Windows, using three compilers (GCC, Intel ICC, and Microsoft Visual C++) with optimization levels O2 and O3. The results are reported qualitatively and via a representative table of CPU times (in milliseconds) for n = 8…15. Hutchinson’s algorithm is consistently the slowest, while Semba and Er are comparable and substantially faster. Djokic’s algorithm is the fastest across all n, with the performance gap widening as n grows. Compiler and OS effects are also significant: Linux builds (GCC/ICC) are roughly twice as fast as Windows/MSVC, and for Djokic’s code ICC outperforms GCC by a factor of about two, whereas the opposite holds for the older algorithms.

Based on these observations, the authors recommend the Djokic et al. algorithm for practical use. It is short, easy to implement, and delivers the best runtime when compiled with at least O2 optimization; on Linux, ICC gives the best performance for this algorithm. The paper also notes that extending the methods to restricted set partitions (where block size is limited) or to Gray‑code based indexing would be valuable future work, as would a systematic study of parallel or hardware‑accelerated implementations.

In summary, the study bridges theory and practice: it supplies a reliable, low‑error approximation for Bell numbers and delivers an evidence‑based recommendation that the Djokic et al. iterative algorithm is the most efficient current method for generating all set partitions on typical modern hardware.

Benchmarking of algorithms for set partitions

💡 Research Summary

Comments & Academic Discussion

Leave a Comment