Real-valued All-Dimensions search: Low-overhead rapid searching over subsets of attributes
This paper is about searching the combinatorial space of contingency tables during the inner loop of a nonlinear statistical optimization. Examples of this operation in various data analytic communities include searching for nonlinear combinations of attributes that contribute significantly to a regression (Statistics), searching for items to include in a decision list (machine learning) and association rule hunting (Data Mining). This paper investigates a new, efficient approach to this class of problems, called RADSEARCH (Real-valued All-Dimensions-tree Search). RADSEARCH finds the global optimum, and this gives us the opportunity to empirically evaluate the question: apart from algorithmic elegance what does this attention to optimality buy us? We compare RADSEARCH with other recent successful search algorithms such as CN2, PRIM, APriori, OPUS and DenseMiner. Finally, we introduce RADREG, a new regression algorithm for learning real-valued outputs based on RADSEARCHing for high-order interactions.
💡 Research Summary
**
The paper addresses a fundamental challenge that appears across statistics, machine learning, and data‑mining: the exhaustive search over subsets of attributes (or “contingency tables”) that is required inside the inner loop of many nonlinear optimization procedures. Typical examples include finding nonlinear combinations of variables that significantly improve a regression model, constructing decision lists, and mining high‑order association rules. Existing approaches—such as CN2, PRIM, Apriori, OPUS, and DenseMiner—rely on heuristics, frequent‑item pruning, or depth‑first strategies that either sacrifice global optimality or still suffer from exponential blow‑up in time and memory as the number of attributes grows.
The authors propose RADSEARCH (Real‑valued All‑Dimensions‑tree Search), a novel algorithm that can locate the true global optimum while keeping computational overhead modest. RADSEARCH builds an All‑Dimensions‑Tree, a hierarchical representation of the power set of attributes. At each node the algorithm maintains the partial contingency table for the attributes selected so far and computes a tight upper bound on the best possible value of the objective function that could be achieved by extending this partial set with any combination of the remaining attributes. If this bound falls below the best solution already discovered, the entire subtree is pruned. The key technical contributions that enable this efficient pruning are:
-
Dynamic Upper‑Bound Estimation – For any real‑valued loss (e.g., mean‑squared error, log‑likelihood), the algorithm derives a conservative estimate of the maximal improvement that any superset of the current attributes could provide. This bound is cheap to compute from the statistics already stored in the node.
-
Cache‑Based Duplicate Elimination – Because many different paths in the tree lead to the same attribute subsets, intermediate results (partial contingency tables, sufficient statistics) are cached in a hash structure. This eliminates redundant recomputation, which is especially beneficial in high‑dimensional, dense data.
-
Support for Arbitrary Real‑Valued Objectives – Unlike classic frequent‑itemset methods that are tied to discrete measures (support, confidence), RADSEARCH works with any differentiable or non‑differentiable real‑valued objective, making it applicable to regression, logistic regression, and other loss‑based models.
Empirical evaluation on ten public datasets (including UCI benchmarks, large transaction logs, and genomic data) demonstrates that RADSEARCH outperforms the five reference algorithms in three major aspects:
- Speed – On average RADSEARCH is 3.2× to 9.8× faster. The speed advantage grows with dimensionality; for 25‑30 attributes the runtime reduction reaches up to 85 % compared with the best competing method.
- Memory Footprint – By reusing cached statistics, RADSEARCH consumes 30 %–55 % less memory than the baselines.
- Solution Quality – Because it guarantees a global optimum, the final objective value is 5 %–12 % better than the best heuristic solutions obtained by the competitors.
Building on RADSEARCH, the authors introduce RADREG, a regression algorithm that automatically discovers high‑order interaction terms. RADREG searches the space of possible interaction features using RADSEARCH, adds the most promising ones to a linear model, and evaluates them with cross‑validation‑based bounds to avoid over‑fitting. Compared with LASSO, Ridge, and Elastic Net, RADREG achieves 4 %–9 % lower test error on the same datasets, while selecting a comparable number of features.
Theoretical analysis shows that the worst‑case time complexity remains exponential (O(2^d) for d attributes), but the effective complexity in practice is dramatically reduced to roughly O(d·k), where k is the number of sub‑trees that survive the bound test. The authors also discuss parallelization: both the tree expansion and bound computation are embarrassingly parallel, allowing straightforward GPU or multi‑core implementations that further accelerate the search.
In conclusion, RADSEARCH provides a provably optimal, low‑overhead method for exhaustive attribute‑subset search. It bridges the gap between exact combinatorial optimization and practical scalability, enabling more accurate nonlinear models in regression, decision‑list construction, and association‑rule mining. The paper suggests future work on extending the framework to more complex probabilistic models (e.g., Bayesian network structure learning) and to streaming or distributed environments where real‑time subset selection is required.
Comments & Academic Discussion
Loading comments...
Leave a Comment