Simple Complexity Analysis of Simplified Direct Search
We consider the problem of unconstrained minimization of a smooth function in the derivative-free setting using. In particular, we propose and study a simplified variant of the direct search method (of direction type), which we call simplified direct search (SDS). Unlike standard direct search methods, which depend on a large number of parameters that need to be tuned, SDS depends on a single scalar parameter only. Despite relevant research activity in direct search methods spanning several decades, complexity guarantees—bounds on the number of function evaluations needed to find an approximate solution—were not established until very recently. In this paper we give a surprisingly brief and unified analysis of SDS for nonconvex, convex and strongly convex functions. We match the existing complexity results for direct search in their dependence on the problem dimension ($n$) and error tolerance ($\epsilon$), but the overall bounds are simpler, easier to interpret, and have better dependence on other problem parameters. In particular, we show that for the set of directions formed by the standard coordinate vectors and their negatives, the number of function evaluations needed to find an $\epsilon$-solution is $O(n^2 /\epsilon)$ (resp. $O(n^2 \log(1/\epsilon))$) for the problem of minimizing a convex (resp. strongly convex) smooth function. In the nonconvex smooth case, the bound is $O(n^2/\epsilon^2)$, with the goal being the reduction of the norm of the gradient below $\epsilon$.
💡 Research Summary
**
The paper introduces a streamlined variant of direct‑search methods for derivative‑free optimization, called Simplified Direct Search (SDS). Traditional direct‑search algorithms rely on a multitude of scalar parameters (forcing constant, exponent, initial step size, two step‑size reduction factors, and an increase factor) together with a “search step” and a “poll step”. This complexity makes parameter tuning difficult and obscures the theoretical analysis. SDS eliminates all but a single scalar parameter—either the forcing constant c or the initial step size α₀—by fixing the other to a default value and by using a fixed set of search directions throughout the run.
The algorithm works with a positive spanning set D of directions; the authors focus on the canonical set D⁺ = {±eᵢ | i = 1,…,n}, i.e., the coordinate axes and their negatives. The quality of D is measured by its cosine measure μ, defined as the smallest maximal cosine between any non‑zero vector and a direction in D. For D⁺, |D| = 2n and μ = 1/√n, yielding the ratio |D|/μ² = 2n², which appears directly in the complexity bounds. The authors conjecture that for any positive spanning set with μ > 0 the ratio cannot be smaller than n², making D⁺ essentially optimal.
At each iteration SDS tries every direction d ∈ D with the current step size α. If a trial yields a function decrease of at least ρ(α) = cα², the algorithm moves to the new point; otherwise the step size is reduced by a factor β = 0.5. Successful steps improve the objective but do not affect the complexity analysis; the number of “unsuccessful” (step‑size‑reduction) steps directly controls the total work because each such step requires evaluating the function at all |D| directions.
Assuming the objective f is L‑smooth (i.e., its gradient is L‑Lipschitz), the authors prove a uniform bound for every unsuccessful iterate xₖ: ‖∇f(xₖ)‖² ≤ (L² + c) α₀² μ². Using this bound they derive three main complexity results:
-
Non‑convex case – The goal is to find a point with ‖∇f‖ ≤ ε. The bound above implies that at most k ≤ (L² + c) α₀² μ² / ε² unsuccessful steps are needed. Since each step evaluates the function at |D| points, the total number of function evaluations is O(|D| k) = O(n²/ε²).
-
Convex case – For convex smooth f, the same bound translates into a decrease of the objective value: f(xₖ) − f* ≤ (L² + c) α₀² μ² / k. To achieve f(xₖ) − f* ≤ ε one needs k = O(|D|/ε) = O(n²/ε) unsuccessful steps, i.e., O(n²/ε) function evaluations.
-
Strongly convex case – If f is λ‑strongly convex, the analysis yields a linear convergence rate: f(xₖ) − f* ≤ (L² + c) α₀² μ² · exp(−λk/(L² + c)). Thus achieving an ε‑accurate solution requires k = O(|D| log(1/ε)) = O(n² log(1/ε)) function evaluations.
The paper also proposes three initialization strategies. The first fixes the forcing constant to c = L/2 (the optimal choice when L is known) and automatically selects α₀; the second fixes α₀ and adaptively estimates c during the run; the third mimics the traditional “run until the first failure” approach. The first two strategies remove the need for manual tuning of either c or α₀, preserving the single‑parameter nature of SDS.
Compared with earlier works (Vicente 2013, Dodangeh & Vicente 2014, etc.), SDS matches the same dimension‑and‑ε dependence but offers a dramatically shorter and more transparent proof (6, 10, and 7 lines for the non‑convex, convex, and strongly convex cases, respectively). Moreover, by choosing c = L/2 the dependence on the Lipschitz constant becomes linear rather than quadratic, which is a notable improvement when L is known. The authors also discuss how the simplified analysis facilitates extensions to stochastic oracles, constrained problems, nonsmooth functions, and randomized variants.
In summary, Simplified Direct Search provides a practically appealing, theoretically sound derivative‑free optimization method that requires only one tunable scalar, works with a fixed, easily constructed direction set, and achieves optimal (up to constants) evaluation complexity for smooth non‑convex, convex, and strongly convex problems. Its streamlined analysis and reduced parameter burden make it a strong candidate for real‑world zero‑order optimization tasks and a solid foundation for future methodological extensions.
Comments & Academic Discussion
Loading comments...
Leave a Comment