Note on computational complexity of the Gromov-Wasserstein distance

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This note addresses computational difficulty of the Gromov-Wasserstein distance frequently mentioned in the literature. We provide details on the structure of the Gromov-Wasserstein distance optimization problem that show its non-convex quadratic nature for any instance of an input data. We further illustrate the non-convexity of the problem with several explicit examples.

💡 Research Summary

The paper investigates the computational difficulty of the Gromov‑Wasserstein (GW) distance, a metric widely used for comparing objects residing in different metric‑measure spaces. After a brief motivation covering applications in cryo‑EM, chemistry, biology, machine learning, and graph analysis, the authors focus on the finite‑space case, which is the most common in practice.

They start by recalling the definition of the p‑GW distance for two finite metric‑measure spaces (X=(X,d_X,\mu_X)) and (Y=(Y,d_Y,\mu_Y)). The optimization variable is a coupling (\mu) represented as a vector of length (|X|\cdot|Y|) subject to linear marginal constraints (A\mu=b) and non‑negativity. The objective can be written as (\mu^{\top}\Gamma_p\mu), where the 4‑way tensor (\Gamma_p) has entries (|d_X(x_i,x_k)-d_Y(y_j,y_l)|^p). This formulation shows that the GW problem is a linearly constrained quadratic program.

The central theoretical contribution is Theorem 4, which proves that the matrix (\Gamma_p) is never positive semidefinite for any pair of finite metric‑measure spaces containing at least two points. The proof extracts the top‑left (2\times2) principal sub‑matrix of (\Gamma_p) and observes that, because distances satisfy (d_{11}=h_{11}=h_{22}=0) and (h_{12}=h_{21}>0), the determinant of this sub‑matrix equals (-h_{12}^p h_{21}^p<0). By Sylvester’s criterion, a negative principal minor implies that (\Gamma_p) cannot be positive semidefinite. Consequently, the quadratic program is non‑convex.

Since general non‑convex quadratic programming is NP‑hard, the GW distance inherits this hardness. The authors relate GW to the classic Quadratic Assignment Problem (QAP), noting that GW can be viewed as a continuous relaxation of QAP where the flow matrix is taken as (-d^2) and the distance matrix as (h^2). Because QAP is NP‑hard, and the relaxation remains non‑convex, no polynomial‑time algorithm can guarantee a global optimum for arbitrary finite GW instances.

To illustrate the theory, two concrete examples are provided. Example 3 presents a tiny pair of two‑point spaces, computes the corresponding (\Gamma) matrix, and shows it has a negative eigenvalue (-2). Example 6 extends this observation: (a) using the family (\Delta_n) of uniform simplex spaces, the number of negative eigenvalues of (\Gamma_p) grows roughly linearly with the size of the second space; (b) using 3‑D trajectory data from a dynamical system model, the same linear growth pattern is observed. These experiments confirm that negative eigenvalues are not pathological but ubiquitous.

The paper also surveys practical solution strategies. The conditional gradient (Frank‑Wolfe) method converges only to stationary points, which may be local minima in a non‑convex landscape. Entropy‑regularized formulations introduce a smoothing term but do not restore convexity, yet they enable efficient Sinkhorn‑type algorithms that are widely used despite lacking global optimality guarantees.

In summary, the note rigorously establishes that every finite‑instance GW distance problem is a non‑convex quadratic program with a necessarily indefinite objective matrix. This explains the persistent computational challenges reported in the literature and suggests that future research should focus on specialized heuristics, approximation schemes, or exploiting additional structure rather than seeking exact polynomial‑time solvers. The work also clarifies the relationship between GW and QAP, reinforcing the view of GW as a non‑convex relaxation of a classic combinatorial optimization problem.

Note on computational complexity of the Gromov-Wasserstein distance

💡 Research Summary

Comments & Academic Discussion

Leave a Comment