Comparing Euclidean and Hyperbolic K-Means for Generalized Category Discovery
Hyperbolic representation learning has been widely used to extract implicit hierarchies within data, and recently it has found its way to the open-world classification task of Generalized Category Discovery (GCD). However, prior hyperbolic GCD methods only use hyperbolic geometry for representation learning and transform back to Euclidean geometry when clustering. We hypothesize this is suboptimal. Therefore, we present Hyperbolic Clustered GCD (HC-GCD), which learns embeddings in the Lorentz Hyperboloid model of hyperbolic geometry, and clusters these embeddings directly in hyperbolic space using a hyperbolic K-Means algorithm. We test our model on the Semantic Shift Benchmark datasets, and demonstrate that HC-GCD is on par with the previous state-of-the-art hyperbolic GCD method. Furthermore, we show that using hyperbolic K-Means leads to better accuracy than Euclidean K-Means. We carry out ablation studies showing that clipping the norm of the Euclidean embeddings leads to decreased accuracy in clustering unseen classes, and increased accuracy for seen classes, while the overall accuracy is dataset dependent. We also show that using hyperbolic K-Means leads to more consistent clusters when varying the label granularity.
💡 Research Summary
The paper addresses a fundamental limitation in recent Generalized Category Discovery (GCD) approaches that exploit hyperbolic representation learning. While prior works such as HypCD and HIDISC learn embeddings in hyperbolic space, they revert to Euclidean geometry for the clustering stage, thereby potentially distorting the hierarchical structure that hyperbolic space naturally encodes. To overcome this, the authors propose Hyperbolic Clustered GCD (HC‑GCD), a framework that (1) learns embeddings directly on the Lorentz hyperboloid model of hyperbolic geometry, and (2) clusters those embeddings without leaving the hyperbolic manifold by employing a novel hyperbolic K‑Means algorithm.
The hyperbolic K‑Means algorithm is built on two ingredients: the hyperbolic distance (d_H) defined via the Lorentz inner product, and a centroid computation that uses the Lorentz centroid. The authors prove that the Lorentz centroid, when projected onto the Klein model, coincides with the Einstein midpoint—a closed‑form weighted average that is well‑known in hyperbolic geometry. This equivalence (Lemma 1, Corollary 1.1, Theorem 1) eliminates the need for model‑to‑model transformations during clustering and guarantees that the centroid truly minimizes the sum of squared hyperbolic distances.
HC‑GCD’s training pipeline mirrors that of HypCD: a Vision Transformer (ViT‑B/14) pretrained with DINOv2 provides visual features, which are passed through a four‑layer projection head. The Euclidean output is norm‑clipped (‖·‖ = 2.3) and mapped onto the Lorentz hyperboloid with curvature (c = -0.05) using the exponential map. The loss combines distance‑based and angle‑based contrastive terms for both supervised (labeled) and unsupervised (unlabeled) samples. Distance‑based contrast uses the negative hyperbolic distance as similarity, while angle‑based contrast employs the exterior angle derived from the Lorentz inner product. A weighting schedule ((\alpha) linearly decays from 1 to 0; (\lambda) balances supervised vs. unsupervised contrast) guides training. After convergence, the hyperbolic K‑Means algorithm clusters the latent points in the Lorentz space, optionally using a small set of labeled centroids for semi‑supervised refinement.
Experiments are conducted on the Semantic Shift Benchmark (SSB), comprising CUB‑200‑2011, Stanford Cars, and FGVC‑Aircraft. Following the standard GCD protocol, half of the classes are designated as “seen,” and only 50 % of samples from those classes receive labels, forming a partially labeled training set. The remaining data (both seen and unseen) constitute the unlabeled pool. Results show that HC‑GCD matches or slightly exceeds the overall accuracy of the previous state‑of‑the‑art hyperbolic method (HypCD). More importantly, when the hyperbolic K‑Means is used instead of Euclidean K‑Means, the accuracy on unseen classes improves by 2–3 percentage points on average, confirming that preserving hyperbolic geometry during clustering benefits the discovery of novel categories.
Ablation studies explore (a) the effect of clipping the Euclidean norm before the exponential map, (b) the impact of using the Poincaré model versus the Lorentz model, and (c) the stability of clusters under varying label granularity. Clipping the norm tends to boost seen‑class performance while slightly hurting unseen‑class performance, a trade‑off that varies across datasets. The label‑granularity analysis demonstrates that hyperbolic K‑Means yields more consistent cluster assignments as the number of fine‑grained labels changes, whereas Euclidean K‑Means exhibits larger fluctuations.
The paper’s contributions are threefold: (1) a mathematically grounded hyperbolic K‑Means algorithm with a provably optimal centroid; (2) an end‑to‑end GCD system that remains entirely within hyperbolic space, thereby preserving hierarchical information; (3) empirical evidence that hyperbolic clustering improves unseen‑class discovery and yields more stable clusters across granularity levels.
Limitations include the focus on image data and a single backbone architecture; extending the approach to text, multimodal, or larger‑scale datasets remains future work. Additionally, the sensitivity of hyperbolic K‑Means to initialization and convergence speed is not exhaustively examined. Nonetheless, HC‑GCD establishes a solid baseline for fully hyperbolic GCD and opens avenues for further research on hierarchical open‑world learning.
Comments & Academic Discussion
Loading comments...
Leave a Comment