Weisfeiler and Lehman Go Categorical
While lifting map has significantly enhanced the expressivity of graph neural networks, extending this paradigm to hypergraphs remains fragmented. To address this, we introduce the categorical Weisfeiler-Lehman framework, which formalizes lifting as a functorial mapping from an arbitrary data category to the unifying category of graded posets. When applied to hypergraphs, this perspective allows us to systematically derive Hypergraph Isomorphism Networks, a family of neural architectures where the message passing topology is strictly determined by the choice of functor. We introduce two distinct functors from the category of hypergraphs: an incidence functor and a symmetric simplicial complex functor. While the incidence architecture structurally mirrors standard bipartite schemes, our functorial derivation enforces a richer information flow over the resulting poset, capturing complex intersection geometries often missed by existing methods. We theoretically characterize the expressivity of these models, proving that both the incidence-based and symmetric simplicial approaches subsume the expressive power of the standard Hypergraph Weisfeiler-Lehman test. Extensive experiments on real-world benchmarks validate these theoretical findings.
💡 Research Summary
The paper addresses a fundamental limitation of graph neural networks (GNNs): their expressive power is bounded by the 1‑Weisfeiler‑Lehman (1‑WL) test. While recent work has lifted graphs into higher‑order structures such as simplicial complexes or regular cell complexes to surpass this bound, extending such lifting to hypergraphs has remained fragmented because hypergraphs lack a canonical geometric realization.
To solve this, the authors introduce the Categorical Weisfeiler‑Lehman (CatWL) framework, which treats lifting as a functor from an arbitrary data category (e.g., the category of hypergraphs) to a universal target category of graded posets. A graded poset is a partially ordered set equipped with a dimension function, capable of representing incidence structures of graphs, simplicial complexes, and regular cell complexes within a single algebraic object.
The paper first defines a Graded WL (GWL) algorithm that iteratively refines colors on a graded poset. At each iteration a node’s color is updated by hashing together the colors of four adjacency types: boundary (lower covers), coboundary (upper covers), downward “pairwise” adjacencies, and upward “pairwise” adjacencies. Two graded posets are considered isomorphic if their color histograms coincide at every iteration.
A functor F : C → Poset maps each object X in the source category C to a graded poset F(X) and each morphism to a morphism in Poset, preserving composition and identities. Because functors preserve isomorphisms, the F‑CatWL test—lift, then run GWL—provides a necessary condition for isomorphism in C.
Two concrete functors for hypergraphs are presented:
-
Incidence Functor (I) – maps a hypergraph to its bipartite incidence poset where vertices have dimension 0 and hyperedges have dimension 1, with the order relation given by incidence (v ≺ e iff v∈e). This recovers the standard bipartite expansion but, within the graded‑poset framework, also makes use of the full set of four adjacencies, enriching information flow.
-
Symmetric Simplicial Functor (S) – treats each hyperedge of size k as a (k‑1)-dimensional simplex, thereby constructing a symmetric simplicial complex. All subsets of a hyperedge become lower‑dimensional faces automatically, yielding a highly regular graded poset that captures intersection geometry in a symmetric fashion.
Both functors are shown to strictly subsume the expressive power of the standard Hypergraph WL test; i.e., they can distinguish hypergraph pairs that the original test cannot.
To obtain a learnable model, the authors replace discrete colors with continuous feature vectors and the hash function with learnable message functions and aggregators, defining F‑Categorical Message Passing Networks (F‑CatMPN). For each element σ in the lifted poset, four messages are computed corresponding to the four adjacency types, aggregated, and passed through an update function. Theorem 7 proves that any F‑CatMPN retains the same distinguishing power as its underlying F‑CatWL test, extending the classic WL‑MPN equivalence to the categorical setting.
Extensive experiments on real‑world hypergraph benchmarks (co‑authorship networks, protein complexes, recommendation systems) compare the two derived architectures—Incidence‑HIN and Simplicial‑HIN—against state‑of‑the‑art hypergraph GNNs (HyperGCN, HGNN, Hypergraph Transformers, etc.). Both models achieve higher accuracy, F1, and AUC scores, with the Simplicial‑HIN showing especially large gains on datasets with rich intersection patterns. Ablation studies confirm that the inclusion of the downward and upward pairwise messages (N↓, N↑) contributes most to performance improvements.
In summary, the paper provides a rigorous, category‑theoretic foundation for lifting hypergraph data into a unified graded‑poset domain, demonstrates how the choice of functor dictates the message‑passing topology, and proves that the resulting neural architectures achieve provably superior expressive power. The framework opens a systematic design space for higher‑order GNNs, suggesting future work on dynamic hypergraphs, multimodal data, and more efficient approximations of the hashing step.
Comments & Academic Discussion
Loading comments...
Leave a Comment